Create self-contained archives for bug reports, archival, and sharing
The Problem
You run a simulation. It crashes, or produces unexpected results. You want to share the exact setup with a colleague, file a bug report, or archive it for your future self. But the run depends on:
A rendered .josh source (possibly from a .josh.j2 template)
A rendered .jshc config (from a Jinja sweep)
Several .jshd data files scattered across directories
Specific parameter values
A specific Josh JAR version
Reconstructing all of this from a run hash alone is fragile. Files move, configs get edited, templates change.
What a Bottle Is
A bottle is a self-contained .tar.gz archive with everything needed to reproduce a single Josh simulation run – without Python, joshpy, or any project structure. Just Java and the JAR.
bottle_abc123def456_20260403_143000.tar.gz
bottle_abc123def456/
simulation.josh # the rendered .josh source
sweep_config.jshc # the rendered config (not the .j2 template)
data/
soil_quality.jshd # all external data files, copied in
run.sh # exact java command with relative paths
manifest.json # full provenance metadata
Everything is realized: the .jshc is the rendered output, not the Jinja template. The .josh is the rendered source, not the .josh.j2. Data files are copied in. run.sh uses relative paths so the archive works anywhere.
To reproduce, a recipient just needs Java and the JAR:
tar xzf bottle_abc123def456_20260403_143000.tar.gzcd bottle_abc123def456./run.sh /path/to/joshsim-fat.jar
Bottling During a Sweep
The most common use case: bottle the first failure automatically so you have a ready-made bug report.
results = manager.run(bottle="first_failure")
This creates an archive in ./bottles/ as soon as a job fails – after the registry callback fires but before stop_on_failure raises. Even if the sweep stops, you have the archive.
Bottle Modes
Mode
What gets bottled
Archive type
"first_failure"
First failed job only
Single-job bottle
"first_success"
First successful job
Single-job bottle
"all_failures"
Every failed job
Sweep bottle (shared data)
"all"
Every job
Sweep bottle (shared data)
first_failure and first_success create an archive immediately when a matching job completes. all and all_failures collect matching jobs during the sweep and create a single archive at the end with shared data files (copied once, not per-job).
Sweep Bottles
When using "all" or "all_failures", the archive groups jobs together with shared data:
bottle_sweep_20260403_143000/
data/
soil_quality_gradient.jshd # shared across all jobs
soil_quality_stripes.jshd
jobs/
abc123def456/
simulation.josh
sweep_config.jshc
run.sh # --data points to ../../data/
fed987cba654/
simulation.josh
sweep_config.jshc
run.sh
manifest.json # lists all jobs + summary
To reproduce a single job from a sweep bottle:
cd jobs/abc123def456./run.sh /path/to/joshsim-fat.jar
import tarfileif archives:with tarfile.open(archives[0], "r:gz") as tar:for member in tar.getmembers(): kind ="dir"if member.isdir() elsef"{member.size:>8,} bytes"print(f" {member.name:<50s}{kind}")
bottle_sweep_20260408_170139 dir
bottle_sweep_20260408_170139/data dir
bottle_sweep_20260408_170139/data/soil_quality_gradient.jshd 289,583 bytes
bottle_sweep_20260408_170139/jobs dir
bottle_sweep_20260408_170139/jobs/7b553aeac8ae dir
bottle_sweep_20260408_170139/jobs/7b553aeac8ae/run.sh 512 bytes
bottle_sweep_20260408_170139/jobs/7b553aeac8ae/simulation.josh 1,978 bytes
bottle_sweep_20260408_170139/jobs/7b553aeac8ae/sweep_config.jshc 0 bytes
bottle_sweep_20260408_170139/manifest.json 719 bytes
The run.sh Script
if archives:with tarfile.open(archives[0], "r:gz") as tar:for member in tar.getmembers():if member.name.endswith("run.sh"):print(tar.extractfile(member).read().decode())break
#!/bin/bash
# Bottled by joshpy v0.0.8.6
# JAR SHA256: 02b9bc80736ca05b68e59231d29325802ac76311012c0ba457c520bbe51189b0
# JAR version: 1.0
# Original run hash: 7b553aeac8ae
# Bottled at: 2026-04-08T17:01:40Z
set -euo pipefail
java -jar "${1:?Usage: ./run.sh /path/to/joshsim-fat.jar}" \
run simulation.josh \
Main \
--data sweep_config.jshc=sweep_config.jshc \
--data soil_quality=../../data/soil_quality_gradient.jshd \
--custom-tag label=demo_run \
--custom-tag run_hash=7b553aeac8ae
The Manifest
import jsonif archives:with tarfile.open(archives[0], "r:gz") as tar:for member in tar.getmembers():if member.name.endswith("manifest.json"): manifest = json.loads(tar.extractfile(member).read())for key, value in manifest.items():if key in ("stderr", "stdout") andlen(str(value)) >80: value =str(value)[:80] +"..."print(f" {key}: {value}")break
The manifest records everything needed to understand the context of a run: JAR version and hash, parameter values, exit code and error output, original file paths, git hash, Python version, and platform.
Bottling from the Registry
Sometimes you discover a problem days after the run. The registry stores the rendered josh source and config content (since PR1), so you can bottle after the fact:
from joshpy.registry import RunRegistryregistry = RunRegistry(str(registry_path))bottle_dir_2 = Path(tmpdir) /"bottles_later"archive = registry.bottle("demo_run", output_dir=bottle_dir_2, cli=cli)print(f"Bottled from registry: {archive.name}")
Bottled from registry: bottle_7b553aeac8ae_20260408_170141.tar.gz
registry.close()
This reconstructs the bottle from stored data. The original .jshd data files must still exist at their recorded paths (they are copied into the archive).
WarningData File Availability
Bottling copies .jshd files from their original locations. If a data file is missing, bottling raises FileNotFoundError — data files are critical for reproducibility. Use omit_jshd=True to intentionally skip them (see below).
Lightweight Bottles (omit_jshd)
Data files can be large. When the recipient already has the data locally (e.g., a colleague on the same team), you can skip copying .jshd files to keep the archive small:
# During a sweepresults = manager.run(bottle="first_failure", bottle_omit_jshd=True)# From the registryregistry.bottle("baseline", cli=cli, omit_jshd=True)
The run.sh still lists all --data flags so the recipient knows which files to provide. The manifest records the original paths and "omit_jshd": true.
Unpacking a Bottle
Use unbottle() to unpack an archive back into a JobConfig for use with joshpy:
from joshpy.bottle import unbottle# Always returns a list of JobConfigs (one per job in the bottle)configs = unbottle("bottle_abc123.tar.gz")# Single-job bottle: configs has one element# Sweep bottle: configs has one element per job
When omit_jshd=True was used, provide a local data_dir. The original directory structure is preserved — data_dir replaces the common root of the original paths:
# Tell unbottle where YOUR copy of the data livesconfigs = unbottle("bottle_abc123.tar.gz", data_dir=Path("/home/alice/josh-data/dev_fine"),)# unbottle reads the manifest to find the sender's original paths:# cover → /home/bob/project/data/grids/dev_fine/cover.jshd# futureTempJan → /home/bob/project/data/grids/dev_fine/monthly/tas_jan.jshd## It strips the common root (/home/bob/project/data/grids/dev_fine)# and resolves relative paths under YOUR data_dir:# cover → /home/alice/josh-data/dev_fine/cover.jshd# futureTempJan → /home/alice/josh-data/dev_fine/monthly/tas_jan.jshd
If bottling fails for any reason (disk full, permissions, etc.), the sweep continues. A warning is printed but the sweep is never aborted by a bottling error. The simulation results are more important than the archive.
Standalone Usage
For custom workflows outside of SweepManager, use create_bottle() directly:
from joshpy.bottle import create_bottlearchive = create_bottle( job=expanded_job, cli_result=result, cli=cli, output_dir=Path("bottles/"),)