Complete Example

From first run to full sweep: GridSpec, templates, and catalog in one workflow

Overview

This tutorial walks through the complete lifecycle of a joshpy project:

  1. Set up a GridSpec grid manifest, model template, and config template
  2. Run a single scenario to check the model works (ad-hoc)
  3. Try a different variant and compare the two runs
  4. Sweep the full parameter x variant space systematically
  5. Track everything with ProjectCatalog

It uses all three template types together: .josh.j2 (model), .jshc.j2 (config), and GridSpec variant declarations (data paths).

Prerequisites: Complete Preprocessing External Data first to create the .jshd data files used here.

The Three Templates

A joshpy project with external data has three kinds of templates that converge at run time:

Template type File What varies
Model (.josh.j2) Grid geometry, export paths Grid bounds, resolution
Config (.jshc.j2) Parameter values maxGrowth, etc.
Data paths (GridSpec) File paths via {variant} placeholders Soil pattern, climate scenario

Let’s look at each one.

Grid manifest (grid.yaml)

from pathlib import Path

GRID_YAML = Path("../../examples/external_data/variant_sweep_grid.yaml")
print(GRID_YAML.read_text())
name: tutorial_variant
grid:
  size_m: 1000
  low: [34.0, -116.4]
  high: [33.7, -115.4]
  steps: 10

variants:
  pattern:
    values: [gradient, triangle, stripes]
    default: gradient

files:
  soil_quality:
    template_path: "soil_quality_{pattern}.jshd"
    units: percent

The template_path uses a {pattern} placeholder. GridSpec resolves it to concrete .jshd file paths per variant value.

Model template (.josh.j2)

SOURCE_TEMPLATE = Path("../../examples/variant_sweep.josh.j2")
print(SOURCE_TEMPLATE.read_text())
# GridSpec variant sweep simulation - tree growth affected by external data
# Demonstrates using .josh.j2 model templates with GridSpec variant sweeps
#
# Grid geometry is injected via template_vars from GridSpec.
# The `external soil_quality` expression reads preprocessed .jshd data
# that varies spatially and by scenario (via GridSpec variants).
# Growth ceiling is controlled by `config sweep_config.maxGrowth`.

start simulation {{ simulation_name }}

  grid.size = {{ size_m }} m
  grid.low = {{ low_lat }} degrees latitude, {{ low_lon }} degrees longitude
  grid.high = {{ high_lat }} degrees latitude, {{ high_lon }} degrees longitude
  grid.patch = "Default"

  steps.low = 0 count
  steps.high = {{ steps }} count

  exportFiles.patch = "file:///tmp/variant_sweep_{run_hash}_{replicate}.csv"

end simulation

start patch Default

  ForeverTree.init = create 10 count of ForeverTree

  soil_quality.step = external soil_quality

  export.average_height.step = mean(ForeverTree.height)
  export.average_age.step = mean(ForeverTree.age)
  export.soil_quality.step = soil_quality

end patch

start organism ForeverTree

  age.init = 0 year
  age.step = prior.age + 1 year

  height.init = 0 meters

  # Growth rate scales with soil quality from external data, capped by maxGrowth
  max_growth.step = map here.soil_quality from [0 percent, 100 percent] to [0 meters, config sweep_config.maxGrowth] linear

  height.step = prior.height + sample uniform from 0 meters to max_growth

end organism

start unit year

  alias years
  alias yr
  alias yrs

end unit

Grid geometry (size_m, low_lat, etc.) comes from GridSpec.template_vars at render time. The model reads external soil quality data and caps tree growth at config sweep_config.maxGrowth.

Config template (.jshc.j2)

CONFIG_TEMPLATE = Path("../../examples/templates/sweep_config.jshc.j2")
print(CONFIG_TEMPLATE.read_text())
# Auto-generated configuration for tutorial_sweep.josh
# Parameter sweep: maxGrowth={{ maxGrowth }}

# =============================================================================
# STATIC CONFIG VALUES
# These values are the same for all runs in the sweep.
# Use static values for constants that don't need to vary across experiments.
# =============================================================================

# Initial tree count per organism (constant across all sweep runs)
initialTreeCount = 10 count

# =============================================================================
# SWEPT CONFIG VALUES
# These values vary across sweep runs. Each unique combination creates a job.
# Use swept values for parameters you want to explore or optimize.
# =============================================================================

# Maximum growth per timestep (meters) - SWEPT via Jinja template
maxGrowth = {{ maxGrowth }} meters

{ maxGrowth } is filled in per job – either from template_vars (ad-hoc) or from ConfigSweepParameter (sweep).

Setup

from joshpy.grid import GridSpec
from joshpy.jobs import JobConfig, SweepConfig, ConfigSweepParameter
from joshpy.sweep import SweepManager
from joshpy.cli import JoshCLI
from joshpy.jar import JarMode
from joshpy.catalog import ProjectCatalog

grid = GridSpec.from_yaml(GRID_YAML)
cli = JoshCLI(josh_jar=JarMode.DEV)
catalog = ProjectCatalog("complete_catalog.duckdb")

ADHOC_REGISTRY = "complete_demo.duckdb"
SWEEP_REGISTRY = "complete_sweep.duckdb"

A quick look at what GridSpec provides:

print(f"Grid: {grid.name}")
Grid: tutorial_variant
print(f"Variant axes: {list(grid.variants.keys())}")
Variant axes: ['pattern']
print(f"Pattern values: {grid.variants['pattern']['values']}")
Pattern values: ['gradient', 'triangle', 'stripes']
print(f"\nTemplate vars for .josh.j2:")

Template vars for .josh.j2:
for key, value in grid.template_vars.items():
    print(f"  {key}: {value}")
  size_m: 1000
  low_lat: 34.0
  low_lon: -116.4
  high_lat: 33.7
  high_lon: -115.4
  steps: 10

Part 1: Ad-Hoc Iteration

Before committing to a full sweep, try a single configuration to make sure the model runs and produces reasonable output.

Run 1: Gradient pattern, low growth

For a single run with templates, set parameter values in template_vars and pick a specific variant with file_mappings_for():

config_gradient = JobConfig(
    source_template_path=SOURCE_TEMPLATE,
    template_vars={**grid.template_vars, "simulation_name": "Main", "maxGrowth": 5},
    template_path=CONFIG_TEMPLATE,
    file_mappings=grid.file_mappings_for(pattern="gradient"),
    simulation="Main",
    replicates=2,
    label="gradient_low",
)

manager = (
    SweepManager.builder(config_gradient)
    .with_registry(ADHOC_REGISTRY, experiment_name="gradient_low")
    .with_cli(cli)
    .with_label("gradient_low")
    .with_catalog(catalog, experiment_name="adhoc_iteration")
    .build()
)

try:
    results = manager.run()
    print(f"Completed: {results.succeeded} succeeded, {results.failed} failed")
    if results.failed > 0:
        for job, result in results:
            if not result.success:
                print(f"Error: {result.stderr[:300]}")
        raise RuntimeError("Run failed")
    manager.load_results()
finally:
    manager.cleanup()
    manager.close()
Running 1 jobs (2 total replicates)
[1/1] Running (local): {}
  [OK] Completed successfully
Completed: 1 succeeded, 0 failed
Completed: 1 succeeded, 0 failed
Loading patch results from: /tmp/variant_sweep_{run_hash}_{replicate}.csv
  Loaded 34782 rows from variant_sweep_63b30c830241_0.csv
  Loaded 34782 rows from variant_sweep_63b30c830241_1.csv

Results:
  Jobs in sweep: 1
  Jobs with results loaded: 1
  Total rows loaded: 69564
69564

Quick look

from joshpy.registry import RunRegistry
from joshpy.diagnostics import SimulationDiagnostics

registry = RunRegistry(ADHOC_REGISTRY)
diag = SimulationDiagnostics(registry)
diag.plot_timeseries("average_height", title="Gradient Pattern, maxGrowth=5")

registry.close()

Run 2: Stripes pattern, high growth

What happens with a different soil pattern and higher growth cap? Change template_vars and file_mappings_for():

config_stripes = JobConfig(
    source_template_path=SOURCE_TEMPLATE,
    template_vars={**grid.template_vars, "simulation_name": "Main", "maxGrowth": 15},
    template_path=CONFIG_TEMPLATE,
    file_mappings=grid.file_mappings_for(pattern="stripes"),
    simulation="Main",
    replicates=2,
    label="stripes_high",
)

manager = (
    SweepManager.builder(config_stripes)
    .with_registry(ADHOC_REGISTRY, experiment_name="stripes_high")
    .with_cli(cli)
    .with_label("stripes_high")
    .with_catalog(catalog, experiment_name="adhoc_iteration")
    .build()
)

try:
    results = manager.run()
    print(f"Completed: {results.succeeded} succeeded, {results.failed} failed")
    if results.failed > 0:
        for job, result in results:
            if not result.success:
                print(f"Error: {result.stderr[:300]}")
        raise RuntimeError("Run failed")
    manager.load_results()
finally:
    manager.cleanup()
    manager.close()
Running 1 jobs (2 total replicates)
[1/1] Running (local): {}
  [OK] Completed successfully
Completed: 1 succeeded, 0 failed
Completed: 1 succeeded, 0 failed
Loading patch results from: /tmp/variant_sweep_{run_hash}_{replicate}.csv
  Loaded 34782 rows from variant_sweep_c7038b2fe88b_0.csv
  Loaded 34782 rows from variant_sweep_c7038b2fe88b_1.csv

Results:
  Jobs in sweep: 1
  Jobs with results loaded: 1
  Total rows loaded: 69564
69564

Run 3: Re-run gradient with tweaked growth (same label)

What if you want to iterate on the gradient run but keep calling it "gradient_low"? Use on_collision="timestamp" to automatically archive the old label:

config_gradient_v2 = JobConfig(
    source_template_path=SOURCE_TEMPLATE,
    template_vars={**grid.template_vars, "simulation_name": "Main", "maxGrowth": 7},
    template_path=CONFIG_TEMPLATE,
    file_mappings=grid.file_mappings_for(pattern="gradient"),
    simulation="Main",
    replicates=2,
    label="gradient_low",
)

manager = (
    SweepManager.builder(config_gradient_v2)
    .with_registry(ADHOC_REGISTRY, experiment_name="gradient_low_v2")
    .with_cli(cli)
    .with_label("gradient_low", on_collision="timestamp")
    .with_catalog(catalog, experiment_name="adhoc_iteration")
    .build()
)

try:
    results = manager.run()
    print(f"Completed: {results.succeeded} succeeded, {results.failed} failed")
    if results.failed > 0:
        for job, result in results:
            if not result.success:
                print(f"Error: {result.stderr[:300]}")
        raise RuntimeError("Run failed")
    manager.load_results()
finally:
    manager.cleanup()
    manager.close()
Running 1 jobs (2 total replicates)
[1/1] Running (local): {}
  [OK] Completed successfully
Completed: 1 succeeded, 0 failed
Completed: 1 succeeded, 0 failed
Loading patch results from: /tmp/variant_sweep_{run_hash}_{replicate}.csv
  Loaded 34782 rows from variant_sweep_8dee7fb8eec4_0.csv
  Loaded 34782 rows from variant_sweep_8dee7fb8eec4_1.csv

Results:
  Jobs in sweep: 1
  Jobs with results loaded: 1
  Total rows loaded: 69564
69564

The old "gradient_low" run is now archived under a timestamped label. The bare label points to the new run:

registry = RunRegistry(ADHOC_REGISTRY)
print("Labels after collision:")
Labels after collision:
for label, run_hash in registry.list_labels():
    print(f"  {label} -> {run_hash}")
  gradient_low -> 8dee7fb8eec4
  gradient_low_20260408_165823 -> 63b30c830241
  stripes_high -> c7038b2fe88b
registry.close()

Compare

All three labeled runs (the archived timestamped one, the new gradient_low, and stripes_high) appear in the comparison:

registry = RunRegistry(ADHOC_REGISTRY)
diag = SimulationDiagnostics(registry)

diag.plot_comparison(
    "average_height",
    group_by="label",
    title="Ad-Hoc: Gradient Low vs Stripes High",
)
Figure 1: Ad-hoc comparison: gradient (low growth) vs stripes (high growth).

The registry stores the full config content and josh source for every run. You can view or diff them with the inspect module – see Inspecting Runs from the Registry for the full workflow.

The get_file_mappings() method recalls which .jshd data files were used:

registry.get_file_mappings("gradient_low")
{'soil_quality': PosixPath('/workspaces/joshpy/examples/external_data/soil_quality_gradient.jshd')}
registry.get_file_mappings("stripes_high")
{'soil_quality': PosixPath('/workspaces/joshpy/examples/external_data/soil_quality_stripes.jshd')}
registry.close()

Part 2: Promote to Sweep

The model works. Now sweep the full parameter x variant space.

The same templates and GridSpec are reused – the only change is adding a SweepConfig. Parameter values move from template_vars to ConfigSweepParameter, and file_mappings_for() becomes variant_sweep():

config_sweep = JobConfig(
    source_template_path=SOURCE_TEMPLATE,
    template_vars={**grid.template_vars, "simulation_name": "Main"},
    template_path=CONFIG_TEMPLATE,
    file_mappings=grid.file_mappings,
    simulation="Main",
    replicates=2,
    sweep=SweepConfig(
        config_parameters=[
            ConfigSweepParameter(name="maxGrowth", values=[5, 10, 15]),
        ],
        compound_parameters=[
            grid.variant_sweep("pattern"),
        ],
    ),
)

n_jobs = len(config_sweep.sweep.config_parameters[0].values) * len(grid.variants["pattern"]["values"])
print(f"Cartesian product: {n_jobs} jobs x {config_sweep.replicates} replicates = {n_jobs * config_sweep.replicates} runs")
Cartesian product: 9 jobs x 2 replicates = 18 runs
NoteAd-hoc vs Sweep: What Changed
Ad-hoc (Part 1) Sweep (Part 2)
Config values template_vars={"maxGrowth": 5} ConfigSweepParameter(values=[5, 10, 15])
Variant selection file_mappings_for(pattern="gradient") variant_sweep("pattern")
Jobs 1 per run 3 x 3 = 9 (cartesian product)
Templates Same .josh.j2 and .jshc.j2 Same .josh.j2 and .jshc.j2
manager = (
    SweepManager.builder(config_sweep)
    .with_registry(SWEEP_REGISTRY, experiment_name="full_sweep")
    .with_cli(cli)
    .with_catalog(catalog, experiment_name="full_variant_sweep")
    .build()
)

try:
    results = manager.run()
    print(f"Sweep complete: {results.succeeded} succeeded, {results.failed} failed")
    if results.failed > 0:
        for job, result in results:
            if not result.success:
                print(f"Error: {result.stderr[:300]}")
        raise RuntimeError("Sweep failed")
    manager.load_results()
finally:
    manager.cleanup()
    manager.close()
Running 9 jobs (18 total replicates)
[1/9] Running (local): {'maxGrowth': 5, 'pattern': 'gradient', 'soil_quality': 'soil_quality_gradient'}
  [OK] Completed successfully
[2/9] Running (local): {'maxGrowth': 5, 'pattern': 'triangle', 'soil_quality': 'soil_quality_triangle'}
  [OK] Completed successfully
[3/9] Running (local): {'maxGrowth': 5, 'pattern': 'stripes', 'soil_quality': 'soil_quality_stripes'}
  [OK] Completed successfully
[4/9] Running (local): {'maxGrowth': 10, 'pattern': 'gradient', 'soil_quality': 'soil_quality_gradient'}
  [OK] Completed successfully
[5/9] Running (local): {'maxGrowth': 10, 'pattern': 'triangle', 'soil_quality': 'soil_quality_triangle'}
  [OK] Completed successfully
[6/9] Running (local): {'maxGrowth': 10, 'pattern': 'stripes', 'soil_quality': 'soil_quality_stripes'}
  [OK] Completed successfully
[7/9] Running (local): {'maxGrowth': 15, 'pattern': 'gradient', 'soil_quality': 'soil_quality_gradient'}
  [OK] Completed successfully
[8/9] Running (local): {'maxGrowth': 15, 'pattern': 'triangle', 'soil_quality': 'soil_quality_triangle'}
  [OK] Completed successfully
[9/9] Running (local): {'maxGrowth': 15, 'pattern': 'stripes', 'soil_quality': 'soil_quality_stripes'}
  [OK] Completed successfully
Completed: 9 succeeded, 0 failed
Sweep complete: 9 succeeded, 0 failed
Loading patch results from: /tmp/variant_sweep_{run_hash}_{replicate}.csv
  Loaded 34782 rows from variant_sweep_63b30c830241_0.csv
  Loaded 34782 rows from variant_sweep_63b30c830241_1.csv
  Loaded 34782 rows from variant_sweep_e41f328be84d_0.csv
  Loaded 34782 rows from variant_sweep_e41f328be84d_1.csv
  Loaded 34782 rows from variant_sweep_9f582a08bf21_0.csv
  Loaded 34782 rows from variant_sweep_9f582a08bf21_1.csv
  Loaded 34782 rows from variant_sweep_e3703e25c1d9_0.csv
  Loaded 34782 rows from variant_sweep_e3703e25c1d9_1.csv
  Loaded 34782 rows from variant_sweep_7c94c563a360_0.csv
  Loaded 34782 rows from variant_sweep_7c94c563a360_1.csv
  Loaded 34782 rows from variant_sweep_9797a10bb51c_0.csv
  Loaded 34782 rows from variant_sweep_9797a10bb51c_1.csv
  Loaded 34782 rows from variant_sweep_c45d4d6aa44e_0.csv
  Loaded 34782 rows from variant_sweep_c45d4d6aa44e_1.csv
  Loaded 34782 rows from variant_sweep_270a6ddc1b28_0.csv
  Loaded 34782 rows from variant_sweep_270a6ddc1b28_1.csv
  Loaded 34782 rows from variant_sweep_c7038b2fe88b_0.csv
  Loaded 34782 rows from variant_sweep_c7038b2fe88b_1.csv

Results:
  Jobs in sweep: 9
  Jobs with results loaded: 9
  Total rows loaded: 626076
626076

Results

sweep_registry = RunRegistry(SWEEP_REGISTRY)
diag = SimulationDiagnostics(sweep_registry)

diag.plot_comparison(
    "average_height",
    group_by="pattern",
    title="Tree Height by Soil Quality Pattern",
)
Figure 2: Tree height across all three soil quality patterns.
diag.plot_comparison(
    "average_height",
    group_by="maxGrowth",
    title="Tree Height by Max Growth Rate",
)
Figure 3: Tree height across growth rate values.

Cross-tabulation

Both the variant label (pattern) and config parameter (maxGrowth) are stored in config_parameters, so you can query the full cartesian product:

result = sweep_registry.query("""
    SELECT
        cp.pattern,
        cp.maxGrowth,
        AVG(cd.average_height) AS mean_final_height
    FROM cell_data cd
    JOIN config_parameters cp ON cd.run_hash = cp.run_hash
    WHERE cd.step = (SELECT MAX(step) FROM cell_data)
    GROUP BY cp.pattern, cp.maxGrowth
    ORDER BY cp.pattern, cp.maxGrowth
""")
result.df()
    pattern  maxGrowth  mean_final_height
0  gradient        5.0        1385.248434
1  gradient       10.0        2769.153093
2  gradient       15.0        4155.361162
3   stripes        5.0        1393.640254
4   stripes       10.0        2784.943696
5   stripes       15.0        4178.172124
6  triangle        5.0        1311.730401
7  triangle       10.0        2626.050737
8  triangle       15.0        3935.615657
sweep_registry.close()

For a deeper dive into the variant API (variant_sweep(), file_mappings_for(), multi-axis cross-products), see GridSpec Variant Sweeps.

Part 3: Recall with Catalog

The catalog tracked both the ad-hoc iteration and the full sweep:

experiments = catalog.list_experiments()
print(f"Total experiments: {len(experiments)}\n")
Total experiments: 4
for exp in experiments:
    print(f"  {exp.name}: status={exp.status}, registry={exp.registry_path}")
  full_variant_sweep: status=completed, registry=complete_sweep.duckdb
  adhoc_iteration: status=completed, registry=complete_demo.duckdb
  adhoc_iteration: status=completed, registry=complete_demo.duckdb
  adhoc_iteration: status=completed, registry=complete_demo.duckdb

Next Steps

Cleanup

catalog.close()