GridSpec Variant Sweeps

Automate file sweeps with GridSpec variant declarations

Introduction

In Sweeping Over External Data, you built FileSweepParameter objects by hand – listing each .jshd path explicitly. That works well for a few files, but scales poorly. A real climate project might have 24 monthly files per scenario across three SSP pathways: 72 paths to manage, and they all need to switch together per scenario.

GridSpec variants solve this. You declare variant axes (e.g., pattern, scenario) in your grid.yaml, use template_path placeholders in file entries, and let variant_sweep() generate the entire CompoundSweepParameter automatically.

This tutorial demonstrates:

  1. Writing a grid.yaml with variant declarations
  2. Exploring the variant API (file_mappings, file_mappings_for, variant_sweep)
  3. Combining variant_sweep() with ConfigSweepParameter for a cartesian sweep
  4. Using a .josh.j2 model template with GridSpec.template_vars
  5. Running and analyzing results with SweepManager

Prerequisites:

The Grid Manifest

A grid.yaml with variants has three sections: grid geometry, variant axes, and file inventory. Here we declare a pattern axis with three values – each corresponding to a soil quality .jshd file created in the preprocessing tutorial:

from pathlib import Path

DATA_DIR = Path("../../examples/external_data")

grid_yaml = """\
name: tutorial_variant
grid:
  size_m: 1000
  low: [34.0, -116.4]
  high: [33.7, -115.4]
  steps: 10

variants:
  pattern:
    values: [gradient, triangle, stripes]
    default: gradient

files:
  soil_quality:
    template_path: "soil_quality_{pattern}.jshd"
    units: percent
"""

grid_yaml_path = DATA_DIR / "grid.yaml"
grid_yaml_path.write_text(grid_yaml)
279
print(grid_yaml)
name: tutorial_variant
grid:
  size_m: 1000
  low: [34.0, -116.4]
  high: [33.7, -115.4]
  steps: 10

variants:
  pattern:
    values: [gradient, triangle, stripes]
    default: gradient

files:
  soil_quality:
    template_path: "soil_quality_{pattern}.jshd"
    units: percent

Key rules:

  • template_path contains {variant_name} placeholders. Each variant value resolves to a concrete file path.
  • path (not used here) is for static files that don’t vary by scenario.
  • variants declares each axis with values (allowed values) and default (used when no variant is specified).

See Project Organization for the full YAML specification.

Step 1: Explore the Variant API

Load the manifest and inspect what GridSpec provides:

from joshpy.grid import GridSpec

grid = GridSpec.from_yaml(grid_yaml_path)

print(f"Grid: {grid.name}")
Grid: tutorial_variant
print(f"Variant axes: {list(grid.variants.keys())}")
Variant axes: ['pattern']
print(f"Pattern values: {grid.variants['pattern']['values']}")
Pattern values: ['gradient', 'triangle', 'stripes']
print(f"Default: {grid.variants['pattern']['default']}")
Default: gradient

Default file mappings

file_mappings resolves template_path entries using each axis’s default value:

for name, path in grid.file_mappings.items():
    print(f"  {name}: {path.name}")
  soil_quality: soil_quality_gradient.jshd

Scenario-specific mappings

file_mappings_for() resolves for a specific variant value:

for pattern in grid.variants["pattern"]["values"]:
    mappings = grid.file_mappings_for(pattern=pattern)
    print(f"  pattern={pattern}: {mappings['soil_quality'].name}")
  pattern=gradient: soil_quality_gradient.jshd
  pattern=triangle: soil_quality_triangle.jshd
  pattern=stripes: soil_quality_stripes.jshd

Template variables

template_vars provides the grid geometry dict used by .josh.j2 model templates:

for key, value in grid.template_vars.items():
    print(f"  {key}: {value}")
  size_m: 1000
  low_lat: 34.0
  low_lon: -116.4
  high_lat: 33.7
  high_lon: -115.4
  steps: 10

Generate a sweep parameter

variant_sweep() builds a CompoundSweepParameter that zips all template_path files referencing the axis:

param = grid.variant_sweep("pattern")

print(f"Type: {type(param).__name__}")
Type: CompoundSweepParameter
print(f"Name: {param.name}")
Name: pattern
print(f"Labels: {param.labels}")
Labels: ['gradient', 'triangle', 'stripes']
print(f"Inner parameters: {len(param.parameters)}")
Inner parameters: 1
for fp in param.parameters:
    print(f"\n  {fp.name}:")
    for path in fp.paths:
        print(f"    {path.name}")

  soil_quality:
    soil_quality_gradient.jshd
    soil_quality_triangle.jshd
    soil_quality_stripes.jshd

Each label corresponds to one scenario. All inner FileSweepParameter objects are zipped – they switch together, not cartesian.

Step 2: Configure the Sweep

Now combine variant_sweep() with a ConfigSweepParameter. The variant axis and config parameter expand as a cartesian product: 3 patterns x 3 growth rates = 9 jobs.

from joshpy.jobs import JobConfig, SweepConfig, ConfigSweepParameter
from joshpy.strategies import CartesianStrategy

SOURCE_TEMPLATE = Path("../../examples/variant_sweep.josh.j2")
CONFIG_TEMPLATE = Path("../../examples/templates/sweep_config.jshc.j2")

config = JobConfig(
    # Model template -- grid geometry injected from GridSpec
    source_template_path=SOURCE_TEMPLATE,
    template_vars={
        **grid.template_vars,
        "simulation_name": "Main",
    },

    # Config template -- maxGrowth swept per job
    template_path=CONFIG_TEMPLATE,

    # Data files -- defaults from GridSpec, overridden per scenario by sweep
    file_mappings=grid.file_mappings,

    simulation="Main",
    replicates=2,

    # Sweep: 3 growth rates x 3 soil patterns = 9 jobs x 2 replicates = 18 runs
    sweep=SweepConfig(
        config_parameters=[
            ConfigSweepParameter(name="maxGrowth", values=[5, 10, 15]),
        ],
        compound_parameters=[
            grid.variant_sweep("pattern"),
        ],
        strategy=CartesianStrategy(),
    ),
)

n_jobs = len(config.sweep.config_parameters[0].values) * len(grid.variants["pattern"]["values"])
print(f"Config parameter: maxGrowth = {config.sweep.config_parameters[0].values}")
Config parameter: maxGrowth = [5, 10, 15]
print(f"Variant axis: pattern = {grid.variants['pattern']['values']}")
Variant axis: pattern = ['gradient', 'triangle', 'stripes']
print(f"Jobs: {n_jobs}")
Jobs: 9
print(f"Replicates per job: {config.replicates}")
Replicates per job: 2
print(f"Total simulation runs: {n_jobs * config.replicates}")
Total simulation runs: 18
NoteHow the pieces fit together
Template type Source Resolves What varies
Model (.josh.j2) source_template_path Grid geometry via template_vars Grid bounds, resolution
Config (.jshc.j2) template_path Parameter values via ConfigSweepParameter maxGrowth
Data paths file_mappings + variant_sweep() File paths via CompoundSweepParameter Soil quality pattern

For more on how these three template types converge, see Templating.

Step 3: Run with SweepManager

from joshpy.sweep import SweepManager
from joshpy.cli import JoshCLI
from joshpy.jar import JarMode

REGISTRY_PATH = "variant_sweep.duckdb"

manager = (
    SweepManager.builder(config)
    .with_registry(REGISTRY_PATH, experiment_name="variant_sweep_tutorial")
    .with_cli(JoshCLI(josh_jar=JarMode.DEV))
    .build()
)

print(f"Session ID: {manager.session_id}")
Session ID: 81a46111-bbef-468a-bd85-451ffb38b67c
print(f"Total jobs: {manager.job_set.total_jobs}")
Total jobs: 9
print(f"Total replicates: {manager.job_set.total_replicates}")
Total replicates: 18

Inspect expanded jobs

Each job has a unique combination of maxGrowth and pattern:

for job in manager.job_set.jobs:
    tags = job.custom_tags
    print(f"Job {job.run_hash[:8]}: maxGrowth={tags.get('maxGrowth', '?')}, pattern={tags.get('pattern', '?')}")
Job 63b30c83: maxGrowth=5, pattern=gradient
Job e41f328b: maxGrowth=5, pattern=triangle
Job 9f582a08: maxGrowth=5, pattern=stripes
Job e3703e25: maxGrowth=10, pattern=gradient
Job 7c94c563: maxGrowth=10, pattern=triangle
Job 9797a10b: maxGrowth=10, pattern=stripes
Job c45d4d6a: maxGrowth=15, pattern=gradient
Job 270a6ddc: maxGrowth=15, pattern=triangle
Job c7038b2f: maxGrowth=15, pattern=stripes

Execute

results = manager.run()
Running 9 jobs (18 total replicates)
[1/9] Running (local): {'maxGrowth': 5, 'pattern': 'gradient', 'soil_quality': 'soil_quality_gradient'}
  [OK] Completed successfully
[2/9] Running (local): {'maxGrowth': 5, 'pattern': 'triangle', 'soil_quality': 'soil_quality_triangle'}
  [OK] Completed successfully
[3/9] Running (local): {'maxGrowth': 5, 'pattern': 'stripes', 'soil_quality': 'soil_quality_stripes'}
  [OK] Completed successfully
[4/9] Running (local): {'maxGrowth': 10, 'pattern': 'gradient', 'soil_quality': 'soil_quality_gradient'}
  [OK] Completed successfully
[5/9] Running (local): {'maxGrowth': 10, 'pattern': 'triangle', 'soil_quality': 'soil_quality_triangle'}
  [OK] Completed successfully
[6/9] Running (local): {'maxGrowth': 10, 'pattern': 'stripes', 'soil_quality': 'soil_quality_stripes'}
  [OK] Completed successfully
[7/9] Running (local): {'maxGrowth': 15, 'pattern': 'gradient', 'soil_quality': 'soil_quality_gradient'}
  [OK] Completed successfully
[8/9] Running (local): {'maxGrowth': 15, 'pattern': 'triangle', 'soil_quality': 'soil_quality_triangle'}
  [OK] Completed successfully
[9/9] Running (local): {'maxGrowth': 15, 'pattern': 'stripes', 'soil_quality': 'soil_quality_stripes'}
  [OK] Completed successfully
Completed: 9 succeeded, 0 failed
print(f"\nSweep complete!")

Sweep complete!
print(f"Succeeded: {results.succeeded}")
Succeeded: 9
print(f"Failed: {results.failed}")
Failed: 0

if results.failed > 0:
    errors = []
    for job, result in results:
        if not result.success:
            error_msg = result.stderr.strip() if result.stderr else "No error message"
            errors.append(f"Job {job.run_hash}: {error_msg[:500]}")
    error_detail = "\n".join(errors)
    raise RuntimeError(f"Sweep failed: {results.failed} job(s) failed\n\n{error_detail}")

Step 4: Load Results

manager.load_results()
Loading patch results from: /tmp/variant_sweep_{run_hash}_{replicate}.csv
  Loaded 34782 rows from variant_sweep_63b30c830241_0.csv
  Loaded 34782 rows from variant_sweep_63b30c830241_1.csv
  Loaded 34782 rows from variant_sweep_e41f328be84d_0.csv
  Loaded 34782 rows from variant_sweep_e41f328be84d_1.csv
  Loaded 34782 rows from variant_sweep_9f582a08bf21_0.csv
  Loaded 34782 rows from variant_sweep_9f582a08bf21_1.csv
  Loaded 34782 rows from variant_sweep_e3703e25c1d9_0.csv
  Loaded 34782 rows from variant_sweep_e3703e25c1d9_1.csv
  Loaded 34782 rows from variant_sweep_7c94c563a360_0.csv
  Loaded 34782 rows from variant_sweep_7c94c563a360_1.csv
  Loaded 34782 rows from variant_sweep_9797a10bb51c_0.csv
  Loaded 34782 rows from variant_sweep_9797a10bb51c_1.csv
  Loaded 34782 rows from variant_sweep_c45d4d6aa44e_0.csv
  Loaded 34782 rows from variant_sweep_c45d4d6aa44e_1.csv
  Loaded 34782 rows from variant_sweep_270a6ddc1b28_0.csv
  Loaded 34782 rows from variant_sweep_270a6ddc1b28_1.csv
  Loaded 34782 rows from variant_sweep_c7038b2fe88b_0.csv
  Loaded 34782 rows from variant_sweep_c7038b2fe88b_1.csv

Results:
  Jobs in sweep: 9
  Jobs with results loaded: 9
  Total rows loaded: 626076
626076
summary = manager.registry.get_data_summary()
print(summary)
Registry Data Summary
========================================
Sessions: 1
Configs:  9
Runs:     9
Rows:     626,076

Variables: average_age, average_height, soil_quality
Entity types: patch
Parameters: maxGrowth, pattern, soil_quality
Steps: 0 - 10
Replicates: 0 - 1
Spatial extent: lon [-116.39, -115.40], lat [33.70, 34.00]
print("Export variables:", manager.registry.list_export_variables())
Export variables: ['average_age', 'average_height', 'soil_quality']
print("Config parameters:", manager.registry.list_config_parameters())
Config parameters: ['maxGrowth', 'pattern', 'soil_quality']

Step 5: Analysis

Compare patterns (variant axis)

from joshpy.diagnostics import SimulationDiagnostics

diag = SimulationDiagnostics(manager.registry)

diag.plot_comparison(
    "average_height",
    group_by="pattern",
    title="Tree Height by Soil Quality Pattern",
)
Figure 1: Tree height trajectories grouped by soil quality pattern

Compare growth rates (config parameter)

diag.plot_comparison(
    "average_height",
    group_by="maxGrowth",
    title="Tree Height by Max Growth Rate",
)
Figure 2: Tree height trajectories grouped by maximum growth rate

Cross-tabulation via SQL

The variant label (pattern) and config parameter (maxGrowth) are both stored in config_parameters, so you can query the full cartesian product:

result = manager.registry.query("""
    SELECT
        cp.pattern,
        cp.maxGrowth,
        MAX(cd.step) as final_step,
        AVG(cd.average_height) as mean_final_height
    FROM cell_data cd
    JOIN config_parameters cp ON cd.run_hash = cp.run_hash
    WHERE cd.step = (SELECT MAX(step) FROM cell_data)
    GROUP BY cp.pattern, cp.maxGrowth
    ORDER BY cp.pattern, cp.maxGrowth
""")
result.df()
    pattern  maxGrowth  final_step  mean_final_height
0  gradient        5.0          10        1385.506064
1  gradient       10.0          10        2769.271851
2  gradient       15.0          10        4155.474653
3   stripes        5.0          10        1391.506514
4   stripes       10.0          10        2780.865357
5   stripes       15.0          10        4173.973905
6  triangle        5.0          10        1311.726731
7  triangle       10.0          10        2623.793176
8  triangle       15.0          10        3939.685104

Summary

Before: manual FileSweepParameter

# From external-data-sweep tutorial -- paths listed by hand
sweep=SweepConfig(
    file_parameters=[
        FileSweepParameter(
            name="soil_quality",
            paths=[
                DATA_DIR / "soil_quality_gradient.jshd",
                DATA_DIR / "soil_quality_triangle.jshd",
                DATA_DIR / "soil_quality_stripes.jshd",
            ],
        ),
    ],
)

After: variant_sweep()

# Paths declared once in grid.yaml, resolved automatically
grid = GridSpec.from_yaml("data/grids/dev_fine/grid.yaml")

sweep=SweepConfig(
    compound_parameters=[
        grid.variant_sweep("pattern"),  # one line
    ],
)

The GridSpec approach scales to any number of template_path files per axis. With 24 monthly climate files across 3 SSP scenarios, variant_sweep("scenario") generates 24 zipped FileSweepParameter objects – all switching together per scenario – in a single call.

When to use which

Approach Best for
FileSweepParameter (manual) Ad-hoc sweeps, files without a naming convention
variant_sweep() (GridSpec) Structured projects where files follow {axis} naming patterns
variant_sweep(axes=[...]) Multi-axis cross-products (e.g., scenario x GCM)

Cleanup

import os

manager.cleanup()
manager.close()

for f in [REGISTRY_PATH, f"{REGISTRY_PATH}.wal"]:
    if os.path.exists(f):
        os.remove(f)

if grid_yaml_path.exists():
    grid_yaml_path.unlink()

Learn More