from pathlib import Path
from joshpy.jobs import JobConfig, SweepConfig, SweepParameter
# Paths to source files (optimized for fast tutorial builds)
SOURCE_PATH = Path("../../examples/tutorial_sweep.josh")
TEMPLATE_PATH = Path("../../examples/templates/sweep_config.jshc.j2")
# Parameter sweep: maxGrowth from 10 to 100 in steps of 10
MAX_GROWTH_VALUES = list(range(10, 101, 10))
config = JobConfig(
template_path=TEMPLATE_PATH,
source_path=SOURCE_PATH,
simulation="Main",
replicates=3,
sweep=SweepConfig(
parameters=[
# maxGrowth is swept - creates one job per value
SweepParameter(name="maxGrowth", values=MAX_GROWTH_VALUES),
# Note: daysPerYear is a static value in the template (not swept)
]
),
)Manual Workflow
Introduction
Josh is an ecological simulation runtime for agent-based modeling developed by the Eric and Wendy Schmidt Center for Data Science and Environment. This demo assumes familiarity with Josh’s simulation language and runtime.
joshpy is a Python client that enables:
- Orchestration: Define parameter sweeps, expand job configurations, and execute simulations programmatically
- Tracking: Register runs in a DuckDB-backed registry with session and config tracking
- Data Loading: Import cell-level CSV exports into queryable tables
- Analysis: Query results across parameter values and replicates
- Diagnostics: Quick matplotlib visualizations for simulation sanity checks
- Visualization: Create publication-quality plots with R/ggplot2 integration
This demo walks through a complete parameter sweep workflow using each component directly. This approach provides maximum control and visibility into each step. For a simplified workflow using SweepManager, see SweepManager Workflow.
We vary the maxGrowth parameter from 10 to 100 meters/step across 10 experiments, each with 3 replicates, then load, query, and visualize the results.
Prerequisites
Ensure the Josh JAR is available at jar/joshsim-fat.jar and joshpy is installed:
pip install -e '.[all]'For visualization, ensure R is installed with the following packages:
install.packages(c("reticulate", "ggplot2", "dplyr"))Step 1: Setup - Define Parameter Sweep
The first step is to define our experiment configuration. joshpy uses three key abstractions:
JobConfig: The top-level configuration specifying source files, templates, and sweep parametersSweepConfig: Defines which parameters to sweep and their valuesSweepParameter: A single parameter with a name and list of values
The JobExpander will later compute the cartesian product of all parameters, generating one job per combination.
The sweep creates one job per maxGrowth value (10, 20, …, 100). Static values like daysPerYear are defined directly in the template, not as sweep parameters.
Let’s examine the source files. The .josh file defines the simulation, and the .jshc.j2 template provides parameterized configuration:
Josh Source
print(SOURCE_PATH.read_text())# Tutorial sweep simulation - optimized for fast documentation builds
# Uses larger grid cells (5000m) for faster execution with same extent
start simulation Main
grid.size = 5000 m
grid.low = 33.7 degrees latitude, -115.4 degrees longitude
grid.high = 34.0 degrees latitude, -116.4 degrees longitude
grid.patch = "Default"
steps.low = 0 count
steps.high = 10 count
exportFiles.patch = "file:///tmp/tutorial_sweep_{maxGrowth}_{replicate}.csv"
end simulation
start patch Default
ForeverTree.init = create 10 count of ForeverTree
export.averageAge.step = mean(ForeverTree.age)
export.averageHeight.step = mean(ForeverTree.height)
end patch
start organism ForeverTree
# Static config value - same for all sweep runs (initial tree count)
initialTreeCount.init = config sweep_config.initialTreeCount
# Swept config value - varies across sweep runs
maxGrowth.init = config sweep_config.maxGrowth
age.init = 0 year
age.step = prior.age + 1 year
height.init = 0 meters
# maxGrowth is swept via sweep_config.jshc
height.step = prior.height + sample uniform from 0 meters to maxGrowth
end organism
start unit year
alias years
alias yr
alias yrs
end unit
Template Configuration
print(TEMPLATE_PATH.read_text())# Auto-generated configuration for tutorial_sweep.josh
# Parameter sweep: maxGrowth={{ maxGrowth }}
# =============================================================================
# STATIC CONFIG VALUES
# These values are the same for all runs in the sweep.
# Use static values for constants that don't need to vary across experiments.
# =============================================================================
# Initial tree count per organism (constant across all sweep runs)
initialTreeCount = 10 count
# =============================================================================
# SWEPT CONFIG VALUES
# These values vary across sweep runs. Each unique combination creates a job.
# Use swept values for parameters you want to explore or optimize.
# =============================================================================
# Maximum growth per timestep (meters) - SWEPT via Jinja template
maxGrowth = {{ maxGrowth }} meters
Notice how the configuration template has two types of values:
- Static values (e.g.,
daysPerYear = 365 count): Fixed values that don’t use Jinja templating. These are the same for all runs in the sweep. - Swept values (e.g.,
maxGrowth = {{ maxGrowth }} meters): Values that use Jinja variables. These vary across sweep runs based on theSweepParameterdefinitions.
The .josh file references both via config sweep_config.variableName. At runtime, each config variable pulls its value from the generated .jshc file.
Step 2: Initialize Registry and Expand Jobs
The RunRegistry provides experiment tracking backed by DuckDB. It stores:
- Sessions: High-level experiment metadata
- Configs: Rendered configuration files with parameter values and input file hashes
- Runs: Individual execution records with timing and exit codes
The JobExpander takes our JobConfig and generates concrete jobs - one per parameter combination, each with a unique run hash for tracking. The run hash includes the .josh file content, rendered .jshc content, and hashes of any input data files.
from joshpy.jobs import JobExpander
from joshpy.registry import RunRegistry
# Registry path - saved to disk for use in analysis tutorial
REGISTRY_PATH = "demo_registry.duckdb"
# Create registry (overwrites if exists)
registry = RunRegistry(REGISTRY_PATH)
# Expand config into individual jobs
expander = JobExpander()
job_set = expander.expand(config)
# Create a session to track this experiment
# Note: create_session() takes a JobConfig directly and auto-stores it in metadata
session_id = registry.create_session(
config=config,
experiment_name="growth_rate_sweep",
)
# Register each job's configuration in the registry
for job in job_set.jobs:
registry.register_run(
session_id=session_id,
run_hash=job.run_hash,
josh_path=str(job.source_path),
config_content=job.config_content,
file_mappings=job.file_mappings,
parameters=job.parameters,
)Step 3: Run the Simulations
The JoshCLI executes jobs via the Josh command-line interface. The run_sweep() function handles execution and automatically records runs in the registry when registry and session_id are provided.
from joshpy.cli import JoshCLI
from joshpy.jobs import run_sweep
# Create CLI targeting the local fat JAR
cli = JoshCLI(josh_jar=Path("../../jar/joshsim-fat.jar"))
# Run all jobs with automatic tracking
# Note: run_sweep() now automatically manages session status:
# - Sets status to "running" at start
# - Sets status to "completed" if all jobs succeed
# - Sets status to "failed" if any job fails
results = run_sweep(cli, job_set, registry=registry, session_id=session_id)Running 10 jobs (30 total replicates)
[1/10] Running (local): {'maxGrowth': 10}
[OK] Completed successfully
[2/10] Running (local): {'maxGrowth': 20}
[OK] Completed successfully
[3/10] Running (local): {'maxGrowth': 30}
[OK] Completed successfully
[4/10] Running (local): {'maxGrowth': 40}
[OK] Completed successfully
[5/10] Running (local): {'maxGrowth': 50}
[OK] Completed successfully
[6/10] Running (local): {'maxGrowth': 60}
[OK] Completed successfully
[7/10] Running (local): {'maxGrowth': 70}
[OK] Completed successfully
[8/10] Running (local): {'maxGrowth': 80}
[OK] Completed successfully
[9/10] Running (local): {'maxGrowth': 90}
[OK] Completed successfully
[10/10] Running (local): {'maxGrowth': 100}
[OK] Completed successfully
Completed: 10 succeeded, 0 failed
Step 4: Load Cell Data from CSVs
Josh exports simulation data to CSV files. The recover_sweep_results() function automatically discovers export paths from the Josh file (using inspect_exports), resolves template variables for each job, and loads results into the registry.
from joshpy.sweep import recover_sweep_results
# Automatically discover and load CSV results
recover_sweep_results(cli, job_set, registry)Loading patch results from: /tmp/tutorial_sweep_{maxGrowth}_{replicate}.csv
Loaded 1463 rows from tutorial_sweep_10_0.csv
Loaded 1463 rows from tutorial_sweep_10_1.csv
Loaded 1463 rows from tutorial_sweep_10_2.csv
Loaded 1463 rows from tutorial_sweep_20_0.csv
Loaded 1463 rows from tutorial_sweep_20_1.csv
Loaded 1463 rows from tutorial_sweep_20_2.csv
Loaded 1463 rows from tutorial_sweep_30_0.csv
Loaded 1463 rows from tutorial_sweep_30_1.csv
Loaded 1463 rows from tutorial_sweep_30_2.csv
Loaded 1463 rows from tutorial_sweep_40_0.csv
Loaded 1463 rows from tutorial_sweep_40_1.csv
Loaded 1463 rows from tutorial_sweep_40_2.csv
Loaded 1463 rows from tutorial_sweep_50_0.csv
Loaded 1463 rows from tutorial_sweep_50_1.csv
Loaded 1463 rows from tutorial_sweep_50_2.csv
Loaded 1463 rows from tutorial_sweep_60_0.csv
Loaded 1463 rows from tutorial_sweep_60_1.csv
Loaded 1463 rows from tutorial_sweep_60_2.csv
Loaded 1463 rows from tutorial_sweep_70_0.csv
Loaded 1463 rows from tutorial_sweep_70_1.csv
Loaded 1463 rows from tutorial_sweep_70_2.csv
Loaded 1463 rows from tutorial_sweep_80_0.csv
Loaded 1463 rows from tutorial_sweep_80_1.csv
Loaded 1463 rows from tutorial_sweep_80_2.csv
Loaded 1463 rows from tutorial_sweep_90_0.csv
Loaded 1463 rows from tutorial_sweep_90_1.csv
Loaded 1463 rows from tutorial_sweep_90_2.csv
Loaded 1463 rows from tutorial_sweep_100_0.csv
Loaded 1463 rows from tutorial_sweep_100_1.csv
Loaded 1463 rows from tutorial_sweep_100_2.csv
Results:
Jobs in sweep: 10
Jobs with results loaded: 10
Total rows loaded: 43890
43890
Step 5: Verify Data Loaded
Let’s verify the data is in the registry and ready for analysis:
# Get summary of loaded data
summary = registry.get_data_summary()
print(summary)Registry Data Summary
========================================
Sessions: 1
Configs: 10
Runs: 10
Rows: 43,890
Variables: averageAge, averageHeight
Entity types: patch
Parameters: maxGrowth
Steps: 0 - 10
Replicates: 0 - 2
Spatial extent: lon [-115.37, -114.40], lat [33.41, 33.68]
registry.list_export_variables()['averageAge', 'averageHeight']
registry.list_config_parameters()['maxGrowth']
Next Steps: Analysis
Now that data is loaded, see Analysis & Visualization Tutorial for comprehensive coverage of:
- Diagnostic Plots (
SimulationDiagnostics) - quick matplotlib visualizations - Custom Queries (
DiagnosticQueries) - get pandas DataFrames - Direct SQL - full DuckDB access for advanced analysis
- R/ggplot2 - publication-quality figures
Quick example:
from joshpy.diagnostics import SimulationDiagnostics
diag = SimulationDiagnostics(registry)
diag.plot_comparison(
"averageHeight",
group_by="maxGrowth",
title="Tree Height by Growth Rate Parameter",
)
Summary
This demo illustrated the manual joshpy workflow using each component directly:
- Define a parameter sweep using
JobConfigandSweepConfig - Expand jobs with
JobExpanderto get concrete job specifications - Register jobs with
registry.register_run()for tracking - Execute with
run_sweep()for automatic recording - Load outputs with
recover_sweep_results()for automatic path discovery - Analyze - see Analysis Tutorial for visualization and queries
Key Design Principles:
- Thin CLI wrapper:
JoshCLImaps 1:1 to CLI commands - Thin DuckDB wrapper: Direct
registry.connaccess for custom SQL - Convenience helpers:
run_sweep()andrecover_sweep_results()for common patterns - Full control: Each step is explicit and visible
Related Tutorials:
- SweepManager Workflow - Simplified orchestration with
SweepManager - Analysis & Visualization - Analysis and visualization (decoupled from orchestration)
Cleanup
job_set.cleanup() # Remove temporary config files
registry.close()The registry has been saved to demo_registry.duckdb. Run the Analysis Tutorial to explore the results.