cell_data.CellDataLoader

cell_data.CellDataLoader(registry)

Load Josh export CSVs into the cell_data table.

This class handles parsing CSV exports from Josh simulations and inserting them into the DuckDB cell_data table for spatiotemporal analysis.

Attributes

Name Type Description
registry The RunRegistry to load data into.

Examples

>>> registry = RunRegistry("experiment.duckdb")
>>> loader = CellDataLoader(registry)
>>> # Load a CSV export
>>> rows_loaded = loader.load_csv(
...     csv_path=Path("/tmp/output.csv"),
...     run_id="abc123",
...     run_hash="a1b2c3d4e5f6",
... )
>>> print(f"Loaded {rows_loaded} rows")

Methods

Name Description
load_csv Load a CSV export into the cell_data table.
load_csv_batch Load multiple CSV files in batch.

load_csv

cell_data.CellDataLoader.load_csv(
    csv_path,
    run_id,
    run_hash,
    entity_type='patch',
)

Load a CSV export into the cell_data table.

The CSV is expected to have columns for simulation variables plus: - step: timestep - replicate: replicate number - position.x, position.y: grid coordinates - position.longitude, position.latitude: Earth coordinates (optional)

Variable columns are stored as typed columns (DOUBLE for numeric values, VARCHAR for strings). Column names preserve original .josh names using quoted identifiers (e.g., ‘avg.height’ stays as “avg.height”), requiring double quotes when referenced with direct calls to DuckDB.

Uses DuckDB’s native CSV reader for optimal performance.

Parameters

Name Type Description Default
csv_path Path Path to the CSV file. required
run_id str The run ID this data belongs to. required
run_hash str Run hash for this run. required
entity_type str Type of entity being exported (default: “patch”). 'patch'

Returns

Name Type Description
int Number of rows loaded.

Raises

Name Type Description
FileNotFoundError If csv_path doesn’t exist.
ValueError If CSV is missing required columns or type mismatch.

load_csv_batch

cell_data.CellDataLoader.load_csv_batch(csv_paths, entity_type='patch')

Load multiple CSV files in batch.

Parameters

Name Type Description Default
csv_paths list[tuple[Path, str, str]] List of (csv_path, run_id, run_hash) tuples. required
entity_type str Type of entity being exported. 'patch'

Returns

Name Type Description
int Total number of rows loaded across all files.