cell_data.CellDataLoader
cell_data.CellDataLoader(registry)Load Josh export CSVs into the cell_data table.
This class handles parsing CSV exports from Josh simulations and inserting them into the DuckDB cell_data table for spatiotemporal analysis.
Attributes
| Name | Type | Description |
|---|---|---|
| registry | The RunRegistry to load data into. |
Examples
>>> registry = RunRegistry("experiment.duckdb")
>>> loader = CellDataLoader(registry)
>>> # Load a CSV export
>>> rows_loaded = loader.load_csv(
... csv_path=Path("/tmp/output.csv"),
... run_id="abc123",
... run_hash="a1b2c3d4e5f6",
... )
>>> print(f"Loaded {rows_loaded} rows")Methods
| Name | Description |
|---|---|
| load_csv | Load a CSV export into the cell_data table. |
| load_csv_batch | Load multiple CSV files in batch. |
load_csv
cell_data.CellDataLoader.load_csv(
csv_path,
run_id,
run_hash,
entity_type='patch',
)Load a CSV export into the cell_data table.
The CSV is expected to have columns for simulation variables plus: - step: timestep - replicate: replicate number - position.x, position.y: grid coordinates - position.longitude, position.latitude: Earth coordinates (optional)
Variable columns are stored as typed columns (DOUBLE for numeric values, VARCHAR for strings). Column names preserve original .josh names using quoted identifiers (e.g., ‘avg.height’ stays as “avg.height”), requiring double quotes when referenced with direct calls to DuckDB.
Uses DuckDB’s native CSV reader for optimal performance.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| csv_path | Path | Path to the CSV file. | required |
| run_id | str | The run ID this data belongs to. | required |
| run_hash | str | Run hash for this run. | required |
| entity_type | str | Type of entity being exported (default: “patch”). | 'patch' |
Returns
| Name | Type | Description |
|---|---|---|
| int | Number of rows loaded. |
Raises
| Name | Type | Description |
|---|---|---|
| FileNotFoundError | If csv_path doesn’t exist. | |
| ValueError | If CSV is missing required columns or type mismatch. |
load_csv_batch
cell_data.CellDataLoader.load_csv_batch(csv_paths, entity_type='patch')Load multiple CSV files in batch.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| csv_paths | list[tuple[Path, str, str]] | List of (csv_path, run_id, run_hash) tuples. | required |
| entity_type | str | Type of entity being exported. | 'patch' |
Returns
| Name | Type | Description |
|---|---|---|
| int | Total number of rows loaded across all files. |