nirs4all.pipeline.storage.io module
Simulation IO Manager - Save and manage simulation outputs
Provides organized storage for pipeline simulation results with dataset and pipeline-based folder structure management.
REFACTORED: Now uses content-addressed artifact storage via serializer. Delegates to focused classes: PipelineWriter, WorkspaceExporter, PredictionResolver.
- class nirs4all.pipeline.storage.io.SimulationSaver(base_path: str | Path | None = None, save_artifacts: bool = True, save_charts: bool = True)[source]
Bases:
objectManages saving simulation results with flat pipeline structure.
Acts as a facade coordinating: - PipelineWriter: File I/O within pipeline directories - WorkspaceExporter: Exporting best results - PredictionResolver: Resolving prediction targets
Works with ManifestManager to create: base_path/NNNN_hash/files
- cleanup(confirm: bool = False) None[source]
Remove the current simulation directory and all its contents.
- Parameters:
confirm – Must be True to actually delete files
- Raises:
RuntimeError – If not registered or confirm is False
- export_best_for_dataset(dataset_name: str, workspace_path: Path, runs_dir: Path, mode: str = 'predictions') Path | None[source]
Export best results for a dataset to exports/ folder.
Delegates to WorkspaceExporter.
Creates exports/{dataset_name}/ with best predictions, pipeline config, and charts. Files are renamed to include run date for tracking.
- Parameters:
dataset_name – Dataset name (matches global prediction JSON filename)
workspace_path – Workspace root path (unused, maintained for compatibility)
runs_dir – Runs directory path
mode – Export mode - “predictions”, “template”, “trained”, or “full”
- Returns:
Path to export directory, or None if no predictions found
- export_best_prediction(predictions_file: Path, exports_dir: Path, dataset_name: str, run_date: str, pipeline_id: str, custom_name: str = None) Path[source]
Export predictions CSV to best_predictions/ folder with optional custom name.
Delegates to WorkspaceExporter.
- Parameters:
predictions_file – Path to predictions.csv
exports_dir – Workspace exports directory (unused, maintained for compatibility)
dataset_name – Metadata for naming
run_date – Metadata for naming
pipeline_id – Metadata for naming
custom_name – Optional custom name for export
Returns: Path to exported CSV
- export_pipeline_full(pipeline_dir: Path, exports_dir: Path, dataset_name: str, run_date: str, custom_name: str = None) Path[source]
Export full pipeline results to flat structure with optional custom name.
Delegates to WorkspaceExporter.
- Parameters:
pipeline_dir – Path to pipeline (NNNN_hash/ or NNNN_pipelinename_hash/)
exports_dir – Workspace exports directory (unused, maintained for compatibility)
dataset_name – Dataset name
run_date – Run date (YYYYMMDD)
custom_name – Optional custom name for export
Returns: Path to exported directory
- property exporter: WorkspaceExporter
Get or create WorkspaceExporter instance.
- get_predict_targets(prediction_obj: Dict[str, Any] | str)[source]
Get target variable names for prediction from a prediction object.
Delegates to PredictionResolver.
- list_files() Dict[str, List[str]][source]
List all saved files in the current pipeline.
- Returns:
Dictionary with file lists
- persist_artifact(step_number: int, name: str, obj: Any, format_hint: str | None = None, branch_id: int | None = None, branch_name: str | None = None) Dict[str, Any][source]
Persist artifact using the serializer with content-addressed storage.
NOTE: This is for internal binary artifacts (models, transformers, etc.) For human-readable outputs (charts, reports), use save_output() instead.
- Parameters:
step_number – Pipeline step number
name – Artifact name (for reference)
obj – Object to persist
format_hint – Optional format hint for serializer
branch_id – Optional branch ID for pipeline branching
branch_name – Optional human-readable branch name
- Returns:
Artifact metadata dictionary (empty if save_artifacts=False)
- register(pipeline_id: str) Path[source]
Register a pipeline ID and set current directory.
- Parameters:
pipeline_id – Pipeline ID from ManifestManager (e.g., “0001_abc123”)
- Returns:
Path to the pipeline directory
- register_workspace(workspace_root: Path, dataset_name: str, pipeline_hash: str, run_name: str = None, pipeline_name: str = None) Path[source]
Register pipeline in workspace structure with optional custom names.
Creates: - Without custom names: workspace_root/runs/{dataset}/NNNN_{hash}/ - With run_name: workspace_root/runs/{dataset}_{runname}/NNNN_{hash}/ - With pipeline_name: workspace_root/runs/{dataset}/NNNN_{pipelinename}_{hash}/ - With both: workspace_root/runs/{dataset}_{runname}/NNNN_{pipelinename}_{hash}/
All pipelines for a dataset are stored in the same folder regardless of date.
- Returns:
Full path to pipeline directory
- property resolver: TargetResolver
Get or create PredictionResolver instance.
- save_file(filename: str, content: str, overwrite: bool = True, encoding: str = 'utf-8', warn_on_overwrite: bool = True) Path[source]
Save a text file to the pipeline directory.
Delegates to PipelineWriter.
- save_json(filename: str, data: Any, overwrite: bool = True, indent: int | None = 2) Path[source]
Save data as JSON file.
Delegates to PipelineWriter.
- save_output(step_number: int, name: str, data: bytes | str, extension: str = '.png') Path | None[source]
Save a human-readable output file (chart, report, etc.).
Delegates to PipelineWriter.
- Parameters:
step_number – Pipeline step number (unused, kept for compatibility)
name – Output name (e.g., “2D_Chart”)
data – Binary or text data to save
extension – File extension (e.g., “.png”, “.csv”, “.txt”)
- Returns:
Path to saved file, or None if save_charts=False
- property writer: PipelineWriter
Get or create PipelineWriter instance.