nirs4all.pipeline.run module

class nirs4all.pipeline.run.Run(id: str = <factory>, name: str = '', templates: List[TemplateInfo] = <factory>, datasets: List[DatasetInfo] = <factory>, status: RunStatus = RunStatus.QUEUED, config: RunConfig = <factory>, created_at: str = <factory>, started_at: str | None = None, completed_at: str | None = None, summary: RunSummary = <factory>, checkpoints: Dict[str, ~typing.Any]]=<factory>)[source]

Bases: object

Represents a complete experiment session.

A Run combines pipeline templates with datasets and generates results for every combination of expanded pipeline configurations and datasets.

id

Unique identifier for the run

Type:: str

name

Human-readable name

Type:: str

templates

List of pipeline templates

Type:: List[nirs4all.pipeline.run.TemplateInfo]

datasets

List of datasets

Type:: List[nirs4all.pipeline.run.DatasetInfo]

status

Current execution status

Type:: nirs4all.pipeline.run.RunStatus

config

Run configuration

Type:: nirs4all.pipeline.run.RunConfig

created_at

Creation timestamp

Type:: str

started_at

Execution start timestamp

Type:: str | None

completed_at

Completion timestamp

Type:: str | None

summary

Post-execution summary

Type:: nirs4all.pipeline.run.RunSummary

add_checkpoint(result_id: str, metadata: Dict[str, Any] | None = None) → None[source]: Record a completed result as a checkpoint.

can_transition_to(new_status: RunStatus) → bool[source]: Check if transition to new status is valid.

checkpoints: List[Dict[str, Any]]

completed_at: str | None = None

config: RunConfig

created_at: str

datasets: List[DatasetInfo]

classmethod from_dict(data: Dict[str, Any]) → Run[source]: Create run from dictionary.

id: str

name: str = ''

started_at: str | None = None

status: RunStatus = 'queued'

summary: RunSummary

templates: List[TemplateInfo]

to_dict() → Dict[str, Any][source]: Convert run to dictionary for serialization.

property total_pipeline_configs: int: Total number of expanded pipeline configurations.

property total_results_expected: int: Expected number of results (configs × datasets).

transition_to(new_status: RunStatus) → None[source]

Transition to a new status.

Raises:: ValueError – If transition is not valid

class nirs4all.pipeline.run.RunConfig(cv_folds: int = 5, cv_strategy: str = 'kfold', random_state: int | None = 42, metric: str = 'r2', save_predictions: bool = True, save_models: bool = True)[source]

Bases: object

Configuration for a run.

cv_folds: int = 5

cv_strategy: str = 'kfold'

metric: str = 'r2'

random_state: int | None = 42

save_models: bool = True

save_predictions: bool = True

class nirs4all.pipeline.run.RunStatus(value)[source]

Bases: Enum

Run execution status.

CANCELLED = 'cancelled'

COMPLETED = 'completed'

FAILED = 'failed'

PAUSED = 'paused'

QUEUED = 'queued'

RUNNING = 'running'

class nirs4all.pipeline.run.RunSummary(total_results: int = 0, completed_results: int = 0, failed_results: int = 0, best_result: Dict[str, Any] | None = None)[source]

Bases: object

Summary of run results.

best_result: Dict[str, Any] | None = None

completed_results: int = 0

failed_results: int = 0

total_results: int = 0

class nirs4all.pipeline.run.TemplateInfo(id: str, name: str, file_path: str | None = None, expansion_count: int = 1, description: str | None = None)[source]

Bases: object

Information about a pipeline template in a run.

description: str | None = None

expansion_count: int = 1

file_path: str | None = None

id: str

name: str

nirs4all.pipeline.run.generate_run_id(name: str = '') → str[source]

Generate a unique run ID.

Format: YYYY-MM-DD_<Name>_<hash>

Parameters:: name – Optional descriptive name
Returns:: Unique run ID string

nirs4all.pipeline.run.get_metric_info(metric_name: str) → Dict[str, Any][source]

Get metadata for a metric.

Parameters:: metric_name – Name of the metric (e.g., ‘r2’, ‘rmse’, ‘accuracy’)
Returns:: Dict with ‘higher_is_better’, ‘optimal’, and ‘range’ keys

nirs4all.pipeline.run.is_better_score(score: float, best_score: float, metric: str) → bool[source]

Compare two scores and determine if the new score is better.

Parameters:

score – New score to compare
best_score – Current best score
metric – Metric name to determine comparison direction

Returns:

True if score is better than best_score