nirs4all.pipeline.predictor module
Pipeline predictor - Handles prediction mode execution.
This module provides the Predictor class for running predictions using trained pipelines on new datasets.
- Phase 5 Enhancement:
The Predictor now supports minimal pipeline execution via TraceBasedExtractor. When an execution trace is available (from Phase 2+), the predictor can extract and run only the required steps, significantly improving prediction speed for complex pipelines.
- class nirs4all.pipeline.predictor.Predictor(runner: PipelineRunner, use_minimal_pipeline: bool = True)[source]
Bases:
objectHandles prediction using trained pipelines.
This class manages the prediction workflow: loading saved models, replaying pipeline configurations, and generating predictions on new data.
- Phase 5 Enhancement:
When use_minimal_pipeline=True (default), the predictor will: 1. Check if an execution trace is available for the prediction 2. Extract the minimal pipeline (only required steps) from the trace 3. Execute only those steps, significantly reducing prediction time
This is especially beneficial for complex pipelines with multiple preprocessing options, branches, or steps that aren’t needed for the specific model being predicted.
- runner
Parent PipelineRunner instance
- saver
File saver for managing outputs
- manifest_manager
Manager for pipeline manifests
- pipeline_uid
Unique identifier for the pipeline
- artifact_loader
Loader for trained model artifacts
- config_path
Path to the pipeline configuration
- target_model
Metadata for the target model
- use_minimal_pipeline
Whether to use minimal pipeline execution (Phase 5)
- predict(prediction_obj: Dict[str, Any] | str, dataset: DatasetConfigs | SpectroDataset | ndarray | Tuple[ndarray, ...] | Dict | List[Dict] | str | List[str], dataset_name: str = 'prediction_dataset', all_predictions: bool = False, verbose: int = 0) Tuple[ndarray, Predictions] | Tuple[Dict[str, Any], Predictions][source]
Run prediction using a saved model on new dataset.
- Phase 5 Enhancement:
When use_minimal_pipeline=True and an execution trace is available, this method will use TraceBasedExtractor to extract and execute only the required steps, improving prediction speed.
- Parameters:
prediction_obj – Model identifier (dict with config_path or prediction ID)
dataset – New dataset to predict on
dataset_name – Name for the dataset
all_predictions – If True, return all predictions; if False, return single best
verbose – Verbosity level
- Returns:
(y_pred, predictions) If all_predictions=True: (predictions_dict, predictions)
- Return type:
If all_predictions=False
Example
>>> predictor = Predictor(runner) >>> y_pred, preds = predictor.predict( ... {"config_path": "0001_abc123"}, ... X_new ... )