nirs4all.controllers.models.components.prediction_assembler module

Prediction Data Assembler - Assemble prediction data for storage

This component creates structured prediction records from model outputs. Extracted from launch_training() lines 462-494 and _create_fold_averages() to eliminate duplicate assembly logic.

class nirs4all.controllers.models.components.prediction_assembler.PartitionPrediction(partition: str, indices: List[int], y_true: ndarray, y_pred: ndarray, score: float)[source]

Bases: object

Single partition prediction data.

indices: List[int]

partition: str

score: float

y_pred: ndarray

y_true: ndarray

class nirs4all.controllers.models.components.prediction_assembler.PredictionDataAssembler[source]

Bases: object

Assembles prediction data for storage.

Creates structured prediction records with all metadata required for storage in the prediction database.

Example

>>> assembler = PredictionDataAssembler()
>>> record = assembler.assemble(
...     dataset=dataset,
...     identifiers=identifiers,
...     scores={'train': 0.95, 'val': 0.90, 'test': 0.88},
...     predictions={'train': y_train_pred, 'val': y_val_pred, 'test': y_test_pred},
...     true_values={'train': y_train, 'val': y_val, 'test': y_test},
...     indices={'train': train_idx, 'val': val_idx, 'test': test_idx},
...     runner=runner,
...     X_shape=X_train.shape,
...     best_params=params
... )

assemble(dataset: Any, identifiers: Any, scores: dict, predictions: dict, true_values: dict, indices: dict, runner: Any, X_shape: Tuple[int, ...], best_params: dict | None = None, context: Any = None) → dict[source]

Assemble complete prediction record.

Parameters:

dataset – SpectroDataset instance
identifiers – ModelIdentifiers with name, id, etc.
scores – Dictionary of scores per partition
predictions – Dictionary of prediction arrays per partition (unscaled)
true_values – Dictionary of true value arrays per partition (unscaled)
indices – Dictionary of sample indices per partition
runner – PipelineRunner instance
X_shape – Shape of input data (for n_features)
best_params – Optional hyperparameters from optimization
context – Optional ExecutionContext for branch information

Returns:

Dictionary ready for storage in prediction database

assemble_fold_average(base_prediction: dict, averaged_predictions: dict, averaged_scores: dict, is_weighted: bool = False) → dict[source]

Assemble prediction record for fold-averaged model.

Parameters:

base_prediction – Base prediction record from a single fold (for metadata)
averaged_predictions – Dictionary of averaged prediction arrays
averaged_scores – Dictionary of averaged scores
is_weighted – Whether averaging was weighted by scores

Returns:

Dictionary ready for storage as fold-averaged prediction

class nirs4all.controllers.models.components.prediction_assembler.PredictionRecord(metadata: dict, partitions: List[Tuple[str, List[int], ndarray, ndarray]])[source]

Bases: object

Complete prediction record for storage.

metadata: dict

partitions: List[Tuple[str, List[int], ndarray, ndarray]]