nirs4all.controllers.models.components package
Submodules
- nirs4all.controllers.models.components.identifier_generator module
- nirs4all.controllers.models.components.index_normalizer module
- nirs4all.controllers.models.components.prediction_assembler module
- nirs4all.controllers.models.components.prediction_transformer module
- nirs4all.controllers.models.components.score_calculator module
Module contents
Model Controller Components - Modular components for base_model refactoring
This package contains focused, testable components that replace the monolithic logic in the original launch_training() method.
- Components:
identifier_generator: Generate model identifiers and names
prediction_transformer: Handle scaling/unscaling of predictions
prediction_assembler: Assemble prediction data for storage
score_calculator: Calculate evaluation scores
index_normalizer: Normalize and validate sample indices
- class nirs4all.controllers.models.components.IndexNormalizer[source]
Bases:
objectNormalizes sample indices to consistent format.
Converts numpy int types to Python int and validates indices are within valid ranges.
Example
>>> normalizer = IndexNormalizer() >>> indices = normalizer.normalize([np.int64(0), np.int64(1), np.int64(2)]) >>> indices [0, 1, 2]
- normalize(indices: List | ndarray | None, n_samples: int, default_range: bool = True, validate: bool = False) List[int][source]
Normalize indices to Python int list.
- Parameters:
indices – Input indices (may be None, list, or numpy array)
n_samples – Total number of samples (for validation and defaults)
default_range – If True and indices is None, return range(n_samples)
validate – If True, validate indices are within bounds
- Returns:
List of Python integers
- Raises:
ValueError – If validate=True and indices are out of bounds
- normalize_batch(indices_dict: dict, n_samples_dict: dict) dict[source]
Normalize a dictionary of indices for multiple partitions.
- Parameters:
indices_dict – Dictionary with keys like ‘train’, ‘val’, ‘test’ and values as index lists/arrays
n_samples_dict – Dictionary with same keys and values as sample counts
- Returns:
Dictionary with same keys but normalized indices
- class nirs4all.controllers.models.components.ModelIdentifierGenerator(helper=None)[source]
Bases:
objectGenerates consistent model identifiers for training and persistence.
This component extracts and centralizes all the naming logic that was previously scattered in launch_training().
Example
>>> generator = ModelIdentifierGenerator() >>> identifiers = generator.generate( ... model_config={'name': 'MyPLS', 'class': 'sklearn.cross_decomposition.PLSRegression'}, ... runner=runner, ... context={'step_id': 5}, ... fold_idx=0 ... ) >>> identifiers.model_id 'MyPLS_10' >>> identifiers.display_name 'MyPLS_10_fold0'
- extract_classname_from_config(model_config: Dict[str, Any]) str[source]
Extract classname from model configuration.
Based on the model declared in config or instance.__class__.__name__ or function name.
- Parameters:
model_config – Model configuration dictionary.
- Returns:
Class name of the model.
- Return type:
- extract_core_name(model_config: Dict[str, Any]) str[source]
Extract core name from model configuration.
User-provided name or class name. This is the base name provided by the user or derived from the class.
- Parameters:
model_config – Model configuration dictionary.
- Returns:
Core name extracted from config.
- Return type:
- generate(model_config: Dict[str, Any], runner: PipelineRunner, context: ExecutionContext, fold_idx: int | None = None) ModelIdentifiers[source]
Generate all model identifiers from configuration and context.
- Parameters:
model_config – Model configuration dictionary
runner – Pipeline runner for operation counter
context – Execution context with step_number
fold_idx – Optional fold index for cross-validation
- Returns:
Container with all generated identifiers
- Return type:
- class nirs4all.controllers.models.components.ModelIdentifiers(classname: str, name: str, model_id: str, display_name: str, operation_counter: int, step_id: int, fold_idx: int | None)[source]
Bases:
objectContainer for all model identifiers.
- class nirs4all.controllers.models.components.PartitionPrediction(partition: str, indices: List[int], y_true: ndarray, y_pred: ndarray, score: float)[source]
Bases:
objectSingle partition prediction data.
- class nirs4all.controllers.models.components.PartitionScores(train: float, val: float, test: float, metric: str, higher_is_better: bool, detailed_scores: Dict[str, float] | None = None)[source]
Bases:
objectScores for a single partition.
- class nirs4all.controllers.models.components.PredictionDataAssembler[source]
Bases:
objectAssembles prediction data for storage.
Creates structured prediction records with all metadata required for storage in the prediction database.
Example
>>> assembler = PredictionDataAssembler() >>> record = assembler.assemble( ... dataset=dataset, ... identifiers=identifiers, ... scores={'train': 0.95, 'val': 0.90, 'test': 0.88}, ... predictions={'train': y_train_pred, 'val': y_val_pred, 'test': y_test_pred}, ... true_values={'train': y_train, 'val': y_val, 'test': y_test}, ... indices={'train': train_idx, 'val': val_idx, 'test': test_idx}, ... runner=runner, ... X_shape=X_train.shape, ... best_params=params ... )
- assemble(dataset: Any, identifiers: Any, scores: dict, predictions: dict, true_values: dict, indices: dict, runner: Any, X_shape: Tuple[int, ...], best_params: dict | None = None, context: Any = None) dict[source]
Assemble complete prediction record.
- Parameters:
dataset – SpectroDataset instance
identifiers – ModelIdentifiers with name, id, etc.
scores – Dictionary of scores per partition
predictions – Dictionary of prediction arrays per partition (unscaled)
true_values – Dictionary of true value arrays per partition (unscaled)
indices – Dictionary of sample indices per partition
runner – PipelineRunner instance
X_shape – Shape of input data (for n_features)
best_params – Optional hyperparameters from optimization
context – Optional ExecutionContext for branch information
- Returns:
Dictionary ready for storage in prediction database
- assemble_fold_average(base_prediction: dict, averaged_predictions: dict, averaged_scores: dict, is_weighted: bool = False) dict[source]
Assemble prediction record for fold-averaged model.
- Parameters:
base_prediction – Base prediction record from a single fold (for metadata)
averaged_predictions – Dictionary of averaged prediction arrays
averaged_scores – Dictionary of averaged scores
is_weighted – Whether averaging was weighted by scores
- Returns:
Dictionary ready for storage as fold-averaged prediction
- class nirs4all.controllers.models.components.PredictionRecord(metadata: dict, partitions: List[Tuple[str, List[int], ndarray, ndarray]])[source]
Bases:
objectComplete prediction record for storage.
- class nirs4all.controllers.models.components.PredictionTransformer[source]
Bases:
objectTransforms predictions between scaled and unscaled spaces.
- Handles:
Classification tasks: Keep predictions in transformed space
Regression tasks: Transform predictions back to numeric space
Respects current y_processing from context
Example
>>> transformer = PredictionTransformer() >>> y_pred_unscaled = transformer.transform_to_unscaled( ... y_pred_scaled, ... dataset, ... context ... )
- transform_batch_to_unscaled(predictions_dict: dict, dataset: SpectroDataset, context: ExecutionContext | None = None) dict[source]
Transform a dictionary of predictions to unscaled space.
- Parameters:
predictions_dict – Dictionary with keys like ‘train’, ‘val’, ‘test’ and values as prediction arrays
dataset – Dataset with transformation info
context – Execution context
- Returns:
Dictionary with same keys but unscaled predictions
- transform_to_unscaled(predictions_scaled: ndarray, dataset: SpectroDataset, context: ExecutionContext | None = None) ndarray[source]
Transform predictions from scaled/processed space to unscaled/numeric space.
- Parameters:
predictions_scaled – Predictions in scaled/processed space
dataset – Dataset with task type and target transformation info
context – Execution context with y processing info
- Returns:
Predictions in unscaled/numeric space
- class nirs4all.controllers.models.components.ScoreCalculator[source]
Bases:
objectCalculates evaluation scores for models.
Uses ModelUtils to select appropriate metrics based on task type, and Evaluator to compute scores.
Example
>>> calculator = ScoreCalculator() >>> scores = calculator.calculate( ... y_true={'train': y_train, 'val': y_val, 'test': y_test}, ... y_pred={'train': y_train_pred, 'val': y_val_pred, 'test': y_test_pred}, ... task_type='regression' ... ) >>> scores.test 0.88
- calculate(y_true: Dict[str, ndarray], y_pred: Dict[str, ndarray], task_type: str) PartitionScores[source]
Calculate scores for all partitions.
- Parameters:
y_true – Dictionary of true values per partition
y_pred – Dictionary of predictions per partition
task_type – Task type string (e.g., ‘regression’, ‘classification’)
- Returns:
PartitionScores with scores for train, val, test
- calculate_single(y_true: ndarray, y_pred: ndarray, task_type: str, metric: str | None = None) float[source]
Calculate score for a single partition.
- Parameters:
y_true – True values
y_pred – Predictions
task_type – Task type string
metric – Optional metric name (if None, uses best metric for task)
- Returns:
Score value
- format_scores(scores: PartitionScores) str[source]
Format scores as a readable string.
- Parameters:
scores – PartitionScores instance
- Returns:
0.95 | Val: 0.90 | Test: 0.88 (R2)”
- Return type:
Formatted string like “Train