nirs4all.controllers.shared package

Submodules

Module contents

Shared utilities for controllers.

This module provides shared utilities used by multiple controllers, particularly for model selection and prediction aggregation which are needed by both MergeController and MetaModelController.

Phase 2 Components (Stacking Restoration):: ModelSelector: Utility class for selecting models based on validation metrics. PredictionAggregator: Utility class for aggregating predictions from multiple models.

These utilities were extracted from MergeController to avoid code duplication and provide a single source of truth for model selection and aggregation logic.

Example

>>> from nirs4all.controllers.shared import ModelSelector, PredictionAggregator
>>> from nirs4all.operators.data.merge import AggregationStrategy
>>>
>>> selector = ModelSelector(prediction_store, context)
>>> ranked_models = selector.select_models(available_models, config, branch_id=0)
>>>
>>> aggregated = PredictionAggregator.aggregate(
...     predictions={"PLS": pls_preds, "RF": rf_preds},
...     strategy=AggregationStrategy.MEAN,
... )

class nirs4all.controllers.shared.ModelSelector(prediction_store: Predictions, context: ExecutionContext)[source]

Bases: object

Utility class for selecting models based on validation metrics.

Handles model ranking and selection strategies (all, best, top_k, explicit) for per-branch prediction collection and stacking operations.

This class is shared between MergeController and MetaModelController to avoid code duplication.

prediction_store: Prediction storage instance.

context: Execution context.

LOWER_IS_BETTER_METRICS: Set of metrics where lower values are better.

LOWER_IS_BETTER_METRICS = {'log_loss', 'mae', 'mape', 'mse', 'nmae', 'nmse', 'nrmse', 'rmse'}

get_model_scores(model_names: List[str], metric: str, branch_id: int) → Dict[str, float][source]

Get validation scores for multiple models.

Used for weighted aggregation.

Parameters:

model_names – List of model names.
metric – Metric name.
branch_id – Branch identifier.

Returns:

Dictionary mapping model name to score.

select_models(available_models: List[str], config: BranchPredictionConfig, branch_id: int) → List[str][source]

Select models from available models based on config.

Parameters:

available_models – List of available model names in the branch.
config – Per-branch prediction configuration.
branch_id – Branch identifier.

Returns:

List of selected model names.

Raises:

ValueError – If explicit model selection references unknown models.

select_models_global(available_models: List[str], selection: Any, metric: str | None = None) → List[str][source]

Select models globally (without branch context).

This is used by MetaModelController for pipelines without branches.

Parameters:

available_models – List of available model names.
selection – Selection configuration: - “all”: Use all models - “best”: Use best model - {“top_k”: N}: Use top N models - [“model1”, “model2”]: Explicit list
metric – Optional metric for ranking.

Returns:

List of selected model names.

class nirs4all.controllers.shared.PredictionAggregator[source]

Bases: object

Utility class for aggregating predictions from multiple models.

Handles aggregation strategies (separate, mean, weighted_mean, proba_mean) for combining predictions within a branch or across models.

This class is shared between MergeController and MetaModelController to avoid code duplication.

All methods are static as no instance state is needed.

LOWER_IS_BETTER_METRICS = {'log_loss', 'mae', 'mape', 'mse', 'nmae', 'nmse', 'nrmse', 'rmse'}

static aggregate(predictions: Dict[str, ndarray], strategy: AggregationStrategy, model_scores: Dict[str, float] | None = None, proba: bool = False, metric: str | None = None) → ndarray[source]

Aggregate predictions from multiple models.

Parameters:

predictions – Dictionary mapping model names to prediction arrays. Each array has shape (n_samples,) for regression or (n_samples, n_classes) for classification probabilities.
strategy – Aggregation strategy to use.
model_scores – Optional dictionary of model scores for weighted averaging.
proba – Whether predictions are class probabilities.
metric – Metric name (for determining weight direction).

Returns:

SEPARATE: (n_samples, n_models)
MEAN/WEIGHTED_MEAN: (n_samples, 1)
PROBA_MEAN: (n_samples, n_classes)

Return type:

Aggregated predictions with shape

Raises:

ValueError – If predictions dict is empty.

static aggregate_folds(fold_predictions: List[ndarray], fold_scores: List[float] | None = None, strategy: str = 'mean', metric: str | None = None) → ndarray[source]

Aggregate predictions across CV folds.

Useful for combining test predictions from different folds.

Parameters:

fold_predictions – List of prediction arrays, one per fold.
fold_scores – Optional list of validation scores per fold.
strategy – Aggregation strategy (“mean”, “weighted_mean”, “best”).
metric – Metric name for weighted aggregation.

Returns:

Aggregated predictions.