nirs4all.controllers.shared.prediction_aggregator module
PredictionAggregator - Utility class for aggregating predictions from multiple models.
This module provides aggregation strategies for combining predictions within a branch or across models for stacking and merge operations.
- Phase 2 Implementation (Stacking Restoration):
Extracted from MergeController to provide shared prediction aggregation logic for both MergeController and MetaModelController.
- Aggregation Strategies:
SEPARATE: Stack predictions as separate features (n_models features)
MEAN: Simple average of predictions (1 feature)
WEIGHTED_MEAN: Weighted average by validation score (1 feature)
PROBA_MEAN: Average class probabilities for classification (n_classes features)
Example
>>> from nirs4all.controllers.shared import PredictionAggregator
>>> from nirs4all.operators.data.merge import AggregationStrategy
>>> import numpy as np
>>>
>>> predictions = {
... "PLS": np.array([1.0, 2.0, 3.0]),
... "RF": np.array([1.1, 1.9, 3.1]),
... }
>>> aggregated = PredictionAggregator.aggregate(
... predictions=predictions,
... strategy=AggregationStrategy.MEAN,
... )
>>> print(aggregated.shape) # (3, 1)
- class nirs4all.controllers.shared.prediction_aggregator.PredictionAggregator[source]
Bases:
objectUtility class for aggregating predictions from multiple models.
Handles aggregation strategies (separate, mean, weighted_mean, proba_mean) for combining predictions within a branch or across models.
This class is shared between MergeController and MetaModelController to avoid code duplication.
All methods are static as no instance state is needed.
- LOWER_IS_BETTER_METRICS = {'log_loss', 'mae', 'mape', 'mse', 'nmae', 'nmse', 'nrmse', 'rmse'}
- static aggregate(predictions: Dict[str, ndarray], strategy: AggregationStrategy, model_scores: Dict[str, float] | None = None, proba: bool = False, metric: str | None = None) → ndarray[source]
Aggregate predictions from multiple models.
- Parameters:
predictions – Dictionary mapping model names to prediction arrays. Each array has shape (n_samples,) for regression or (n_samples, n_classes) for classification probabilities.
strategy – Aggregation strategy to use.
model_scores – Optional dictionary of model scores for weighted averaging.
proba – Whether predictions are class probabilities.
metric – Metric name (for determining weight direction).
- Returns:
SEPARATE: (n_samples, n_models)
MEAN/WEIGHTED_MEAN: (n_samples, 1)
PROBA_MEAN: (n_samples, n_classes)
- Return type:
Aggregated predictions with shape
- Raises:
ValueError – If predictions dict is empty.
- static aggregate_folds(fold_predictions: List[ndarray], fold_scores: List[float] | None = None, strategy: str = 'mean', metric: str | None = None) → ndarray[source]
Aggregate predictions across CV folds.
Useful for combining test predictions from different folds.
- Parameters:
fold_predictions – List of prediction arrays, one per fold.
fold_scores – Optional list of validation scores per fold.
strategy – Aggregation strategy (“mean”, “weighted_mean”, “best”).
metric – Metric name for weighted aggregation.
- Returns:
Aggregated predictions.