nirs4all.data.ensemble_utils module

Ensemble Prediction Utilities - Weighted averaging for ensemble predictions

This module provides utilities for combining predictions from multiple models using weighted averaging based on their scores. Relocated from utils/model_utils.py to be with data/prediction modules.

Supports both regression (numeric averaging) and classification (soft/hard voting).

class nirs4all.data.ensemble_utils.EnsembleUtils[source]

Bases: object

Utilities for ensemble prediction with weighted averaging and voting.

static compute_ensemble_prediction(predictions_data: List[Dict[str, Any]], score_metric: str = 'test_score', prediction_key: str = 'y_pred', metric_for_direction: str | None = None, higher_is_better: bool | None = None) → Dict[str, Any][source]

Compute ensemble prediction from a list of prediction dictionaries.

Parameters:

predictions_data – List of prediction dictionaries
score_metric – Key to extract score from each prediction
prediction_key – Key to extract predictions array from each prediction
metric_for_direction – Metric name to infer direction (if higher_is_better is None)
higher_is_better – Whether higher scores are better (None to infer)

Returns:

Dictionary with ensemble prediction and metadata

Raises:

ValueError – If predictions_data is empty or missing required keys

static compute_hard_voting(class_predictions: List[ndarray], weights: ndarray | None = None, n_classes: int | None = None) → ndarray[source]

Compute hard voting (majority vote) from class predictions.

Each model votes for a class, and the class with most votes wins. Supports weighted voting where each model’s vote is weighted.

Parameters:

class_predictions – List of class prediction arrays, each shape (n_samples,) or (n_samples, 1).
weights – Optional weights for each model’s vote. If None, uses uniform weights (standard majority vote).
n_classes – Number of classes. If None, inferred from predictions.

Returns:

Final class predictions as (n_samples, 1) array.

Raises:

ValueError – If class_predictions is empty.

static compute_soft_voting_average(probability_arrays: List[ndarray], weights: ndarray | None = None, use_confidence_weighting: bool = False) → Tuple[ndarray, ndarray][source]

Compute soft voting average of class probabilities.

Averages probability distributions from multiple models (weighted or simple), then takes argmax to get final class predictions.

Parameters:

probability_arrays – List of probability arrays, each shape (n_samples, n_classes). Arrays can have different numbers of classes; they will be padded/aligned to the maximum number of classes found.
weights – Optional weights for each model (fold weights based on validation scores). If None, uses uniform weights.
use_confidence_weighting – If True, additionally weight each fold’s contribution per-sample by its prediction confidence (max probability). This gives more influence to confident predictions.

Returns:

class_predictions: Class labels as (n_samples, 1) array
averaged_probabilities: Averaged probabilities (n_samples, n_classes)

Return type:

Tuple of

Raises:

ValueError – If probability_arrays is empty or sample counts don’t match.

static compute_weighted_average(arrays: List[ndarray], scores: List[float], metric: str | None = None, higher_is_better: bool | None = None) → ndarray[source]

Compute weighted average of arrays based on their scores.

Parameters:

arrays – List of numpy arrays to average (must have same shape)
scores – List of scores corresponding to each array
metric – Name of the metric (used to determine if higher is better) Supported: ‘mse’, ‘rmse’, ‘mae’, ‘r2’, ‘accuracy’, ‘f1’, ‘precision’, ‘recall’
higher_is_better – Boolean indicating if higher scores are better If None, will be inferred from metric name

Returns:

Weighted average array

Raises:

ValueError – If arrays have different shapes or invalid parameters