nirs4all.controllers.models.meta_model module

MetaModel Controller - Controller for meta-model stacking.

This controller handles MetaModel operators by: 1. Collecting out-of-fold (OOF) predictions from source models 2. Constructing training features from these predictions 3. Training the meta-learner on these features 4. Storing predictions with proper metadata for serialization

The controller prevents data leakage by using only validation partition predictions from each fold to construct the training set.

Phase 2 Enhancement: Delegates OOF reconstruction to TrainingSetReconstructor for cleaner separation of concerns and more robust coverage handling.

Phase 3 Enhancement: Implements prediction mode with dependency resolution and meta-model artifact persistence with source model references.

class nirs4all.controllers.models.meta_model.MetaModelController[source]

Bases: SklearnModelController

Controller for meta-model stacking using pipeline predictions.

This controller handles MetaModel operators, constructing training features from out-of-fold predictions of previous models. It extends SklearnModelController since the meta-learner is always sklearn-compatible.

The key difference from regular model controllers is that get_xy() returns features constructed from predictions rather than the original dataset features.

Key Behavior:
  • Works INDEPENDENTLY of branches (no branch awareness required for basic case)

  • Queries prediction_store for ALL models from previous steps

  • Does NOT modify execution context (unlike MergeController)

  • For branch-aware stacking, uses BranchScope configuration

priority

Controller priority (5) - higher than SklearnModelController (6) to ensure MetaModel operators are handled by this controller.

Type:

int

use_reconstructor

If True, use TrainingSetReconstructor for OOF.

Type:

bool

execute(step_info: ParsedStep, dataset: SpectroDataset, context: ExecutionContext, runtime_context: RuntimeContext, source: int = -1, mode: str = 'train', loaded_binaries: List[Tuple[str, bytes]] | None = None, prediction_store: Any | None = None) Tuple[ExecutionContext, List[Tuple[str, bytes]]][source]

Execute meta-model controller.

Stores MetaModel operator and prediction_store in context for use by get_xy(). Also stores source models for artifact persistence in Phase 3.

Parameters:
  • step_info – Parsed step with MetaModel operator.

  • dataset – SpectroDataset.

  • context – Execution context.

  • runtime_context – Runtime context.

  • source – Data source index.

  • mode – Execution mode.

  • loaded_binaries – Pre-loaded model binaries.

  • prediction_store – Predictions store.

Returns:

Tuple of (updated_context, list_of_binaries).

get_xy(dataset: SpectroDataset, context: ExecutionContext) Tuple[Any, Any, Any, Any, Any, Any][source]

Extract train/test splits using meta-features from predictions.

Instead of using the original dataset features, this constructs features from out-of-fold predictions of source models.

For training:
  • X_train: OOF predictions from source models (n_train_samples, n_source_models)

  • y_train: Original target values

For test:
  • X_test: Aggregated source model test predictions

  • y_test: Original target values

Parameters:
  • dataset – SpectroDataset with partitioned data.

  • context – Execution context with partition and branch info.

Returns:

Tuple of (X_train, y_train, X_test, y_test, y_train_unscaled, y_test_unscaled) where X_train and X_test are meta-features from predictions.

classmethod matches(step: Any, operator: Any, keyword: str) bool[source]

Match MetaModel operators.

Parameters:
  • step – Pipeline step configuration.

  • operator – Instantiated operator object.

  • keyword – Pipeline keyword (unused).

Returns:

True if the operator is a MetaModel instance.

priority: int = 5
use_reconstructor: bool = True
nirs4all.controllers.models.meta_model.build_unique_source_names(source_models: List[ModelCandidate]) Tuple[List[str], Dict[str, Tuple[str, int | None]]][source]

Build unique source model names with branch disambiguation.

When the same model_name appears in multiple branches (e.g., 3 Ridge_MetaModel from different branches), this function creates unique names by appending branch suffixes (e.g., “Ridge_MetaModel_br0”, “Ridge_MetaModel_br1”).

Models that appear in only one branch keep their original names.

Parameters:

source_models – List of ModelCandidate objects from selection.

Returns:

  • unique_names: List of unique source model names (order preserved)

  • branch_map: Dict mapping unique_name -> (original_name, branch_id) Only contains entries for models that needed disambiguation.

Return type:

Tuple of

Example

>>> candidates = [
...     ModelCandidate("PLS", ..., branch_id=0),
...     ModelCandidate("Ridge", ..., branch_id=0),
...     ModelCandidate("Ridge", ..., branch_id=1),
...     ModelCandidate("Ridge", ..., branch_id=2),
... ]
>>> names, branch_map = build_unique_source_names(candidates)
>>> names
['PLS', 'Ridge_br0', 'Ridge_br1', 'Ridge_br2']
>>> branch_map
{'Ridge_br0': ('Ridge', 0), 'Ridge_br1': ('Ridge', 1), 'Ridge_br2': ('Ridge', 2)}