nirs4all.controllers.models.meta_model module
MetaModel Controller - Controller for meta-model stacking.
This controller handles MetaModel operators by: 1. Collecting out-of-fold (OOF) predictions from source models 2. Constructing training features from these predictions 3. Training the meta-learner on these features 4. Storing predictions with proper metadata for serialization
The controller prevents data leakage by using only validation partition predictions from each fold to construct the training set.
Phase 2 Enhancement: Delegates OOF reconstruction to TrainingSetReconstructor for cleaner separation of concerns and more robust coverage handling.
Phase 3 Enhancement: Implements prediction mode with dependency resolution and meta-model artifact persistence with source model references.
- class nirs4all.controllers.models.meta_model.MetaModelController[source]
Bases:
SklearnModelControllerController for meta-model stacking using pipeline predictions.
This controller handles MetaModel operators, constructing training features from out-of-fold predictions of previous models. It extends SklearnModelController since the meta-learner is always sklearn-compatible.
The key difference from regular model controllers is that get_xy() returns features constructed from predictions rather than the original dataset features.
- Key Behavior:
Works INDEPENDENTLY of branches (no branch awareness required for basic case)
Queries prediction_store for ALL models from previous steps
Does NOT modify execution context (unlike MergeController)
For branch-aware stacking, uses BranchScope configuration
- priority
Controller priority (5) - higher than SklearnModelController (6) to ensure MetaModel operators are handled by this controller.
- Type:
- execute(step_info: ParsedStep, dataset: SpectroDataset, context: ExecutionContext, runtime_context: RuntimeContext, source: int = -1, mode: str = 'train', loaded_binaries: List[Tuple[str, bytes]] | None = None, prediction_store: Any | None = None) Tuple[ExecutionContext, List[Tuple[str, bytes]]][source]
Execute meta-model controller.
Stores MetaModel operator and prediction_store in context for use by get_xy(). Also stores source models for artifact persistence in Phase 3.
- Parameters:
step_info – Parsed step with MetaModel operator.
dataset – SpectroDataset.
context – Execution context.
runtime_context – Runtime context.
source – Data source index.
mode – Execution mode.
loaded_binaries – Pre-loaded model binaries.
prediction_store – Predictions store.
- Returns:
Tuple of (updated_context, list_of_binaries).
- get_xy(dataset: SpectroDataset, context: ExecutionContext) Tuple[Any, Any, Any, Any, Any, Any][source]
Extract train/test splits using meta-features from predictions.
Instead of using the original dataset features, this constructs features from out-of-fold predictions of source models.
- For training:
X_train: OOF predictions from source models (n_train_samples, n_source_models)
y_train: Original target values
- For test:
X_test: Aggregated source model test predictions
y_test: Original target values
- Parameters:
dataset – SpectroDataset with partitioned data.
context – Execution context with partition and branch info.
- Returns:
Tuple of (X_train, y_train, X_test, y_test, y_train_unscaled, y_test_unscaled) where X_train and X_test are meta-features from predictions.
- nirs4all.controllers.models.meta_model.build_unique_source_names(source_models: List[ModelCandidate]) Tuple[List[str], Dict[str, Tuple[str, int | None]]][source]
Build unique source model names with branch disambiguation.
When the same model_name appears in multiple branches (e.g., 3 Ridge_MetaModel from different branches), this function creates unique names by appending branch suffixes (e.g., “Ridge_MetaModel_br0”, “Ridge_MetaModel_br1”).
Models that appear in only one branch keep their original names.
- Parameters:
source_models – List of ModelCandidate objects from selection.
- Returns:
unique_names: List of unique source model names (order preserved)
branch_map: Dict mapping unique_name -> (original_name, branch_id) Only contains entries for models that needed disambiguation.
- Return type:
Tuple of
Example
>>> candidates = [ ... ModelCandidate("PLS", ..., branch_id=0), ... ModelCandidate("Ridge", ..., branch_id=0), ... ModelCandidate("Ridge", ..., branch_id=1), ... ModelCandidate("Ridge", ..., branch_id=2), ... ] >>> names, branch_map = build_unique_source_names(candidates) >>> names ['PLS', 'Ridge_br0', 'Ridge_br1', 'Ridge_br2'] >>> branch_map {'Ridge_br0': ('Ridge', 0), 'Ridge_br1': ('Ridge', 1), 'Ridge_br2': ('Ridge', 2)}