nirs4all.controllers.models.stacking.serialization module

Meta-Model Serialization - Artifact persistence for meta-model stacking.

This module provides dataclasses and utilities for persisting meta-model artifacts with complete source model dependency tracking.

Phase 3 Implementation - Key components: 1. SourceModelReference: Reference to a source model with feature mapping 2. MetaModelArtifact: Complete artifact for meta-model persistence 3. MetaModelSerializer: Handles serialization/deserialization

The meta-model serialization captures: - The trained meta-learner itself (via artifact_registry) - Ordered references to source models (for feature column alignment) - Stacking configuration (coverage strategy, aggregation, etc.) - Branch context (for validation during prediction)

class nirs4all.controllers.models.stacking.serialization.MetaModelArtifact(meta_model_type: str, meta_model_name: str, meta_learner_class: str, source_models: ~typing.List[~nirs4all.controllers.models.stacking.serialization.SourceModelReference], feature_columns: ~typing.List[str], stacking_config: ~typing.Dict[str, ~typing.Any], selector_config: ~typing.Dict[str, ~typing.Any] | None = None, branch_context: ~typing.Dict[str, ~typing.Any] | None = None, use_proba: bool = False, n_folds: int = 0, coverage_ratio: float = 1.0, artifact_id: str = '', training_timestamp: str = <factory>, task_type: str = 'regression', n_classes: int | None = None, feature_to_model_mapping: ~typing.Dict[str, str] | None = None)[source]

Bases: object

Complete artifact for meta-model persistence.

Contains all information needed to: - Reload the meta-model and its dependencies - Reconstruct feature columns in the correct order - Validate branch context during prediction - Apply the same stacking configuration

meta_model_type

Type identifier (“MetaModel”).

Type:

str

meta_model_name

Display name of the meta-model.

Type:

str

meta_learner_class

Class name of the meta-learner (e.g., “Ridge”).

Type:

str

source_models

Ordered list of source model references.

Type:

List[nirs4all.controllers.models.stacking.serialization.SourceModelReference]

feature_columns

Feature column names in order.

Type:

List[str]

stacking_config

Serialized stacking configuration.

Type:

Dict[str, Any]

selector_config

Configuration of the model selector used.

Type:

Dict[str, Any] | None

branch_context

Branch context during training.

Type:

Dict[str, Any] | None

use_proba

Whether probability features were used.

Type:

bool

n_folds

Number of cross-validation folds.

Type:

int

coverage_ratio

OOF coverage ratio achieved during training.

Type:

float

artifact_id

The artifact ID for the meta-model itself.

Type:

str

training_timestamp

ISO timestamp of training.

Type:

str

Example

>>> artifact = MetaModelArtifact(
...     meta_model_type="MetaModel",
...     meta_model_name="MetaModel_Ridge",
...     meta_learner_class="Ridge",
...     source_models=[ref1, ref2],
...     feature_columns=["PLS_pred", "RF_pred"],
...     stacking_config=stacking_config_dict,
...     branch_context={"branch_id": None},
...     use_proba=False,
...     n_folds=5,
...     coverage_ratio=1.0,
...     artifact_id="0001:5:all",
...     training_timestamp="2024-12-12T14:30:22Z"
... )
artifact_id: str = ''
branch_context: Dict[str, Any] | None = None
coverage_ratio: float = 1.0
feature_columns: List[str]
feature_to_model_mapping: Dict[str, str] | None = None
classmethod from_dict(data: Dict[str, Any]) MetaModelArtifact[source]

Create from dictionary.

classmethod from_json(json_str: str) MetaModelArtifact[source]

Deserialize from JSON string.

get_source_artifact_ids() List[str][source]

Get ordered list of source model artifact IDs.

Returns:

List of artifact IDs in feature column order.

get_source_by_index(index: int) SourceModelReference | None[source]

Get source model reference by feature index.

Parameters:

index – Feature column index.

Returns:

SourceModelReference or None if index out of range.

meta_learner_class: str
meta_model_name: str
meta_model_type: str
n_classes: int | None = None
n_folds: int = 0
selector_config: Dict[str, Any] | None = None
source_models: List[SourceModelReference]
stacking_config: Dict[str, Any]
task_type: str = 'regression'
to_dict() Dict[str, Any][source]

Convert to dictionary for JSON/YAML serialization.

to_json() str[source]

Serialize to JSON string.

training_timestamp: str
use_proba: bool = False
validate_feature_alignment() bool[source]

Validate that feature columns match source models.

Returns:

True if alignment is valid.

class nirs4all.controllers.models.stacking.serialization.MetaModelSerializer[source]

Bases: object

Handles serialization and deserialization of meta-model artifacts.

Provides methods to: - Build MetaModelArtifact from training context - Convert to/from MetaModelConfig for artifact registry - Validate artifact completeness

Example

>>> serializer = MetaModelSerializer()
>>> artifact = serializer.build_artifact(
...     meta_operator=meta_model_op,
...     source_models=selected_sources,
...     reconstruction_result=result,
...     context=execution_context
... )
>>> config = serializer.to_meta_model_config(artifact)
build_artifact(meta_operator: MetaModel, source_models: List[ModelCandidate], reconstruction_result: ReconstructionResult | None = None, context: ExecutionContext | None = None, artifact_id: str = '') MetaModelArtifact[source]

Build MetaModelArtifact from training context.

Parameters:
  • meta_operator – The MetaModel operator being trained.

  • source_models – List of selected source model candidates.

  • reconstruction_result – Optional result from TrainingSetReconstructor.

  • context – Optional execution context for branch info.

  • artifact_id – The artifact ID for this meta-model.

Returns:

MetaModelArtifact ready for persistence.

to_meta_model_config(artifact: MetaModelArtifact) MetaModelConfig[source]

Convert MetaModelArtifact to MetaModelConfig for registry.

The ArtifactRegistry uses MetaModelConfig to track source model dependencies. This method creates the appropriate config.

Parameters:

artifact – MetaModelArtifact to convert.

Returns:

MetaModelConfig for artifact registry.

validate_artifact(artifact: MetaModelArtifact) List[str][source]

Validate artifact completeness and consistency.

Parameters:

artifact – MetaModelArtifact to validate.

Returns:

List of validation error messages (empty if valid).

class nirs4all.controllers.models.stacking.serialization.SourceModelReference(model_name: str, model_classname: str, step_idx: int, artifact_id: str, feature_index: int, fold_id: str | None = None, branch_id: int | None = None, branch_name: str | None = None, branch_path: List[int] | None = None, val_score: float | None = None, metric: str | None = None)[source]

Bases: object

Reference to a source model used in stacking.

Stores all information needed to locate and validate a source model during prediction mode.

model_name

Display name of the model (e.g., “PLSRegression”).

Type:

str

model_classname

Full class name (e.g., “sklearn.cross_decomposition.PLSRegression”).

Type:

str

step_idx

Pipeline step index where the model was trained.

Type:

int

artifact_id

Unique artifact ID for loading the model binary.

Type:

str

feature_index

Column index in meta-features matrix.

Type:

int

fold_id

Optional fold ID if fold-specific reference.

Type:

str | None

branch_id

Branch ID where model was trained.

Type:

int | None

branch_name

Branch name where model was trained.

Type:

str | None

branch_path

Full branch path for nested branches.

Type:

List[int] | None

val_score

Validation score for weighted averaging.

Type:

float | None

metric

Metric used for scoring (e.g., “r2”, “rmse”).

Type:

str | None

Example

>>> ref = SourceModelReference(
...     model_name="PLSRegression",
...     model_classname="sklearn.cross_decomposition.PLSRegression",
...     step_idx=3,
...     artifact_id="0001:3:all",
...     feature_index=0,
...     branch_id=None,
...     val_score=0.92,
...     metric="r2"
... )
artifact_id: str
branch_id: int | None = None
branch_name: str | None = None
branch_path: List[int] | None = None
feature_index: int
fold_id: str | None = None
classmethod from_dict(data: Dict[str, Any]) SourceModelReference[source]

Create from dictionary.

metric: str | None = None
model_classname: str
model_name: str
step_idx: int
to_dict() Dict[str, Any][source]

Convert to dictionary for JSON/YAML serialization.

val_score: float | None = None
nirs4all.controllers.models.stacking.serialization.stacking_config_from_dict(data: Dict[str, Any]) StackingConfig[source]

Create StackingConfig from dictionary.

Parameters:

data – Dictionary with config values.

Returns:

StackingConfig instance.

nirs4all.controllers.models.stacking.serialization.stacking_config_to_dict(config: StackingConfig) Dict[str, Any][source]

Convert StackingConfig to serializable dictionary.

Parameters:

config – StackingConfig instance.

Returns:

Dictionary with string enum values.