nirs4all.controllers.models.stacking.classification module
Classification Support for Meta-Model Stacking.
Phase 5 Implementation - Provides utilities for: 1. Detecting classification vs regression task types from predictions 2. Extracting probability features for classification stacking 3. Handling binary and multiclass classification scenarios 4. Generating meaningful feature names with class information
Key components: - ClassificationFeatureExtractor: Extracts probability features from predictions - TaskTypeDetector: Detects task type from prediction metadata - FeatureNameGenerator: Creates descriptive feature names for meta-features
- class nirs4all.controllers.models.stacking.classification.ClassificationFeatureExtractor(classification_info: ClassificationInfo, use_proba: bool = False)[source]
Bases:
objectExtracts classification features from predictions.
Handles extraction of probability features for binary and multiclass classification, with proper handling of different array shapes.
- extract_features(pred: Dict[str, Any], n_samples: int) ndarray[source]
Extract features from a single prediction entry.
- Parameters:
pred – Prediction dictionary with y_pred and optionally y_proba.
n_samples – Expected number of samples.
- Returns:
Feature array of shape (n_samples,) or (n_samples, n_features).
- class nirs4all.controllers.models.stacking.classification.ClassificationInfo(task_type: StackingTaskType, n_classes: int | None = None, class_labels: List[Any] | None = None, has_probabilities: bool = False, proba_shape: Tuple[int, ...] | None = None)[source]
Bases:
objectInformation about classification task detected from predictions.
- task_type
Detected task type (regression/binary/multiclass).
- class_labels
Optional class labels if available.
- Type:
List[Any] | None
- get_n_features_per_model(use_proba: bool = False) int[source]
Get number of features per source model.
- Parameters:
use_proba – Whether probability features are requested.
- Returns:
Number of feature columns per source model. - Regression: 1 (y_pred) - Binary + use_proba: 1 (positive class probability) - Multiclass + use_proba: n_classes (all class probabilities) - Classification without use_proba: 1 (y_pred)
- task_type: StackingTaskType
- class nirs4all.controllers.models.stacking.classification.FeatureNameGenerator(classification_info: ClassificationInfo, use_proba: bool = False, pattern: str = '{model_name}_pred')[source]
Bases:
objectGenerates meaningful feature names for meta-model.
Creates descriptive feature names that include model name and, for classification with probabilities, class information.
- generate_names(source_model_names: List[str]) List[str][source]
Generate feature names for all source models.
- Parameters:
source_model_names – List of source model names.
- Returns:
List of feature column names.
- get_feature_importance_mapping(source_model_names: List[str]) Dict[str, List[str]][source]
Get mapping from source models to their feature names.
Useful for feature importance analysis.
- Parameters:
source_model_names – List of source model names.
- Returns:
Dictionary mapping model name to list of feature names.
- class nirs4all.controllers.models.stacking.classification.MetaFeatureInfo(feature_names: ~typing.List[str], source_models: ~typing.List[str], feature_to_model: ~typing.Dict[str, str], classification_info: ~nirs4all.controllers.models.stacking.classification.ClassificationInfo, n_features_per_model: ~typing.Dict[str, int] = <factory>)[source]
Bases:
objectInformation about generated meta-features.
Used for tracking feature importance and providing interpretable results.
- classification_info
Classification metadata.
- aggregate_importance_by_model(feature_importances: Dict[str, float]) Dict[str, float][source]
Aggregate feature importances by source model.
Sums importance scores for all features from the same source model.
- Parameters:
feature_importances – Mapping from feature name to importance score.
- Returns:
Mapping from model name to aggregated importance.
- classification_info: ClassificationInfo
- class nirs4all.controllers.models.stacking.classification.StackingTaskType(value)[source]
Bases:
EnumTask type for stacking.
- REGRESSION
Regression task using y_pred as features.
- BINARY_CLASSIFICATION
Binary classification (2 classes).
- MULTICLASS_CLASSIFICATION
Multi-class classification (>2 classes).
- UNKNOWN
Could not determine task type.
- BINARY_CLASSIFICATION = 'binary_classification'
- MULTICLASS_CLASSIFICATION = 'multiclass_classification'
- REGRESSION = 'regression'
- UNKNOWN = 'unknown'
- class nirs4all.controllers.models.stacking.classification.TaskTypeDetector(prediction_store: Predictions)[source]
Bases:
objectDetects task type from prediction metadata.
Uses prediction store metadata and y_proba presence to determine whether the stacking involves regression or classification.
- detect(source_model_names: List[str], context: ExecutionContext) ClassificationInfo[source]
Detect task type from source model predictions.
Examines predictions from source models to determine task type and gather classification metadata.
- Parameters:
source_model_names – List of source model names to examine.
context – Execution context with branch info.
- Returns:
ClassificationInfo with detected task type and metadata.
- nirs4all.controllers.models.stacking.classification.build_meta_feature_info(source_model_names: List[str], classification_info: ClassificationInfo, use_proba: bool = False, name_pattern: str = '{model_name}_pred') MetaFeatureInfo[source]
Build MetaFeatureInfo from source models and classification info.
- Parameters:
source_model_names – List of source model names.
classification_info – Classification metadata.
use_proba – Whether probability features are used.
name_pattern – Pattern for feature names.
- Returns:
MetaFeatureInfo with all mappings populated.