nirs4all.controllers.models.autogluon_model module
AutoGluon Model Controller - Controller for AutoGluon TabularPredictor
This controller handles AutoGluon TabularPredictor with support for: - Automatic model selection and ensembling - Training on tabular data (samples x features) - Model persistence and prediction storage - Integration with the nirs4all pipeline
AutoGluon differs from sklearn models in that: - It trains an ensemble of models automatically - It uses DataFrames internally, not numpy arrays - It manages its own model directory for persistence - It has its own hyperparameter tuning (no need for Optuna)
Lazy loading pattern: AutoGluon is only imported when actually needed for training or prediction, not at module import time.
- class nirs4all.controllers.models.autogluon_model.AutoGluonModelController[source]
Bases:
BaseModelControllerController for AutoGluon TabularPredictor.
This controller handles AutoGluon models with automatic model selection, ensembling, and integration with the nirs4all pipeline.
AutoGluon automatically: - Trains multiple models (LightGBM, CatBoost, XGBoost, Neural Networks, etc.) - Performs cross-validation - Creates weighted ensembles - Handles hyperparameter tuning internally
Uses lazy loading - AutoGluon is only imported when training starts.
- priority
Controller priority (5) - higher than sklearn (6) to prioritize AutoGluon when explicitly requested.
- Type:
- execute(step_info: ParsedStep, dataset: SpectroDataset, context: ExecutionContext, runtime_context: RuntimeContext, source: int = -1, mode: str = 'train', loaded_binaries: List[Tuple[str, bytes]] | None = None, prediction_store: Any | None = None) Tuple[ExecutionContext, List[ArtifactMeta]][source]
Execute AutoGluon model controller.
Main entry point for AutoGluon model execution in the pipeline.
- Parameters:
step_info – Parsed step containing model configuration.
dataset (SpectroDataset) – Dataset containing features and targets.
context (ExecutionContext) – Pipeline execution context.
runtime_context (RuntimeContext) – Runtime context.
source (int) – Source index. Defaults to -1.
mode (str) – Execution mode. Defaults to ‘train’.
loaded_binaries – Pre-loaded model binaries for prediction.
prediction_store – Store for managing predictions.
- Returns:
- Updated context
and list of model binaries.
- Return type:
Tuple[ExecutionContext, List[ArtifactMeta]]
- get_preferred_layout() str[source]
Return the preferred data layout for AutoGluon.
- Returns:
Data layout preference, ‘2d’ for AutoGluon.
- Return type:
- load_model(filepath: str) Any[source]
Load AutoGluon model from disk.
- Parameters:
filepath (str) – Path to the saved model directory.
- Returns:
Loaded AutoGluon predictor.
- Return type:
TabularPredictor
- classmethod matches(step: Any, operator: Any, keyword: str) bool[source]
Match AutoGluon TabularPredictor configurations.
- save_model(model: Any, filepath: str) None[source]
Save AutoGluon model to disk.
AutoGluon models are saved as directories. This method moves the model’s directory to the specified filepath.
- Parameters:
model (TabularPredictor) – Trained AutoGluon predictor.
filepath (str) – Target path for saving.