nirs4all.sklearn package
Submodules
- nirs4all.sklearn.classifier module
- nirs4all.sklearn.pipeline module
NIRSPipelineNIRSPipeline.is_fitted_NIRSPipeline.model_NIRSPipeline.bundle_loader_NIRSPipeline.preprocessing_chainNIRSPipeline.model_step_indexNIRSPipeline.fold_weightsNIRSPipeline.predict()NIRSPipeline.score()NIRSPipeline.transform()NIRSPipeline.__repr__()NIRSPipeline.__str__()NIRSPipeline.bundle_loader_NIRSPipeline.fit()NIRSPipeline.fold_weightsNIRSPipeline.from_bundle()NIRSPipeline.from_result()NIRSPipeline.get_params()NIRSPipeline.get_transformers()NIRSPipeline.is_fitted_NIRSPipeline.model_NIRSPipeline.model_nameNIRSPipeline.model_step_indexNIRSPipeline.n_foldsNIRSPipeline.predict()NIRSPipeline.preprocessing_chainNIRSPipeline.score()NIRSPipeline.set_params()NIRSPipeline.shap_modelNIRSPipeline.transform()
Module contents
NIRS4All sklearn Integration Module.
This module provides sklearn-compatible wrappers for nirs4all pipelines, enabling integration with scikit-learn tools like cross_validate, GridSearchCV, and SHAP explainers.
- Classes:
NIRSPipeline: sklearn-compatible regressor wrapper for trained pipelines. NIRSPipelineClassifier: Classification variant of NIRSPipeline.
Important
NIRSPipeline is a PREDICTION wrapper, not a training estimator. Training is done via nirs4all.run(), then the result can be wrapped for sklearn compatibility using NIRSPipeline.from_result() or NIRSPipeline.from_bundle().
Example
>>> import nirs4all
>>> from nirs4all.sklearn import NIRSPipeline
>>> import shap
>>>
>>> # Train with nirs4all
>>> result = nirs4all.run(pipeline, dataset)
>>>
>>> # Wrap for sklearn/SHAP compatibility
>>> pipe = NIRSPipeline.from_result(result)
>>> explainer = shap.Explainer(pipe.predict, X_background)
>>> shap_values = explainer(X_test)
>>>
>>> # Or from exported bundle
>>> pipe = NIRSPipeline.from_bundle("exports/model.n4a")
>>> y_pred = pipe.predict(X_new)
>>> print(f"R²: {pipe.score(X_test, y_test):.4f}")
- class nirs4all.sklearn.NIRSPipeline[source]
Bases:
objectsklearn-compatible wrapper for trained nirs4all pipelines.
This class wraps a trained nirs4all pipeline to provide sklearn’s BaseEstimator interface. It is designed for PREDICTION and EXPLANATION, not for training (use nirs4all.run() for training).
- Construction:
Use class methods to create instances: - NIRSPipeline.from_result(result): From a RunResult - NIRSPipeline.from_bundle(path): From an exported .n4a bundle
- is_fitted_
Always True for wrapped pipelines.
- model_
The underlying model (fold 0) for SHAP access.
- bundle_loader_
BundleLoader instance (if created from bundle).
- preprocessing_chain
String summary of preprocessing steps.
- model_step_index
Index of the model step in the pipeline.
- fold_weights
Dictionary of fold weights for CV ensemble.
- sklearn Compatibility:
Implements BaseEstimator interface (get_params, set_params)
Implements RegressorMixin (score method)
Works with SHAP explainers
Works with sklearn.model_selection.cross_val_predict (predict only)
Example
>>> result = nirs4all.run(pipeline, dataset) >>> pipe = NIRSPipeline.from_result(result) >>> y_pred = pipe.predict(X_new) >>> print(f"R²: {pipe.score(X_test, y_test):.4f}")
- property bundle_loader_: BundleLoader | None
Get the underlying BundleLoader (if created from bundle).
- Returns:
BundleLoader instance or None.
- fit(X: Any, y: Any, **fit_params: Any) NIRSPipeline[source]
Fit is not supported - use nirs4all.run() for training.
NIRSPipeline is a prediction wrapper, not a training estimator. Training should be done with nirs4all.run(), then wrapped.
- Parameters:
X – Ignored.
y – Ignored.
**fit_params – Ignored.
- Raises:
NotImplementedError – Always, by design.
Example
>>> # Correct workflow: >>> result = nirs4all.run(pipeline, dataset) # Training >>> pipe = NIRSPipeline.from_result(result) # Wrapping >>> y_pred = pipe.predict(X_new) # Prediction
- property fold_weights: Dict[int, float]
Get fold weights for CV ensemble.
- Returns:
Dictionary mapping fold_id to weight.
- classmethod from_bundle(bundle_path: str | Path, fold: int = 0) NIRSPipeline[source]
Create NIRSPipeline from an exported .n4a bundle.
- Parameters:
bundle_path – Path to the exported .n4a bundle file.
fold – Which fold’s model to use for model_ property (default: 0).
- Returns:
NIRSPipeline instance ready for prediction.
- Raises:
FileNotFoundError – If bundle file doesn’t exist.
ValueError – If bundle is invalid or corrupted.
Example
>>> pipe = NIRSPipeline.from_bundle("exports/model.n4a") >>> y_pred = pipe.predict(X_new)
- classmethod from_result(result: RunResult, source: Dict[str, Any] | None = None, fold: int = 0) NIRSPipeline[source]
Create NIRSPipeline from a RunResult.
This exports the best (or specified) model from the RunResult to a temporary bundle, then loads it for prediction. This ensures consistent prediction behavior between direct bundle loading and result-based creation.
- Parameters:
result – RunResult from nirs4all.run().
source – Optional prediction dict to wrap. If None, uses best model.
fold – Which fold’s model to use for model_ property (default: 0).
- Returns:
NIRSPipeline instance ready for prediction.
- Raises:
ValueError – If no predictions available in result.
RuntimeError – If export fails.
Example
>>> result = nirs4all.run(pipeline, dataset) >>> pipe = NIRSPipeline.from_result(result) >>> y_pred = pipe.predict(X_new)
- get_params(deep: bool = True) Dict[str, Any][source]
Get parameters for this estimator (sklearn interface).
- Parameters:
deep – If True, return nested parameters.
- Returns:
Parameter dictionary.
- get_transformers() List[Tuple[str, Any]][source]
Get list of preprocessing transformers.
- Returns:
List of (name, transformer) tuples.
Example
>>> pipe = NIRSPipeline.from_bundle("model.n4a") >>> for name, transformer in pipe.get_transformers(): ... print(f"{name}: {type(transformer).__name__}")
- property model_: Any
Get the underlying model for SHAP access.
Returns the model from the specified fold (default: fold 0). For tree-based models, this enables TreeExplainer. For neural networks, enables DeepExplainer.
- Returns:
The fitted model object.
- Raises:
RuntimeError – If model cannot be accessed.
Example
>>> pipe = NIRSPipeline.from_bundle("model.n4a") >>> model = pipe.model_ >>> explainer = shap.TreeExplainer(model) # If tree-based
- property model_step_index: int | None
Get the index of the model step in the pipeline.
- Returns:
Model step index or None.
- predict(X: ndarray) ndarray[source]
Make predictions on new data.
- Parameters:
X – Feature matrix (n_samples, n_features) as numpy array.
- Returns:
Predicted values array (n_samples,).
- Raises:
RuntimeError – If pipeline is not properly initialized.
Example
>>> pipe = NIRSPipeline.from_bundle("model.n4a") >>> y_pred = pipe.predict(X_test)
- property preprocessing_chain: str
Get string summary of preprocessing steps.
- Returns:
Preprocessing chain description.
- score(X: ndarray, y: ndarray) float[source]
Compute R² score on test data.
- Parameters:
X – Feature matrix (n_samples, n_features).
y – True target values (n_samples,).
- Returns:
R² score (coefficient of determination).
Example
>>> pipe = NIRSPipeline.from_bundle("model.n4a") >>> r2 = pipe.score(X_test, y_test) >>> print(f"R²: {r2:.4f}")
- set_params(**params: Any) NIRSPipeline[source]
Set parameters for this estimator (sklearn interface).
- Parameters:
**params – Parameters to set. Only ‘fold’ is supported.
- Returns:
self
- property shap_model: Any
Alias for model_ for SHAP compatibility.
- Returns:
The fitted model object.
- transform(X: ndarray) ndarray[source]
Apply preprocessing steps to data (without model prediction).
This applies all preprocessing transformers but stops before the model step. Useful for getting base model predictions in stacking or for debugging preprocessing.
- Parameters:
X – Feature matrix (n_samples, n_features).
- Returns:
Transformed features (n_samples, n_transformed_features).
- Raises:
RuntimeError – If pipeline is not properly initialized.
- class nirs4all.sklearn.NIRSPipelineClassifier[source]
Bases:
NIRSPipelinesklearn-compatible classifier wrapper for trained nirs4all pipelines.
This is the classification variant of NIRSPipeline, providing ClassifierMixin compatibility (predict_proba, classes_).
- Construction:
Use class methods to create instances: - NIRSPipelineClassifier.from_result(result): From a RunResult - NIRSPipelineClassifier.from_bundle(path): From an exported .n4a bundle
- Additional Attributes:
classes_: Array of class labels.
- Additional Methods:
predict_proba(X): Predict class probabilities.
Example
>>> result = nirs4all.run(classification_pipeline, dataset) >>> clf = NIRSPipelineClassifier.from_result(result) >>> proba = clf.predict_proba(X_new) >>> print(f"Accuracy: {clf.score(X_test, y_test):.4f}")
- property classes_: ndarray
Get array of class labels.
- Returns:
Array of unique class labels.
- Raises:
RuntimeError – If classes cannot be determined.
- classmethod from_bundle(bundle_path: str | Path, fold: int = 0) NIRSPipelineClassifier[source]
Create NIRSPipelineClassifier from an exported .n4a bundle.
- Parameters:
bundle_path – Path to the exported .n4a bundle file.
fold – Which fold’s model to use (default: 0).
- Returns:
NIRSPipelineClassifier instance ready for prediction.
Example
>>> clf = NIRSPipelineClassifier.from_bundle("exports/classifier.n4a") >>> proba = clf.predict_proba(X_new)
- classmethod from_result(result: RunResult, source: Dict[str, Any] | None = None, fold: int = 0) NIRSPipelineClassifier[source]
Create NIRSPipelineClassifier from a RunResult.
- Parameters:
result – RunResult from nirs4all.run() with a classification pipeline.
source – Optional prediction dict to wrap. If None, uses best model.
fold – Which fold’s model to use (default: 0).
- Returns:
NIRSPipelineClassifier instance ready for prediction.
Example
>>> result = nirs4all.run(classification_pipeline, dataset) >>> clf = NIRSPipelineClassifier.from_result(result)
- predict(X: ndarray) ndarray[source]
Predict class labels for samples.
- Parameters:
X – Feature matrix (n_samples, n_features).
- Returns:
Predicted class labels (n_samples,).
Example
>>> clf = NIRSPipelineClassifier.from_bundle("model.n4a") >>> y_pred = clf.predict(X_test)
- predict_proba(X: ndarray) ndarray[source]
Predict class probabilities for samples.
- Parameters:
X – Feature matrix (n_samples, n_features).
- Returns:
Class probability matrix (n_samples, n_classes).
- Raises:
RuntimeError – If model doesn’t support predict_proba.
Example
>>> clf = NIRSPipelineClassifier.from_bundle("model.n4a") >>> proba = clf.predict_proba(X_test) >>> print(f"Probability of class 0: {proba[:, 0]}")
- score(X: ndarray, y: ndarray) float[source]
Compute accuracy score on test data.
- Parameters:
X – Feature matrix (n_samples, n_features).
y – True class labels (n_samples,).
- Returns:
Accuracy score (fraction correctly classified).
Example
>>> clf = NIRSPipelineClassifier.from_bundle("model.n4a") >>> accuracy = clf.score(X_test, y_test) >>> print(f"Accuracy: {accuracy:.4f}")