nirs4all.operators.models.sklearn.dipls module

Dynamic PLS (DiPLS) regressor for nirs4all.

See pls.py for full documentation and usage examples.

class nirs4all.operators.models.sklearn.dipls.DiPLS(n_components: int = 5, lags: int = 1, cv_splits: int = 7, tol: float = 1e-08, max_iter: int = 1000)[source]

Bases: BaseEstimator, RegressorMixin

Dynamic PLS (DiPLS) regressor.

DiPLS extends PLS to handle dynamic systems by including time-lagged variables. It uses the trendfitter package.

Parameters:
  • n_components (int, default=5) – Number of latent variables to extract.

  • lags (int, default=1) – Number of time lags to consider (s parameter in DiPLS).

  • cv_splits (int, default=7) – Number of cross-validation splits for automatic component selection.

  • tol (float, default=1e-8) – Convergence tolerance.

  • max_iter (int, default=1000) – Maximum number of iterations.

n_features_in_

Number of features seen during fit.

Type:

int

n_components_

Actual number of components used.

Type:

int

Examples

>>> from nirs4all.operators.models.sklearn.pls import DiPLS
>>> import numpy as np
>>> X = np.random.randn(100, 50)
>>> y = np.random.randn(100)
>>> model = DiPLS(n_components=5, lags=2)
>>> model.fit(X, y)
DiPLS(n_components=5, lags=2)
>>> predictions = model.predict(X)

Notes

Requires the trendfitter package: pip install trendfitter

DiPLS is particularly useful for: - Process monitoring with temporal dependencies - NIR data collected over time - Batch process analytics

See also

sklearn.cross_decomposition.PLSRegression

Standard PLS without dynamics.

References

fit(X, y)[source]

Fit the DiPLS model.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data (time-ordered measurements).

  • y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target values.

Returns:

self – Fitted estimator.

Return type:

DiPLS

Raises:

ImportError – If trendfitter package is not installed.

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

predict(X)[source]

Predict using the DiPLS model.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

y_pred – Predicted values.

Return type:

ndarray of shape (n_samples,) or (n_samples, n_targets)

Notes

DiPLS uses Hankelization which may produce fewer predictions than input samples. This implementation pads the beginning with the first predicted value to maintain compatibility with sklearn cross-validation.

set_params(**params)[source]

Set the parameters of this estimator.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

DiPLS

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') DiPLS

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object