nirs4all.operators.models.sklearn.recursive_pls module

Recursive PLS (RPLS) regressor for nirs4all.

A sklearn-compatible implementation of Recursive Partial Least Squares. RPLS enables online model updates for drifting processes through incremental updates using a forgetting factor.

Supports both NumPy (CPU) and JAX (GPU/TPU) backends.

References

  • Qin, S. J. (1998). Recursive PLS algorithms for adaptive data modeling. Computers & Chemical Engineering, 22(4-5), 503-514.

  • Helland, K., Berntsen, H. E., Borgen, O. S., & Martens, H. (1992). Recursive algorithm for partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 14(1-3), 129-137.

  • Dayal, B. S., & MacGregor, J. F. (1997). Recursive exponentially weighted PLS and its applications to adaptive control and prediction. Journal of Process Control, 7(3), 169-179.

class nirs4all.operators.models.sklearn.recursive_pls.RecursivePLS(n_components: int = 10, forgetting_factor: float = 0.99, scale: bool = True, center: bool = True, backend: str = 'numpy')[source]

Bases: BaseEstimator, RegressorMixin

Recursive Partial Least Squares (Recursive PLS) regressor.

Recursive PLS enables online model updates for drifting processes. It uses a forgetting factor to exponentially weight old samples, allowing the model to adapt to non-stationary data streams.

The algorithm maintains running covariance matrices that are updated incrementally with each new batch of samples. The PLS loadings are then recomputed from these updated covariances.

Parameters:
  • n_components (int, default=10) – Number of PLS components to extract.

  • forgetting_factor (float, default=0.99) – Forgetting factor in (0, 1]. Controls the rate of adaptation: - 1.0: No forgetting, standard batch PLS - <1.0: Exponential forgetting of old samples - Typical values: 0.95-0.999 depending on drift speed

  • scale (bool, default=True) – Whether to scale X and Y to unit variance.

  • center (bool, default=True) – Whether to center X and Y (subtract mean).

  • backend (str, default='numpy') – Computational backend to use: - ‘numpy’: NumPy backend (CPU only). - ‘jax’: JAX backend (supports GPU/TPU acceleration).

n_features_in_

Number of features seen during fit.

Type:

int

n_components_

Actual number of components used.

Type:

int

n_samples_seen_

Total number of samples seen (including partial_fit calls).

Type:

int

x_mean_

Mean of X (updated with exponential moving average).

Type:

ndarray of shape (n_features,)

x_std_

Standard deviation of X.

Type:

ndarray of shape (n_features,)

y_mean_

Mean of Y (updated with exponential moving average).

Type:

ndarray of shape (n_targets,)

y_std_

Standard deviation of Y.

Type:

ndarray of shape (n_targets,)

x_weights_

X weights (W).

Type:

ndarray of shape (n_features, n_components_)

x_loadings_

X loadings (P).

Type:

ndarray of shape (n_features, n_components_)

y_loadings_

Y loadings (Q).

Type:

ndarray of shape (n_targets, n_components_)

coef_

Regression coefficients.

Type:

ndarray of shape (n_features, n_targets)

Examples

>>> from nirs4all.operators.models.sklearn.recursive_pls import RecursivePLS
>>> import numpy as np
>>> # Initial batch fit
>>> np.random.seed(42)
>>> X_init = np.random.randn(100, 50)
>>> y_init = X_init[:, :5].sum(axis=1) + 0.1 * np.random.randn(100)
>>> model = RecursivePLS(n_components=10, forgetting_factor=0.99)
>>> model.fit(X_init, y_init)
RecursivePLS(n_components=10)
>>> # Online update with new samples
>>> X_new = np.random.randn(10, 50)
>>> y_new = X_new[:, :5].sum(axis=1) + 0.1 * np.random.randn(10)
>>> model.partial_fit(X_new, y_new)
>>> # Predict
>>> predictions = model.predict(X_new)
>>> print(f"Samples seen: {model.n_samples_seen_}")

Notes

Recursive PLS is particularly useful when: - Data arrives in streams and batch retraining is too expensive - Process conditions drift over time (sensor aging, raw material changes) - You need to adapt a calibration model to local conditions

The forgetting factor controls the adaptation speed: - Higher values (0.99-0.999): Slow adaptation, stable model - Lower values (0.9-0.95): Fast adaptation, may be unstable

See also

SIMPLS

Batch SIMPLS algorithm.

sklearn.cross_decomposition.PLSRegression

sklearn’s batch PLS.

References

  • Qin, S. J. (1998). Recursive PLS algorithms for adaptive data modeling. Computers & Chemical Engineering, 22(4-5), 503-514.

  • Dayal, B. S., & MacGregor, J. F. (1997). Recursive exponentially weighted PLS and its applications to adaptive control and prediction. Journal of Process Control, 7(3), 169-179.

__repr__() str[source]

Return string representation.

fit(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) RecursivePLS[source]

Fit the Recursive PLS model with initial batch.

Parameters:
Returns:

self – Fitted estimator.

Return type:

RecursivePLS

Raises:
  • ValueError – If backend is not ‘numpy’ or ‘jax’. If forgetting_factor is not in (0, 1].

  • ImportError – If backend is ‘jax’ and JAX is not installed.

get_params(deep: bool = True) dict[source]

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

partial_fit(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) RecursivePLS[source]

Update the Recursive PLS model with new samples.

Parameters:
Returns:

self – Updated estimator.

Return type:

RecursivePLS

Raises:

NotFittedError – If the model has not been fitted yet.

predict(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]

Predict using the Recursive PLS model.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

y_pred – Predicted values.

Return type:

ndarray of shape (n_samples,) or (n_samples, n_targets)

set_params(**params) RecursivePLS[source]

Set the parameters of this estimator.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

RecursivePLS

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') RecursivePLS

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

transform(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]

Transform X to score space.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to transform.

Returns:

T – X scores.

Return type:

ndarray of shape (n_samples, n_components_)