nirs4all.operators.models.sklearn.simpls module

SIMPLS (Simple PLS) regressor for nirs4all.

A sklearn-compatible implementation of the SIMPLS algorithm by de Jong (1993). SIMPLS is an alternative to NIPALS that computes PLS components via projections of the covariance matrix, avoiding the iterative deflation of X.

Supports both NumPy (CPU) and JAX (GPU/TPU) backends.

References

class nirs4all.operators.models.sklearn.simpls.SIMPLS(n_components: int = 10, scale: bool = True, center: bool = True, backend: str = 'numpy')[source]

Bases: BaseEstimator, RegressorMixin

SIMPLS (Simple PLS) regressor.

SIMPLS is an alternative to NIPALS-based PLS that computes components via projections of the covariance matrix X’Y. It produces the same predictions as PLSRegression for univariate Y, and slightly different (but equivalent in terms of prediction accuracy) results for multivariate Y.

SIMPLS is often faster than NIPALS for high-dimensional data because it avoids the iterative deflation of X.

Parameters:
  • n_components (int, default=10) – Number of PLS components to extract.

  • scale (bool, default=True) – Whether to scale X and Y to unit variance.

  • center (bool, default=True) – Whether to center X and Y (subtract mean).

  • backend (str, default='numpy') – Computational backend to use: - ‘numpy’: NumPy backend (CPU only). - ‘jax’: JAX backend (supports GPU/TPU acceleration). JAX backend requires JAX to be installed: pip install jax For GPU support: pip install jax[cuda12]

n_features_in_

Number of features seen during fit.

Type:

int

n_components_

Actual number of components used (may be less than n_components if limited by data dimensions).

Type:

int

x_mean_

Mean of X.

Type:

ndarray of shape (n_features,)

x_std_

Standard deviation of X.

Type:

ndarray of shape (n_features,)

y_mean_

Mean of Y.

Type:

ndarray of shape (n_targets,)

y_std_

Standard deviation of Y.

Type:

ndarray of shape (n_targets,)

x_scores_

X scores (T).

Type:

ndarray of shape (n_samples, n_components_)

y_scores_

Y scores (U).

Type:

ndarray of shape (n_samples, n_components_)

x_weights_

X weights (W).

Type:

ndarray of shape (n_features, n_components_)

x_loadings_

X loadings (P).

Type:

ndarray of shape (n_features, n_components_)

y_loadings_

Y loadings (Q).

Type:

ndarray of shape (n_targets, n_components_)

coef_

Regression coefficients (using all components).

Type:

ndarray of shape (n_features, n_targets)

Examples

>>> from nirs4all.operators.models.sklearn.simpls import SIMPLS
>>> import numpy as np
>>> # Generate sample data
>>> np.random.seed(42)
>>> X = np.random.randn(100, 50)
>>> y = X[:, :5].sum(axis=1) + 0.1 * np.random.randn(100)
>>> # Fit SIMPLS model
>>> model = SIMPLS(n_components=10)
>>> model.fit(X, y)
SIMPLS(n_components=10)
>>> predictions = model.predict(X)
>>> # Use JAX backend for GPU acceleration
>>> model_jax = SIMPLS(n_components=10, backend='jax')

Notes

SIMPLS differs from NIPALS in how the deflation is performed: - NIPALS deflates X after each component (X := X - t*p’) - SIMPLS deflates the covariance matrix S = X’Y

For univariate Y, both methods produce identical predictions. For multivariate Y, SIMPLS produces Y loadings that span the same space as NIPALS but with slightly different orientations.

See also

sklearn.cross_decomposition.PLSRegression

sklearn’s NIPALS-based PLS.

IKPLS

Fast PLS implementation from the ikpls package.

References

  • de Jong, S. (1993). SIMPLS: An alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 18(3), 251-263.

__repr__() str[source]

Return string representation.

fit(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) SIMPLS[source]

Fit the SIMPLS model.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data.

  • y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target values.

Returns:

self – Fitted estimator.

Return type:

SIMPLS

Raises:
  • ValueError – If backend is not ‘numpy’ or ‘jax’.

  • ImportError – If backend is ‘jax’ and JAX is not installed.

get_params(deep: bool = True) dict[source]

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

predict(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], n_components: int | None = None) ndarray[tuple[Any, ...], dtype[floating]][source]

Predict using the SIMPLS model.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Samples to predict.

  • n_components (int, optional) – Number of components to use for prediction. If None, uses all fitted components.

Returns:

y_pred – Predicted values.

Return type:

ndarray of shape (n_samples,) or (n_samples, n_targets)

set_params(**params) SIMPLS[source]

Set the parameters of this estimator.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

SIMPLS

set_predict_request(*, n_components: bool | None | str = '$UNCHANGED$') SIMPLS

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

n_components (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for n_components parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') SIMPLS

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

transform(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]

Transform X to score space.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to transform.

Returns:

T – X scores.

Return type:

ndarray of shape (n_samples, n_components_)