nirs4all.operators.models.sklearn.nlpls module

Nonlinear PLS (NL-PLS / Kernel PLS) regressor for nirs4all.

A sklearn-compatible implementation of Nonlinear PLS using kernel methods. This approach maps the data into a higher-dimensional feature space using a kernel function (e.g., RBF) and then fits a standard PLS model on the kernel matrix.

Supports both NumPy (CPU) and JAX (GPU/TPU) backends.

Two implementations are provided:

  1. KernelPLS (KPLS) - Simple Kernel PLS Maps X into kernel space using a nonlinear kernel (RBF, polynomial, etc.) and fits PLS on the kernel matrix K = kernel(X, X).

  2. MIRPLS - Monotonic Inner Relation PLS (experimental) Implements the MIR-PLS algorithm from Zheng et al. (2024) which uses monotonic cubic spline piecewise regression for the inner model.

References

  • Rosipal, R., & Trejo, L. J. (2001). Kernel partial least squares regression in reproducing kernel hilbert space. Journal of Machine Learning Research, 2, 97-123.

  • Zheng, X., Nie, B., Du, J., et al. (2024). A non-linear partial least squares based on monotonic inner relation. Frontiers in Physiology, 15, 1369165. doi:10.3389/fphys.2024.1369165

  • Qin, S. J., & McAvoy, T. J. (1992). Nonlinear PLS modeling using neural networks. Computers & Chemical Engineering, 16(4), 379-391.

nirs4all.operators.models.sklearn.nlpls.KPLS

alias of KernelPLS

class nirs4all.operators.models.sklearn.nlpls.KernelPLS(n_components: int = 10, kernel: Literal['rbf', 'linear', 'poly', 'sigmoid'] = 'rbf', gamma: float | None = None, degree: int = 3, coef0: float = 1.0, center_kernel: bool = True, scale_y: bool = True, backend: str = 'numpy')[source]

Bases: BaseEstimator, RegressorMixin

Nonlinear PLS using Kernel Methods (Kernel PLS / NL-PLS).

Kernel PLS maps the input data X into a higher-dimensional feature space using a kernel function (RBF, polynomial, sigmoid) and then fits a PLS model on the kernel matrix K(X, X). This allows capturing nonlinear relationships between X and Y while retaining the interpretability of PLS.

The algorithm: 1. Compute kernel matrix K = kernel(X_train, X_train) 2. Center the kernel matrix 3. Fit PLS on K with target Y 4. For prediction: K_test = kernel(X_test, X_train), center, predict

This is a simple and effective approach for nonlinear regression that combines the power of kernel methods with PLS dimensionality reduction.

Parameters:
  • n_components (int, default=10) – Number of PLS components to extract.

  • kernel ({'rbf', 'linear', 'poly', 'sigmoid'}, default='rbf') – Kernel function to use: - ‘rbf’: Radial basis function K(x,y) = exp(-gamma ||x-y||^2) - ‘linear’: Linear kernel K(x,y) = x^T y (equivalent to standard PLS) - ‘poly’: Polynomial kernel K(x,y) = (gamma * x^T y + coef0)^degree - ‘sigmoid’: Sigmoid kernel K(x,y) = tanh(gamma * x^T y + coef0)

  • gamma (float, optional) – Kernel coefficient for ‘rbf’, ‘poly’, and ‘sigmoid’ kernels. If None, defaults to 1/n_features.

  • degree (int, default=3) – Degree for polynomial kernel.

  • coef0 (float, default=1.0) – Independent term in polynomial and sigmoid kernels.

  • center_kernel (bool, default=True) – Whether to center the kernel matrix. Recommended for most cases.

  • scale_y (bool, default=True) – Whether to center and scale Y to zero mean and unit variance.

  • backend (str, default='numpy') – Computational backend to use: - ‘numpy’: NumPy backend (CPU only). - ‘jax’: JAX backend (supports GPU/TPU acceleration).

n_features_in\_

Number of features seen during fit.

Type:

int

n_components\_

Actual number of components used.

Type:

int

X_train\_

Training data (stored for kernel computation at predict time).

Type:

ndarray of shape (n_train, n_features)

K_train\_

Raw (uncentered) training kernel matrix.

Type:

ndarray of shape (n_train, n_train)

y_mean\_

Mean of Y (if scale_y=True).

Type:

ndarray of shape (n_targets,)

y_std\_

Standard deviation of Y (if scale_y=True).

Type:

ndarray of shape (n_targets,)

x_scores\_

X scores in kernel space (T).

Type:

ndarray of shape (n_train, n_components)

y_scores\_

Y scores (U).

Type:

ndarray of shape (n_train, n_components)

coef\_

Kernel regression coefficients.

Type:

ndarray of shape (n_train, n_targets)

Examples

>>> from nirs4all.operators.models.sklearn.nlpls import KernelPLS
>>> import numpy as np
>>> # Generate nonlinear data
>>> np.random.seed(42)
>>> X = np.random.randn(100, 50)
>>> y = np.sin(X[:, :5].sum(axis=1)) + 0.1 * np.random.randn(100)
>>> # Fit Kernel PLS with RBF kernel
>>> model = KernelPLS(n_components=10, kernel='rbf', gamma=0.1)
>>> model.fit(X, y)
KernelPLS(...)
>>> predictions = model.predict(X)
>>> print(f"R^2 score: {model.score(X, y):.4f}")

Notes

Kernel PLS is particularly useful when: - The relationship between X and Y is nonlinear - Standard linear PLS gives poor predictions - You want to use kernel methods but need PLS-style dimensionality reduction

The choice of kernel and gamma parameter significantly affects performance. Cross-validation is recommended for hyperparameter tuning.

For NIRS data, the RBF kernel with small gamma often works well for capturing nonlinear spectral-property relationships.

See also

KOPLS

Kernel OPLS with orthogonal variation filtering.

sklearn.cross_decomposition.PLSRegression

Standard linear PLS.

References

  • Rosipal, R., & Trejo, L. J. (2001). Kernel partial least squares regression in reproducing kernel hilbert space. Journal of Machine Learning Research, 2, 97-123.

__repr__() str[source]

Return string representation.

fit(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) KernelPLS[source]

Fit the Kernel PLS model.

Parameters:
Returns:

self – Fitted estimator.

Return type:

KernelPLS

Raises:
  • ValueError – If backend is not ‘numpy’ or ‘jax’. If kernel is not one of the supported types.

  • ImportError – If backend is ‘jax’ and JAX is not installed.

get_params(deep: bool = True) dict[source]

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

predict(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], n_components: int | None = None) ndarray[tuple[Any, ...], dtype[floating]][source]

Predict using the Kernel PLS model.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Samples to predict.

  • n_components (int, optional) – Number of components to use for prediction. If None, uses all fitted components.

Returns:

y_pred – Predicted values.

Return type:

ndarray of shape (n_samples,) or (n_samples, n_targets)

set_params(**params) KernelPLS[source]

Set the parameters of this estimator.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

KernelPLS

set_predict_request(*, n_components: bool | None | str = '$UNCHANGED$') KernelPLS

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

n_components (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for n_components parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') KernelPLS

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

transform(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]

Transform X to kernel PLS score space.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to transform.

Returns:

T – X scores in kernel space.

Return type:

ndarray of shape (n_samples, n_components_)

nirs4all.operators.models.sklearn.nlpls.NLPLS

alias of KernelPLS