nirs4all.operators.models.sklearn.kopls module

Kernel Orthogonal PLS (K-OPLS) regressor for nirs4all.

A sklearn-compatible implementation of K-OPLS that combines kernel methods with Orthogonal PLS to handle nonlinear relationships in the data. K-OPLS separates Y-predictive variation from Y-orthogonal variation in kernel space.

This implementation is based on the ConsensusOPLS R package algorithm from https://github.com/sib-swiss/ConsensusOPLS, which itself is based on the original K-OPLS algorithm by Bylesjo, Rantalainen, et al.

Supports both NumPy (CPU) and JAX (GPU/TPU) backends.

References

Bylesjo, M., Rantalainen, M., Cloarec, O., Nicholson, J. K., Holmes, E., & Trygg, J. (2006). OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. Journal of Chemometrics, 20(8-10), 341-351.
Rantalainen, M., Bylesjo, M., Cloarec, O., Nicholson, J. K., Holmes, E., & Trygg, J. (2007). Kernel-based orthogonal projections to latent structures (K-OPLS). Journal of Chemometrics, 21(7-9), 376-385.
ConsensusOPLS R package: https://github.com/sib-swiss/ConsensusOPLS

class nirs4all.operators.models.sklearn.kopls.KOPLS(n_components: int = 5, n_ortho_components: int = 1, kernel: Literal['linear', 'rbf', 'poly'] = 'rbf', gamma: float | None = None, degree: int = 3, coef0: float = 1.0, center: bool = True, scale: bool = True, backend: str = 'numpy')[source]

Bases: BaseEstimator, RegressorMixin

Kernel Orthogonal PLS (K-OPLS) regressor.

K-OPLS combines kernel methods with Orthogonal PLS to handle nonlinear relationships in the data. It first removes Y-orthogonal variation from the kernel matrix, then fits a kernel PLS model on the filtered kernel.

This implementation follows the algorithm from ConsensusOPLS R package, which is based on the original K-OPLS algorithm by Rantalainen et al.

Parameters:

n_components (int, default=5) – Number of predictive PLS components.
n_ortho_components (int, default=1) – Number of orthogonal components to remove. These represent Y-orthogonal variation that would hurt prediction.
kernel (str, default='rbf') – Kernel function to use: - ‘linear’: Linear kernel K(x,y) = x^T y - ‘rbf’: Radial basis function K(x,y) = exp(-gamma ||x-y||^2) - ‘poly’: Polynomial kernel K(x,y) = (gamma x^T y + coef0)^degree
gamma (float, optional) – Kernel coefficient for ‘rbf’ and ‘poly’ kernels. If None, uses 1/n_features.
degree (int, default=3) – Degree for polynomial kernel.
coef0 (float, default=1.0) – Independent term in polynomial kernel.
center (bool, default=True) – Whether to center the kernel matrix.
scale (bool, default=True) – Whether to scale Y to unit variance.
backend (str, default='numpy') – Computational backend to use: - ‘numpy’: NumPy backend (CPU only). - ‘jax’: JAX backend (supports GPU/TPU acceleration).

n_features_in\_

Number of features seen during fit.

Type:: int

n_components\_

Actual number of predictive components used.

Type:: int

n_ortho_components\_

Actual number of orthogonal components used.

Type:: int

X_train\_

Training data (stored for kernel computation at predict time).

Type:: ndarray of shape (n_samples, n_features)

y_mean\_

Mean of Y.

Type:: ndarray of shape (n_targets,)

y_std\_

Standard deviation of Y.

Type:: ndarray of shape (n_targets,)

x_scores\_

X scores from filtered kernel PLS (T).

Type:: ndarray of shape (n_samples, n_components)

y_scores\_

Y scores (U).

Type:: ndarray of shape (n_samples, n_components)

y_loadings\_

Y loadings (C).

Type:: ndarray of shape (n_targets, n_components)

ortho_scores\_

Orthogonal scores (T_ortho).

Type:: ndarray of shape (n_samples, n_ortho_components)

Examples

>>> from nirs4all.operators.models.sklearn.kopls import KOPLS
>>> import numpy as np
>>> # Generate nonlinear data
>>> np.random.seed(42)
>>> X = np.random.randn(100, 50)
>>> y = np.sin(X[:, :5].sum(axis=1)) + 0.1 * np.random.randn(100)
>>> # Fit K-OPLS with RBF kernel
>>> model = KOPLS(n_components=5, n_ortho_components=2, kernel='rbf')
>>> model.fit(X, y)
KOPLS(...)
>>> predictions = model.predict(X)
>>> # Transform to score space
>>> T = model.transform(X)
>>> print(T.shape)
(100, 5)

References

Rantalainen, M., Bylesjo, M., Cloarec, O., Nicholson, J. K., Holmes, E., & Trygg, J. (2007). Kernel-based orthogonal projections to latent structures (K-OPLS). Journal of Chemometrics, 21(7-9), 376-385.
ConsensusOPLS R package: https://github.com/sib-swiss/ConsensusOPLS

__repr__() → str[source]: Return string representation.

Fit the K-OPLS model.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target values.

Returns:

self – Fitted estimator.

Return type:

KOPLS

get_params(deep: bool = True) → dict[source]: Get parameters for this estimator.

Predict using the K-OPLS model.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Returns:: y_pred – Predicted values.
Return type:: ndarray of shape (n_samples,) or (n_samples, n_targets)

set_params(**params) → KOPLS[source]: Set the parameters of this estimator.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') → KOPLS

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:: sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Returns:: self – The updated object.
Return type:: object

Transform X to K-OPLS score space.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to transform.
Returns:: T – X scores in the filtered kernel PLS space.
Return type:: ndarray of shape (n_samples, n_components_)