nirs4all.operators.models.sklearn.kopls module
Kernel Orthogonal PLS (K-OPLS) regressor for nirs4all.
A sklearn-compatible implementation of K-OPLS that combines kernel methods with Orthogonal PLS to handle nonlinear relationships in the data. K-OPLS separates Y-predictive variation from Y-orthogonal variation in kernel space.
This implementation is based on the ConsensusOPLS R package algorithm from https://github.com/sib-swiss/ConsensusOPLS, which itself is based on the original K-OPLS algorithm by Bylesjo, Rantalainen, et al.
Supports both NumPy (CPU) and JAX (GPU/TPU) backends.
References
Bylesjo, M., Rantalainen, M., Cloarec, O., Nicholson, J. K., Holmes, E., & Trygg, J. (2006). OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification. Journal of Chemometrics, 20(8-10), 341-351.
Rantalainen, M., Bylesjo, M., Cloarec, O., Nicholson, J. K., Holmes, E., & Trygg, J. (2007). Kernel-based orthogonal projections to latent structures (K-OPLS). Journal of Chemometrics, 21(7-9), 376-385.
ConsensusOPLS R package: https://github.com/sib-swiss/ConsensusOPLS
- class nirs4all.operators.models.sklearn.kopls.KOPLS(n_components: int = 5, n_ortho_components: int = 1, kernel: Literal['linear', 'rbf', 'poly'] = 'rbf', gamma: float | None = None, degree: int = 3, coef0: float = 1.0, center: bool = True, scale: bool = True, backend: str = 'numpy')[source]
Bases:
BaseEstimator,RegressorMixinKernel Orthogonal PLS (K-OPLS) regressor.
K-OPLS combines kernel methods with Orthogonal PLS to handle nonlinear relationships in the data. It first removes Y-orthogonal variation from the kernel matrix, then fits a kernel PLS model on the filtered kernel.
This implementation follows the algorithm from ConsensusOPLS R package, which is based on the original K-OPLS algorithm by Rantalainen et al.
- Parameters:
n_components (int, default=5) – Number of predictive PLS components.
n_ortho_components (int, default=1) – Number of orthogonal components to remove. These represent Y-orthogonal variation that would hurt prediction.
kernel (str, default='rbf') – Kernel function to use: - ‘linear’: Linear kernel K(x,y) = x^T y - ‘rbf’: Radial basis function K(x,y) = exp(-gamma ||x-y||^2) - ‘poly’: Polynomial kernel K(x,y) = (gamma x^T y + coef0)^degree
gamma (float, optional) – Kernel coefficient for ‘rbf’ and ‘poly’ kernels. If None, uses 1/n_features.
degree (int, default=3) – Degree for polynomial kernel.
coef0 (float, default=1.0) – Independent term in polynomial kernel.
center (bool, default=True) – Whether to center the kernel matrix.
scale (bool, default=True) – Whether to scale Y to unit variance.
backend (str, default='numpy') – Computational backend to use: - ‘numpy’: NumPy backend (CPU only). - ‘jax’: JAX backend (supports GPU/TPU acceleration).
- n_features_in\_
Number of features seen during fit.
- Type:
- n_components\_
Actual number of predictive components used.
- Type:
- n_ortho_components\_
Actual number of orthogonal components used.
- Type:
- X_train\_
Training data (stored for kernel computation at predict time).
- Type:
ndarray of shape (n_samples, n_features)
- y_mean\_
Mean of Y.
- Type:
ndarray of shape (n_targets,)
- y_std\_
Standard deviation of Y.
- Type:
ndarray of shape (n_targets,)
- x_scores\_
X scores from filtered kernel PLS (T).
- Type:
ndarray of shape (n_samples, n_components)
- y_scores\_
Y scores (U).
- Type:
ndarray of shape (n_samples, n_components)
- y_loadings\_
Y loadings (C).
- Type:
ndarray of shape (n_targets, n_components)
- ortho_scores\_
Orthogonal scores (T_ortho).
- Type:
ndarray of shape (n_samples, n_ortho_components)
Examples
>>> from nirs4all.operators.models.sklearn.kopls import KOPLS >>> import numpy as np >>> # Generate nonlinear data >>> np.random.seed(42) >>> X = np.random.randn(100, 50) >>> y = np.sin(X[:, :5].sum(axis=1)) + 0.1 * np.random.randn(100) >>> # Fit K-OPLS with RBF kernel >>> model = KOPLS(n_components=5, n_ortho_components=2, kernel='rbf') >>> model.fit(X, y) KOPLS(...) >>> predictions = model.predict(X) >>> # Transform to score space >>> T = model.transform(X) >>> print(T.shape) (100, 5)
References
Rantalainen, M., Bylesjo, M., Cloarec, O., Nicholson, J. K., Holmes, E., & Trygg, J. (2007). Kernel-based orthogonal projections to latent structures (K-OPLS). Journal of Chemometrics, 21(7-9), 376-385.
ConsensusOPLS R package: https://github.com/sib-swiss/ConsensusOPLS
- fit(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) KOPLS[source]
Fit the K-OPLS model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target values.
- Returns:
self – Fitted estimator.
- Return type:
- predict(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]
Predict using the K-OPLS model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Returns:
y_pred – Predicted values.
- Return type:
ndarray of shape (n_samples,) or (n_samples, n_targets)
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') KOPLS
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]
Transform X to K-OPLS score space.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to transform.
- Returns:
T – X scores in the filtered kernel PLS space.
- Return type:
ndarray of shape (n_samples, n_components_)