X (array-like of shape (n_samples, n_features)) – Samples to predict.
Returns:
y_pred – Predicted values.
Return type:
ndarray of shape (n_samples,) or (n_samples, n_targets)
Notes
DiPLS uses Hankelization which may produce fewer predictions than
input samples. This implementation pads the beginning with the first
predicted value to maintain compatibility with sklearn cross-validation.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
FCK-PLS builds spectral features by convolving input spectra with a bank
of fractional order filters, then applies PLS regression on the expanded
feature space. This approach captures derivative-like information at
various fractional orders.
The pipeline is:
1. Optional standardization of X and Y
2. FractionalConvFeaturizer: X -> X_feat (feature expansion)
3. PLSRegression: X_feat, Y -> predictions
Parameters:
n_components (int, default=10) – Number of PLS components to extract.
alphas (sequence of float, default=(0.0, 0.5, 1.0, 1.5, 2.0)) – Fractional orders for the filter bank.
sigmas (sequence of float, default=(2.0,)) – Scale parameters for fractional kernels.
kernel_size (int, default=15) – Size of convolution kernels (must be odd).
mode (str, default='same') – Convolution mode: ‘same’ or ‘valid’.
kernel_type (str, default='heuristic') – Fractional kernel type: ‘heuristic’ or ‘grunwald’.
standardize (bool, default=True) – Whether to standardize X and Y before fitting.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Convolutional featurizer using a bank of fractional filters.
Builds features by convolving input spectra with multiple fractional
order filters at different scales. This captures derivative-like
information at various fractional orders, which can be useful for
identifying spectral features.
Parameters:
alphas (sequence of float, default=(0.0, 0.5, 1.0, 1.5, 2.0)) – Fractional orders for the filter bank.
- 0: Smoothing/identity-like
- 0.5: Half-derivative
- 1: First derivative
- 1.5: Fractional between 1st and 2nd derivative
- 2: Second derivative
sigmas (sequence of float, default=(2.0,)) – Scale parameters. If single value, same sigma for all alphas.
If same length as alphas, pairs (alpha[i], sigma[i]).
kernel_size (int, default=15) – Size of convolution kernels (should be odd).
mode (str, default='same') – Convolution mode:
- ‘same’: Output same length as input
- ‘valid’: Output shorter (no padding)
kernel_type (str, default='heuristic') – Type of fractional kernel:
- ‘heuristic’: Gaussian-modulated fractional power
- ‘grunwald’: Grünwald-Letnikov coefficients
A sklearn-compatible wrapper for the ikpls package, which provides
fast PLS implementations using NumPy or JAX (for GPU/TPU acceleration).
IKPLS is significantly faster than sklearn’s PLSRegression, especially
for cross-validation.
Parameters:
n_components (int, default=10) – Number of PLS components to extract.
algorithm (int, default=1) – IKPLS algorithm variant (1 or 2). Algorithm 1 is generally faster.
center (bool, default=True) – Whether to center X and Y before fitting.
scale (bool, default=True) – Whether to scale X and Y before fitting.
backend (str, default='numpy') – Backend to use for computation. Options are:
- ‘numpy’: Use NumPy backend (CPU only).
- ‘jax’: Use JAX backend (supports GPU/TPU acceleration).
JAX backend requires JAX to be installed: pipinstalljax
For GPU support: pipinstalljax[cuda12]
Configure whether metadata should be requested to be passed to the predict method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
n_components (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for n_components parameter in predict.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
iPLS evaluates PLS models on contiguous wavelength intervals to identify
optimal spectral regions for prediction. This is particularly useful for
NIR spectroscopy where not all wavelengths contribute equally to the
prediction.
The algorithm divides the spectrum into intervals and evaluates each
interval (or combination of intervals) using cross-validation. Different
selection modes are available:
- ‘single’: Select only the best performing interval
- ‘forward’: Iteratively add intervals that improve performance
- ‘backward’: Start with all intervals and remove those that don’t help
Parameters:
n_components (int, default=5) – Number of PLS components to extract for each interval model.
n_intervals (int, default=10) – Number of equal-width intervals to divide X into.
interval_width (int, optional) – Fixed width for each interval. If specified, overrides n_intervals.
cv (int, default=5) – Number of cross-validation folds for interval evaluation.
scoring (str, default='r2') – Scoring metric for cross-validation. Supports sklearn metrics
like ‘r2’, ‘neg_mean_squared_error’, etc.
mode ({'single', 'forward', 'backward'}, default='forward') – Interval selection mode:
- ‘single’: Use only the best single interval
- ‘forward’: Forward selection of intervals
- ‘backward’: Backward elimination of intervals
combination_method ({'best', 'union'}, default='union') – How to combine selected intervals for the final model:
- ‘best’: Use only the single best interval
- ‘union’: Use union of all selected intervals
backend (str, default='numpy') – Computational backend:
- ‘numpy’: NumPy backend (CPU only, default)
- ‘jax’: JAX backend (supports GPU/TPU acceleration)
Note: JAX backend accelerates interval evaluation but final
model fitting uses sklearn for compatibility.
>>> fromnirs4all.operators.models.sklearn.iplsimportIntervalPLS>>> importnumpyasnp>>> # Generate sample spectral data>>> np.random.seed(42)>>> X=np.random.randn(100,200)# 200 wavelengths>>> y=X[:,50:70].sum(axis=1)+0.1*np.random.randn(100)# Signal in 50-70>>> # Fit iPLS to find informative regions>>> model=IntervalPLS(n_components=5,n_intervals=10,mode='forward')>>> model.fit(X,y)IntervalPLS(n_components=5, n_intervals=10, mode='forward')>>> # See which intervals were selected>>> print(f"Selected intervals: {model.selected_intervals_}")>>> print(f"Selected regions: {model.selected_regions_}")>>> # Predict>>> predictions=model.predict(X)
Notes
iPLS is particularly effective for NIR spectroscopy because:
1. Different spectral regions contain different chemical information
2. Some regions may be dominated by noise or uninformative signals
3. Selecting optimal intervals can improve both prediction and interpretation
The JAX backend provides acceleration for interval evaluation when using
GPU/TPU, which is beneficial when evaluating many intervals.
Norgaard, L., et al. (2000). Interval partial least-squares
regression (iPLS): A comparative chemometric study with an example
from near-infrared spectroscopy. Applied Spectroscopy, 54(3), 413-419.
Get detailed information about intervals and selection.
Returns:
info – Dictionary containing:
- ‘n_intervals’: Number of intervals
- ‘interval_scores’: CV scores for each interval
- ‘interval_ranges’: List of (start, end) for each interval
- ‘selected_intervals’: Indices of selected intervals
- ‘selected_regions’: (start, end) pairs for selected regions
- ‘n_selected_features’: Total number of selected features
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
K-OPLS combines kernel methods with Orthogonal PLS to handle nonlinear
relationships in the data. It first removes Y-orthogonal variation from
the kernel matrix, then fits a kernel PLS model on the filtered kernel.
This implementation follows the algorithm from ConsensusOPLS R package,
which is based on the original K-OPLS algorithm by Rantalainen et al.
Parameters:
n_components (int, default=5) – Number of predictive PLS components.
n_ortho_components (int, default=1) – Number of orthogonal components to remove. These represent
Y-orthogonal variation that would hurt prediction.
kernel (str, default='rbf') – Kernel function to use:
- ‘linear’: Linear kernel K(x,y) = x^T y
- ‘rbf’: Radial basis function K(x,y) = exp(-gamma ||x-y||^2)
- ‘poly’: Polynomial kernel K(x,y) = (gamma x^T y + coef0)^degree
gamma (float, optional) – Kernel coefficient for ‘rbf’ and ‘poly’ kernels.
If None, uses 1/n_features.
degree (int, default=3) – Degree for polynomial kernel.
coef0 (float, default=1.0) – Independent term in polynomial kernel.
center (bool, default=True) – Whether to center the kernel matrix.
scale (bool, default=True) – Whether to scale Y to unit variance.
Training data (stored for kernel computation at predict time).
Type:
ndarray of shape (n_samples, n_features)
y_mean\_
Mean of Y.
Type:
ndarray of shape (n_targets,)
y_std\_
Standard deviation of Y.
Type:
ndarray of shape (n_targets,)
x_scores\_
X scores from filtered kernel PLS (T).
Type:
ndarray of shape (n_samples, n_components)
y_scores\_
Y scores (U).
Type:
ndarray of shape (n_samples, n_components)
y_loadings\_
Y loadings (C).
Type:
ndarray of shape (n_targets, n_components)
ortho_scores\_
Orthogonal scores (T_ortho).
Type:
ndarray of shape (n_samples, n_ortho_components)
Examples
>>> fromnirs4all.operators.models.sklearn.koplsimportKOPLS>>> importnumpyasnp>>> # Generate nonlinear data>>> np.random.seed(42)>>> X=np.random.randn(100,50)>>> y=np.sin(X[:,:5].sum(axis=1))+0.1*np.random.randn(100)>>> # Fit K-OPLS with RBF kernel>>> model=KOPLS(n_components=5,n_ortho_components=2,kernel='rbf')>>> model.fit(X,y)KOPLS(...)>>> predictions=model.predict(X)>>> # Transform to score space>>> T=model.transform(X)>>> print(T.shape)(100, 5)
References
Rantalainen, M., Bylesjo, M., Cloarec, O., Nicholson, J. K.,
Holmes, E., & Trygg, J. (2007). Kernel-based orthogonal
projections to latent structures (K-OPLS). Journal of
Chemometrics, 21(7-9), 376-385.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Nonlinear PLS using Kernel Methods (Kernel PLS / NL-PLS).
Kernel PLS maps the input data X into a higher-dimensional feature space
using a kernel function (RBF, polynomial, sigmoid) and then fits a PLS
model on the kernel matrix K(X, X). This allows capturing nonlinear
relationships between X and Y while retaining the interpretability of PLS.
The algorithm:
1. Compute kernel matrix K = kernel(X_train, X_train)
2. Center the kernel matrix
3. Fit PLS on K with target Y
4. For prediction: K_test = kernel(X_test, X_train), center, predict
This is a simple and effective approach for nonlinear regression that
combines the power of kernel methods with PLS dimensionality reduction.
Parameters:
n_components (int, default=10) – Number of PLS components to extract.
kernel ({'rbf', 'linear', 'poly', 'sigmoid'}, default='rbf') – Kernel function to use:
- ‘rbf’: Radial basis function K(x,y) = exp(-gamma ||x-y||^2)
- ‘linear’: Linear kernel K(x,y) = x^T y (equivalent to standard PLS)
- ‘poly’: Polynomial kernel K(x,y) = (gamma * x^T y + coef0)^degree
- ‘sigmoid’: Sigmoid kernel K(x,y) = tanh(gamma * x^T y + coef0)
gamma (float, optional) – Kernel coefficient for ‘rbf’, ‘poly’, and ‘sigmoid’ kernels.
If None, defaults to 1/n_features.
degree (int, default=3) – Degree for polynomial kernel.
coef0 (float, default=1.0) – Independent term in polynomial and sigmoid kernels.
center_kernel (bool, default=True) – Whether to center the kernel matrix. Recommended for most cases.
scale_y (bool, default=True) – Whether to center and scale Y to zero mean and unit variance.
Kernel PLS is particularly useful when:
- The relationship between X and Y is nonlinear
- Standard linear PLS gives poor predictions
- You want to use kernel methods but need PLS-style dimensionality reduction
The choice of kernel and gamma parameter significantly affects performance.
Cross-validation is recommended for hyperparameter tuning.
For NIRS data, the RBF kernel with small gamma often works well for
capturing nonlinear spectral-property relationships.
Rosipal, R., & Trejo, L. J. (2001). Kernel partial least squares
regression in reproducing kernel hilbert space. Journal of Machine
Learning Research, 2, 97-123.
Configure whether metadata should be requested to be passed to the predict method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
n_components (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for n_components parameter in predict.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Locally-Weighted Partial Least Squares (LWPLS) regressor.
LWPLS builds a local PLS model for each query sample, weighting
training samples by their similarity (proximity) to the query.
This approach is useful for:
Data with local nonlinearity
Drifting processes where the relationship changes over time
Heterogeneous data where a single global model is inadequate
The similarity is computed using a Gaussian kernel based on
Euclidean distance, controlled by the lambda_in_similarity parameter.
Parameters:
n_components (int, default=10) – Maximum number of PLS components to extract for each local model.
lambda_in_similarity (float, default=1.0) – Kernel width parameter. Smaller values create more localized models
(more weight on nearby samples), larger values approach global PLS.
Typical values range from 2^-9 to 2^5 depending on the data.
scale (bool, default=True) – Whether to standardize X and y before fitting. Strongly recommended
as LWPLS uses Euclidean distances.
backend (str, default='numpy') – Computational backend to use. Options are:
- ‘numpy’: NumPy backend (CPU only, default).
- ‘jax’: JAX backend (supports GPU/TPU acceleration).
- ‘torch’: PyTorch backend (supports GPU acceleration).
JAX backend requires JAX to be installed: pipinstalljax
For GPU support: pipinstalljax[cuda12]
PyTorch backend requires PyTorch: pipinstalltorch
For GPU support: pipinstalltorch with CUDA.
batch_size (int, default=64) – Number of test samples to process per batch (JAX/torch backends).
Reduce this if running out of GPU memory on large datasets.
Ignored for NumPy backend.
Stored training X data (standardized if scale=True).
Type:
ndarray of shape (n_samples, n_features)
y_train\_
Stored training y data (standardized if scale=True).
Type:
ndarray of shape (n_samples,)
x_scaler\_
Fitted scaler for X (if scale=True).
Type:
StandardScaler or None
y_scaler\_
Fitted scaler for y (if scale=True).
Type:
StandardScaler or None
Examples
>>> fromnirs4all.operators.models.sklearn.lwplsimportLWPLS>>> importnumpyasnp>>> # Nonlinear data>>> np.random.seed(42)>>> X=5*np.random.rand(100,2)>>> y=3*X[:,0]**2+10*np.log(X[:,1]+0.1)+np.random.randn(100)>>> # Split data>>> X_train,X_test=X[:70],X[70:]>>> y_train,y_test=y[:70],y[70:]>>> # Fit LWPLS with NumPy backend (default)>>> model=LWPLS(n_components=5,lambda_in_similarity=0.25)>>> model.fit(X_train,y_train)LWPLS(n_components=5, lambda_in_similarity=0.25)>>> y_pred=model.predict(X_test)>>> # Use JAX backend for GPU acceleration>>> model_jax=LWPLS(n_components=5,lambda_in_similarity=0.25,backend='jax')>>> model_jax.fit(X_train,y_train)>>> y_pred_jax=model_jax.predict(X_test)>>> # Use PyTorch backend for GPU acceleration>>> model_torch=LWPLS(n_components=5,lambda_in_similarity=0.25,backend='torch')>>> model_torch.fit(X_train,y_train)>>> y_pred_torch=model_torch.predict(X_test)
Notes
LWPLS is computationally more expensive than standard PLS because
it builds a separate weighted model for each prediction. The training
data must be stored for prediction.
The JAX backend provides significant speedups on GPU by:
- Vectorizing the per-sample loop using jax.vmap
- JIT-compiling the prediction function
- Running on GPU/TPU when available
The PyTorch backend provides GPU acceleration by:
- Running tensor operations on CUDA or MPS devices
- Batched processing to control memory usage
- Automatic device selection when device=’auto’
The optimal lambda_in_similarity should be tuned via cross-validation.
Typical search range is 2^k for k in [-9, 6].
Kim, S., et al. (2011). Estimation of active pharmaceutical
ingredient content using locally weighted partial least squares.
International Journal of Pharmaceutics, 421(2), 269-274.
Configure whether metadata should be requested to be passed to the predict method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
n_components (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for n_components parameter in predict.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
MB-PLS fuses multiple X blocks (e.g., different preprocessing variants,
multiple sensors) into a single predictive model. Each block contributes
to the latent variables according to its relevance to Y.
Parameters:
n_components (int, default=5) – Number of latent variables to extract.
method (str, default='NIPALS') – Decomposition method. Currently only ‘NIPALS’ is supported.
standardize (bool, default=True) – Whether to standardize blocks before fitting.
max_tol (float, default=1e-14) – Convergence tolerance for NIPALS.
X (array-like of shape (n_samples, n_features) or list of arrays) – Training data. Can be a single matrix or a list of X blocks
for true multiblock analysis (NumPy backend only).
y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target values.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Online Koopman Latent-Mode Partial Least Squares (OKLM-PLS).
OKLM-PLS combines Koopman operator theory with PLS for time-series
regression. It learns latent scores T = ψ(X) @ W and simultaneously:
- Enforces dynamic coherence: T_{t+1} ≈ F @ T_t
- Learns regression: Y_t ≈ T_t @ B
This is useful for spectral data collected over time where temporal
coherence provides additional predictive information.
Parameters:
n_components (int, default=5) – Number of latent components.
featurizer (TransformerMixin, optional) – Feature map ψ: X -> Z. If None, identity is used.
Options include PolynomialFeaturizer and RBFFeaturizer.
lambda_dyn (float, default=1.0) – Weight for dynamic consistency loss ||T_{t+1} - F @ T_t||².
Higher values enforce stronger temporal coherence.
lambda_reg_y (float, default=1.0) – Weight for regression loss ||Y - T @ B||².
max_iter (int, default=50) – Maximum alternating optimization iterations.
tol (float, default=1e-4) – Convergence tolerance on the objective function.
warm_start_pls (bool, default=True) – If True, initialize W/B from a standard PLSRegression fit.
standardize (bool, default=True) – Whether to standardize X and Y before fitting.
>>> fromnirs4all.operators.models.sklearn.oklmplsimportOKLMPLS>>> importnumpyasnp>>> # Generate time-series data>>> np.random.seed(42)>>> X=np.random.randn(100,50)>>> y=X[:,:5].sum(axis=1)+0.1*np.random.randn(100)>>> # Fit OKLM-PLS>>> model=OKLMPLS(n_components=10,lambda_dyn=1.0,lambda_reg_y=1.0)>>> model.fit(X,y)OKLMPLS(...)>>> predictions=model.predict(X)>>> # Use with polynomial featurizer for nonlinearity>>> fromnirs4all.operators.models.sklearn.oklmplsimportPolynomialFeaturizer>>> model_poly=OKLMPLS(n_components=10,featurizer=PolynomialFeaturizer(degree=2))>>> model_poly.fit(X,y)
Notes
OKLM-PLS is designed for temporally-ordered data where samples are
sequential in time. The dynamics constraint helps capture temporal
patterns and can improve prediction when the underlying process
has smooth temporal evolution.
For non-temporal data, set lambda_dyn=0 to disable the dynamics
constraint (equivalent to standard PLS with optional featurization).
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
The method works on simple estimators as well as on nested objects
(such as Pipeline). The latter have
parameters of the form <component>__<parameter> so that it’s
possible to update each component of a nested object.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
# Explicitly declare estimator type for sklearn compatibility (e.g., StackingClassifier)
_estimator_type = “classifier”
OPLS-DA combines OPLS filtering with PLS-DA classification.
It removes Y-orthogonal variation from X before applying PLS-DA,
improving class separation and model interpretability.
Parameters:
n_components (int, default=1) – Number of orthogonal components to remove.
pls_components (int, default=5) – Number of PLS components for the discriminant model.
scale (bool, default=True) – Whether to scale X before fitting.
Bylesjö, M., et al. (2006). OPLS discriminant analysis: combining
the strengths of PLS-DA and SIMCA classification. Journal of
Chemometrics, 20(8-10), 341-351.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
The method works on simple estimators as well as on nested objects
(such as Pipeline). The latter have
parameters of the form <component>__<parameter> so that it’s
possible to update each component of a nested object.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Recursive Partial Least Squares (Recursive PLS) regressor.
Recursive PLS enables online model updates for drifting processes.
It uses a forgetting factor to exponentially weight old samples,
allowing the model to adapt to non-stationary data streams.
The algorithm maintains running covariance matrices that are updated
incrementally with each new batch of samples. The PLS loadings are
then recomputed from these updated covariances.
Parameters:
n_components (int, default=10) – Number of PLS components to extract.
forgetting_factor (float, default=0.99) – Forgetting factor in (0, 1]. Controls the rate of adaptation:
- 1.0: No forgetting, standard batch PLS
- <1.0: Exponential forgetting of old samples
- Typical values: 0.95-0.999 depending on drift speed
scale (bool, default=True) – Whether to scale X and Y to unit variance.
center (bool, default=True) – Whether to center X and Y (subtract mean).
Recursive PLS is particularly useful when:
- Data arrives in streams and batch retraining is too expensive
- Process conditions drift over time (sensor aging, raw material changes)
- You need to adapt a calibration model to local conditions
The forgetting factor controls the adaptation speed:
- Higher values (0.99-0.999): Slow adaptation, stable model
- Lower values (0.9-0.95): Fast adaptation, may be unstable
Qin, S. J. (1998). Recursive PLS algorithms for adaptive data
modeling. Computers & Chemical Engineering, 22(4-5), 503-514.
Dayal, B. S., & MacGregor, J. F. (1997). Recursive exponentially
weighted PLS and its applications to adaptive control and prediction.
Journal of Process Control, 7(3), 169-179.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Robust Partial Least Squares (Robust PLS) regressor.
Robust PLS uses iteratively reweighted least squares (IRLS) to down-weight
outliers during model fitting. This makes the model more resistant to
outliers in both X (leverage points) and Y (vertical outliers).
The algorithm iterates between:
1. Fitting PLS with weighted covariance matrix
2. Computing residuals and updating weights using robust M-estimation
Two weighting schemes are available:
- ‘huber’: Huber’s psi function - smooth transition from L2 to L1
- ‘tukey’: Tukey’s bisquare - completely down-weights extreme outliers
Parameters:
n_components (int, default=10) – Number of PLS components to extract.
weighting ({'huber', 'tukey'}, default='huber') – Robust weighting scheme:
- ‘huber’: Huber’s psi function with smooth redescending.
- ‘tukey’: Tukey’s bisquare with hard rejection of outliers.
c (float or None, default=None) – Tuning constant for the weight function. Controls the threshold
beyond which observations are down-weighted.
- For ‘huber’: default is 1.345 (95% efficiency)
- For ‘tukey’: default is 4.685 (95% efficiency)
max_iter (int, default=100) – Maximum number of IRLS iterations.
tol (float, default=1e-6) – Convergence tolerance for weight changes.
scale (bool, default=True) – Whether to scale X and Y to unit variance.
center (bool, default=True) – Whether to center X and Y (subtract mean).
backend (str, default='numpy') – Computational backend to use:
- ‘numpy’: NumPy backend (CPU only).
- ‘jax’: JAX backend (supports GPU/TPU acceleration).
Note: IRLS weight computation is always done in NumPy for consistency.
The backend affects only the final PLS fit and prediction.
Final sample weights from IRLS. Low values indicate potential outliers.
Type:
ndarray of shape (n_samples,)
Examples
>>> fromnirs4all.operators.models.sklearn.robust_plsimportRobustPLS>>> importnumpyasnp>>> # Generate data with outliers>>> np.random.seed(42)>>> X=np.random.randn(100,50)>>> y=X[:,:5].sum(axis=1)+0.1*np.random.randn(100)>>> # Add outliers>>> y[0:5]=y[0:5]+10# Vertical outliers>>> # Fit Robust PLS>>> model=RobustPLS(n_components=10,weighting='huber')>>> model.fit(X,y)RobustPLS(n_components=10, weighting='huber')>>> predictions=model.predict(X)>>> # Check which samples were down-weighted (potential outliers)>>> outlier_mask=model.sample_weights_<0.5>>> print(f"Potential outliers: {np.where(outlier_mask)[0]}")
Notes
Robust PLS is particularly useful when:
- Data contains outliers in X or Y
- Standard PLS gives poor predictions due to leverage points
- You want to identify potential outliers via sample weights
The sample_weights_ attribute can be used to identify outliers after fitting.
Samples with low weights (e.g., < 0.5) may be outliers worth investigating.
Hubert, M., & Vanden Branden, K. (2003). Robust procedures for
partial least squares regression. Chemometrics and Intelligent
Laboratory Systems, 65(2), 101-121.
Gil, J. A., & Romera, R. (1998). On robust partial least squares
(PLS) methods. Journal of Chemometrics, 12(6), 365-378.
Configure whether metadata should be requested to be passed to the predict method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
n_components (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for n_components parameter in predict.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
SIMPLS is an alternative to NIPALS-based PLS that computes components
via projections of the covariance matrix X’Y. It produces the same
predictions as PLSRegression for univariate Y, and slightly different
(but equivalent in terms of prediction accuracy) results for multivariate Y.
SIMPLS is often faster than NIPALS for high-dimensional data because
it avoids the iterative deflation of X.
Parameters:
n_components (int, default=10) – Number of PLS components to extract.
scale (bool, default=True) – Whether to scale X and Y to unit variance.
center (bool, default=True) – Whether to center X and Y (subtract mean).
backend (str, default='numpy') – Computational backend to use:
- ‘numpy’: NumPy backend (CPU only).
- ‘jax’: JAX backend (supports GPU/TPU acceleration).
JAX backend requires JAX to be installed: pipinstalljax
For GPU support: pipinstalljax[cuda12]
>>> fromnirs4all.operators.models.sklearn.simplsimportSIMPLS>>> importnumpyasnp>>> # Generate sample data>>> np.random.seed(42)>>> X=np.random.randn(100,50)>>> y=X[:,:5].sum(axis=1)+0.1*np.random.randn(100)>>> # Fit SIMPLS model>>> model=SIMPLS(n_components=10)>>> model.fit(X,y)SIMPLS(n_components=10)>>> predictions=model.predict(X)>>> # Use JAX backend for GPU acceleration>>> model_jax=SIMPLS(n_components=10,backend='jax')
Notes
SIMPLS differs from NIPALS in how the deflation is performed:
- NIPALS deflates X after each component (X := X - t*p’)
- SIMPLS deflates the covariance matrix S = X’Y
For univariate Y, both methods produce identical predictions.
For multivariate Y, SIMPLS produces Y loadings that span the same
space as NIPALS but with slightly different orientations.
de Jong, S. (1993). SIMPLS: An alternative approach to partial
least squares regression. Chemometrics and Intelligent Laboratory
Systems, 18(3), 251-263.
Configure whether metadata should be requested to be passed to the predict method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
n_components (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for n_components parameter in predict.
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
Sparse PLS (sPLS) regressor with L1 regularization.
Sparse PLS performs joint prediction and variable selection by applying
L1 (Lasso) regularization to the PLS loadings. This produces sparse
loadings where many wavelengths/features have zero weights, effectively
selecting the most relevant variables.
Parameters:
n_components (int, default=5) – Number of latent variables to extract.
alpha (float, default=1.0) – Regularization strength. Higher values produce more sparsity.
max_iter (int, default=500) – Maximum number of iterations.
tol (float, default=1e-6) – Convergence tolerance.
scale (bool, default=True) – Whether to scale X and y before fitting.
backend (str, default='numpy') – Backend to use for computation. Options are:
- ‘numpy’: Use NumPy backend (CPU only).
- ‘jax’: Use JAX backend (supports GPU/TPU acceleration).
JAX backend requires JAX: pipinstalljax
For GPU support: pipinstalljax[cuda12]
Configure whether metadata should be requested to be passed to the score method.
Note that this method is only relevant when this estimator is used as a
sub-estimator within a meta-estimator and metadata routing is enabled
with enable_metadata_routing=True (see sklearn.set_config()).
Please check the User Guide on how the routing
mechanism works.
The options for each parameter are:
True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (sklearn.utils.metadata_routing.UNCHANGED) retains the
existing request. This allows you to change the request for some
parameters and not others.
Added in version 1.3.
Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.