nirs4all.operators.models.sklearn.fckpls module

Fractional Convolutional Kernel PLS (FCK-PLS) regressor for nirs4all.

A sklearn-compatible implementation of FCK-PLS that uses fractional order convolutional filters to build spectral features, then applies PLS regression. This approach is particularly suited for NIRS data where derivative-like features at various fractional orders can capture different spectral signatures.

Supports both NumPy (CPU) and JAX (GPU/TPU) backends.

References

Mathematical formulation

Let

X ∈ ℝ^{n×p} be the input matrix of n samples and p features (e.g. NIRS spectra, treated as 1D signals over wavelength),
Y ∈ ℝ^{n×q} be the response matrix.

FCK-PLS builds an explicit feature map Φ : ℝ^p → ℝ^{D} by convolving each spectrum with a bank of L fractional filters { h_ℓ }_{ℓ=1,…,L}, and then applies PLS in this expanded feature space.

Fractional filter bank

Each filter h_ℓ ∈ ℝ^{k} (k odd) is defined by parameters (α_ℓ, σ_ℓ) that control its fractional “order” and scale. Conceptually, h_ℓ approximates a 1D operator whose frequency response has the form

H_ℓ(ω) ∝ |ω|^{α_ℓ} exp(−σ_ℓ ω²),

so that:

α_ℓ ≈ 0 corresponds to a smoothing operator,
α_ℓ ≈ 1 to a first-derivative-like operator,
α_ℓ ≈ 2 to a second-derivative-like operator,

with intermediate values giving fractional-order behavior. In practice, h_ℓ is implemented as a discrete, symmetric 1D kernel constructed from (α_ℓ, σ_ℓ, k) and normalized for numerical stability.

For a single spectrum x ∈ ℝ^p, the convolution with filter ℓ is

f_ℓ = x * h_ℓ, f_ℓ ∈ ℝ^{p′},

where * denotes 1D discrete convolution along the wavelength axis (with either “same” or “valid” output length p′).

The feature map Φ stacks all convolved signals:

Φ(x) = [ f_1ᵀ, f_2ᵀ, …, f_Lᵀ ]ᵀ ∈ ℝ^{D}, D = L · p′.

Collecting all samples, we form the feature-expanded matrix

X_feat ∈ ℝ^{n×D}, row i = Φ(x_i)ᵀ.

PLS in feature space

On X_feat and Y, FCK-PLS applies a standard PLS regression:

find loading matrix W_feat ∈ ℝ^{D×r},
scores T = X_feat W_feat ∈ ℝ^{n×r},
regression matrix C ∈ ℝ^{r×q},

such that:

the covariance between T and Y is maximized (PLS objective), and
the regression Y ≈ T C is well fitted in least-squares sense.

Equivalently, one can define a kernel in the original input space

K_{ij} = Φ(x_i)ᵀ Φ(x_j),

and view FCK-PLS as a (linear) Kernel PLS in the feature space induced by the fractional convolutional map Φ. In this implementation, the feature map is explicit (X_feat is computed directly) and a standard PLSRegression is applied.

Prediction for new data X* proceeds as follows:

apply the same preprocessing and fractional convolutional featurizer to get X*_feat,
compute scores T* = X*_feat W_feat,
output Ŷ* = T* C (with inverse scaling if standardization is used).

By tuning { (α_ℓ, σ_ℓ) } and the number of components r, FCK-PLS can adaptively emphasize different fractional smooth/derivative behaviors and scales in the spectra, providing a flexible family of preprocessing+PLS models specialized to 1D spectral data.

class nirs4all.operators.models.sklearn.fckpls.FCKPLS(n_components: int = 10, alphas: Sequence[float] = (0.0, 0.5, 1.0, 1.5, 2.0), sigmas: Sequence[float] = (2.0,), kernel_size: int = 15, mode: Literal['same', 'valid'] = 'same', kernel_type: Literal['heuristic', 'grunwald'] = 'heuristic', standardize: bool = True, backend: str = 'numpy')[source]

Bases: BaseEstimator, RegressorMixin

Fractional Convolutional Kernel PLS (FCK-PLS).

FCK-PLS builds spectral features by convolving input spectra with a bank of fractional order filters, then applies PLS regression on the expanded feature space. This approach captures derivative-like information at various fractional orders.

The pipeline is: 1. Optional standardization of X and Y 2. FractionalConvFeaturizer: X -> X_feat (feature expansion) 3. PLSRegression: X_feat, Y -> predictions

Parameters:

n_components (int, default=10) – Number of PLS components to extract.
alphas (sequence of float, default=(0.0, 0.5, 1.0, 1.5, 2.0)) – Fractional orders for the filter bank.
sigmas (sequence of float, default=(2.0,)) – Scale parameters for fractional kernels.
kernel_size (int, default=15) – Size of convolution kernels (must be odd).
mode (str, default='same') – Convolution mode: ‘same’ or ‘valid’.
kernel_type (str, default='heuristic') – Fractional kernel type: ‘heuristic’ or ‘grunwald’.
standardize (bool, default=True) – Whether to standardize X and Y before fitting.
backend (str, default='numpy') – Computational backend: - ‘numpy’: NumPy/SciPy backend (CPU) - ‘jax’: JAX backend (supports GPU/TPU)

n_features_in_

Number of input features.

Type:: int

n_features_out_

Number of features after convolution.

Type:: int

featurizer_

The fitted fractional featurizer.

Type:: FractionalConvFeaturizer

pls_

The fitted PLS model.

Type:: PLSRegression

Examples

>>> from nirs4all.operators.models.sklearn.fckpls import FCKPLS
>>> import numpy as np
>>> # Generate spectral data
>>> np.random.seed(42)
>>> X = np.random.randn(100, 200)  # 100 samples, 200 wavelengths
>>> y = X[:, 50:60].mean(axis=1) + 0.1 * np.random.randn(100)
>>> # Fit FCK-PLS with default fractional orders
>>> model = FCKPLS(n_components=10, alphas=(0.0, 0.5, 1.0, 1.5, 2.0))
>>> model.fit(X, y)
FCKPLS(...)
>>> predictions = model.predict(X)
>>> # Use specific fractional orders
>>> model2 = FCKPLS(n_components=10, alphas=(0.0, 1.0, 2.0), sigmas=(3.0,))
>>> model2.fit(X, y)

Notes

The fractional order α controls the type of spectral feature extracted: - α ≈ 0: Smoothed spectrum (low-pass filtering) - α ≈ 1: First derivative-like (highlights slopes) - α ≈ 2: Second derivative-like (highlights peaks/valleys) - Fractional α: Intermediate behavior

The sigma parameter controls the scale of the filter. Larger sigma captures broader spectral features; smaller sigma captures local details.

FCK-PLS can be computationally expensive with many filters and large spectra. Consider using the JAX backend for GPU acceleration.