nirs4all.operators.models.sklearn.fckpls module

Fractional Convolutional Kernel PLS (FCK-PLS) regressor for nirs4all.

A sklearn-compatible implementation of FCK-PLS that uses fractional order convolutional filters to build spectral features, then applies PLS regression. This approach is particularly suited for NIRS data where derivative-like features at various fractional orders can capture different spectral signatures.

Supports both NumPy (CPU) and JAX (GPU/TPU) backends.

References

Mathematical formulation

Let

  • X ∈ ℝ^{n×p} be the input matrix of n samples and p features (e.g. NIRS spectra, treated as 1D signals over wavelength),

  • Y ∈ ℝ^{n×q} be the response matrix.

FCK-PLS builds an explicit feature map Φ : ℝ^p → ℝ^{D} by convolving each spectrum with a bank of L fractional filters { h_ℓ }_{ℓ=1,…,L}, and then applies PLS in this expanded feature space.

Fractional filter bank

Each filter h_ℓ ∈ ℝ^{k} (k odd) is defined by parameters (α_ℓ, σ_ℓ) that control its fractional “order” and scale. Conceptually, h_ℓ approximates a 1D operator whose frequency response has the form

H_ℓ(ω) ∝ |ω|^{α_ℓ} exp(−σ_ℓ ω²),

so that:

  • α_ℓ ≈ 0 corresponds to a smoothing operator,

  • α_ℓ ≈ 1 to a first-derivative-like operator,

  • α_ℓ ≈ 2 to a second-derivative-like operator,

with intermediate values giving fractional-order behavior. In practice, h_ℓ is implemented as a discrete, symmetric 1D kernel constructed from (α_ℓ, σ_ℓ, k) and normalized for numerical stability.

For a single spectrum x ∈ ℝ^p, the convolution with filter ℓ is

f_ℓ = x * h_ℓ, f_ℓ ∈ ℝ^{p′},

where * denotes 1D discrete convolution along the wavelength axis (with either “same” or “valid” output length p′).

The feature map Φ stacks all convolved signals:

Φ(x) = [ f_1ᵀ, f_2ᵀ, …, f_Lᵀ ]ᵀ ∈ ℝ^{D}, D = L · p′.

Collecting all samples, we form the feature-expanded matrix

X_feat ∈ ℝ^{n×D}, row i = Φ(x_i)ᵀ.

PLS in feature space

On X_feat and Y, FCK-PLS applies a standard PLS regression:

  • find loading matrix W_feat ∈ ℝ^{D×r},

  • scores T = X_feat W_feat ∈ ℝ^{n×r},

  • regression matrix C ∈ ℝ^{r×q},

such that:

  1. the covariance between T and Y is maximized (PLS objective), and

  2. the regression Y ≈ T C is well fitted in least-squares sense.

Equivalently, one can define a kernel in the original input space

K_{ij} = Φ(x_i)ᵀ Φ(x_j),

and view FCK-PLS as a (linear) Kernel PLS in the feature space induced by the fractional convolutional map Φ. In this implementation, the feature map is explicit (X_feat is computed directly) and a standard PLSRegression is applied.

Prediction for new data X* proceeds as follows:

  1. apply the same preprocessing and fractional convolutional featurizer to get X*_feat,

  2. compute scores T* = X*_feat W_feat,

  3. output Ŷ* = T* C (with inverse scaling if standardization is used).

By tuning { (α_ℓ, σ_ℓ) } and the number of components r, FCK-PLS can adaptively emphasize different fractional smooth/derivative behaviors and scales in the spectra, providing a flexible family of preprocessing+PLS models specialized to 1D spectral data.

class nirs4all.operators.models.sklearn.fckpls.FCKPLS(n_components: int = 10, alphas: Sequence[float] = (0.0, 0.5, 1.0, 1.5, 2.0), sigmas: Sequence[float] = (2.0,), kernel_size: int = 15, mode: Literal['same', 'valid'] = 'same', kernel_type: Literal['heuristic', 'grunwald'] = 'heuristic', standardize: bool = True, backend: str = 'numpy')[source]

Bases: BaseEstimator, RegressorMixin

Fractional Convolutional Kernel PLS (FCK-PLS).

FCK-PLS builds spectral features by convolving input spectra with a bank of fractional order filters, then applies PLS regression on the expanded feature space. This approach captures derivative-like information at various fractional orders.

The pipeline is: 1. Optional standardization of X and Y 2. FractionalConvFeaturizer: X -> X_feat (feature expansion) 3. PLSRegression: X_feat, Y -> predictions

Parameters:
  • n_components (int, default=10) – Number of PLS components to extract.

  • alphas (sequence of float, default=(0.0, 0.5, 1.0, 1.5, 2.0)) – Fractional orders for the filter bank.

  • sigmas (sequence of float, default=(2.0,)) – Scale parameters for fractional kernels.

  • kernel_size (int, default=15) – Size of convolution kernels (must be odd).

  • mode (str, default='same') – Convolution mode: ‘same’ or ‘valid’.

  • kernel_type (str, default='heuristic') – Fractional kernel type: ‘heuristic’ or ‘grunwald’.

  • standardize (bool, default=True) – Whether to standardize X and Y before fitting.

  • backend (str, default='numpy') – Computational backend: - ‘numpy’: NumPy/SciPy backend (CPU) - ‘jax’: JAX backend (supports GPU/TPU)

n_features_in_

Number of input features.

Type:

int

n_features_out_

Number of features after convolution.

Type:

int

featurizer_

The fitted fractional featurizer.

Type:

FractionalConvFeaturizer

pls_

The fitted PLS model.

Type:

PLSRegression

Examples

>>> from nirs4all.operators.models.sklearn.fckpls import FCKPLS
>>> import numpy as np
>>> # Generate spectral data
>>> np.random.seed(42)
>>> X = np.random.randn(100, 200)  # 100 samples, 200 wavelengths
>>> y = X[:, 50:60].mean(axis=1) + 0.1 * np.random.randn(100)
>>> # Fit FCK-PLS with default fractional orders
>>> model = FCKPLS(n_components=10, alphas=(0.0, 0.5, 1.0, 1.5, 2.0))
>>> model.fit(X, y)
FCKPLS(...)
>>> predictions = model.predict(X)
>>> # Use specific fractional orders
>>> model2 = FCKPLS(n_components=10, alphas=(0.0, 1.0, 2.0), sigmas=(3.0,))
>>> model2.fit(X, y)

Notes

The fractional order α controls the type of spectral feature extracted: - α ≈ 0: Smoothed spectrum (low-pass filtering) - α ≈ 1: First derivative-like (highlights slopes) - α ≈ 2: Second derivative-like (highlights peaks/valleys) - Fractional α: Intermediate behavior

The sigma parameter controls the scale of the filter. Larger sigma captures broader spectral features; smaller sigma captures local details.

FCK-PLS can be computationally expensive with many filters and large spectra. Consider using the JAX backend for GPU acceleration.

See also

SIMPLS

Standard PLS without feature expansion.

IntervalPLS

PLS with wavelength interval selection.

__repr__() str[source]

Return string representation.

fit(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str], y: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) FCKPLS[source]

Fit the FCK-PLS model.

Parameters:
Returns:

self – Fitted estimator.

Return type:

FCKPLS

Raises:
get_filter_info() dict[source]

Get information about the fractional filter bank.

Returns:

info – Dictionary containing filter parameters.

Return type:

dict

get_fractional_features(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]

Get the fractional convolution features.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input spectra.

Returns:

X_feat – Fractional convolution features.

Return type:

ndarray of shape (n_samples, n_features_out)

get_params(deep: bool = True) dict[source]

Get parameters for this estimator.

predict(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]

Predict using the FCK-PLS model.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

y_pred – Predicted values.

Return type:

ndarray of shape (n_samples,) or (n_samples, n_targets)

set_params(**params) FCKPLS[source]

Set the parameters of this estimator.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FCKPLS

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

transform(X: _Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | complex | bytes | str | _NestedSequence[complex | bytes | str]) ndarray[tuple[Any, ...], dtype[floating]][source]

Transform X to PLS score space.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to transform.

Returns:

T – PLS scores in the feature-expanded space.

Return type:

ndarray of shape (n_samples, n_components)

class nirs4all.operators.models.sklearn.fckpls.FractionalConvFeaturizer(alphas: Sequence[float] = (0.0, 0.5, 1.0, 1.5, 2.0), sigmas: Sequence[float] = (2.0,), kernel_size: int = 15, mode: Literal['same', 'valid'] = 'same', kernel_type: Literal['heuristic', 'grunwald'] = 'heuristic')[source]

Bases: BaseEstimator, TransformerMixin

Convolutional featurizer using a bank of fractional filters.

Builds features by convolving input spectra with multiple fractional order filters at different scales. This captures derivative-like information at various fractional orders, which can be useful for identifying spectral features.

Parameters:
  • alphas (sequence of float, default=(0.0, 0.5, 1.0, 1.5, 2.0)) – Fractional orders for the filter bank. - 0: Smoothing/identity-like - 0.5: Half-derivative - 1: First derivative - 1.5: Fractional between 1st and 2nd derivative - 2: Second derivative

  • sigmas (sequence of float, default=(2.0,)) – Scale parameters. If single value, same sigma for all alphas. If same length as alphas, pairs (alpha[i], sigma[i]).

  • kernel_size (int, default=15) – Size of convolution kernels (should be odd).

  • mode (str, default='same') – Convolution mode: - ‘same’: Output same length as input - ‘valid’: Output shorter (no padding)

  • kernel_type (str, default='heuristic') – Type of fractional kernel: - ‘heuristic’: Gaussian-modulated fractional power - ‘grunwald’: Grünwald-Letnikov coefficients

kernels_

Precomputed convolution kernels.

Type:

list of ndarray

n_kernels_

Number of kernels in the filter bank.

Type:

int

fit(X, y=None)[source]

Precompute convolution kernels.

Parameters:
Returns:

self

Return type:

FractionalConvFeaturizer

get_kernel_info() dict[source]

Get information about the filter bank.

Returns:

info – Dictionary containing kernel parameters and shapes.

Return type:

dict

transform(X)[source]

Apply fractional convolution filter bank.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input spectra.

Returns:

X_feat – Convolved features. n_features_out depends on mode: - ‘same’: n_features * n_kernels - ‘valid’: (n_features - kernel_size + 1) * n_kernels

Return type:

ndarray of shape (n_samples, n_features_out)

nirs4all.operators.models.sklearn.fckpls.FractionalPLS

alias of FCKPLS

nirs4all.operators.models.sklearn.fckpls.fractional_kernel_1d(alpha: float, sigma: float, kernel_size: int) ndarray[tuple[Any, ...], dtype[floating]][source]

Build a 1D discrete kernel for fractional smoothing/derivative.

This kernel approximates fractional order operators by combining a Gaussian envelope with a fractional power modulation. The result captures derivative-like behavior at non-integer orders.

Parameters:
  • alpha (float) – Fractional order in [0, 2]: - 0: Pure smoothing (Gaussian-like) - 1: First-derivative-like behavior - 2: Second-derivative-like behavior Intermediate values provide fractional derivatives.

  • sigma (float) – Scale parameter controlling the width of the kernel. Larger sigma = wider filter, more smoothing.

  • kernel_size (int) – Number of points in the kernel (should be odd).

Returns:

h – Normalized discrete filter.

Return type:

ndarray of shape (kernel_size,)

Notes

This is a heuristic approximation suitable for spectral data. For mathematically rigorous fractional derivatives, use FFT-based implementations with frequency domain multiplication by |ω|^α.

nirs4all.operators.models.sklearn.fckpls.fractional_kernel_grrunwald_letnikov(alpha: float, kernel_size: int) ndarray[tuple[Any, ...], dtype[floating]][source]

Build Grünwald-Letnikov fractional derivative kernel.

This is a more mathematically rigorous approximation of the fractional derivative operator using the Grünwald-Letnikov definition.

Parameters:
  • alpha (float) – Fractional order (can be any real number).

  • kernel_size (int) – Number of points in the kernel.

Returns:

h – Grünwald-Letnikov coefficients.

Return type:

ndarray of shape (kernel_size,)

Notes

The Grünwald-Letnikov definition: D^α f(x) ≈ lim_{h→0} (1/h^α) Σ_{j=0}^{n} (-1)^j C(α,j) f(x - jh)

where C(α,j) = Γ(α+1) / (Γ(j+1) * Γ(α-j+1))