nirs4all.operators.transforms.resampler module

Wavelength resampling operators for NIRS spectral data.

This module provides resampling functionality to interpolate spectral data to new wavelength grids using various scipy interpolation methods.

class nirs4all.operators.transforms.resampler.Resampler(target_wavelengths: ndarray, method: Literal['linear', 'nearest', 'cubic', 'quadratic', 'slinear', 'zero'] = 'linear', crop_range: Tuple[float, float] | None = None, fill_value: float | str = 0.0, bounds_error: bool = False, copy: bool = True)[source]

Bases: TransformerMixin, BaseEstimator

Resample spectral data to new wavelength grid using interpolation.

This transformer interpolates NIRS spectral data from the original wavelength grid to a target wavelength grid using scipy interpolation methods.

Parameters:
  • target_wavelengths (array-like) – Target wavelengths for resampling. Must be 1D array.

  • method (str, default='linear') – Interpolation method. Supported methods: - ‘linear’: Linear interpolation - ‘nearest’: Nearest neighbor interpolation - ‘cubic’: Cubic spline interpolation - ‘quadratic’: Quadratic spline interpolation - ‘slinear’: Linear spline (order 1) - ‘zero’: Zero-order spline (piecewise constant) Future: May support additional scipy methods

  • crop_range (tuple of (float, float) or None, default=None) – Optional (min_wavelength, max_wavelength) to crop original data before resampling.

  • fill_value (float or 'extrapolate', default=0.0) – Value to use for target wavelengths outside the original range. - float: Use this constant value for extrapolation - ‘extrapolate’: Extrapolate using the interpolation method - 0.0: Default padding with zeros (safe choice)

  • bounds_error (bool, default=False) – If True, raise error when target wavelengths are outside original range. If False, use fill_value for out-of-bounds points.

  • copy (bool, default=True) – Whether to copy input data or modify in place.

original_wavelengths_

Original wavelength grid from fit data

Type:

ndarray of shape (n_features,)

n_features_in_

Number of features (wavelengths) in input data

Type:

int

n_features_out_

Number of features (wavelengths) in output data

Type:

int

interpolator_params_

Stored interpolation parameters for reconstruction

Type:

dict

Examples

>>> from nirs4all.operators.transforms import Resampler
>>> import numpy as np
>>>
>>> # Original data at 1000-2500 nm with 200 points
>>> X = np.random.randn(100, 200)
>>> original_wl = np.linspace(1000, 2500, 200)
>>>
>>> # Resample to 100 evenly-spaced wavelengths
>>> target_wl = np.linspace(1000, 2500, 100)
>>> resampler = Resampler(target_wavelengths=target_wl, method='cubic')
>>> resampler.fit(X, wavelengths=original_wl)
>>> X_resampled = resampler.transform(X)
>>> X_resampled.shape
(100, 100)

Notes

  • Wavelengths must be strictly increasing

  • Warns if target wavelengths extend beyond original range

  • Raises error if no wavelengths overlap between original and target

__repr__()[source]

String representation of the resampler.

fit(X, y=None, wavelengths: ndarray | None = None)[source]

Fit the resampler by storing original wavelength grid.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data.

  • y (None) – Ignored. Present for API consistency.

  • wavelengths (array-like of shape (n_features,), optional) – Original wavelength grid. If None, will be extracted from dataset headers by the controller.

Returns:

self – Fitted resampler.

Return type:

Resampler

get_feature_names_out(input_features=None)[source]

Get output feature names (target wavelengths as strings).

Parameters:

input_features (array-like of str or None, default=None) – Ignored. Present for API consistency.

Returns:

feature_names_out – Target wavelengths as strings.

Return type:

ndarray of str

set_fit_request(*, wavelengths: bool | None | str = '$UNCHANGED$') Resampler

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

wavelengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for wavelengths parameter in fit.

Returns:

self – The updated object.

Return type:

object

transform(X)[source]

Resample spectral data to target wavelength grid.

Parameters:

X (array-like of shape (n_samples, n_features)) – Spectral data to resample. Should have same number of features as training data.

Returns:

X_resampled – Resampled spectral data.

Return type:

ndarray of shape (n_samples, n_features_out_)