nirs4all.operators.transforms.nirs module
- class nirs4all.operators.transforms.nirs.ASLSBaseline(lam: float = 1000000.0, p: float = 0.01, max_iter: int = 50, tol: float = 0.001, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasAsymmetric Least Squares (AsLS) baseline correction.
Convenience class for ASLS baseline correction. This is equivalent to PyBaselineCorrection(method=’asls’, …).
- Parameters:
References
Eilers, P.H.C. and Boelens, H.F.M. (2005). Baseline Correction with Asymmetric Least Squares Smoothing.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') ASLSBaseline
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.AirPLS(lam: float = 1000000.0, max_iter: int = 50, tol: float = 0.001, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasAdaptive Iteratively Reweighted Penalized Least Squares baseline correction.
A robust baseline correction method that adaptively adjusts weights based on the difference between the fitted baseline and the data.
- Parameters:
References
Zhang, Z.M., et al. (2010). Baseline correction using adaptive iteratively reweighted penalized least squares. Analyst, 135(5), 1138-1146.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') AirPLS
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.ArPLS(lam: float = 1000000.0, max_iter: int = 50, tol: float = 0.001, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasAsymmetrically Reweighted Penalized Least Squares baseline correction.
- Parameters:
References
Baek, S.J., et al. (2015). Baseline correction using asymmetrically reweighted penalized least squares smoothing. Analyst, 140(1), 250-257.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') ArPLS
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.AreaNormalization(method: str = 'sum', *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorArea normalization of spectra.
Normalizes each spectrum by dividing by its total area (sum of absolute values). This removes intensity variations while preserving spectral shape.
- Parameters:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') AreaNormalization
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.BEADS(lam_0: float = 1.0, lam_1: float = 1.0, lam_2: float = 1.0, max_iter: int = 50, tol: float = 0.01, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasBaseline Estimation And Denoising with Sparsity.
Simultaneously estimates baseline and removes noise using sparsity constraints.
- Parameters:
lam_0 (float, default=1.0) – Regularization parameter for the baseline.
lam_1 (float, default=1.0) – Regularization parameter for the first derivative.
lam_2 (float, default=1.0) – Regularization parameter for the second derivative.
max_iter (int, default=50) – Maximum number of iterations.
tol (float, default=1e-2) – Convergence tolerance.
copy (bool, default=True) – Whether to copy input data.
References
Ning, X., et al. (2014). Chromatogram baseline estimation and denoising using sparsity (BEADS). Chemometrics and Intelligent Laboratory Systems, 139, 156-167.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') BEADS
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.ExtendedMultiplicativeScatterCorrection(degree: int = 2, scale: bool = True, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorExtended Multiplicative Scatter Correction (EMSC).
EMSC extends MSC by including polynomial terms to model chemical and physical light scattering effects.
- Parameters:
- class nirs4all.operators.transforms.nirs.FirstDerivative(delta: float = 1.0, edge_order: int = 2, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorFirst numerical derivative using numpy.gradient.
- Parameters:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') FirstDerivative
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.Haar(*, copy: bool = True)[source]
Bases:
WaveletShortcut to the Wavelet haar transform.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') Haar
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.IASLS(lam: float = 1000000.0, p: float = 0.01, lam_1: float = 0.0001, max_iter: int = 50, tol: float = 0.001, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasImproved Asymmetric Least Squares baseline correction.
An improvement over ASLS that uses a different weighting scheme.
- Parameters:
lam (float, default=1e6) – Smoothness parameter.
p (float, default=0.01) – Asymmetry parameter.
lam_1 (float, default=1e-4) – First derivative smoothing parameter.
max_iter (int, default=50) – Maximum number of iterations.
tol (float, default=1e-3) – Convergence tolerance.
copy (bool, default=True) – Whether to copy input data.
References
He, S., et al. (2014). Baseline correction for Raman spectra using an improved asymmetric least squares method. Analytical Methods, 6(12), 4402-4407.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') IASLS
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.IModPoly(poly_order: int = 5, max_iter: int = 250, tol: float = 0.001, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasImproved Modified Polynomial baseline correction.
A polynomial-based baseline correction that iteratively fits and removes points above the baseline.
- Parameters:
References
Zhao, J., et al. (2007). Automated autofluorescence background subtraction algorithm for biomedical Raman spectroscopy. Applied Spectroscopy, 61(11), 1225-1232.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') IModPoly
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.LogTransform(base: float = 2.718281828459045, offset: float = 0.0, auto_offset: bool = True, min_value: float = 1e-08, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorElementwise logarithm with automatic handling of edge cases.
- Parameters:
base (float, default=np.e) – Logarithm base.
offset (float, default=0.0) – Fixed value added before log to handle non-positives.
auto_offset (bool, default=True) – If True, automatically add offset to handle zeros/negatives.
min_value (float, default=1e-8) – Minimum value after offset when auto_offset=True.
copy (bool, default=True) – Whether to copy input.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') LogTransform
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.ModPoly(poly_order: int = 5, max_iter: int = 250, tol: float = 0.001, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasModified Polynomial baseline correction.
- Parameters:
References
Lieber, C.A. and Mahadevan-Jansen, A. (2003). Automated method for subtraction of fluorescence from biological Raman spectra. Applied Spectroscopy, 57(11), 1363-1367.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') ModPoly
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.MultiplicativeScatterCorrection(scale=True, *, copy=True)[source]
Bases:
TransformerMixin,BaseEstimator
- class nirs4all.operators.transforms.nirs.PyBaselineCorrection(method: str = 'asls', *, copy: bool = True, **method_params)[source]
Bases:
TransformerMixin,BaseEstimatorGeneral baseline correction using pybaselines library.
A flexible wrapper for the pybaselines library that provides access to numerous baseline correction algorithms. This transformer allows easy integration of any pybaselines method into sklearn pipelines.
- Parameters:
method (str, default='asls') –
The baseline correction method to use. Available methods by category:
- Whittaker-based (smooth baselines with asymmetric weighting):
’asls’: Asymmetric Least Squares
’iasls’: Improved Asymmetric Least Squares
’airpls’: Adaptive Iteratively Reweighted PLS
’arpls’: Asymmetrically Reweighted PLS
’drpls’: Doubly Reweighted PLS
’iarpls’: Improved ARPLS
’aspls’: Adaptive Smoothness PLS
’psalsa’: Peaked Signal’s Asymmetric Least Squares
’derpsalsa’: Derivative PSALSA
- Polynomial (polynomial fitting):
’poly’: Regular polynomial
’modpoly’: Modified polynomial
’imodpoly’: Improved modified polynomial
’penalized_poly’: Penalized polynomial
’loess’: Locally estimated scatterplot smoothing
’quant_reg’: Quantile regression
- Morphological (morphological operations):
’mor’: Morphological
’imor’: Improved morphological
’mormol’: Morphological and mollified
’amormol’: Averaging morphological and mollified
’rolling_ball’: Rolling ball algorithm
’mwmv’: Moving window minimum value
’tophat’: Top-hat transform
’mpspline’: Morphological penalized spline
’jbcd’: Joint baseline correction and denoising
- Spline (spline-based methods):
’mixture_model’: Mixture model
’irsqr’: Iteratively reweighted spline quantile regression
’corner_cutting’: Corner-cutting
’pspline_asls’, ‘pspline_iasls’, ‘pspline_airpls’, etc.
- Smooth (smoothing-based):
’noise_median’: Noise median
’snip’: Statistics-sensitive Non-linear Iterative Peak-clipping
’swima’: Small-Window Moving Average
’ipsa’: Iterative Polynomial Smoothing Algorithm
- Misc:
’beads’: Baseline estimation and denoising with sparsity
’interp_pts’: Interpolation between points
copy (bool, default=True) – Whether to copy input data.
**method_params (dict) – Additional parameters passed to the specific baseline method. Common parameters include: - lam (float): Smoothness parameter for Whittaker methods - p (float): Asymmetry parameter for ASLS-type methods - poly_order (int): Polynomial order for polynomial methods - max_half_window (int): Window size for morphological/smooth methods - max_iter (int): Maximum iterations - tol (float): Convergence tolerance
Examples
>>> from nirs4all.operators.transforms.nirs import PyBaselineCorrection >>> import numpy as np
Basic usage with ASLS: >>> transformer = PyBaselineCorrection(method=’asls’, lam=1e6, p=0.01) >>> corrected = transformer.fit_transform(spectra)
Using airPLS: >>> transformer = PyBaselineCorrection(method=’airpls’, lam=1e5) >>> corrected = transformer.fit_transform(spectra)
Using improved modified polynomial: >>> transformer = PyBaselineCorrection(method=’imodpoly’, poly_order=3) >>> corrected = transformer.fit_transform(spectra)
Using SNIP for Raman-like data: >>> transformer = PyBaselineCorrection(method=’snip’, max_half_window=40) >>> corrected = transformer.fit_transform(spectra)
Using rolling ball: >>> transformer = PyBaselineCorrection(method=’rolling_ball’, half_window=50) >>> corrected = transformer.fit_transform(spectra)
In a pipeline: >>> from sklearn.pipeline import Pipeline >>> from sklearn.preprocessing import StandardScaler >>> pipeline = Pipeline([ … (‘baseline’, PyBaselineCorrection(method=’airpls’, lam=1e5)), … (‘scale’, StandardScaler()), … ])
References
pybaselines documentation: https://pybaselines.readthedocs.io/
- fit(X, y=None)[source]
Fit the transformer (validates method and stores number of features).
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (None) – Ignored.
- Returns:
self – Fitted transformer.
- Return type:
- static list_methods()[source]
List all available baseline correction methods.
- Returns:
Dictionary with method categories as keys and list of methods as values.
- Return type:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') PyBaselineCorrection
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform(X, copy=None)[source]
Apply baseline correction to the data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input spectra.
copy (bool or None, optional) – Whether to copy the input data.
- Returns:
X_corrected – Baseline-corrected spectra.
- Return type:
ndarray of shape (n_samples, n_features)
- class nirs4all.operators.transforms.nirs.ReflectanceToAbsorbance(min_value: float = 1e-08, percent: bool = False, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorConvert reflectance spectra to absorbance using Beer-Lambert law.
Applies the transformation: A = -log10(R) = log10(1/R) where R is reflectance and A is absorbance.
This is a fundamental transformation in NIR spectroscopy, as absorbance is linearly related to concentration (Beer-Lambert law), while reflectance is not.
- Parameters:
min_value (float, default=1e-8) – Minimum value to clamp reflectance to avoid log(0). Values below this threshold will be set to min_value before applying the log transform.
percent (bool, default=False) – If True, assumes input reflectance is in percentage (0-100) and divides by 100 before conversion.
copy (bool, default=True) – Whether to copy input data.
Notes
Input reflectance values should be positive.
For reflectance in range (0, 1], output absorbance is non-negative.
For reflectance > 1 (e.g., percentage values), set percent=True.
Examples
>>> from nirs4all.operators.transforms.nirs import ReflectanceToAbsorbance >>> import numpy as np >>> R = np.array([[0.5, 0.25, 0.1], [0.8, 0.4, 0.2]]) >>> transformer = ReflectanceToAbsorbance() >>> A = transformer.fit_transform(R) >>> # A ≈ [[0.301, 0.602, 1.0], [0.097, 0.398, 0.699]]
- fit(X, y=None)[source]
Fit the transformer (no-op, included for API compatibility).
- Parameters:
X (array-like of shape (n_samples, n_features)) – Reflectance spectra.
y (None) – Ignored.
- Returns:
self – Fitted transformer.
- Return type:
- inverse_transform(X)[source]
Convert absorbance back to reflectance.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Absorbance spectra.
- Returns:
X_reflectance – Reflectance spectra.
- Return type:
ndarray of shape (n_samples, n_features)
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') ReflectanceToAbsorbance
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform(X, copy=None)[source]
Convert reflectance to absorbance.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Reflectance spectra.
copy (bool or None, optional) – Whether to copy the input data.
- Returns:
X_transformed – Absorbance spectra.
- Return type:
ndarray of shape (n_samples, n_features)
- class nirs4all.operators.transforms.nirs.RollingBall(half_window: int = 50, smooth_half_window: int = None, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasRolling Ball baseline correction.
A morphological approach that simulates rolling a ball beneath the spectrum.
- Parameters:
References
Kneen, M.A. and Annegarn, H.J. (1996). Algorithm for fitting XRF, SEM and PIXE X-ray spectra backgrounds. Nuclear Instruments and Methods in Physics Research B, 109, 209-213.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') RollingBall
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.SNIP(max_half_window: int = 40, decreasing: bool = True, smooth_half_window: int = None, *, copy: bool = True)[source]
Bases:
_BaselineMethodAliasStatistics-sensitive Non-linear Iterative Peak-clipping baseline correction.
Particularly effective for spectra with many peaks (e.g., Raman, XRF).
- Parameters:
max_half_window (int, default=40) – Maximum half-window size for the algorithm.
decreasing (bool, default=True) – Whether to use decreasing window sizes.
smooth_half_window (int or None, default=None) – Half-window for smoothing. None means no smoothing.
copy (bool, default=True) – Whether to copy input data.
References
Ryan, C.G., et al. (1988). SNIP, a statistics-sensitive background treatment for the quantitative analysis of PIXE spectra in geoscience applications. Nuclear Instruments and Methods in Physics Research B, 34(3), 396-402.
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') SNIP
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.SavitzkyGolay(window_length: int = 11, polyorder: int = 3, deriv: int = 0, delta: float = 1.0, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorA class for smoothing and differentiating data using the Savitzky-Golay filter.
Parameters:
- window_lengthint, optional (default=11)
The length of the window used for smoothing.
- polyorderint, optional (default=3)
The order of the polynomial used for fitting the samples within the window.
- derivint, optional (default=0)
The order of the derivative to compute.
- deltafloat, optional (default=1.0)
The sampling distance of the data.
- copybool, optional (default=True)
Whether to copy the input data.
Methods:
- fit(X, y=None)
Fits the transformer to the data X.
- transform(X, copy=None)
Applies the Savitzky-Golay filter to the data X.
- fit(X, y=None)[source]
Verify the X data compliance with Savitzky-Golay filter.
- Parameters:
X (array-like) – The data to transform.
y (None) – Ignored.
- Raises:
ValueError – If the input X is a sparse matrix.
- Returns:
The fitted object.
- Return type:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') SavitzkyGolay
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.SecondDerivative(delta: float = 1.0, edge_order: int = 2, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorSecond numerical derivative using numpy.gradient.
- Parameters:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') SecondDerivative
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.Wavelet(wavelet: str = 'haar', mode: str = 'periodization', *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorSingle level Discrete Wavelet Transform.
Performs a discrete wavelet transform on data, using a wavelet function.
- Parameters:
wavelet (Wavelet object or name, default='haar') – Wavelet to use: [‘Haar’, ‘Daubechies’, ‘Symlets’, ‘Coiflets’, ‘Biorthogonal’, ‘Reverse biorthogonal’, ‘Discrete Meyer (FIR Approximation)’…]
mode (str, optional, default='periodization') – Signal extension mode.
- fit(X, y=None)[source]
Verify the X data compliance with wavelet transform.
- Parameters:
X (array-like, spectra) – The data to transform.
y (None) – Ignored.
- Raises:
ValueError – If the input X is a sparse matrix.
- Returns:
The fitted object.
- Return type:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') Wavelet
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class nirs4all.operators.transforms.nirs.WaveletFeatures(wavelet: str = 'db4', max_level: int = 5, n_coeffs_per_level: int = 10, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorDiscrete Wavelet Transform feature extractor for spectral data.
Decomposes spectra into approximation (smooth trends) and detail (sharp features) coefficients at multiple scales, then extracts statistical features from each level. This captures both global baseline variations and local absorption peaks.
- Scientific basis:
Multi-resolution analysis captures features at different scales
Daubechies wavelets (db4) are well-suited for smooth signals
Wavelet coefficients are partially decorrelated
- Parameters:
wavelet (str, default='db4') – Wavelet to use (e.g., ‘haar’, ‘db4’, ‘coif3’, ‘sym4’).
max_level (int, default=5) – Maximum decomposition level.
n_coeffs_per_level (int, default=10) – Number of top coefficients (by magnitude) to extract per level.
copy (bool, default=True) – Whether to copy input data.
- actual_level_
Actual decomposition level used (may be less than max_level depending on signal length).
- Type:
References
Mallat (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE PAMI.
- fit(X, y=None)[source]
Fit the wavelet feature extractor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (None) – Ignored.
- Returns:
self – Fitted transformer.
- Return type:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') WaveletFeatures
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform(X, copy=None)[source]
Extract wavelet features from spectra.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input spectra.
copy (bool or None, optional) – Ignored (for API compatibility).
- Returns:
X_transformed – Wavelet features.
- Return type:
ndarray of shape (n_samples, n_features_out_)
- class nirs4all.operators.transforms.nirs.WaveletPCA(wavelet: str = 'db4', max_level: int = 4, n_components_per_level: int = 3, whiten: bool = True, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorMulti-scale PCA on wavelet coefficients.
Applies PCA separately to each wavelet decomposition level, creating a compact multi-scale representation where each scale contributes a few principal components. This preserves frequency-specific information while reducing dimensionality.
- Scientific basis:
Combines multi-resolution analysis with decorrelation
Each scale captures different frequency information
PCA per scale reduces redundancy within each frequency band
Results in a compact, interpretable feature set
- Parameters:
wavelet (str, default='db4') – Wavelet to use (e.g., ‘haar’, ‘db4’, ‘coif3’, ‘sym4’).
max_level (int, default=4) – Maximum decomposition level.
n_components_per_level (int, default=3) – Number of PCA components to keep per decomposition level.
whiten (bool, default=True) – Whether to whiten the PCA components.
copy (bool, default=True) – Whether to copy input data.
References
Trygg & Wold (1998). PLS regression on wavelet compressed NIR spectra.
- fit(X, y=None)[source]
Fit the wavelet-PCA transformer.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (None) – Ignored.
- Returns:
self – Fitted transformer.
- Return type:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') WaveletPCA
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform(X, copy=None)[source]
Transform spectra to wavelet-PCA features.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input spectra.
copy (bool or None, optional) – Ignored (for API compatibility).
- Returns:
X_transformed – Wavelet-PCA features.
- Return type:
ndarray of shape (n_samples, n_features_out_)
- class nirs4all.operators.transforms.nirs.WaveletSVD(wavelet: str = 'db4', max_level: int = 4, n_components_per_level: int = 3, *, copy: bool = True)[source]
Bases:
TransformerMixin,BaseEstimatorMulti-scale SVD on wavelet coefficients.
Applies Truncated SVD separately to each wavelet decomposition level, creating a compact multi-scale representation. Similar to WaveletPCA but uses SVD which doesn’t center data and works better for sparse data.
- Scientific basis:
Combines multi-resolution analysis with dimensionality reduction
Each scale captures different frequency information
SVD per scale reduces redundancy within each frequency band
Results in a compact feature set
- Parameters:
References
Trygg & Wold (1998). PLS regression on wavelet compressed NIR spectra.
- fit(X, y=None)[source]
Fit the wavelet-SVD transformer.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (None) – Ignored.
- Returns:
self – Fitted transformer.
- Return type:
- set_transform_request(*, copy: bool | None | str = '$UNCHANGED$') WaveletSVD
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform(X, copy=None)[source]
Transform spectra to wavelet-SVD features.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input spectra.
copy (bool or None, optional) – Ignored (for API compatibility).
- Returns:
X_transformed – Wavelet-SVD features.
- Return type:
ndarray of shape (n_samples, n_features_out_)
- nirs4all.operators.transforms.nirs.asls_baseline(spectra: ndarray, lam: float = 1000000.0, p: float = 0.01, max_iter: int = 50, tol: float = 0.001) ndarray[source]
Compute baseline using Asymmetric Least Squares Smoothing.
This is a convenience wrapper around pybaseline_correction with method=’asls’.
- Parameters:
spectra (numpy.ndarray) – NIRS data matrix (n_samples, n_features).
lam (float) – Smoothness parameter (lambda). Default is 1e6.
p (float) – Asymmetry parameter (0 < p < 1). Default is 0.01.
max_iter (int) – Maximum number of iterations. Default is 50.
tol (float) – Convergence tolerance. Default is 1e-3.
- Returns:
Baseline-corrected spectra with same shape as input.
- Return type:
- nirs4all.operators.transforms.nirs.first_derivative(spectra: ndarray, delta: float = 1.0, edge_order: int = 2) ndarray[source]
First numerical derivative along feature axis using central differences.
- Parameters:
spectra (numpy.ndarray) – NIRS data matrix (n_samples, n_features).
delta (float) – Sampling step along the feature axis.
edge_order (int) – 1 or 2, order of accuracy at the boundaries.
- Returns:
First derivative dX/dλ with same shape as input.
- Return type:
- nirs4all.operators.transforms.nirs.log_transform(spectra: ndarray, base: float = 2.718281828459045, offset: float = 0.0, auto_offset: bool = True, min_value: float = 1e-08) ndarray[source]
Apply elementwise logarithm with automatic handling of edge cases.
- Parameters:
spectra (numpy.ndarray) – NIRS data matrix.
base (float) – Logarithm base. Default is e.
offset (float) – Fixed value added before log to handle non-positives.
auto_offset (bool) – If True, automatically add offset for problematic values.
min_value (float) – Minimum value after offset when auto_offset=True.
- Returns:
Log-transformed spectra.
- Return type:
- nirs4all.operators.transforms.nirs.msc(spectra, scaled=True)[source]
Performs multiplicative scatter correction to the mean.
- Parameters:
spectra (numpy.ndarray) – NIRS data matrix.
scaled (bool) – Whether to scale the data. Defaults to True.
- Returns:
Scatter-corrected NIR spectra.
- Return type:
- nirs4all.operators.transforms.nirs.pybaseline_correction(spectra: ndarray, method: str = 'asls', **kwargs) ndarray[source]
Apply baseline correction using pybaselines library.
This is a general wrapper for all pybaselines methods, allowing flexible baseline correction with various algorithms.
- Parameters:
spectra (numpy.ndarray) – NIRS data matrix (n_samples, n_features).
method (str) –
Baseline correction method. Available methods: Whittaker: ‘asls’, ‘iasls’, ‘airpls’, ‘arpls’, ‘drpls’, ‘iarpls’,
’aspls’, ‘psalsa’, ‘derpsalsa’
Polynomial: ‘poly’, ‘modpoly’, ‘imodpoly’, ‘penalized_poly’, ‘loess’, ‘quant_reg’ Morphological: ‘mor’, ‘imor’, ‘mormol’, ‘amormol’, ‘rolling_ball’,
’mwmv’, ‘tophat’, ‘mpspline’, ‘jbcd’
Spline: ‘mixture_model’, ‘irsqr’, ‘corner_cutting’, ‘pspline_asls’, etc. Smooth: ‘noise_median’, ‘snip’, ‘swima’, ‘ipsa’ Classification: ‘dietrich’, ‘golotvin’, ‘std_distribution’, ‘fastchrom’, ‘cwt_br’ Optimizers: ‘collab_pls’, ‘optimize_extended_range’, ‘adaptive_minmax’ Misc: ‘interp_pts’, ‘beads’
**kwargs – Additional parameters passed to the specific baseline method.
- Returns:
Baseline-corrected spectra with same shape as input.
- Return type:
- Raises:
ImportError – If pybaselines is not installed.
ValueError – If an unknown method is specified.
Examples
>>> from nirs4all.operators.transforms.nirs import pybaseline_correction >>> corrected = pybaseline_correction(spectra, method='airpls', lam=1e5) >>> corrected = pybaseline_correction(spectra, method='imodpoly', poly_order=3) >>> corrected = pybaseline_correction(spectra, method='snip', max_half_window=30)
- nirs4all.operators.transforms.nirs.reflectance_to_absorbance(spectra: ndarray, min_value: float = 1e-08) ndarray[source]
Convert reflectance spectra to absorbance.
Applies the Beer-Lambert law: A = -log10(R) = log10(1/R) where R is reflectance and A is absorbance.
- Parameters:
spectra (numpy.ndarray) – Reflectance NIRS data matrix (n_samples, n_features). Values should be in range (0, 1] or as percentages (0, 100].
min_value (float) – Minimum value to clamp reflectance to avoid log(0). Default is 1e-8.
- Returns:
Absorbance spectra with same shape as input.
- Return type:
- nirs4all.operators.transforms.nirs.savgol(spectra: ndarray, window_length: int = 11, polyorder: int = 3, deriv: int = 0, delta: float = 1.0) ndarray[source]
Perform Savitzky–Golay filtering on the data (also calculates derivatives). This function is a wrapper for scipy.signal.savgol_filter.
- Parameters:
spectra (numpy.ndarray) – NIRS data matrix.
window_length (int) – Size of the filter window in samples (default 11).
polyorder (int) – Order of the polynomial estimation (default 3).
deriv (int) – Order of the derivation (default 0).
delta (float) – Sampling distance of the data.
- Returns:
NIRS data smoothed with Savitzky-Golay filtering.
- Return type:
- nirs4all.operators.transforms.nirs.second_derivative(spectra: ndarray, delta: float = 1.0, edge_order: int = 2) ndarray[source]
Second numerical derivative along feature axis.
- Parameters:
spectra (numpy.ndarray) – NIRS data matrix (n_samples, n_features).
delta (float) – Sampling step along the feature axis.
edge_order (int) – 1 or 2, order of accuracy at the boundaries.
- Returns:
Second derivative d²X/dλ² with same shape as input.
- Return type:
- nirs4all.operators.transforms.nirs.wavelet_transform(spectra: ndarray, wavelet: str, mode: str = 'periodization') ndarray[source]
Computes transform using pywavelet transform.
- Parameters:
spectra (numpy.ndarray) – NIRS data matrix.
wavelet (str) – wavelet family transformation.
mode (str) – signal extension mode.
- Returns:
wavelet and resampled spectra.
- Return type: