nirs4all.operators.augmentation.edge_artifacts module
Edge artifacts and spectral boundary effects augmentation operators.
This module provides wavelength-aware augmentation operators that simulate edge-related artifacts commonly observed in NIR spectra, including:
Detector sensitivity roll-off at wavelength boundaries
Stray light effects (more pronounced at spectral edges)
Edge curvature/bending due to optical aberrations
These artifacts often manifest as deformations at the start or end of spectra and can significantly impact model performance if not accounted for.
References
Workman, J., & Weyer, L. (2012). Practical Guide and Spectral Atlas for Interpretive Near-Infrared Spectroscopy. CRC Press.
Burns, D. A., & Ciurczak, E. W. (2007). Handbook of Near-Infrared Analysis. CRC Press. Chapter on instrumental considerations.
Chalmers, J. M., & Griffiths, P. R. (2001). Mid-Infrared Spectroscopy: Anomalies, Artifacts and Common Errors. Wiley.
JASCO (2020). Advantages of high-sensitivity InGaAs detector in UV-Vis/NIR spectrophotometer. Technical Note.
Applied Optics (1975). Resolution and stray light in near infrared spectroscopy, 14(8), 1977.
- class nirs4all.operators.augmentation.edge_artifacts.DetectorModel(name: str, optimal_range: Tuple[float, float], roll_off_rate: float, min_sensitivity: float)[source]
Bases:
objectParameters for detector sensitivity roll-off modeling.
Detector sensitivity curves typically follow a profile where sensitivity peaks in the middle of the spectral range and rolls off at the edges, often following an exponential decay pattern.
- class nirs4all.operators.augmentation.edge_artifacts.DetectorRollOffAugmenter(detector_model: str = 'generic_nir', effect_strength: float = 1.0, noise_amplification: float = 0.02, include_baseline_distortion: bool = True, random_state: int | None = None)[source]
Bases:
SpectraTransformerMixinSimulate detector sensitivity roll-off at spectral edges.
NIR detectors have wavelength-dependent sensitivity curves that typically roll off at the edges of their spectral range. This causes: - Increased noise at edge wavelengths (lower SNR) - Apparent baseline curvature near spectral boundaries - Reduced peak heights at the edges
The effect is modeled as an exponential decay of detector sensitivity outside the optimal wavelength range, which manifests as multiplicative noise amplification and slight baseline distortion.
- Parameters:
detector_model (str, default="generic_nir") – Detector type to simulate. Available models: - “ingaas_standard”: Standard InGaAs (1000-1600 nm optimal) - “ingaas_extended”: Extended InGaAs (1100-2200 nm optimal) - “pbs”: Lead sulfide (1000-2800 nm optimal) - “silicon_ccd”: Silicon CCD (400-900 nm optimal) - “generic_nir”: Generic NIR detector
effect_strength (float, default=1.0) – Scaling factor for the roll-off effect (0-2).
noise_amplification (float, default=0.02) – Additional noise added at low-sensitivity wavelengths.
include_baseline_distortion (bool, default=True) – Whether to include slight baseline distortion at edges.
random_state (int, optional) – Random seed for reproducibility.
Examples
>>> from nirs4all.operators.augmentation import DetectorRollOffAugmenter >>> aug = DetectorRollOffAugmenter(detector_model="ingaas_standard") >>> X_aug = aug.transform(X, wavelengths=wavelengths)
>>> # Stronger effect for portable spectrometers >>> aug = DetectorRollOffAugmenter(effect_strength=1.5) >>> pipeline = [aug, SNV(), PLSRegression(10)]
References
JASCO (2020). Advantages of high-sensitivity InGaAs detector.
LaserComponents InGaAs Photodiodes specifications.
- set_transform_request(*, wavelengths: bool | None | str = '$UNCHANGED$') DetectorRollOffAugmenter
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform_with_wavelengths(X: ndarray, wavelengths: ndarray) ndarray[source]
Apply detector roll-off effects to spectra.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Input spectra.
wavelengths (ndarray of shape (n_features,)) – Wavelength array in nm.
- Returns:
X_transformed – Spectra with detector roll-off effects applied.
- Return type:
ndarray of shape (n_samples, n_features)
- class nirs4all.operators.augmentation.edge_artifacts.EdgeArtifactsAugmenter(detector_roll_off: bool = True, stray_light: bool = True, edge_curvature: bool = True, truncated_peaks: bool = True, overall_strength: float = 1.0, detector_model: str = 'generic_nir', random_state: int | None = None)[source]
Bases:
SpectraTransformerMixinCombined augmenter for edge-related spectral artifacts.
This is a convenience class that combines multiple edge artifact effects: - Detector roll-off - Stray light - Edge curvature - Truncated peaks
Each effect can be individually enabled/disabled.
- Parameters:
detector_roll_off (bool, default=True) – Enable detector sensitivity roll-off effect.
stray_light (bool, default=True) – Enable stray light effect.
edge_curvature (bool, default=True) – Enable edge curvature/bending effect.
truncated_peaks (bool, default=True) – Enable truncated peak effect at boundaries.
overall_strength (float, default=1.0) – Scaling factor for all effects (0-2).
detector_model (str, default="generic_nir") – Detector model for roll-off simulation.
random_state (int, optional) – Random seed for reproducibility.
Examples
>>> from nirs4all.operators.augmentation import EdgeArtifactsAugmenter >>> aug = EdgeArtifactsAugmenter(overall_strength=0.8) >>> X_aug = aug.transform(X, wavelengths=wavelengths)
>>> # Only detector and stray light effects >>> aug = EdgeArtifactsAugmenter( ... detector_roll_off=True, ... stray_light=True, ... edge_curvature=False, ... truncated_peaks=False ... ) >>> pipeline = [aug, SNV(), PLSRegression(10)]
- set_transform_request(*, wavelengths: bool | None | str = '$UNCHANGED$') EdgeArtifactsAugmenter
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform_with_wavelengths(X: ndarray, wavelengths: ndarray) ndarray[source]
Apply all enabled edge artifact effects to spectra.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Input spectra.
wavelengths (ndarray of shape (n_features,)) – Wavelength array in nm.
- Returns:
X_transformed – Spectra with edge artifacts applied.
- Return type:
ndarray of shape (n_samples, n_features)
- class nirs4all.operators.augmentation.edge_artifacts.EdgeCurvatureAugmenter(curvature_strength: float = 0.02, curvature_type: str = 'random', asymmetry: float = 0.0, edge_focus: float = 0.7, random_state: int | None = None)[source]
Bases:
SpectraTransformerMixinSimulate edge curvature and baseline bending at spectral boundaries.
Edge curvature can arise from various sources: - Optical aberrations in the spectrometer - Wavelength-dependent baseline drift - Polynomial baseline correction artifacts - Sample holder effects
This operator adds smooth curvature that increases towards the spectral edges, mimicking the characteristic “smile” or “frown” patterns often seen in real spectra.
- Parameters:
curvature_strength (float, default=0.02) – Maximum curvature amplitude (in absorbance units).
curvature_type (str, default="random") – Type of curvature pattern: - “random”: Randomly choose smile/frown/asymmetric - “smile”: Upward curvature at edges (convex) - “frown”: Downward curvature at edges (concave) - “asymmetric”: Different curvature at each edge
asymmetry (float, default=0.0) – For “asymmetric” type, ratio of left/right curvature (-1 to 1). Positive values emphasize left edge, negative emphasize right.
edge_focus (float, default=0.7) – How concentrated the curvature is at edges (0-1). Higher values create sharper edge effects.
random_state (int, optional) – Random seed for reproducibility.
Examples
>>> from nirs4all.operators.augmentation import EdgeCurvatureAugmenter >>> aug = EdgeCurvatureAugmenter(curvature_strength=0.03) >>> X_aug = aug.transform(X, wavelengths=wavelengths)
>>> # Simulate baseline correction artifacts >>> aug = EdgeCurvatureAugmenter( ... curvature_type="asymmetric", ... asymmetry=0.5, ... edge_focus=0.8 ... ) >>> pipeline = [aug, Detrend(), PLSRegression(10)]
References
Cao, A., et al. (2007). A robust method for automated background subtraction of tissue fluorescence. Journal of Raman Spectroscopy.
NIRPY Research (2019). Two methods for baseline correction of spectral data.
- set_transform_request(*, wavelengths: bool | None | str = '$UNCHANGED$') EdgeCurvatureAugmenter
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform_with_wavelengths(X: ndarray, wavelengths: ndarray) ndarray[source]
Apply edge curvature effects to spectra.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Input spectra.
wavelengths (ndarray of shape (n_features,)) – Wavelength array in nm.
- Returns:
X_transformed – Spectra with edge curvature applied.
- Return type:
ndarray of shape (n_samples, n_features)
- class nirs4all.operators.augmentation.edge_artifacts.StrayLightAugmenter(stray_light_fraction: float = 0.001, edge_enhancement: float = 2.0, edge_width: float = 0.1, include_peak_truncation: bool = True, random_state: int | None = None)[source]
Bases:
SpectraTransformerMixinSimulate stray light effects on NIR spectra.
Stray light is unwanted radiation that reaches the detector without passing through the intended optical path. Its effects are most pronounced: - At high-absorbance wavelengths (peaks appear truncated) - At spectral edges where instrument sensitivity is lower - Near the limits of the detector’s wavelength range
The primary effect is a reduction in observed peak height, causing apparent negative deviations from Beer’s law. This is particularly problematic at the edges of spectra where stray light often constitutes a larger fraction of the total signal.
- Parameters:
stray_light_fraction (float, default=0.001) – Base stray light as fraction of total signal (0.001 = 0.1%). Typical values: 0.0001-0.01 depending on instrument quality.
edge_enhancement (float, default=2.0) – Factor by which stray light increases at spectral edges.
edge_width (float, default=0.1) – Fraction of spectral range considered “edge” (0-0.5).
include_peak_truncation (bool, default=True) – Whether to simulate peak height reduction at high absorbance.
random_state (int, optional) – Random seed for reproducibility.
Examples
>>> from nirs4all.operators.augmentation import StrayLightAugmenter >>> aug = StrayLightAugmenter(stray_light_fraction=0.005) >>> X_aug = aug.transform(X, wavelengths=wavelengths)
>>> # High stray light (older/portable instruments) >>> aug = StrayLightAugmenter(stray_light_fraction=0.01, edge_enhancement=3.0) >>> pipeline = [aug, MSC(), PLSRegression(10)]
Notes
- The observed transmittance with stray light is:
T_obs = (T_true + s) / (1 + s)
where s is the stray light fraction. This causes: - At high absorbance (low T_true): T_obs ≈ s, creating a floor effect - At low absorbance (high T_true): Minimal effect
- Converting to absorbance:
A_obs = -log10(T_obs) < A_true
References
Applied Optics (1975). Resolution and stray light in near infrared spectroscopy, 14(8), 1977.
Chalmers & Griffiths (2001). Mid-Infrared Spectroscopy: Anomalies, Artifacts and Common Errors.
- set_transform_request(*, wavelengths: bool | None | str = '$UNCHANGED$') StrayLightAugmenter
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform_with_wavelengths(X: ndarray, wavelengths: ndarray) ndarray[source]
Apply stray light effects to spectra.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Input spectra (in absorbance units).
wavelengths (ndarray of shape (n_features,)) – Wavelength array in nm.
- Returns:
X_transformed – Spectra with stray light effects applied.
- Return type:
ndarray of shape (n_samples, n_features)
- class nirs4all.operators.augmentation.edge_artifacts.TruncatedPeakAugmenter(peak_probability: float = 0.3, amplitude_range: Tuple[float, float] = (0.01, 0.1), width_range: Tuple[float, float] = (50, 200), left_edge: bool = True, right_edge: bool = True, random_state: int | None = None)[source]
Bases:
SpectraTransformerMixinSimulate truncated absorption peaks at spectral boundaries.
When measuring NIR spectra, absorption bands that have their centers outside the measured wavelength range will appear as partial peaks at the spectral edges. This creates characteristic rising or falling baselines at the spectrum boundaries.
This effect is common when: - The spectrometer range doesn’t cover the full absorption band - Strong absorbers (e.g., water) have peaks just outside the range - Mid-IR absorption bands tail into the NIR region
- Parameters:
peak_probability (float, default=0.3) – Probability of adding truncated peaks (0-1).
amplitude_range (tuple of (float, float), default=(0.01, 0.1)) – Range of peak amplitudes (in absorbance units).
width_range (tuple of (float, float), default=(50, 200)) – Range of peak widths (in nm). Controls how fast the edge rises/falls.
left_edge (bool, default=True) – Whether to potentially add truncated peak at left (low wavelength) edge.
right_edge (bool, default=True) – Whether to potentially add truncated peak at right (high wavelength) edge.
random_state (int, optional) – Random seed for reproducibility.
Examples
>>> from nirs4all.operators.augmentation import TruncatedPeakAugmenter >>> aug = TruncatedPeakAugmenter(peak_probability=0.5) >>> X_aug = aug.transform(X, wavelengths=wavelengths)
>>> # Strong truncated peaks (e.g., water band edge) >>> aug = TruncatedPeakAugmenter( ... amplitude_range=(0.05, 0.2), ... width_range=(100, 300) ... ) >>> pipeline = [aug, SNV(), PLSRegression(10)]
Notes
The truncated peak is modeled as a Gaussian band with its center positioned outside the measured wavelength range. Only the “tail” of this band appears in the spectrum.
- set_transform_request(*, wavelengths: bool | None | str = '$UNCHANGED$') TruncatedPeakAugmenter
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- transform_with_wavelengths(X: ndarray, wavelengths: ndarray) ndarray[source]
Apply truncated peak effects to spectra.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Input spectra.
wavelengths (ndarray of shape (n_features,)) – Wavelength array in nm.
- Returns:
X_transformed – Spectra with truncated peaks at edges.
- Return type:
ndarray of shape (n_samples, n_features)