nirs4all.operators.augmentation.environmental module

Environmental effects augmentation operators for spectral data.

This module provides wavelength-aware augmentation operators that simulate environmental effects on NIR spectra, including temperature-induced changes and moisture/water activity effects.

These operators inherit from SpectraTransformerMixin and automatically receive wavelength information from the dataset when used in nirs4all pipelines.

References

  • Maeda, H., Ozaki, Y., et al. (1995). Near infrared spectroscopy and chemometrics studies of temperature-dependent spectral variations of water. Journal of Near Infrared Spectroscopy, 3(4), 191-201.

  • Segtnan, V. H., et al. (2001). Studies on the structure of water using two-dimensional near-infrared correlation spectroscopy. Analytical Chemistry, 73(13), 3153-3161.

  • Luck, W. A. P. (1998). The importance of cooperativity for the properties of liquid water. Journal of Molecular Structure, 448(2-3), 131-142.

class nirs4all.operators.augmentation.environmental.MoistureAugmenter(water_activity_delta: float = 0.1, water_activity_range: Tuple[float, float] | None = None, reference_water_activity: float = 0.5, free_water_fraction: float = 0.3, bound_water_shift: float = 25.0, moisture_content: float = 0.1, enable_shift: bool = True, enable_intensity: bool = True, random_state: int | None = None)[source]

Bases: SpectraTransformerMixin

Simulate moisture-induced spectral changes for data augmentation.

Water activity and moisture content affect NIR spectra through shifts in water bands between free and bound states. Higher water activity leads to more free water, while lower water activity means more water is hydrogen-bonded to the sample matrix.

Parameters:
  • water_activity_delta (float, default=0.1) – Change in water activity from reference (0-1 scale).

  • water_activity_range (tuple of (float, float), optional) – If provided, randomly sample water_activity_delta from this range for each sample.

  • reference_water_activity (float, default=0.5) – Reference water activity for the input spectra.

  • free_water_fraction (float, default=0.3) – Base fraction of water that is “free” vs. bound (0-1).

  • bound_water_shift (float, default=25.0) – Wavelength shift (nm) for bound water relative to free water.

  • moisture_content (float, default=0.10) – Base moisture content as fraction (affects intensity).

  • enable_shift (bool, default=True) – Apply water band position shifts.

  • enable_intensity (bool, default=True) – Apply water band intensity changes based on moisture content.

  • random_state (int, optional) – Random seed for reproducibility.

_requires_wavelengths

Always True - this operator requires wavelength information.

Type:

bool

Examples

>>> from nirs4all.operators.augmentation import MoistureAugmenter
>>> aug = MoistureAugmenter(water_activity_delta=0.2)
>>> X_aug = aug.transform(X, wavelengths=wavelengths)
>>> # Random moisture variation in pipeline
>>> aug = MoistureAugmenter(water_activity_range=(-0.2, 0.2))
>>> pipeline = [aug, PLSRegression(10)]

References

  • Büning-Pfaue, H. (2003). Analysis of water in food by near infrared spectroscopy. Food Chemistry, 82(1), 107-115.

  • Luck, W. A. P. (1998). The importance of cooperativity for the properties of liquid water. Journal of Molecular Structure.

BOUND_WATER_PEAK_1ST = 1460
BOUND_WATER_PEAK_COMB = 1940
FREE_WATER_PEAK_1ST = 1410
FREE_WATER_PEAK_COMB = 1920
set_transform_request(*, wavelengths: bool | None | str = '$UNCHANGED$') MoistureAugmenter

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

wavelengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for wavelengths parameter in transform.

Returns:

self – The updated object.

Return type:

object

transform_with_wavelengths(X: ndarray, wavelengths: ndarray) ndarray[source]

Apply moisture effects to spectra.

Parameters:
Returns:

X_transformed – Spectra with moisture effects applied.

Return type:

ndarray of shape (n_samples, n_features)

class nirs4all.operators.augmentation.environmental.TemperatureAugmenter(temperature_delta: float = 5.0, temperature_range: Tuple[float, float] | None = None, reference_temperature: float = 25.0, enable_shift: bool = True, enable_intensity: bool = True, enable_broadening: bool = True, region_specific: bool = True, random_state: int | None = None)[source]

Bases: SpectraTransformerMixin

Simulate temperature-induced spectral changes for data augmentation.

Temperature affects NIR spectra through: - Peak position shifts (especially O-H, N-H bands) - Intensity changes (hydrogen bonding disruption) - Band broadening (thermal motion)

This operator applies region-specific temperature effects based on literature values for NIR spectroscopy.

Parameters:
  • temperature_delta (float, default=5.0) – Temperature change from reference (°C). Positive = heating.

  • temperature_range (tuple of (float, float), optional) – If provided, randomly sample temperature_delta from this range for each sample. Overrides temperature_delta parameter.

  • reference_temperature (float, default=25.0) – Reference temperature for the input spectra (°C).

  • enable_shift (bool, default=True) – Apply peak position shifts.

  • enable_intensity (bool, default=True) – Apply intensity changes.

  • enable_broadening (bool, default=True) – Apply band broadening.

  • region_specific (bool, default=True) – Apply region-specific effects (recommended). If False, applies uniform average effects across all wavelengths.

  • random_state (int, optional) – Random seed for reproducibility.

_requires_wavelengths

Always True - this operator requires wavelength information.

Type:

bool

Examples

>>> from nirs4all.operators.augmentation import TemperatureAugmenter
>>> aug = TemperatureAugmenter(temperature_delta=10.0)
>>> X_aug = aug.transform(X, wavelengths=wavelengths)
>>> # Random temperature variation in pipeline
>>> aug = TemperatureAugmenter(temperature_range=(-5, 10))
>>> pipeline = [aug, PLSRegression(10)]

References

  • Maeda et al. (1995). JNIR Spectroscopy, 3(4), 191-201.

  • Segtnan et al. (2001). Analytical Chemistry, 73(13), 3153-3161.

set_transform_request(*, wavelengths: bool | None | str = '$UNCHANGED$') TemperatureAugmenter

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

wavelengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for wavelengths parameter in transform.

Returns:

self – The updated object.

Return type:

object

transform_with_wavelengths(X: ndarray, wavelengths: ndarray) ndarray[source]

Apply temperature effects to spectra.

Parameters:
Returns:

X_transformed – Spectra with temperature effects applied.

Return type:

ndarray of shape (n_samples, n_features)