nirs4all.operators.augmentation.synthesis module
Synthesis-derived augmentation operators for spectral data.
This module provides augmentation operators extracted from the synthetic NIRS spectra generator’s effect chain. These operators simulate realistic instrumental and physical effects that occur during NIR measurements.
- Operators:
PathLengthAugmenter: Optical path length variation (multiplicative scaling)
BatchEffectAugmenter: Batch/session measurement effects (offset + gain)
InstrumentalBroadeningAugmenter: Instrumental spectral broadening (Gaussian convolution)
HeteroscedasticNoiseAugmenter: Signal-dependent detector noise
DeadBandAugmenter: Dead spectral bands (detector saturation/failure)
References
Burns, D. A., & Ciurczak, E. W. (2007). Handbook of Near-Infrared Analysis. CRC Press.
Workman Jr, J., & Weyer, L. (2012). Practical Guide and Spectral Atlas for Interpretive Near-Infrared Spectroscopy. CRC Press.
- class nirs4all.operators.augmentation.synthesis.BatchEffectAugmenter(offset_std=0.02, slope_std=0.01, gain_std=0.03, random_state=None, variation_scope='sample')[source]
Bases:
SpectraTransformerMixinSimulates batch/session effects in spectroscopic measurements.
Applies wavelength-dependent additive offset and multiplicative gain to simulate variations between measurement sessions or instruments.
The offset consists of a constant term plus a wavelength-dependent slope. The gain is a uniform multiplicative factor.
- Parameters:
offset_std (float, default=0.02) – Standard deviation of the constant offset term.
slope_std (float, default=0.01) – Standard deviation of the wavelength-dependent slope.
gain_std (float, default=0.03) – Standard deviation of the multiplicative gain (centered at 1.0).
random_state (int or None, default=None) – Random seed for reproducibility.
variation_scope (str, default="sample") – Scope of variation: “sample” for per-sample effects, “batch” for a single effect applied to all samples.
Examples
>>> from nirs4all.operators.augmentation.synthesis import BatchEffectAugmenter >>> aug = BatchEffectAugmenter(offset_std=0.03, gain_std=0.05) >>> X_aug = aug.fit_transform(X, wavelengths=wavelengths)
- class nirs4all.operators.augmentation.synthesis.DeadBandAugmenter(n_bands=1, width_range=(10, 30), noise_std=0.05, probability=1.0, random_state=None, variation_scope='sample')[source]
Bases:
TransformerMixin,BaseEstimatorSimulates dead spectral bands (detector saturation/failure regions).
Zeroes out random wavelength regions and adds noise, simulating detector dead bands or saturation artifacts.
- Parameters:
n_bands (int, default=1) – Number of dead bands to introduce per sample.
width_range (tuple of (int, int), default=(10, 30)) – Range for the width (in wavelength points) of each dead band.
noise_std (float, default=0.05) – Standard deviation of the noise injected into dead band regions.
probability (float, default=1.0) – Probability that a dead band is applied to a given sample.
random_state (int or None, default=None) – Random seed for reproducibility.
variation_scope (str, default="sample") – Scope of variation: “sample” for per-sample dead bands, “batch” for the same dead bands across all samples.
Examples
>>> from nirs4all.operators.augmentation.synthesis import DeadBandAugmenter >>> aug = DeadBandAugmenter(n_bands=2, width_range=(15, 40)) >>> X_aug = aug.fit_transform(X)
- class nirs4all.operators.augmentation.synthesis.HeteroscedasticNoiseAugmenter(noise_base=0.001, noise_signal_dep=0.005, random_state=None, variation_scope='sample')[source]
Bases:
TransformerMixin,BaseEstimatorSimulates signal-dependent (heteroscedastic) detector noise.
Noise variance is proportional to signal magnitude, modeling shot noise and detector-limited measurements.
- The noise standard deviation at each point is computed as:
sigma = noise_base + noise_signal_dep * |X|
- Parameters:
noise_base (float, default=0.001) – Base noise standard deviation (signal-independent component).
noise_signal_dep (float, default=0.005) – Signal-dependent noise coefficient.
random_state (int or None, default=None) – Random seed for reproducibility.
variation_scope (str, default="sample") – Scope of variation: “sample” for independent noise per sample, “batch” for a shared noise pattern across all samples.
Examples
>>> from nirs4all.operators.augmentation.synthesis import HeteroscedasticNoiseAugmenter >>> aug = HeteroscedasticNoiseAugmenter(noise_base=0.002, noise_signal_dep=0.01) >>> X_aug = aug.fit_transform(X)
- class nirs4all.operators.augmentation.synthesis.InstrumentalBroadeningAugmenter(fwhm=3.0, fwhm_range=None, random_state=None, variation_scope='sample')[source]
Bases:
SpectraTransformerMixinSimulates instrumental spectral broadening.
Applies Gaussian convolution to simulate the finite spectral resolution of the instrument. FWHM is converted to sigma for the Gaussian kernel.
- The relationship between FWHM and Gaussian sigma is:
sigma = FWHM / (2 * sqrt(2 * ln(2)))
- Parameters:
fwhm (float, default=3.0) – Full Width at Half Maximum in wavelength units (e.g., nm).
fwhm_range (tuple of (float, float) or None, default=None) – If provided, randomly sample FWHM from this range instead of using the fixed
fwhmvalue.random_state (int or None, default=None) – Random seed for reproducibility.
variation_scope (str, default="sample") – Scope of variation when
fwhm_rangeis used: “sample” for per-sample FWHM, “batch” for a single FWHM for all samples.
Examples
>>> from nirs4all.operators.augmentation.synthesis import InstrumentalBroadeningAugmenter >>> aug = InstrumentalBroadeningAugmenter(fwhm=5.0) >>> X_aug = aug.fit_transform(X, wavelengths=wavelengths)
>>> # Variable broadening >>> aug = InstrumentalBroadeningAugmenter(fwhm_range=(2.0, 6.0)) >>> X_aug = aug.fit_transform(X, wavelengths=wavelengths)
- class nirs4all.operators.augmentation.synthesis.PathLengthAugmenter(path_length_std=0.05, min_path_length=0.5, random_state=None, variation_scope='sample')[source]
Bases:
TransformerMixin,BaseEstimatorSimulates optical path length variation.
Multiplicatively scales spectra to simulate variations in optical path length due to sample positioning, particle size effects, etc.
The path length factor L is drawn from a normal distribution centered at 1.0, then clipped to a minimum value to prevent sign inversion.
- Parameters:
path_length_std (float, default=0.05) – Standard deviation of the path length factor distribution.
min_path_length (float, default=0.5) – Minimum allowed path length factor.
random_state (int or None, default=None) – Random seed for reproducibility.
variation_scope (str, default="sample") – Scope of variation: “sample” for per-sample variation, “batch” for a single factor applied to all samples.
Examples
>>> from nirs4all.operators.augmentation.synthesis import PathLengthAugmenter >>> aug = PathLengthAugmenter(path_length_std=0.1) >>> X_aug = aug.fit_transform(X)