Standard Normal Variate (SNV) Transformation
Overview
Standard Normal Variate (SNV) is a scatter correction technique commonly used in Near-Infrared Spectroscopy (NIRS) and other spectroscopic applications. It normalizes each spectrum (sample) individually to remove multiplicative scatter effects.
Implementation
The StandardNormalVariate class in nirs4all provides a flexible implementation that can work in two modes:
1. Row-wise SNV (Default - Spectroscopy Standard)
By default, SNV operates row-wise (axis=1), which is the standard approach in spectroscopy. Each spectrum (row) is centered and scaled independently:
from nirs4all.operators.transforms import StandardNormalVariate
import numpy as np
# Example spectral data (3 samples, 5 wavelengths)
X = np.array([[1, 2, 3, 4, 5],
[10, 20, 30, 40, 50],
[100, 200, 300, 400, 500]], dtype=float)
# Apply SNV (row-wise by default)
snv = StandardNormalVariate()
X_transformed = snv.fit_transform(X)
# Each row now has mean≈0 and std≈1
Formula (per sample):
SNV(x) = (x - mean(x)) / std(x)
2. Column-wise SNV (Like StandardScaler)
You can also apply SNV column-wise (axis=0), which makes it equivalent to sklearn’s StandardScaler:
# Apply SNV column-wise
snv_colwise = StandardNormalVariate(axis=0)
X_transformed = snv_colwise.fit_transform(X)
# Each column now has mean≈0 and std≈1
Parameters
axis (int, default=1): Axis along which to compute mean and standard deviation
axis=1: Row-wise (default, standard SNV for spectroscopy)axis=0: Column-wise (equivalent to StandardScaler)
with_mean (bool, default=True): If True, center the data before scaling
with_std (bool, default=True): If True, scale the data to unit variance
ddof (int, default=0): Delta Degrees of Freedom for standard deviation calculation
copy (bool, default=True): If False, try to avoid a copy and do inplace scaling
Use Cases
Row-wise SNV (axis=1) - Spectroscopy
Purpose: Remove multiplicative scatter effects from individual spectra
When to use: Standard preprocessing for NIRS, Raman, and other spectroscopic data
Effect: Each spectrum is normalized independently, removing baseline shifts and scaling differences
Column-wise SNV (axis=0) - Feature Scaling
Purpose: Standardize features across samples
When to use: When you want to normalize features (wavelengths) rather than samples
Effect: Equivalent to sklearn’s StandardScaler
Examples
Example 1: Basic SNV for Spectroscopy
from nirs4all.operators.transforms import StandardNormalVariate
from sklearn.pipeline import Pipeline
from sklearn.decomposition import PCA
# Create preprocessing pipeline
pipeline = Pipeline([
('snv', StandardNormalVariate()), # Row-wise SNV
('pca', PCA(n_components=10))
])
# Apply to spectral data
X_processed = pipeline.fit_transform(X_spectra)
Example 2: SNV with Other Preprocessing
from nirs4all.operators.transforms import (
StandardNormalVariate,
SavitzkyGolay,
MultiplicativeScatterCorrection
)
from sklearn.pipeline import Pipeline
# Compare different scatter correction methods
pipelines = {
'snv': Pipeline([('snv', StandardNormalVariate())]),
'msc': Pipeline([('msc', MultiplicativeScatterCorrection())]),
'snv+savgol': Pipeline([
('snv', StandardNormalVariate()),
('savgol', SavitzkyGolay())
])
}
Example 3: Column-wise Standardization
# If you need column-wise standardization (like StandardScaler)
snv_colwise = StandardNormalVariate(axis=0)
X_scaled = snv_colwise.fit_transform(X)
Technical Details
Mathematical Formulation
For each sample (row) when axis=1:
x_i,transformed = (x_i - μ_i) / σ_i
Where:
x_iis the i-th sample (spectrum)μ_iis the mean of the i-th sampleσ_iis the standard deviation of the i-th sample
Handling Edge Cases
Zero standard deviation: If a sample has zero standard deviation (all values are the same), the std is set to 1.0 to avoid division by zero
Sparse matrices: Not supported (will raise TypeError)
Comparison with StandardScaler
Feature |
StandardNormalVariate (axis=1) |
StandardNormalVariate (axis=0) |
sklearn.StandardScaler |
|---|---|---|---|
Default behavior |
Row-wise (per sample) |
Column-wise (per feature) |
Column-wise (per feature) |
Typical use case |
Spectroscopy |
Feature scaling |
Feature scaling |
Fits parameters |
No (stateless) |
No (stateless) |
Yes (stores mean/std) |
Memory in pipeline |
Minimal |
Minimal |
Stores statistics |
Migration from Previous Implementation
If you were using the old alias (StandardScaler as StandardNormalVariate), the behavior has changed:
Old behavior (column-wise):
# This used to be sklearn's StandardScaler (column-wise)
snv = StandardNormalVariate()
New behavior (row-wise - proper SNV):
# Now it's true row-wise SNV by default
snv = StandardNormalVariate() # Row-wise by default
# To get the old behavior (column-wise):
snv = StandardNormalVariate(axis=0)
References
Barnes, R. J., Dhanoa, M. S., & Lister, S. J. (1989). Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. Applied Spectroscopy, 43(5), 772-777.
Rinnan, Å., van den Berg, F., & Engelsen, S. B. (2009). Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends in Analytical Chemistry, 28(10), 1201-1222.
See Also
MultiplicativeScatterCorrection: Another scatter correction methodSavitzkyGolay: Smoothing and derivative calculationsklearn.preprocessing.StandardScaler: Column-wise standardization