# Standard Normal Variate (SNV) Transformation

## Overview

Standard Normal Variate (SNV) is a scatter correction technique commonly used in Near-Infrared Spectroscopy (NIRS) and other spectroscopic applications. It normalizes each spectrum (sample) individually to remove multiplicative scatter effects.

## Implementation

The `StandardNormalVariate` class in `nirs4all` provides a flexible implementation that can work in two modes:

### 1. Row-wise SNV (Default - Spectroscopy Standard)

By default, SNV operates row-wise (axis=1), which is the standard approach in spectroscopy. Each spectrum (row) is centered and scaled independently:

```python
from nirs4all.operators.transforms import StandardNormalVariate
import numpy as np

# Example spectral data (3 samples, 5 wavelengths)
X = np.array([[1, 2, 3, 4, 5],
              [10, 20, 30, 40, 50],
              [100, 200, 300, 400, 500]], dtype=float)

# Apply SNV (row-wise by default)
snv = StandardNormalVariate()
X_transformed = snv.fit_transform(X)

# Each row now has mean≈0 and std≈1
```

**Formula (per sample):**
```
SNV(x) = (x - mean(x)) / std(x)
```

### 2. Column-wise SNV (Like StandardScaler)

You can also apply SNV column-wise (axis=0), which makes it equivalent to sklearn's StandardScaler:

```python
# Apply SNV column-wise
snv_colwise = StandardNormalVariate(axis=0)
X_transformed = snv_colwise.fit_transform(X)

# Each column now has mean≈0 and std≈1
```

## Parameters

- **axis** (int, default=1): Axis along which to compute mean and standard deviation
  - `axis=1`: Row-wise (default, standard SNV for spectroscopy)
  - `axis=0`: Column-wise (equivalent to StandardScaler)

- **with_mean** (bool, default=True): If True, center the data before scaling

- **with_std** (bool, default=True): If True, scale the data to unit variance

- **ddof** (int, default=0): Delta Degrees of Freedom for standard deviation calculation

- **copy** (bool, default=True): If False, try to avoid a copy and do inplace scaling

## Use Cases

### Row-wise SNV (axis=1) - Spectroscopy
- **Purpose**: Remove multiplicative scatter effects from individual spectra
- **When to use**: Standard preprocessing for NIRS, Raman, and other spectroscopic data
- **Effect**: Each spectrum is normalized independently, removing baseline shifts and scaling differences

### Column-wise SNV (axis=0) - Feature Scaling
- **Purpose**: Standardize features across samples
- **When to use**: When you want to normalize features (wavelengths) rather than samples
- **Effect**: Equivalent to sklearn's StandardScaler

## Examples

### Example 1: Basic SNV for Spectroscopy

```python
from nirs4all.operators.transforms import StandardNormalVariate
from sklearn.pipeline import Pipeline
from sklearn.decomposition import PCA

# Create preprocessing pipeline
pipeline = Pipeline([
    ('snv', StandardNormalVariate()),  # Row-wise SNV
    ('pca', PCA(n_components=10))
])

# Apply to spectral data
X_processed = pipeline.fit_transform(X_spectra)
```

### Example 2: SNV with Other Preprocessing

```python
from nirs4all.operators.transforms import (
    StandardNormalVariate,
    SavitzkyGolay,
    MultiplicativeScatterCorrection
)
from sklearn.pipeline import Pipeline

# Compare different scatter correction methods
pipelines = {
    'snv': Pipeline([('snv', StandardNormalVariate())]),
    'msc': Pipeline([('msc', MultiplicativeScatterCorrection())]),
    'snv+savgol': Pipeline([
        ('snv', StandardNormalVariate()),
        ('savgol', SavitzkyGolay())
    ])
}
```

### Example 3: Column-wise Standardization

```python
# If you need column-wise standardization (like StandardScaler)
snv_colwise = StandardNormalVariate(axis=0)
X_scaled = snv_colwise.fit_transform(X)
```

## Technical Details

### Mathematical Formulation

For each sample (row) when axis=1:
```
x_i,transformed = (x_i - μ_i) / σ_i
```

Where:
- `x_i` is the i-th sample (spectrum)
- `μ_i` is the mean of the i-th sample
- `σ_i` is the standard deviation of the i-th sample

### Handling Edge Cases

- **Zero standard deviation**: If a sample has zero standard deviation (all values are the same), the std is set to 1.0 to avoid division by zero
- **Sparse matrices**: Not supported (will raise TypeError)

## Comparison with StandardScaler

| Feature | StandardNormalVariate (axis=1) | StandardNormalVariate (axis=0) | sklearn.StandardScaler |
|---------|-------------------------------|-------------------------------|------------------------|
| Default behavior | Row-wise (per sample) | Column-wise (per feature) | Column-wise (per feature) |
| Typical use case | Spectroscopy | Feature scaling | Feature scaling |
| Fits parameters | No (stateless) | No (stateless) | Yes (stores mean/std) |
| Memory in pipeline | Minimal | Minimal | Stores statistics |

## Migration from Previous Implementation

If you were using the old alias (`StandardScaler` as `StandardNormalVariate`), the behavior has changed:

**Old behavior (column-wise):**
```python
# This used to be sklearn's StandardScaler (column-wise)
snv = StandardNormalVariate()
```

**New behavior (row-wise - proper SNV):**
```python
# Now it's true row-wise SNV by default
snv = StandardNormalVariate()  # Row-wise by default

# To get the old behavior (column-wise):
snv = StandardNormalVariate(axis=0)
```

## References

- Barnes, R. J., Dhanoa, M. S., & Lister, S. J. (1989). Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. *Applied Spectroscopy*, 43(5), 772-777.
- Rinnan, Å., van den Berg, F., & Engelsen, S. B. (2009). Review of the most common pre-processing techniques for near-infrared spectra. *TrAC Trends in Analytical Chemistry*, 28(10), 1201-1222.

## See Also

- `MultiplicativeScatterCorrection`: Another scatter correction method
- `SavitzkyGolay`: Smoothing and derivative calculation
- `sklearn.preprocessing.StandardScaler`: Column-wise standardization