nirs4all.data.synthetic.accelerated module
GPU-accelerated generation for synthetic NIRS data.
This module provides optional GPU acceleration for generating large synthetic datasets using JAX, CuPy, or falls back to NumPy.
- Phase 4 Features:
Automatic backend detection (JAX, CuPy, NumPy)
Batch spectrum generation on GPU
Significant speedup for large datasets (10x+)
Graceful fallback to CPU when GPU unavailable
Note
This module is optional. GPU acceleration requires additional dependencies (jax[cuda] or cupy-cuda*).
- class nirs4all.data.synthetic.accelerated.AcceleratedArrays(backend: AcceleratorBackend, zeros: Callable, ones: Callable, arange: Callable, linspace: Callable, array: Callable, exp: Callable, log: Callable, sqrt: Callable, sin: Callable, cos: Callable, sum: Callable, dot: Callable, matmul: Callable, random_normal: Callable, random_uniform: Callable, to_numpy: Callable)[source]
Bases:
objectContainer for accelerated array operations.
- backend: AcceleratorBackend
- class nirs4all.data.synthetic.accelerated.AcceleratedGenerator(backend: AcceleratorBackend | None = None, random_state: int | None = None)[source]
Bases:
objectGPU-accelerated synthetic spectrum generator.
This class provides a high-level interface for generating large batches of synthetic spectra using GPU acceleration when available.
- Parameters:
backend – Backend to use (auto-detect if None).
random_state – Random state for reproducibility.
Example
>>> gen = AcceleratedGenerator(random_state=42) >>> print(f"Using backend: {gen.backend}") >>> >>> # Generate 10000 spectra >>> X = gen.generate_batch( ... n_samples=10000, ... wavelengths=np.linspace(1000, 2500, 700), ... component_spectra=E, ... concentrations=C, ... )
- generate_batch(n_samples: int, wavelengths: ndarray, component_spectra: ndarray, concentrations: ndarray, noise_level: float = 0.01) ndarray[source]
Generate a batch of spectra.
- Parameters:
n_samples – Number of samples.
wavelengths – Wavelength array.
component_spectra – Component spectra (n_components, n_wavelengths).
concentrations – Concentrations (n_samples, n_components).
noise_level – Noise level.
- Returns:
Generated spectra (n_samples, n_wavelengths).
- generate_voigt_profiles(wavelengths: ndarray, centers: ndarray, amplitudes: ndarray, sigmas: ndarray, gammas: ndarray) ndarray[source]
Generate Voigt profiles for component spectra.
- Parameters:
wavelengths – Wavelength array.
centers – Band centers.
amplitudes – Band amplitudes.
sigmas – Gaussian widths.
gammas – Lorentzian widths.
- Returns:
Spectrum array.
- class nirs4all.data.synthetic.accelerated.AcceleratorBackend(value)[source]
-
Available acceleration backends.
- CUPY = 'cupy'
- JAX = 'jax'
- NUMPY = 'numpy'
- nirs4all.data.synthetic.accelerated.benchmark_backends(n_samples: int = 1000, n_wavelengths: int = 700, n_components: int = 5, n_trials: int = 5) Dict[str, float][source]
Benchmark available backends.
- Parameters:
n_samples – Number of samples to generate.
n_wavelengths – Number of wavelengths.
n_components – Number of components.
n_trials – Number of timing trials.
- Returns:
Dictionary of backend name to mean time in seconds.
Example
>>> results = benchmark_backends() >>> for backend, time in results.items(): ... print(f"{backend}: {time:.4f}s")
- nirs4all.data.synthetic.accelerated.create_accelerated_arrays(backend: AcceleratorBackend | None = None, seed: int = 0) AcceleratedArrays[source]
Create accelerated array operations for the specified backend.
- Parameters:
backend – Backend to use (auto-detect if None).
seed – Random seed.
- Returns:
AcceleratedArrays with operations for the backend.
- nirs4all.data.synthetic.accelerated.detect_best_backend() AcceleratorBackend[source]
Detect the best available acceleration backend.
- Returns:
AcceleratorBackend enum indicating best available option.
Example
>>> backend = detect_best_backend() >>> print(f"Using backend: {backend}")
- nirs4all.data.synthetic.accelerated.generate_spectra_batch_accelerated(n_samples: int, wavelengths: ndarray, component_spectra: ndarray, concentrations: ndarray, noise_level: float = 0.01, arrays: AcceleratedArrays | None = None) ndarray[source]
Generate batch of spectra using GPU acceleration.
- Parameters:
n_samples – Number of samples to generate.
wavelengths – Wavelength array.
component_spectra – Pure component spectra (n_components, n_wavelengths).
concentrations – Concentration matrix (n_samples, n_components).
noise_level – Noise level as fraction of signal.
arrays – Accelerated arrays.
- Returns:
Generated spectra (n_samples, n_wavelengths).
- nirs4all.data.synthetic.accelerated.generate_voigt_profiles_accelerated(wavelengths: ndarray, centers: ndarray, amplitudes: ndarray, sigmas: ndarray, gammas: ndarray, arrays: AcceleratedArrays | None = None) ndarray[source]
Generate Voigt profiles using GPU acceleration.
Uses Pseudo-Voigt approximation for efficiency.
- Parameters:
wavelengths – Wavelength array (n_wavelengths,).
centers – Band centers (n_bands,).
amplitudes – Band amplitudes (n_bands,).
sigmas – Gaussian widths (n_bands,).
gammas – Lorentzian widths (n_bands,).
arrays – Accelerated arrays (auto-create if None).
- Returns:
Spectrum array (n_wavelengths,).
- nirs4all.data.synthetic.accelerated.get_acceleration_speedup_estimate(n_samples: int) float[source]
Estimate speedup from GPU acceleration.
- Parameters:
n_samples – Number of samples to generate.
- Returns:
Estimated speedup factor (1.0 for CPU).