nirs4all.data.synthetic.accelerated module

GPU-accelerated generation for synthetic NIRS data.

This module provides optional GPU acceleration for generating large synthetic datasets using JAX, CuPy, or falls back to NumPy.

Phase 4 Features:

Automatic backend detection (JAX, CuPy, NumPy)
Batch spectrum generation on GPU
Significant speedup for large datasets (10x+)
Graceful fallback to CPU when GPU unavailable

Note

This module is optional. GPU acceleration requires additional dependencies (jax[cuda] or cupy-cuda*).

class nirs4all.data.synthetic.accelerated.AcceleratedArrays(backend: AcceleratorBackend, zeros: Callable, ones: Callable, arange: Callable, linspace: Callable, array: Callable, exp: Callable, log: Callable, sqrt: Callable, sin: Callable, cos: Callable, sum: Callable, dot: Callable, matmul: Callable, random_normal: Callable, random_uniform: Callable, to_numpy: Callable)[source]

Bases: object

Container for accelerated array operations.

arange: Callable

array: Callable

backend: AcceleratorBackend

cos: Callable

dot: Callable

exp: Callable

linspace: Callable

log: Callable

matmul: Callable

ones: Callable

random_normal: Callable

random_uniform: Callable

sin: Callable

sqrt: Callable

sum: Callable

to_numpy: Callable

zeros: Callable

class nirs4all.data.synthetic.accelerated.AcceleratedGenerator(backend: AcceleratorBackend | None = None, random_state: int | None = None)[source]

Bases: object

GPU-accelerated synthetic spectrum generator.

This class provides a high-level interface for generating large batches of synthetic spectra using GPU acceleration when available.

Parameters:

backend – Backend to use (auto-detect if None).
random_state – Random state for reproducibility.

Example

>>> gen = AcceleratedGenerator(random_state=42)
>>> print(f"Using backend: {gen.backend}")
>>>
>>> # Generate 10000 spectra
>>> X = gen.generate_batch(
...     n_samples=10000,
...     wavelengths=np.linspace(1000, 2500, 700),
...     component_spectra=E,
...     concentrations=C,
... )

generate_batch(n_samples: int, wavelengths: ndarray, component_spectra: ndarray, concentrations: ndarray, noise_level: float = 0.01) → ndarray[source]

Generate a batch of spectra.

Parameters:

n_samples – Number of samples.
wavelengths – Wavelength array.
component_spectra – Component spectra (n_components, n_wavelengths).
concentrations – Concentrations (n_samples, n_components).
noise_level – Noise level.

Returns:

Generated spectra (n_samples, n_wavelengths).

generate_voigt_profiles(wavelengths: ndarray, centers: ndarray, amplitudes: ndarray, sigmas: ndarray, gammas: ndarray) → ndarray[source]

Generate Voigt profiles for component spectra.

Parameters:

wavelengths – Wavelength array.
centers – Band centers.
amplitudes – Band amplitudes.
sigmas – Gaussian widths.
gammas – Lorentzian widths.

Returns:

Spectrum array.

class nirs4all.data.synthetic.accelerated.AcceleratorBackend(value)[source]

Bases: str, Enum

Available acceleration backends.

CUPY = 'cupy'

JAX = 'jax'

NUMPY = 'numpy'

nirs4all.data.synthetic.accelerated.benchmark_backends(n_samples: int = 1000, n_wavelengths: int = 700, n_components: int = 5, n_trials: int = 5) → Dict[str, float][source]

Benchmark available backends.

Parameters:

n_samples – Number of samples to generate.
n_wavelengths – Number of wavelengths.
n_components – Number of components.
n_trials – Number of timing trials.

Returns:

Dictionary of backend name to mean time in seconds.

Example

>>> results = benchmark_backends()
>>> for backend, time in results.items():
...     print(f"{backend}: {time:.4f}s")

nirs4all.data.synthetic.accelerated.create_accelerated_arrays(backend: AcceleratorBackend | None = None, seed: int = 0) → AcceleratedArrays[source]

Create accelerated array operations for the specified backend.

Parameters:

backend – Backend to use (auto-detect if None).
seed – Random seed.

Returns:

AcceleratedArrays with operations for the backend.

nirs4all.data.synthetic.accelerated.detect_best_backend() → AcceleratorBackend[source]

Detect the best available acceleration backend.

Returns:: AcceleratorBackend enum indicating best available option.

Example

>>> backend = detect_best_backend()
>>> print(f"Using backend: {backend}")

nirs4all.data.synthetic.accelerated.generate_spectra_batch_accelerated(n_samples: int, wavelengths: ndarray, component_spectra: ndarray, concentrations: ndarray, noise_level: float = 0.01, arrays: AcceleratedArrays | None = None) → ndarray[source]

Generate batch of spectra using GPU acceleration.

Parameters:

n_samples – Number of samples to generate.
wavelengths – Wavelength array.
component_spectra – Pure component spectra (n_components, n_wavelengths).
concentrations – Concentration matrix (n_samples, n_components).
noise_level – Noise level as fraction of signal.
arrays – Accelerated arrays.

Returns:

Generated spectra (n_samples, n_wavelengths).

nirs4all.data.synthetic.accelerated.generate_voigt_profiles_accelerated(wavelengths: ndarray, centers: ndarray, amplitudes: ndarray, sigmas: ndarray, gammas: ndarray, arrays: AcceleratedArrays | None = None) → ndarray[source]

Generate Voigt profiles using GPU acceleration.

Uses Pseudo-Voigt approximation for efficiency.

Parameters:

wavelengths – Wavelength array (n_wavelengths,).
centers – Band centers (n_bands,).
amplitudes – Band amplitudes (n_bands,).
sigmas – Gaussian widths (n_bands,).
gammas – Lorentzian widths (n_bands,).
arrays – Accelerated arrays (auto-create if None).

Returns:

Spectrum array (n_wavelengths,).

nirs4all.data.synthetic.accelerated.get_acceleration_speedup_estimate(n_samples: int) → float[source]

Estimate speedup from GPU acceleration.

Parameters:: n_samples – Number of samples to generate.
Returns:: Estimated speedup factor (1.0 for CPU).

nirs4all.data.synthetic.accelerated.get_backend_info() → Dict[str, Any][source]

Get detailed information about available backends.

Returns:: Dictionary with backend availability and details.

nirs4all.data.synthetic.accelerated.is_gpu_available() → bool[source]

Check if GPU acceleration is available.

Returns:: True if JAX with GPU or CuPy is available.

Example

>>> if is_gpu_available():
...     print("GPU acceleration enabled!")