nirs4all.data.synthetic.accelerated module

GPU-accelerated generation for synthetic NIRS data.

This module provides optional GPU acceleration for generating large synthetic datasets using JAX, CuPy, or falls back to NumPy.

Phase 4 Features:
  • Automatic backend detection (JAX, CuPy, NumPy)

  • Batch spectrum generation on GPU

  • Significant speedup for large datasets (10x+)

  • Graceful fallback to CPU when GPU unavailable

Note

This module is optional. GPU acceleration requires additional dependencies (jax[cuda] or cupy-cuda*).

class nirs4all.data.synthetic.accelerated.AcceleratedArrays(backend: AcceleratorBackend, zeros: Callable, ones: Callable, arange: Callable, linspace: Callable, array: Callable, exp: Callable, log: Callable, sqrt: Callable, sin: Callable, cos: Callable, sum: Callable, dot: Callable, matmul: Callable, random_normal: Callable, random_uniform: Callable, to_numpy: Callable)[source]

Bases: object

Container for accelerated array operations.

arange: Callable
array: Callable
backend: AcceleratorBackend
cos: Callable
dot: Callable
exp: Callable
linspace: Callable
log: Callable
matmul: Callable
ones: Callable
random_normal: Callable
random_uniform: Callable
sin: Callable
sqrt: Callable
sum: Callable
to_numpy: Callable
zeros: Callable
class nirs4all.data.synthetic.accelerated.AcceleratedGenerator(backend: AcceleratorBackend | None = None, random_state: int | None = None)[source]

Bases: object

GPU-accelerated synthetic spectrum generator.

This class provides a high-level interface for generating large batches of synthetic spectra using GPU acceleration when available.

Parameters:
  • backend – Backend to use (auto-detect if None).

  • random_state – Random state for reproducibility.

Example

>>> gen = AcceleratedGenerator(random_state=42)
>>> print(f"Using backend: {gen.backend}")
>>>
>>> # Generate 10000 spectra
>>> X = gen.generate_batch(
...     n_samples=10000,
...     wavelengths=np.linspace(1000, 2500, 700),
...     component_spectra=E,
...     concentrations=C,
... )
generate_batch(n_samples: int, wavelengths: ndarray, component_spectra: ndarray, concentrations: ndarray, noise_level: float = 0.01) ndarray[source]

Generate a batch of spectra.

Parameters:
  • n_samples – Number of samples.

  • wavelengths – Wavelength array.

  • component_spectra – Component spectra (n_components, n_wavelengths).

  • concentrations – Concentrations (n_samples, n_components).

  • noise_level – Noise level.

Returns:

Generated spectra (n_samples, n_wavelengths).

generate_voigt_profiles(wavelengths: ndarray, centers: ndarray, amplitudes: ndarray, sigmas: ndarray, gammas: ndarray) ndarray[source]

Generate Voigt profiles for component spectra.

Parameters:
  • wavelengths – Wavelength array.

  • centers – Band centers.

  • amplitudes – Band amplitudes.

  • sigmas – Gaussian widths.

  • gammas – Lorentzian widths.

Returns:

Spectrum array.

class nirs4all.data.synthetic.accelerated.AcceleratorBackend(value)[source]

Bases: str, Enum

Available acceleration backends.

CUPY = 'cupy'
JAX = 'jax'
NUMPY = 'numpy'
nirs4all.data.synthetic.accelerated.benchmark_backends(n_samples: int = 1000, n_wavelengths: int = 700, n_components: int = 5, n_trials: int = 5) Dict[str, float][source]

Benchmark available backends.

Parameters:
  • n_samples – Number of samples to generate.

  • n_wavelengths – Number of wavelengths.

  • n_components – Number of components.

  • n_trials – Number of timing trials.

Returns:

Dictionary of backend name to mean time in seconds.

Example

>>> results = benchmark_backends()
>>> for backend, time in results.items():
...     print(f"{backend}: {time:.4f}s")
nirs4all.data.synthetic.accelerated.create_accelerated_arrays(backend: AcceleratorBackend | None = None, seed: int = 0) AcceleratedArrays[source]

Create accelerated array operations for the specified backend.

Parameters:
  • backend – Backend to use (auto-detect if None).

  • seed – Random seed.

Returns:

AcceleratedArrays with operations for the backend.

nirs4all.data.synthetic.accelerated.detect_best_backend() AcceleratorBackend[source]

Detect the best available acceleration backend.

Returns:

AcceleratorBackend enum indicating best available option.

Example

>>> backend = detect_best_backend()
>>> print(f"Using backend: {backend}")
nirs4all.data.synthetic.accelerated.generate_spectra_batch_accelerated(n_samples: int, wavelengths: ndarray, component_spectra: ndarray, concentrations: ndarray, noise_level: float = 0.01, arrays: AcceleratedArrays | None = None) ndarray[source]

Generate batch of spectra using GPU acceleration.

Parameters:
  • n_samples – Number of samples to generate.

  • wavelengths – Wavelength array.

  • component_spectra – Pure component spectra (n_components, n_wavelengths).

  • concentrations – Concentration matrix (n_samples, n_components).

  • noise_level – Noise level as fraction of signal.

  • arrays – Accelerated arrays.

Returns:

Generated spectra (n_samples, n_wavelengths).

nirs4all.data.synthetic.accelerated.generate_voigt_profiles_accelerated(wavelengths: ndarray, centers: ndarray, amplitudes: ndarray, sigmas: ndarray, gammas: ndarray, arrays: AcceleratedArrays | None = None) ndarray[source]

Generate Voigt profiles using GPU acceleration.

Uses Pseudo-Voigt approximation for efficiency.

Parameters:
  • wavelengths – Wavelength array (n_wavelengths,).

  • centers – Band centers (n_bands,).

  • amplitudes – Band amplitudes (n_bands,).

  • sigmas – Gaussian widths (n_bands,).

  • gammas – Lorentzian widths (n_bands,).

  • arrays – Accelerated arrays (auto-create if None).

Returns:

Spectrum array (n_wavelengths,).

nirs4all.data.synthetic.accelerated.get_acceleration_speedup_estimate(n_samples: int) float[source]

Estimate speedup from GPU acceleration.

Parameters:

n_samples – Number of samples to generate.

Returns:

Estimated speedup factor (1.0 for CPU).

nirs4all.data.synthetic.accelerated.get_backend_info() Dict[str, Any][source]

Get detailed information about available backends.

Returns:

Dictionary with backend availability and details.

nirs4all.data.synthetic.accelerated.is_gpu_available() bool[source]

Check if GPU acceleration is available.

Returns:

True if JAX with GPU or CuPy is available.

Example

>>> if is_gpu_available():
...     print("GPU acceleration enabled!")