nirs4all.data.synthetic.reconstruction.generator module
Synthetic data generation from learned parameter distributions.
Generates realistic synthetic spectra by: 1. Sampling physical parameters from learned distributions 2. Running the forward model chain 3. Adding appropriate noise
- class nirs4all.data.synthetic.reconstruction.generator.GenerationResult(X: ndarray, concentrations: ndarray, path_lengths: ndarray, baseline_coeffs: ndarray, wavelengths: ndarray, noise_level: float = 0.0, wl_shifts: ndarray | None = None, temperature_deltas: ndarray | None = None, water_activities: ndarray | None = None, scattering_powers: ndarray | None = None, scattering_amplitudes: ndarray | None = None)[source]
Bases:
objectResult of synthetic generation.
- X
Generated spectra (n_samples, n_wavelengths).
- Type:
- concentrations
Sampled concentrations (n_samples, n_components).
- Type:
- path_lengths
Sampled path lengths (n_samples,).
- Type:
- baseline_coeffs
Sampled baseline coefficients.
- Type:
- wavelengths
Wavelength grid.
- Type:
- wl_shifts
Per-sample wavelength shifts.
- Type:
numpy.ndarray | None
- temperature_deltas
Per-sample temperature deviations (°C).
- Type:
numpy.ndarray | None
- water_activities
Per-sample water activity values.
- Type:
numpy.ndarray | None
- scattering_powers
Per-sample scattering exponents.
- Type:
numpy.ndarray | None
- scattering_amplitudes
Per-sample scattering amplitudes.
- Type:
numpy.ndarray | None
- class nirs4all.data.synthetic.reconstruction.generator.ReconstructionGenerator(noise_level: float = 0.001, multiplicative_noise: float = 0.01, add_noise: bool = True, noise_type: str = 'both')[source]
Bases:
objectGenerate synthetic spectra from learned parameter distributions.
Uses the calibrated forward chain and learned parameter distributions to generate realistic synthetic data that matches the statistical properties of the original dataset.
- forward_chain
Calibrated forward chain.
- sampler
Parameter sampler with learned distributions.
- noise_estimator
Estimated noise level from inversion residuals.
- generate(n_samples: int, forward_chain: ForwardChain, sampler: ParameterSampler, random_state: int | None = None) GenerationResult[source]
Generate synthetic spectra.
- Parameters:
n_samples – Number of samples to generate.
forward_chain – Calibrated forward chain.
sampler – Parameter sampler.
random_state – Random seed.
- Returns:
GenerationResult with generated spectra and parameters.
- generate_matched(X_real: np.ndarray, forward_chain: ForwardChain, sampler: ParameterSampler, random_state: int | None = None) GenerationResult[source]
Generate synthetic data matched to real data statistics.
Generates same number of samples as real data and optionally adjusts noise level based on estimated residuals.
- Parameters:
X_real – Real data matrix for reference.
forward_chain – Calibrated forward chain.
sampler – Parameter sampler.
random_state – Random seed.
- Returns:
GenerationResult.
- nirs4all.data.synthetic.reconstruction.generator.estimate_noise_from_residuals(inversion_results: List['InversionResult']) Tuple[float, float][source]
Estimate noise parameters from inversion residuals.
- Parameters:
inversion_results – List of inversion results with residuals.
- Returns:
Tuple of (additive_noise_std, multiplicative_noise_std).
- nirs4all.data.synthetic.reconstruction.generator.generate_synthetic_dataset(forward_chain: ForwardChain, distribution_result: DistributionResult, n_samples: int, noise_level: float = 0.001, multiplicative_noise: float = 0.01, random_state: int | None = None) GenerationResult[source]
Complete pipeline to generate synthetic dataset.
- Parameters:
forward_chain – Calibrated forward chain.
distribution_result – Fitted parameter distributions.
n_samples – Number of samples to generate.
noise_level – Additive noise level.
multiplicative_noise – Multiplicative noise level.
random_state – Random seed.
- Returns:
GenerationResult with synthetic spectra.