nirs4all.synthesis.reconstruction package
Submodules
- nirs4all.synthesis.reconstruction.calibration module
CalibrationResultCalibrationResult.wl_shiftCalibrationResult.wl_stretchCalibrationResult.ils_sigmaCalibrationResult.stray_lightCalibrationResult.gainCalibrationResult.offsetCalibrationResult.prototype_residualsCalibrationResult.prototype_r2CalibrationResult.total_lossCalibrationResult.from_array()CalibrationResult.gainCalibrationResult.ils_sigmaCalibrationResult.offsetCalibrationResult.prototype_r2CalibrationResult.prototype_residualsCalibrationResult.stray_lightCalibrationResult.to_dict()CalibrationResult.total_lossCalibrationResult.wl_shiftCalibrationResult.wl_stretch
GlobalCalibratorGlobalCalibrator.forward_chainGlobalCalibrator.wl_shift_boundsGlobalCalibrator.wl_stretch_boundsGlobalCalibrator.ils_sigma_boundsGlobalCalibrator.regularizationGlobalCalibrator.use_global_searchGlobalCalibrator.calibrate()GlobalCalibrator.ils_sigma_boundsGlobalCalibrator.refine()GlobalCalibrator.regularizationGlobalCalibrator.use_global_searchGlobalCalibrator.wl_shift_boundsGlobalCalibrator.wl_stretch_bounds
PrototypeSelectormultistage_calibration()
- nirs4all.synthesis.reconstruction.distributions module
DistributionResultDistributionResult.param_namesDistributionResult.distributionsDistributionResult.correlationsDistributionResult.factor_loadingsDistributionResult.transform_paramsDistributionResult.n_samples_fittedDistributionResult.correlationsDistributionResult.distributionsDistributionResult.factor_loadingsDistributionResult.n_samples_fittedDistributionResult.param_namesDistributionResult.summary()DistributionResult.transform_params
ParameterDistributionFitterParameterDistributionFitter.positive_paramsParameterDistributionFitter.bounded_paramsParameterDistributionFitter.use_factor_modelParameterDistributionFitter.n_factorsParameterDistributionFitter.min_stdParameterDistributionFitter.bounded_paramsParameterDistributionFitter.fit()ParameterDistributionFitter.min_stdParameterDistributionFitter.n_factorsParameterDistributionFitter.positive_paramsParameterDistributionFitter.use_factor_model
ParameterSamplerfit_parameter_distributions()
- nirs4all.synthesis.reconstruction.environmental module
EnvironmentalEffectsModelEnvironmentalEffectsModel.temperature_deltaEnvironmentalEffectsModel.water_activityEnvironmentalEffectsModel.scattering_powerEnvironmentalEffectsModel.scattering_amplitudeEnvironmentalEffectsModel.enabledEnvironmentalEffectsModel.reference_wavelengthEnvironmentalEffectsModel.apply()EnvironmentalEffectsModel.copy()EnvironmentalEffectsModel.enabledEnvironmentalEffectsModel.from_dict()EnvironmentalEffectsModel.get_jacobian_wrt_scattering_amplitude()EnvironmentalEffectsModel.get_jacobian_wrt_scattering_power()EnvironmentalEffectsModel.get_jacobian_wrt_temperature()EnvironmentalEffectsModel.get_jacobian_wrt_water_activity()EnvironmentalEffectsModel.reference_wavelengthEnvironmentalEffectsModel.scattering_amplitudeEnvironmentalEffectsModel.scattering_powerEnvironmentalEffectsModel.temperature_deltaEnvironmentalEffectsModel.to_dict()EnvironmentalEffectsModel.water_activity
EnvironmentalParameterConfigEnvironmentalParameterConfig.compute_prior_penalty()EnvironmentalParameterConfig.get_bounds_list()EnvironmentalParameterConfig.scattering_amplitude_boundsEnvironmentalParameterConfig.scattering_amplitude_prior_scaleEnvironmentalParameterConfig.scattering_power_boundsEnvironmentalParameterConfig.scattering_power_prior_meanEnvironmentalParameterConfig.scattering_power_prior_stdEnvironmentalParameterConfig.temperature_boundsEnvironmentalParameterConfig.temperature_prior_meanEnvironmentalParameterConfig.temperature_prior_stdEnvironmentalParameterConfig.water_activity_boundsEnvironmentalParameterConfig.water_activity_prior_alphaEnvironmentalParameterConfig.water_activity_prior_beta
- nirs4all.synthesis.reconstruction.forward module
CanonicalForwardModelCanonicalForwardModel.canonical_gridCanonicalForwardModel.component_namesCanonicalForwardModel.component_spectraCanonicalForwardModel.baseline_orderCanonicalForwardModel.continuum_orderCanonicalForwardModel.__post_init__()CanonicalForwardModel.baseline_orderCanonicalForwardModel.canonical_gridCanonicalForwardModel.component_namesCanonicalForwardModel.compute_absorption()CanonicalForwardModel.continuum_orderCanonicalForwardModel.get_design_matrix()CanonicalForwardModel.n_baselineCanonicalForwardModel.n_componentsCanonicalForwardModel.n_continuumCanonicalForwardModel.n_linear_params
DomainTransformForwardChainForwardChain.canonical_modelForwardChain.environmental_modelForwardChain.instrument_modelForwardChain.domain_transformForwardChain.preprocessingForwardChain.canonical_modelForwardChain.create()ForwardChain.domain_transformForwardChain.environmental_modelForwardChain.forward()ForwardChain.forward_design_matrix()ForwardChain.instrument_modelForwardChain.preprocessing
InstrumentModelInstrumentModel.target_gridInstrumentModel.wl_shiftInstrumentModel.wl_stretchInstrumentModel.wl_poly_coeffsInstrumentModel.ils_sigmaInstrumentModel.stray_lightInstrumentModel.gainInstrumentModel.offsetInstrumentModel.apply()InstrumentModel.from_params()InstrumentModel.gainInstrumentModel.get_jacobian_wrt_ils_sigma()InstrumentModel.get_jacobian_wrt_wl_shift()InstrumentModel.ils_sigmaInstrumentModel.offsetInstrumentModel.stray_lightInstrumentModel.target_gridInstrumentModel.wl_poly_coeffsInstrumentModel.wl_shiftInstrumentModel.wl_stretch
PreprocessingOperatorPreprocessingOperator.preprocessing_typePreprocessingOperator.sg_windowPreprocessingOperator.sg_polyorderPreprocessingOperator.sg_derivPreprocessingOperator.reference_spectrumPreprocessingOperator.apply()PreprocessingOperator.apply_to_matrix()PreprocessingOperator.from_detection()PreprocessingOperator.preprocessing_typePreprocessingOperator.reference_spectrumPreprocessingOperator.sg_derivPreprocessingOperator.sg_polyorderPreprocessingOperator.sg_window
- nirs4all.synthesis.reconstruction.generator module
GenerationResultGenerationResult.XGenerationResult.concentrationsGenerationResult.path_lengthsGenerationResult.baseline_coeffsGenerationResult.wavelengthsGenerationResult.noise_levelGenerationResult.wl_shiftsGenerationResult.temperature_deltasGenerationResult.water_activitiesGenerationResult.scattering_powersGenerationResult.scattering_amplitudesGenerationResult.XGenerationResult.baseline_coeffsGenerationResult.concentrationsGenerationResult.n_samplesGenerationResult.n_wavelengthsGenerationResult.noise_levelGenerationResult.path_lengthsGenerationResult.scattering_amplitudesGenerationResult.scattering_powersGenerationResult.temperature_deltasGenerationResult.water_activitiesGenerationResult.wavelengthsGenerationResult.wl_shifts
ReconstructionGeneratorReconstructionGenerator.forward_chainReconstructionGenerator.samplerReconstructionGenerator.noise_estimatorReconstructionGenerator.add_noiseReconstructionGenerator.noise_typeReconstructionGenerator.add_noiseReconstructionGenerator.generate()ReconstructionGenerator.generate_matched()ReconstructionGenerator.multiplicative_noiseReconstructionGenerator.noise_levelReconstructionGenerator.noise_type
estimate_noise_from_residuals()generate_synthetic_dataset()
- nirs4all.synthesis.reconstruction.inversion module
InversionResultInversionResult.concentrationsInversionResult.baseline_coeffsInversionResult.continuum_coeffsInversionResult.path_lengthInversionResult.wl_shift_residualInversionResult.scatter_coeffsInversionResult.fitted_spectrumInversionResult.residualsInversionResult.r_squaredInversionResult.rmseInversionResult.convergedInversionResult.temperature_deltaInversionResult.water_activityInversionResult.scattering_powerInversionResult.scattering_amplitudeInversionResult.baseline_coeffsInversionResult.concentrationsInversionResult.continuum_coeffsInversionResult.convergedInversionResult.fitted_spectrumInversionResult.linear_paramsInversionResult.path_lengthInversionResult.r_squaredInversionResult.residualsInversionResult.rmseInversionResult.scatter_coeffsInversionResult.scattering_amplitudeInversionResult.scattering_powerInversionResult.temperature_deltaInversionResult.to_dict()InversionResult.water_activityInversionResult.wl_shift_residual
MultiscaleScheduleMultiscaleSchedule.smooth_sigmasMultiscaleSchedule.derivative_weightsMultiscaleSchedule.baseline_regularizationMultiscaleSchedule.max_iterationsMultiscaleSchedule.baseline_regularizationMultiscaleSchedule.derivative_weightsMultiscaleSchedule.max_iterationsMultiscaleSchedule.n_stagesMultiscaleSchedule.quick()MultiscaleSchedule.smooth_sigmasMultiscaleSchedule.thorough()
VariableProjectionSolverVariableProjectionSolver.path_length_boundsVariableProjectionSolver.wl_shift_boundsVariableProjectionSolver.concentration_regularizationVariableProjectionSolver.baseline_smoothness_penaltyVariableProjectionSolver.use_derivativesVariableProjectionSolver.fit_environmentalVariableProjectionSolver.temperature_boundsVariableProjectionSolver.water_activity_boundsVariableProjectionSolver.scattering_power_boundsVariableProjectionSolver.scattering_amplitude_boundsVariableProjectionSolver.environmental_prior_weightVariableProjectionSolver.baseline_smoothness_penaltyVariableProjectionSolver.concentration_regularizationVariableProjectionSolver.environmental_prior_weightVariableProjectionSolver.fit()VariableProjectionSolver.fit_batch()VariableProjectionSolver.fit_environmentalVariableProjectionSolver.path_length_boundsVariableProjectionSolver.scattering_amplitude_boundsVariableProjectionSolver.scattering_power_boundsVariableProjectionSolver.temperature_boundsVariableProjectionSolver.use_derivativesVariableProjectionSolver.verboseVariableProjectionSolver.water_activity_boundsVariableProjectionSolver.wl_shift_bounds
invert_dataset()
- nirs4all.synthesis.reconstruction.pipeline module
DatasetConfigDatasetConfig.wavelengthsDatasetConfig.signal_typeDatasetConfig.preprocessingDatasetConfig.domainDatasetConfig.sg_windowDatasetConfig.sg_polyorderDatasetConfig.nameDatasetConfig.domainDatasetConfig.from_data()DatasetConfig.nameDatasetConfig.preprocessingDatasetConfig.sg_polyorderDatasetConfig.sg_windowDatasetConfig.signal_typeDatasetConfig.wavelengths
PipelineResultPipelineResult.configPipelineResult.calibrationPipelineResult.inversion_resultsPipelineResult.distributionPipelineResult.X_syntheticPipelineResult.validationPipelineResult.forward_chainPipelineResult.X_syntheticPipelineResult.calibrationPipelineResult.configPipelineResult.distributionPipelineResult.forward_chainPipelineResult.inversion_resultsPipelineResult.summary()PipelineResult.validation
ReconstructionPipelineReconstructionPipeline.configReconstructionPipeline.component_namesReconstructionPipeline.canonical_resolutionReconstructionPipeline.baseline_orderReconstructionPipeline.n_prototypesReconstructionPipeline.fit_environmentalReconstructionPipeline.verboseReconstructionPipeline.__post_init__()ReconstructionPipeline.baseline_orderReconstructionPipeline.canonical_resolutionReconstructionPipeline.component_namesReconstructionPipeline.configReconstructionPipeline.continuum_orderReconstructionPipeline.fit()ReconstructionPipeline.fit_environmentalReconstructionPipeline.generate()ReconstructionPipeline.n_prototypesReconstructionPipeline.verbose
reconstruct_and_generate()
- nirs4all.synthesis.reconstruction.validation module
ReconstructionValidatorReconstructionValidator.r2_thresholdReconstructionValidator.residual_autocorr_thresholdReconstructionValidator.pca_distance_thresholdReconstructionValidator.concentration_maxReconstructionValidator.concentration_maxReconstructionValidator.path_length_boundsReconstructionValidator.pca_distance_thresholdReconstructionValidator.r2_thresholdReconstructionValidator.residual_autocorr_thresholdReconstructionValidator.validate()ReconstructionValidator.validate_parameters()ReconstructionValidator.validate_reconstruction()ReconstructionValidator.validate_synthetic()
ValidationResultValidationResult.reconstruction_metricsValidationResult.synthetic_metricsValidationResult.parameter_metricsValidationResult.overall_scoreValidationResult.passedValidationResult.warningsValidationResult.overall_scoreValidationResult.parameter_metricsValidationResult.passedValidationResult.reconstruction_metricsValidationResult.summary()ValidationResult.synthetic_metricsValidationResult.warnings
compute_diagnostic_data()
Module contents
Physical signal-chain reconstruction and variance modeling for NIR spectra.
This module implements a physically realistic “full signal-chain” reconstruction workflow that: 1. Reconstructs spectra using a physical forward model (Beer-Lambert + instrument chain) 2. Learns distributions of physical parameters for variance modeling 3. Generates realistic synthetic datasets by sampling from learned distributions
- Key Components:
CanonicalForwardModel: Physical model on canonical grid
InstrumentModel: Wavelength warp, ILS convolution, gain/offset
EnvironmentalEffectsModel: Temperature, moisture, and scattering effects
DomainModel: Absorbance/reflectance transformation
PreprocessingOperator: Match dataset preprocessing (SG derivatives, SNV, etc.)
VariableProjectionSolver: NNLS inner solve + nonlinear outer optimization
GlobalCalibrator: Prototype-based instrument parameter estimation
ParameterDistributionFitter: Learn distributions in parameter space
ReconstructionGenerator: Generate synthetic data from learned distributions
Example
>>> from nirs4all.synthesis.reconstruction import (
... ReconstructionPipeline,
... DatasetConfig,
... )
>>>
>>> # Configure for a dataset
>>> config = DatasetConfig(
... wavelengths=wavelengths,
... signal_type="absorbance",
... preprocessing="first_derivative",
... domain="food_dairy",
... )
>>>
>>> # Run full reconstruction pipeline
>>> pipeline = ReconstructionPipeline(config)
>>> result = pipeline.fit(X_real)
>>>
>>> # Generate synthetic data
>>> X_synth = pipeline.generate(n_samples=1000)
References
Burns, D. A., & Ciurczak, E. W. (2007). Handbook of Near-Infrared Analysis.
Workman Jr, J., & Weyer, L. (2012). Practical Guide and Spectral Atlas for Interpretive Near-Infrared Spectroscopy.
- class nirs4all.synthesis.reconstruction.CalibrationResult(wl_shift: float = 0.0, wl_stretch: float = 1.0, ils_sigma: float = 4.0, stray_light: float = 0.0, gain: float = 1.0, offset: float = 0.0, prototype_residuals: ndarray | None = None, prototype_r2: ndarray | None = None, total_loss: float = inf)[source]
Bases:
objectResult of global calibration.
- prototype_residuals
Residuals for each prototype.
- Type:
numpy.ndarray | None
- prototype_r2
R² for each prototype.
- Type:
numpy.ndarray | None
- classmethod from_array(params: ndarray) CalibrationResult[source]
Create from parameter array [wl_shift, wl_stretch, ils_sigma].
- class nirs4all.synthesis.reconstruction.CanonicalForwardModel(canonical_grid: ndarray, component_names: List[str] = <factory>, baseline_order: int = 5, continuum_order: int = 3, _component_spectra: ndarray | None = None, _baseline_basis: ndarray | None = None, _continuum_basis: ndarray | None = None)[source]
Bases:
objectPhysical model on canonical high-resolution wavelength grid.
- Computes absorption coefficient K(λ) from chemical components:
K(λ) = Σ c_k * ε_k(λ) + K0(λ)
- where:
c_k: concentration of component k
ε_k(λ): molar absorptivity (from component library)
K0(λ): continuum/background absorption (low-frequency)
- canonical_grid
High-resolution wavelength grid (nm).
- Type:
- component_spectra
Pre-computed component spectra on canonical grid.
- compute_absorption(concentrations: ndarray, path_length: float = 1.0, baseline_coeffs: ndarray | None = None, continuum_coeffs: ndarray | None = None) ndarray[source]
Compute absorption coefficient on canonical grid.
- Parameters:
concentrations – Component concentrations, shape (n_components,).
path_length – Optical path length factor.
baseline_coeffs – Baseline polynomial coefficients.
continuum_coeffs – Continuum absorption coefficients.
- Returns:
Absorbance spectrum on canonical grid.
- class nirs4all.synthesis.reconstruction.DatasetConfig(wavelengths: ndarray, signal_type: Literal['absorbance', 'reflectance', 'unknown'] = 'absorbance', preprocessing: Literal['none', 'first_derivative', 'second_derivative', 'snv', 'msc', 'unknown'] = 'none', domain: str = 'unknown', sg_window: int = 15, sg_polyorder: int = 2, name: str = 'dataset')[source]
Bases:
objectConfiguration for a dataset to be reconstructed.
Captures all dataset-specific information needed for reconstruction: - Wavelength grid - Signal type (absorbance, reflectance) - Preprocessing applied - Application domain (for component selection)
- wavelengths
Wavelength grid in nm.
- Type:
- signal_type
Signal type (‘absorbance’, ‘reflectance’).
- Type:
Literal[‘absorbance’, ‘reflectance’, ‘unknown’]
- preprocessing
Detected or specified preprocessing type.
- Type:
Literal[‘none’, ‘first_derivative’, ‘second_derivative’, ‘snv’, ‘msc’, ‘unknown’]
- classmethod from_data(X: ndarray, wavelengths: ndarray, name: str = 'dataset') DatasetConfig[source]
Create configuration by auto-detecting properties from data.
- Parameters:
X – Spectra matrix (n_samples, n_wavelengths).
wavelengths – Wavelength grid.
name – Dataset name.
- Returns:
DatasetConfig with detected properties.
- class nirs4all.synthesis.reconstruction.DistributionResult(param_names: ~typing.List[str], distributions: ~typing.Dict[str, ~typing.Dict[str, ~typing.Any]], correlations: ~numpy.ndarray | None = None, factor_loadings: ~numpy.ndarray | None = None, transform_params: ~typing.Dict[str, ~typing.Dict[str, ~typing.Any]] = <factory>, n_samples_fitted: int = 0)[source]
Bases:
objectResult of parameter distribution fitting.
- correlations
Correlation matrix of transformed parameters.
- Type:
numpy.ndarray | None
- factor_loadings
Low-rank factor model loadings (optional).
- Type:
numpy.ndarray | None
- class nirs4all.synthesis.reconstruction.DomainTransform(domain: Literal['absorbance', 'reflectance', 'transmittance', 'km'] = 'absorbance', scatter_coeffs: ndarray | None = None, scatter_wavelength_exp: float = 0.0)[source]
Bases:
objectTransform between physical domains (absorbance, reflectance, etc.).
For absorbance datasets: A(λ) = absorption coefficient (direct) For reflectance datasets: R(λ) computed via Kubelka-Munk or approximation
- domain
Domain type (‘absorbance’, ‘reflectance’, ‘transmittance’, ‘km’).
- Type:
Literal[‘absorbance’, ‘reflectance’, ‘transmittance’, ‘km’]
- scatter_coeffs
Scattering coefficients for KM model (reflectance).
- Type:
numpy.ndarray | None
- scatter_wavelength_dep
Wavelength-dependent scatter (λ^-n).
- inverse_transform(spectrum: ndarray, wavelengths: ndarray, scatter: ndarray | None = None) ndarray[source]
Inverse transform from domain to absorption.
- Parameters:
spectrum – Spectrum in domain representation.
wavelengths – Wavelength grid.
scatter – Scattering coefficient for reflectance.
- Returns:
Absorption coefficient.
- transform(absorption: ndarray, wavelengths: ndarray, scatter: ndarray | None = None) ndarray[source]
Transform absorption to target domain.
- Parameters:
absorption – Absorption coefficient K(λ).
wavelengths – Wavelength grid.
scatter – Scattering coefficient S(λ) for reflectance.
- Returns:
Spectrum in target domain representation.
- class nirs4all.synthesis.reconstruction.EnvironmentalEffectsModel(temperature_delta: float = 0.0, water_activity: float = 0.5, scattering_power: float = 1.5, scattering_amplitude: float = 0.0, enabled: bool = True, reference_wavelength: float = 1500.0, _region_masks: Dict[str, ndarray] | None = None, _cached_wavelengths: ndarray | None = None)[source]
Bases:
objectEnvironmental effects on the canonical absorption spectrum.
Applied to absorption in canonical space before domain transform and instrument effects. Implements region-specific temperature and moisture effects based on literature parameters.
- apply(absorption: ndarray, wavelengths: ndarray) ndarray[source]
Apply environmental effects to absorption spectrum.
Effects are applied in order: 1. Temperature effects (region-specific shifts, intensity changes) 2. Moisture effects (water band shifts based on water activity) 3. Scattering baseline (wavelength-dependent λ^-n)
- Parameters:
absorption – Absorption coefficient on canonical grid.
wavelengths – Wavelength grid (nm).
- Returns:
Modified absorption spectrum with environmental effects.
- copy() EnvironmentalEffectsModel[source]
Create a copy of this model.
- get_jacobian_wrt_scattering_amplitude(absorption: ndarray, wavelengths: ndarray, eps: float = 0.001) ndarray[source]
Numerical Jacobian w.r.t. scattering_amplitude.
- get_jacobian_wrt_scattering_power(absorption: ndarray, wavelengths: ndarray, eps: float = 0.05) ndarray[source]
Numerical Jacobian w.r.t. scattering_power.
- get_jacobian_wrt_temperature(absorption: ndarray, wavelengths: ndarray, eps: float = 0.1) ndarray[source]
Numerical Jacobian w.r.t. temperature_delta.
- class nirs4all.synthesis.reconstruction.EnvironmentalParameterConfig(temperature_bounds: Tuple[float, float] = (-15.0, 15.0), temperature_prior_mean: float = 0.0, temperature_prior_std: float = 5.0, water_activity_bounds: Tuple[float, float] = (0.1, 0.9), water_activity_prior_alpha: float = 2.0, water_activity_prior_beta: float = 2.0, scattering_power_bounds: Tuple[float, float] = (0.5, 3.0), scattering_power_prior_mean: float = 1.5, scattering_power_prior_std: float = 0.5, scattering_amplitude_bounds: Tuple[float, float] = (0.0, 0.2), scattering_amplitude_prior_scale: float = 0.02)[source]
Bases:
objectConfiguration for environmental parameter fitting.
Defines bounds and prior distributions for each parameter.
- compute_prior_penalty(temperature_delta: float, water_activity: float, scattering_power: float, scattering_amplitude: float) float[source]
Compute prior penalty for regularization.
Returns negative log-prior (to be added to objective function).
- class nirs4all.synthesis.reconstruction.ForwardChain(canonical_model: CanonicalForwardModel, instrument_model: InstrumentModel, domain_transform: DomainTransform, preprocessing: PreprocessingOperator, environmental_model: 'EnvironmentalEffectsModel' | None = None)[source]
Bases:
objectComplete forward measurement chain combining all components.
Chain: CanonicalForwardModel → [EnvironmentalEffects] → DomainTransform → InstrumentModel → PreprocessingOperator
- canonical_model
Physical model on canonical grid.
- Type:
- environmental_model
Optional environmental effects (temperature, moisture, scattering).
- Type:
Optional[‘EnvironmentalEffectsModel’]
- instrument_model
Instrument effects.
- Type:
- domain_transform
Domain conversion.
- Type:
- preprocessing
Dataset preprocessing.
- Type:
- canonical_model: CanonicalForwardModel
- classmethod create(canonical_grid: ndarray, target_grid: ndarray, component_names: List[str], domain: str = 'absorbance', preprocessing_type: str = 'none', instrument_params: Dict[str, float] | None = None, baseline_order: int = 5, continuum_order: int = 3, sg_window: int = 15, sg_polyorder: int = 2, include_environmental: bool = False) ForwardChain[source]
Convenience factory method to create ForwardChain.
- Parameters:
canonical_grid – High-resolution canonical wavelength grid.
target_grid – Target dataset wavelength grid.
component_names – Names of components to include.
domain – Domain type (‘absorbance’, ‘reflectance’).
preprocessing_type – Preprocessing type.
instrument_params – Instrument parameters dict.
baseline_order – Baseline polynomial order.
continuum_order – Continuum polynomial order.
sg_window – Savitzky-Golay window.
sg_polyorder – Savitzky-Golay polynomial order.
include_environmental – Whether to include environmental effects model.
- Returns:
Configured ForwardChain instance.
- domain_transform: DomainTransform
- forward(concentrations: ndarray, path_length: float = 1.0, baseline_coeffs: ndarray | None = None, continuum_coeffs: ndarray | None = None, scatter: ndarray | None = None) ndarray[source]
Run full forward chain.
- Parameters:
concentrations – Component concentrations.
path_length – Optical path length factor.
baseline_coeffs – Baseline polynomial coefficients.
continuum_coeffs – Continuum absorption coefficients.
scatter – Scattering coefficients for reflectance.
- Returns:
Spectrum on target grid with preprocessing applied.
- forward_design_matrix(path_length: float = 1.0) ndarray[source]
Get transformed design matrix for linear fitting.
Returns the design matrix after applying instrument and preprocessing transforms. Note: Domain transform is not applied here as it may be nonlinear (KM).
- instrument_model: InstrumentModel
- preprocessing: PreprocessingOperator
- class nirs4all.synthesis.reconstruction.GenerationResult(X: ndarray, concentrations: ndarray, path_lengths: ndarray, baseline_coeffs: ndarray, wavelengths: ndarray, noise_level: float = 0.0, wl_shifts: ndarray | None = None, temperature_deltas: ndarray | None = None, water_activities: ndarray | None = None, scattering_powers: ndarray | None = None, scattering_amplitudes: ndarray | None = None)[source]
Bases:
objectResult of synthetic generation.
- X
Generated spectra (n_samples, n_wavelengths).
- Type:
- concentrations
Sampled concentrations (n_samples, n_components).
- Type:
- path_lengths
Sampled path lengths (n_samples,).
- Type:
- baseline_coeffs
Sampled baseline coefficients.
- Type:
- wavelengths
Wavelength grid.
- Type:
- wl_shifts
Per-sample wavelength shifts.
- Type:
numpy.ndarray | None
- temperature_deltas
Per-sample temperature deviations (°C).
- Type:
numpy.ndarray | None
- water_activities
Per-sample water activity values.
- Type:
numpy.ndarray | None
- scattering_powers
Per-sample scattering exponents.
- Type:
numpy.ndarray | None
- scattering_amplitudes
Per-sample scattering amplitudes.
- Type:
numpy.ndarray | None
- class nirs4all.synthesis.reconstruction.GlobalCalibrator(wl_shift_bounds: Tuple[float, float] = (-10.0, 10.0), wl_stretch_bounds: Tuple[float, float] = (0.98, 1.02), ils_sigma_bounds: Tuple[float, float] = (2.0, 20.0), regularization: float = 1e-06, use_global_search: bool = False)[source]
Bases:
objectCalibrate global instrument parameters using prototype spectra.
Optimizes θ_global = {wl_shift, wl_stretch, ils_sigma} to minimize total fitting loss across all prototypes, with per-prototype linear parameters solved via NNLS.
- forward_chain
ForwardChain for computing model predictions.
- calibrate(prototypes: np.ndarray, forward_chain: ForwardChain, initial_guess: np.ndarray | None = None) CalibrationResult[source]
Calibrate global parameters on prototype spectra.
- Parameters:
prototypes – Prototype spectra (n_prototypes, n_wavelengths).
forward_chain – Forward chain for model evaluation.
initial_guess – Initial [wl_shift, wl_stretch, ils_sigma].
- Returns:
CalibrationResult with optimized parameters.
- refine(current_result: CalibrationResult, prototypes: np.ndarray, forward_chain: ForwardChain) CalibrationResult[source]
Refine calibration with tighter bounds around current estimate.
- Parameters:
current_result – Current calibration result.
prototypes – Prototype spectra.
forward_chain – Forward chain.
- Returns:
Refined CalibrationResult.
- class nirs4all.synthesis.reconstruction.InstrumentModel(target_grid: ndarray, wl_shift: float = 0.0, wl_stretch: float = 1.0, wl_poly_coeffs: ndarray | None = None, ils_sigma: float = 4.0, stray_light: float = 0.0, gain: float = 1.0, offset: float = 0.0)[source]
Bases:
objectInstrument effects: warp, ILS convolution, gain/offset, resampling.
- Transforms spectrum from canonical grid to target instrument grid:
Wavelength warp: λ* → λ’ (shift + stretch + optional higher order)
ILS convolution: Gaussian or Voigt line shape
Stray light / gain / offset
Resample to target grid
- target_grid
Target wavelength grid (dataset grid).
- Type:
- wl_poly_coeffs
Higher-order polynomial warp coefficients.
- Type:
numpy.ndarray | None
- apply(spectrum: ndarray, canonical_grid: ndarray) ndarray[source]
Apply instrument chain to transform spectrum.
- Parameters:
spectrum – Input spectrum on canonical grid.
canonical_grid – Canonical wavelength grid.
- Returns:
Transformed spectrum on target grid.
- classmethod from_params(target_grid: ndarray, params: Dict[str, float]) InstrumentModel[source]
Create InstrumentModel from parameter dictionary.
- get_jacobian_wrt_ils_sigma(spectrum: ndarray, canonical_grid: ndarray, eps: float = 0.1) ndarray[source]
Numerical Jacobian w.r.t. ILS sigma.
- class nirs4all.synthesis.reconstruction.InversionResult(concentrations: ndarray, baseline_coeffs: ndarray, continuum_coeffs: ndarray | None = None, path_length: float = 1.0, wl_shift_residual: float = 0.0, scatter_coeffs: ndarray | None = None, fitted_spectrum: ndarray | None = None, residuals: ndarray | None = None, r_squared: float = 0.0, rmse: float = inf, converged: bool = False, temperature_delta: float = 0.0, water_activity: float = 0.5, scattering_power: float = 1.5, scattering_amplitude: float = 0.0)[source]
Bases:
objectResult of per-sample inversion.
- concentrations
Fitted component concentrations.
- Type:
- baseline_coeffs
Fitted baseline coefficients.
- Type:
- continuum_coeffs
Fitted continuum coefficients.
- Type:
numpy.ndarray | None
- scatter_coeffs
Fitted scatter coefficients (reflectance).
- Type:
numpy.ndarray | None
- fitted_spectrum
Reconstructed spectrum.
- Type:
numpy.ndarray | None
- residuals
Fitting residuals.
- Type:
numpy.ndarray | None
- class nirs4all.synthesis.reconstruction.MultiscaleSchedule(smooth_sigmas: List[float] = <factory>, derivative_weights: List[float] = <factory>, baseline_regularization: List[float] = <factory>, max_iterations: List[int] = <factory>)[source]
Bases:
objectConfiguration for multiscale fitting curriculum.
Fits coarse features first, then progressively adds detail: 1. Smooth target + no derivatives + strong baseline prior 2. Less smooth + partial derivative weight 3. Full resolution + full preprocessing
- classmethod quick() MultiscaleSchedule[source]
Quick schedule for fast fitting.
- classmethod thorough() MultiscaleSchedule[source]
Thorough schedule for best accuracy.
- class nirs4all.synthesis.reconstruction.ParameterDistributionFitter(positive_params: List[str] = <factory>, bounded_params: Dict[str, ~typing.Tuple[float, float]]=<factory>, use_factor_model: bool = False, n_factors: int = 3, min_std: float = 1e-06)[source]
Bases:
objectFit distributions to parameter samples.
- For positive parameters (concentrations, path_length):
Use log-normal or gamma distributions
Transform to log space for correlation modeling
- For shift parameters (wl_shift):
Use Gaussian distributions
- For bounded parameters:
Use truncated normal or beta distributions
- fit(params: Dict[str, ndarray], param_names: List[str] | None = None) DistributionResult[source]
Fit distributions to parameter samples.
- Parameters:
params – Dict of parameter arrays. Each array has shape (n_samples,) or (n_samples, n_features) for multi-dimensional params.
param_names – Optional list of parameter names to fit.
- Returns:
DistributionResult with fitted distributions.
- class nirs4all.synthesis.reconstruction.ParameterSampler(distribution_result: DistributionResult, use_correlations: bool = True)[source]
Bases:
objectSample parameters from fitted distributions.
Uses Gaussian copula to maintain correlations between parameters while respecting marginal distributions.
- distribution_result
Fitted DistributionResult.
- distribution_result: DistributionResult
- sample(n_samples: int, random_state: int | None = None) Dict[str, ndarray][source]
Sample parameters from fitted distributions.
- Parameters:
n_samples – Number of samples to generate.
random_state – Random seed.
- Returns:
Dict of parameter arrays with same structure as fit input.
- class nirs4all.synthesis.reconstruction.PipelineResult(config: DatasetConfig, calibration: 'CalibrationResult' | None = None, inversion_results: List['InversionResult'] | None = None, distribution: 'DistributionResult' | None = None, X_synthetic: np.ndarray | None = None, validation: 'ValidationResult' | None = None, forward_chain: 'ForwardChain' | None = None)[source]
Bases:
objectResult of reconstruction pipeline.
Contains all outputs from the reconstruction workflow: - Calibration results - Inversion results - Learned distributions - Generated synthetic data - Validation metrics
- config
Dataset configuration used.
- Type:
- calibration
Global calibration result.
- Type:
Optional[‘CalibrationResult’]
- inversion_results
Per-sample inversion results.
- Type:
Optional[List[‘InversionResult’]]
- distribution
Learned parameter distributions.
- Type:
Optional[‘DistributionResult’]
- X_synthetic
Generated synthetic spectra.
- Type:
Optional[np.ndarray]
- validation
Validation result.
- Type:
Optional[‘ValidationResult’]
- forward_chain
Calibrated forward chain.
- Type:
Optional[‘ForwardChain’]
- config: DatasetConfig
- class nirs4all.synthesis.reconstruction.PreprocessingOperator(preprocessing_type: Literal['none', 'first_derivative', 'second_derivative', 'snv', 'msc', 'detrend', 'mean_centered'] = 'none', sg_window: int = 15, sg_polyorder: int = 2, sg_deriv: int = 0, reference_spectrum: ndarray | None = None)[source]
Bases:
objectApply dataset preprocessing to match stored representation.
- Implements exact preprocessing steps:
Savitzky-Golay derivatives (1st, 2nd order)
SNV (Standard Normal Variate)
MSC (Multiplicative Scatter Correction)
Detrend
Mean centering
- preprocessing_type
Type of preprocessing.
- Type:
Literal[‘none’, ‘first_derivative’, ‘second_derivative’, ‘snv’, ‘msc’, ‘detrend’, ‘mean_centered’]
- reference_spectrum
Reference for MSC (mean of calibration set).
- Type:
numpy.ndarray | None
- apply(spectrum: ndarray) ndarray[source]
Apply preprocessing to spectrum.
- Parameters:
spectrum – Input spectrum, shape (n_wavelengths,) or (n_samples, n_wavelengths).
- Returns:
Preprocessed spectrum(a).
- classmethod from_detection(preprocessing_type: str, sg_window: int = 15, sg_polyorder: int = 2) PreprocessingOperator[source]
Create PreprocessingOperator from detected preprocessing type.
- class nirs4all.synthesis.reconstruction.PrototypeSelector(n_prototypes: int = 5, include_median: bool = True, include_quantiles: bool = True, pca_components: int = 5)[source]
Bases:
objectSelect representative prototype spectra from a dataset.
Uses multiple strategies to ensure robust global calibration: 1. Median spectrum (robust central tendency) 2. Quantile spectra (25%, 75% in PC1) 3. K-medoids in PCA space (capture diversity)
- class nirs4all.synthesis.reconstruction.ReconstructionGenerator(noise_level: float = 0.001, multiplicative_noise: float = 0.01, add_noise: bool = True, noise_type: str = 'both')[source]
Bases:
objectGenerate synthetic spectra from learned parameter distributions.
Uses the calibrated forward chain and learned parameter distributions to generate realistic synthetic data that matches the statistical properties of the original dataset.
- forward_chain
Calibrated forward chain.
- sampler
Parameter sampler with learned distributions.
- noise_estimator
Estimated noise level from inversion residuals.
- generate(n_samples: int, forward_chain: ForwardChain, sampler: ParameterSampler, random_state: int | None = None) GenerationResult[source]
Generate synthetic spectra.
- Parameters:
n_samples – Number of samples to generate.
forward_chain – Calibrated forward chain.
sampler – Parameter sampler.
random_state – Random seed.
- Returns:
GenerationResult with generated spectra and parameters.
- generate_matched(X_real: np.ndarray, forward_chain: ForwardChain, sampler: ParameterSampler, random_state: int | None = None) GenerationResult[source]
Generate synthetic data matched to real data statistics.
Generates same number of samples as real data and optionally adjusts noise level based on estimated residuals.
- Parameters:
X_real – Real data matrix for reference.
forward_chain – Calibrated forward chain.
sampler – Parameter sampler.
random_state – Random seed.
- Returns:
GenerationResult.
- class nirs4all.synthesis.reconstruction.ReconstructionPipeline(config: DatasetConfig, component_names: List[str] | None = None, canonical_resolution: float = 0.5, baseline_order: int = 5, continuum_order: int = 3, n_prototypes: int = 5, fit_environmental: bool = False, verbose: bool = True)[source]
Bases:
objectComplete reconstruction pipeline.
Orchestrates the full workflow: 1. Configuration and component selection 2. Prototype selection and global calibration 3. Per-sample inversion (optionally with environmental parameters) 4. Parameter distribution learning 5. Synthetic generation 6. Validation
- config
Dataset configuration.
- config: DatasetConfig
- fit(X: ndarray, max_samples: int | None = None) PipelineResult[source]
Run full reconstruction pipeline.
- Parameters:
X – Spectra matrix (n_samples, n_wavelengths).
max_samples – Max samples to invert (for speed).
- Returns:
PipelineResult with all outputs.
- generate(n_samples: int, result: PipelineResult, random_state: int | None = None) ndarray[source]
Generate additional synthetic samples using fitted pipeline.
- Parameters:
n_samples – Number of samples to generate.
result – PipelineResult from fit().
random_state – Random seed.
- Returns:
Synthetic spectra matrix.
- class nirs4all.synthesis.reconstruction.ReconstructionValidator(r2_threshold: float = 0.9, residual_autocorr_threshold: float = 0.3, pca_distance_threshold: float = 3.0, concentration_max: float = 10.0, path_length_bounds: Tuple[float, float] = (0.3, 3.0))[source]
Bases:
objectValidate reconstruction quality and synthetic realism.
Checks: 1. Residuals should be structureless (no systematic patterns) 2. Synthetic should match real in PCA space 3. Per-wavelength statistics should be similar 4. Parameters should be physically plausible
- validate(inversion_results: List['InversionResult'], X_real: np.ndarray, X_synth: np.ndarray) ValidationResult[source]
Run full validation.
- Parameters:
inversion_results – Inversion results.
X_real – Real data.
X_synth – Synthetic data.
- Returns:
ValidationResult.
- validate_parameters(inversion_results: List['InversionResult']) Dict[str, Any][source]
Validate parameter plausibility.
- Parameters:
inversion_results – List of inversion results.
- Returns:
Dict of parameter metrics.
- class nirs4all.synthesis.reconstruction.ValidationResult(reconstruction_metrics: Dict[str, ~typing.Any]=<factory>, synthetic_metrics: Dict[str, ~typing.Any]=<factory>, parameter_metrics: Dict[str, ~typing.Any]=<factory>, overall_score: float = 0.0, passed: bool = False, warnings: List[str] = <factory>)[source]
Bases:
objectResult of reconstruction validation.
- class nirs4all.synthesis.reconstruction.VariableProjectionSolver(path_length_bounds: Tuple[float, float] = (0.5, 2.0), wl_shift_bounds: Tuple[float, float] = (-2.0, 2.0), concentration_regularization: float = 1e-06, baseline_smoothness_penalty: float = 0.0001, use_derivatives: bool = False, verbose: bool = False, fit_environmental: bool = False, temperature_bounds: Tuple[float, float] = (-15.0, 15.0), water_activity_bounds: Tuple[float, float] = (0.1, 0.9), scattering_power_bounds: Tuple[float, float] = (0.5, 3.0), scattering_amplitude_bounds: Tuple[float, float] = (0.0, 0.2), environmental_prior_weight: float = 0.1)[source]
Bases:
objectVariable projection solver for spectral inversion.
Separates optimization into: - Nonlinear params: path_length, per-sample wl_shift, [environmental] (outer loop) - Linear params: concentrations, baseline, continuum (inner NNLS/QP)
- fit(target: np.ndarray, forward_chain: ForwardChain, schedule: MultiscaleSchedule | None = None, initial_params: Dict[str, float] | None = None) InversionResult[source]
Fit forward model to target spectrum.
- Parameters:
target – Target spectrum to fit.
forward_chain – Forward chain with calibrated global params.
schedule – Multiscale fitting schedule.
initial_params – Initial nonlinear parameters.
- Returns:
InversionResult with fitted parameters.
- fit_batch(X: np.ndarray, forward_chain: ForwardChain, schedule: MultiscaleSchedule | None = None, n_jobs: int = 1) List[InversionResult][source]
Fit multiple spectra.
- Parameters:
X – Spectra matrix (n_samples, n_wavelengths).
forward_chain – Forward chain with calibrated global params.
schedule – Multiscale fitting schedule.
n_jobs – Number of parallel jobs (1 = sequential).
- Returns:
List of InversionResult for each sample.
- nirs4all.synthesis.reconstruction.reconstruct_and_generate(X: ndarray, wavelengths: ndarray, n_synthetic: int | None = None, domain: str = 'unknown', component_names: List[str] | None = None, fit_environmental: bool = False, verbose: bool = True) Tuple[ndarray, PipelineResult][source]
Convenience function for end-to-end reconstruction and generation.
- Parameters:
X – Real spectra matrix.
wavelengths – Wavelength grid.
n_synthetic – Number of synthetic samples (default: same as X).
domain – Application domain.
component_names – Components to use.
fit_environmental – Whether to fit environmental parameters (temperature, water activity, scattering).
verbose – Print progress.
- Returns:
Tuple of (X_synthetic, PipelineResult).