nirs4all.controllers.data.sample_augmentation module

class nirs4all.controllers.data.sample_augmentation.SampleAugmentationController[source]

Bases: OperatorController

Sample Augmentation Controller with delegation pattern.

This controller orchestrates sample augmentation by: 1. Calculating augmentation distribution (standard or balanced mode) 2. Creating transformer→samples mapping 3. Emitting ONE run_step per transformer with target samples

The actual augmentation work is delegated to TransformerMixinController.

execute(step_info: ParsedStep, dataset: SpectroDataset, context: ExecutionContext, runtime_context: RuntimeContext, source: int = -1, mode: str = 'train', loaded_binaries: Any | None = None, prediction_store: Any | None = None) Tuple[ExecutionContext, List][source]

Execute sample augmentation with standard or balanced mode.

Step format for standard mode:
{
“sample_augmentation”: {

“transformers”: [transformer1, transformer2, …], “count”: int, “selection”: “random” or “all”, # Default “random” “random_state”: int # Optional

}

}

Step format for balanced mode (choose one balancing strategy):

Mode 1 - Fixed target size per class: {

“sample_augmentation”: {

“transformers”: […], “balance”: “y” or “metadata_column”, # Default “y” “target_size”: int, # Fixed target samples per class “selection”: “random” or “all”, “random_state”: int

}

}

Mode 2 - Multiplier for augmentation: {

“sample_augmentation”: {

“transformers”: […], “balance”: “y” or “metadata_column”, “max_factor”: float, # Multiplier (e.g., 3 means class grows 3x) “selection”: “random” or “all”, “random_state”: int

}

}

Mode 3 - Percentage of majority class: {

“sample_augmentation”: {

“transformers”: […], “balance”: “y” or “metadata_column”, “ref_percentage”: float, # Target as % of majority (0.0-1.0) “selection”: “random” or “all”, “random_state”: int

}

}

Binning for regression (automatic when balance=”y” and task is regression):
{
“sample_augmentation”: {

“transformers”: […], “balance”: “y”, “bins”: int, # Number of virtual classes (default: 10) “binning_strategy”: “equal_width” or “quantile”, # Default: “equal_width” “max_factor”: float, # Choose one balancing mode “selection”: “random” or “all”, “random_state”: int

}

}

classmethod matches(step: Any, operator: Any, keyword: str) bool[source]

Check if the operator matches the step and keyword.

static normalize_generator_spec(spec: Any) Any[source]

Normalize generator spec for sample_augmentation context.

In sample_augmentation context, multi-selection should use combinations by default since the order of transformers doesn’t matter. Translates legacy ‘size’ to ‘pick’ for explicit semantics.

Parameters:

spec – Generator specification (may contain _or_, size, pick, arrange).

Returns:

Normalized spec with ‘size’ converted to ‘pick’ if needed.

priority: int = 10
classmethod supports_prediction_mode() bool[source]

Sample augmentation only runs during training.

classmethod use_multi_source() bool[source]

Check if the operator supports multi-source datasets.