nirs4all.controllers.data.feature_augmentation module

class nirs4all.controllers.data.feature_augmentation.FeatureAugmentationController[source]

Bases: OperatorController

Controller for feature augmentation with multiple action modes.

The feature_augmentation controller supports three action modes that control how preprocessing operations interact with existing processings:

extend (default): Add new processings to the set. Each operation runs independently on the base processing. If a processing already exists, it is not duplicated. Growth pattern is linear.
add: Chain each operation on top of ALL existing processings. Keep original processings alongside new chained versions. Growth pattern is multiplicative with originals (n + n×m).
replace: Chain each operation on top of ALL existing processings. Discard original processings, keeping only the chained versions. Growth pattern is multiplicative without originals (n×m).

Example

>>> # Extend mode (default) - linear growth
>>> {"feature_augmentation": [SNV, Gaussian], "action": "extend"}
>>> # With raw_A already present: raw_A, raw_SNV, raw_Gaussian

>>> # Add mode - multiplicative with originals
>>> {"feature_augmentation": [SNV, Gaussian], "action": "add"}
>>> # With raw_A present: raw_A, raw_A_SNV, raw_A_Gaussian

>>> # Replace mode - multiplicative, discards originals
>>> {"feature_augmentation": [SNV, Gaussian], "action": "replace"}
>>> # With raw_A present: raw_A_SNV, raw_A_Gaussian (raw_A discarded)

execute(step_info: ParsedStep, dataset: SpectroDataset, context: ExecutionContext, runtime_context: RuntimeContext, source: int = -1, mode: str = 'train', loaded_binaries: List[Tuple[str, Any]] | None = None, prediction_store: Any | None = None) → Tuple[ExecutionContext, List[Tuple[str, bytes]]][source]

Execute feature augmentation with specified action mode.

Parameters:

step_info – Parsed step information containing the operation list and action mode.
dataset – The spectroscopic dataset to process.
context – Current execution context with processing state.
runtime_context – Runtime infrastructure for step execution.
source – Source index (-1 for all sources).
mode – Execution mode (“train”, “predict”, etc.).
loaded_binaries – Pre-loaded binary artifacts for prediction mode.
prediction_store – Store for prediction-time state.

Returns:

Tuple of (updated_context, artifacts_list).

Raises:

ValueError – If action mode is invalid.

classmethod matches(step: Any, operator: Any, keyword: str) → bool[source]: Check if the operator matches the step and keyword.

static normalize_generator_spec(spec: Any) → Any[source]

Normalize generator spec for feature_augmentation context.

In feature_augmentation context, multi-selection should use combinations by default since the order of parallel feature channels doesn’t matter. Translates legacy ‘size’ to ‘pick’ for explicit semantics.

Parameters:: spec – Generator specification (may contain _or_, size, pick, arrange).
Returns:: Normalized spec with ‘size’ converted to ‘pick’ if needed.

priority: int = 10

classmethod supports_prediction_mode() → bool[source]: Feature augmentation should NOT execute during prediction mode - transformations are already applied and saved.

classmethod use_multi_source() → bool[source]: Check if the operator supports multi-source datasets.