nirs4all.controllers.data.feature_augmentation module

class nirs4all.controllers.data.feature_augmentation.FeatureAugmentationController[source]

Bases: OperatorController

Controller for feature augmentation with multiple action modes.

The feature_augmentation controller supports three action modes that control how preprocessing operations interact with existing processings:

  • extend (default): Add new processings to the set. Each operation runs independently on the base processing. If a processing already exists, it is not duplicated. Growth pattern is linear.

  • add: Chain each operation on top of ALL existing processings. Keep original processings alongside new chained versions. Growth pattern is multiplicative with originals (n + n×m).

  • replace: Chain each operation on top of ALL existing processings. Discard original processings, keeping only the chained versions. Growth pattern is multiplicative without originals (n×m).

Example

>>> # Extend mode (default) - linear growth
>>> {"feature_augmentation": [SNV, Gaussian], "action": "extend"}
>>> # With raw_A already present: raw_A, raw_SNV, raw_Gaussian
>>> # Add mode - multiplicative with originals
>>> {"feature_augmentation": [SNV, Gaussian], "action": "add"}
>>> # With raw_A present: raw_A, raw_A_SNV, raw_A_Gaussian
>>> # Replace mode - multiplicative, discards originals
>>> {"feature_augmentation": [SNV, Gaussian], "action": "replace"}
>>> # With raw_A present: raw_A_SNV, raw_A_Gaussian (raw_A discarded)
execute(step_info: ParsedStep, dataset: SpectroDataset, context: ExecutionContext, runtime_context: RuntimeContext, source: int = -1, mode: str = 'train', loaded_binaries: List[Tuple[str, Any]] | None = None, prediction_store: Any | None = None) Tuple[ExecutionContext, List[Tuple[str, bytes]]][source]

Execute feature augmentation with specified action mode.

Parameters:
  • step_info – Parsed step information containing the operation list and action mode.

  • dataset – The spectroscopic dataset to process.

  • context – Current execution context with processing state.

  • runtime_context – Runtime infrastructure for step execution.

  • source – Source index (-1 for all sources).

  • mode – Execution mode (“train”, “predict”, etc.).

  • loaded_binaries – Pre-loaded binary artifacts for prediction mode.

  • prediction_store – Store for prediction-time state.

Returns:

Tuple of (updated_context, artifacts_list).

Raises:

ValueError – If action mode is invalid.

classmethod matches(step: Any, operator: Any, keyword: str) bool[source]

Check if the operator matches the step and keyword.

static normalize_generator_spec(spec: Any) Any[source]

Normalize generator spec for feature_augmentation context.

In feature_augmentation context, multi-selection should use combinations by default since the order of parallel feature channels doesn’t matter. Translates legacy ‘size’ to ‘pick’ for explicit semantics.

Parameters:

spec – Generator specification (may contain _or_, size, pick, arrange).

Returns:

Normalized spec with ‘size’ converted to ‘pick’ if needed.

priority: int = 10
classmethod supports_prediction_mode() bool[source]

Feature augmentation should NOT execute during prediction mode - transformations are already applied and saved.

classmethod use_multi_source() bool[source]

Check if the operator supports multi-source datasets.