nirs4all.controllers.data.concat_transform module

Concat Augmentation Controller.

This module provides the ConcatAugmentationController for concatenating multiple transformer outputs horizontally. It can either: - REPLACE each processing with concatenated versions (top-level usage) - ADD a new processing with concatenated output (inside feature_augmentation)

class nirs4all.controllers.data.concat_transform.ConcatAugmentationController[source]

Bases: OperatorController

Controller that concatenates multiple transformer outputs.

Semantics: - Top-level (add_feature=False): REPLACES each processing with concatenated version - Inside feature_augmentation (add_feature=True): ADDS one new processing

Supports: - Single transformers: PCA(50) - Chained transformers: [Wavelet(), PCA(50)] → sequential application - Mixed: [PCA(50), [Wavelet(), SVD(30)], LocalStats()]

Examples

Top-level replacement: >>> pipeline = [{“concat_transform”: [PCA(50), SVD(50)]}] # Before: (500, 3, 500) with [“raw”, “snv”, “savgol”] # After: (500, 3, 100) with [“raw_concat_PCA_SVD”, “snv_concat_PCA_SVD”, …]

Nested inside feature_augmentation: >>> pipeline = [{ … “feature_augmentation”: [ … SNV(), … {“concat_transform”: [PCA(50), SVD(50)]} … ] … }] # Before: (500, 1, 500) with [“raw”] # After: (500, 3, 500) with [“raw”, “snv”, “concat_PCA_SVD”] (padded)

execute(step_info: ParsedStep, dataset: SpectroDataset, context: ExecutionContext, runtime_context: RuntimeContext, source: int = -1, mode: str = 'train', loaded_binaries: List[Tuple[str, Any]] | None = None, prediction_store: Any | None = None) Tuple[ExecutionContext, List[Tuple[str, bytes]]][source]

Execute concat augmentation.

Parameters:
  • step_info – Parsed step containing the concat_transform config

  • dataset – SpectroDataset to operate on

  • context – Execution context with selector and metadata

  • runtime_context – Runtime infrastructure (saver, step_number, etc.)

  • source – Source index (-1 for all sources)

  • mode – Execution mode (“train”, “predict”, “explain”)

  • loaded_binaries – Pre-fitted transformers for predict/explain mode

  • prediction_store – Not used by this controller

Returns:

Tuple of (updated_context, list_of_artifacts)

classmethod matches(step: Any, operator: Any, keyword: str) bool[source]

Check if step is a concat_transform operation.

static normalize_generator_spec(spec: Any) Any[source]

Normalize generator spec for concat_transform context.

In concat_transform context, multi-selection should use combinations by default since the order of concatenated features doesn’t matter. Translates legacy ‘size’ to ‘pick’ for explicit semantics.

Parameters:

spec – Generator specification (may contain _or_, size, pick, arrange).

Returns:

Normalized spec with ‘size’ converted to ‘pick’ if needed.

priority: int = 10
classmethod supports_prediction_mode() bool[source]

Supports prediction mode for applying saved transformers.

classmethod use_multi_source() bool[source]

Supports multi-source datasets.