nirs4all.data.targets module

Target data management with processing chains.

class nirs4all.data.targets.Targets[source]

Bases: object

Target manager that stores target arrays with processing chains.

Manages multiple versions of target data (raw, numeric, scaled, etc.) with processing ancestry tracking and transformation capabilities. Delegates specialized operations to helper components for better maintainability.

num_samples

Number of samples in target data

Type:

int

num_targets

Number of target variables

Type:

int

num_classes

Number of unique classes (for classification tasks)

Type:

int

num_processings

Number of processing versions

Type:

int

processing_ids

Names of available processings

Type:

list of str

Examples

>>> targets = Targets()
>>> targets.add_targets(np.array([1, 2, 3, 1, 2]))
>>> targets.num_samples
5
>>> targets.num_classes
3
>>> # Add scaled version
>>> from sklearn.preprocessing import StandardScaler
>>> scaler = StandardScaler()
>>> scaled_data = scaler.fit_transform(targets.get_targets('numeric'))
>>> targets.add_processed_targets('scaled', scaled_data, 'numeric', scaler)
>>> # Transform predictions back to numeric space
>>> predictions = model.predict(X_test)
>>> numeric_preds = targets.transform_predictions(
...     predictions, 'scaled', 'numeric'
... )

See also

ProcessingChain: Manages processing ancestry NumericConverter: Converts raw data to numeric TargetTransformer: Transforms predictions between states

__repr__() str[source]

Return unambiguous string representation.

Returns:

String showing samples, targets, and processings

Return type:

str

__str__() str[source]

Return readable string representation with statistics.

Returns:

Multi-line string with processing statistics

Return type:

str

Notes: - Skips ‘raw’ processing in display - Shows min/max/mean for numeric processings - Computed statistics are not cached

add_processed_targets(processing_name: str, targets: ndarray | List | tuple, ancestor: str = 'numeric', transformer: TransformerMixin | None = None, mode: str = 'train', labelizer: bool = True) None[source]

Add processed version of target data.

Parameters:
  • processing_name (str) – Unique name for this processing

  • targets (array-like) – Processed target data (same number of samples)

  • ancestor (str, optional) – Source processing name. Defaults to ‘numeric’.

  • transformer (TransformerMixin, optional) – Transformer used to create this processing

  • mode (str, optional) – Mode for validation (‘train’ enforces shape checks). Defaults to ‘train’.

  • labelizer (bool, optional) – Legacy parameter (currently unused). Defaults to True.

Raises:
  • ValueError – If processing_name already exists

  • ValueError – If ancestor doesn’t exist

  • ValueError – If shape doesn’t match existing data (in train mode)

Examples: >>> from sklearn.preprocessing import StandardScaler >>> scaler = StandardScaler() >>> scaled = scaler.fit_transform(targets.get_targets(‘numeric’)) >>> targets.add_processed_targets(‘scaled’, scaled, ‘numeric’, scaler)

add_targets(targets: ndarray | List | tuple) None[source]

Add target samples. Can be called multiple times to append.

Automatically creates ‘raw’ and ‘numeric’ processings on first call. Subsequent calls append to existing data.

Parameters:

targets (array-like) – Target data as 1D (single target) or 2D (multiple targets)

Raises:
  • ValueError – If processings beyond ‘raw’ and ‘numeric’ exist

  • ValueError – If target dimensions don’t match existing data

Notes: - First call: creates ‘raw’ and ‘numeric’ processings - Subsequent calls: appends to existing arrays - Invalidates statistics cache

Examples: >>> targets = Targets() >>> targets.add_targets([1, 2, 3]) >>> targets.num_samples 3 >>> targets.add_targets([4, 5]) >>> targets.num_samples 5

get_processing_ancestry(processing: str) List[str][source]

Get the full ancestry chain for a processing.

Parameters:

processing (str) – Processing name

Returns:

Processing names from root to specified processing

Return type:

list of str

Raises:

ValueError – If processing doesn’t exist

Examples: >>> targets.get_processing_ancestry(‘scaled’) [‘raw’, ‘numeric’, ‘scaled’]

get_targets(processing: str = 'numeric', indices: List[int] | ndarray | None = None) ndarray[source]

Get target data for a specific processing.

Parameters:
  • processing (str, optional) – Processing name to retrieve. Defaults to ‘numeric’.

  • indices (array-like of int, optional) – Sample indices to retrieve (None for all)

Returns:

Target array of shape (n_samples, n_targets) or (selected_samples, n_targets)

Return type:

np.ndarray

Raises:

ValueError – If processing doesn’t exist

Examples: >>> targets.get_targets(‘numeric’) array([[1.], [2.], [3.]])

>>> targets.get_targets('numeric', indices=[0, 2])
array([[1.], [3.]])
get_task_type_for_processing(processing: str) TaskType | None[source]

Get the task type for a specific processing.

This method allows retrieving the task type that was detected when a specific processing was added. Useful for understanding how different transformations (e.g., discretization, binning) affect the task type.

Parameters:

processing (str) – Processing name to query

Returns:

Task type for the processing, or None if not available

Return type:

Optional[TaskType]

Examples

>>> targets.add_targets([1.0, 2.0, 3.0, 4.0, 5.0])
>>> targets.get_task_type_for_processing('numeric')
TaskType.REGRESSION
>>> # After discretization
>>> targets.add_processed_targets('binned', [0, 0, 1, 1, 2], 'numeric')
>>> targets.get_task_type_for_processing('binned')
TaskType.MULTICLASS_CLASSIFICATION
invert_transform(y_pred: ndarray, from_processing: str, to_processing: str = 'raw') ndarray[source]

Inverse transform predictions from one processing back to another.

Parameters:
  • y_pred (np.ndarray) – Predictions to transform

  • from_processing (str) – Source processing name

  • to_processing (str, optional) – Target processing name. Defaults to ‘raw’.

Returns:

Inverse transformed predictions

Return type:

np.ndarray

Notes: This method delegates to transform_predictions for the actual transformation.

See Also: transform_predictions: Main transformation method

property num_classes: int

Get the number of unique classes from numeric targets.

Returns:

Number of unique classes

Return type:

int

Raises:

Notes: - Uses numeric targets (not raw) - For multi-target, uses first column - Result is cached until data changes - NaN values are excluded from count

property num_processings: int

Get the number of unique processings.

Returns:

Number of processing versions

Return type:

int

property num_samples: int

Get the number of samples.

Returns:

Number of samples (0 if no data)

Return type:

int

property num_targets: int

Get the number of target variables.

Returns:

Number of targets (0 if no data)

Return type:

int

property processing_ids: List[str]

Get the list of processing IDs.

Returns:

Copy of processing names

Return type:

list of str

set_task_type(task_type: TaskType, forced: bool = True) None[source]

Set the task type explicitly.

Parameters:
  • task_type – TaskType enum value

  • forced – If True, prevents auto-detection from overriding this value in subsequent processing (e.g., after MinMaxScaler). Default True.

property task_type: TaskType | None

Get the detected task type.

Returns:

TaskType enum or None if no targets added

property task_type_forced: bool

Check if task type was explicitly forced (disabling auto-detection).

transform_predictions(y_pred: ndarray, from_processing: str, to_processing: str) ndarray[source]

Transform predictions from one processing state to another.

Applies appropriate forward or inverse transformations based on the ancestry relationship between processings.

Parameters:
  • y_pred (np.ndarray) – Prediction array to transform

  • from_processing (str) – Current processing state of predictions

  • to_processing (str) – Target processing state

Returns:

Transformed predictions in target processing state

Return type:

np.ndarray

Raises:

Examples: >>> # Model trained on scaled targets >>> predictions = model.predict(X_test) >>> # Transform back to numeric space >>> numeric_preds = targets.transform_predictions( … predictions, ‘scaled’, ‘numeric’ … )

Notes: - Empty predictions return empty array - Uses cached ancestry for efficiency - Handles both forward and inverse transformations

See Also: TargetTransformer: Handles transformation logic

y(indices: list[int] | ndarray, processing: str) ndarray[source]

Convenience method to get targets with indices.

Alias for get_targets with different parameter order.

Parameters:
  • indices (array-like of int) – Sample indices to retrieve

  • processing (str) – Processing name

Returns:

Target array for specified indices

Return type:

np.ndarray

Examples: >>> targets.y([0, 1, 2], ‘numeric’) array([[1.], [2.], [3.]])