nirs4all.data.targets module

Target data management with processing chains.

class nirs4all.data.targets.Targets[source]

Bases: object

Target manager that stores target arrays with processing chains.

Manages multiple versions of target data (raw, numeric, scaled, etc.) with processing ancestry tracking and transformation capabilities. Delegates specialized operations to helper components for better maintainability.

num_samples

Number of samples in target data

Type:: int

num_targets

Number of target variables

Type:: int

num_classes

Number of unique classes (for classification tasks)

Type:: int

num_processings

Number of processing versions

Type:: int

processing_ids

Names of available processings

Type:: list of str

Examples

>>> targets = Targets()
>>> targets.add_targets(np.array([1, 2, 3, 1, 2]))
>>> targets.num_samples
5
>>> targets.num_classes
3

>>> # Add scaled version
>>> from sklearn.preprocessing import StandardScaler
>>> scaler = StandardScaler()
>>> scaled_data = scaler.fit_transform(targets.get_targets('numeric'))
>>> targets.add_processed_targets('scaled', scaled_data, 'numeric', scaler)

>>> # Transform predictions back to numeric space
>>> predictions = model.predict(X_test)
>>> numeric_preds = targets.transform_predictions(
...     predictions, 'scaled', 'numeric'
... )

See also

ProcessingChain: Manages processing ancestry NumericConverter: Converts raw data to numeric TargetTransformer: Transforms predictions between states

__repr__() → str[source]

Return unambiguous string representation.

Returns:: String showing samples, targets, and processings
Return type:: str

__str__() → str[source]

Return readable string representation with statistics.

Returns:: Multi-line string with processing statistics
Return type:: str

Notes: - Skips ‘raw’ processing in display - Shows min/max/mean for numeric processings - Computed statistics are not cached

add_processed_targets(processing_name: str, targets: ndarray | List | tuple, ancestor: str = 'numeric', transformer: TransformerMixin | None = None, mode: str = 'train', labelizer: bool = True) → None[source]

Add processed version of target data.

Parameters:

processing_name (str) – Unique name for this processing
targets (array-like) – Processed target data (same number of samples)
ancestor (str, optional) – Source processing name. Defaults to ‘numeric’.
transformer (TransformerMixin, optional) – Transformer used to create this processing
mode (str, optional) – Mode for validation (‘train’ enforces shape checks). Defaults to ‘train’.
labelizer (bool, optional) – Legacy parameter (currently unused). Defaults to True.

Raises:

ValueError – If processing_name already exists
ValueError – If ancestor doesn’t exist
ValueError – If shape doesn’t match existing data (in train mode)

Examples: >>> from sklearn.preprocessing import StandardScaler >>> scaler = StandardScaler() >>> scaled = scaler.fit_transform(targets.get_targets(‘numeric’)) >>> targets.add_processed_targets(‘scaled’, scaled, ‘numeric’, scaler)

add_targets(targets: ndarray | List | tuple) → None[source]

Add target samples. Can be called multiple times to append.

Automatically creates ‘raw’ and ‘numeric’ processings on first call. Subsequent calls append to existing data.

Parameters:

targets (array-like) – Target data as 1D (single target) or 2D (multiple targets)

Raises:

ValueError – If processings beyond ‘raw’ and ‘numeric’ exist
ValueError – If target dimensions don’t match existing data

Notes: - First call: creates ‘raw’ and ‘numeric’ processings - Subsequent calls: appends to existing arrays - Invalidates statistics cache

Examples: >>> targets = Targets() >>> targets.add_targets([1, 2, 3]) >>> targets.num_samples 3 >>> targets.add_targets([4, 5]) >>> targets.num_samples 5

get_processing_ancestry(processing: str) → List[str][source]

Get the full ancestry chain for a processing.

Parameters:: processing (str) – Processing name
Returns:: Processing names from root to specified processing
Return type:: list of str
Raises:: ValueError – If processing doesn’t exist

Examples: >>> targets.get_processing_ancestry(‘scaled’) [‘raw’, ‘numeric’, ‘scaled’]

get_targets(processing: str = 'numeric', indices: List[int] | ndarray | None = None) → ndarray[source]

Get target data for a specific processing.

Parameters:

processing (str, optional) – Processing name to retrieve. Defaults to ‘numeric’.
indices (array-like of int, optional) – Sample indices to retrieve (None for all)

Returns:

Target array of shape (n_samples, n_targets) or (selected_samples, n_targets)

Return type:

np.ndarray

Raises:

ValueError – If processing doesn’t exist

Examples: >>> targets.get_targets(‘numeric’) array([[1.], [2.], [3.]])

>>> targets.get_targets('numeric', indices=[0, 2])
array([[1.], [3.]])

get_task_type_for_processing(processing: str) → TaskType | None[source]

Get the task type for a specific processing.

This method allows retrieving the task type that was detected when a specific processing was added. Useful for understanding how different transformations (e.g., discretization, binning) affect the task type.

Parameters:: processing (str) – Processing name to query
Returns:: Task type for the processing, or None if not available
Return type:: Optional[TaskType]

Examples

>>> targets.add_targets([1.0, 2.0, 3.0, 4.0, 5.0])
>>> targets.get_task_type_for_processing('numeric')
TaskType.REGRESSION

>>> # After discretization
>>> targets.add_processed_targets('binned', [0, 0, 1, 1, 2], 'numeric')
>>> targets.get_task_type_for_processing('binned')
TaskType.MULTICLASS_CLASSIFICATION

invert_transform(y_pred: ndarray, from_processing: str, to_processing: str = 'raw') → ndarray[source]

Inverse transform predictions from one processing back to another.

Parameters:

y_pred (np.ndarray) – Predictions to transform
from_processing (str) – Source processing name
to_processing (str, optional) – Target processing name. Defaults to ‘raw’.

Returns:

Inverse transformed predictions

Return type:

np.ndarray

Notes: This method delegates to transform_predictions for the actual transformation.

See Also: transform_predictions: Main transformation method

property num_classes: int

Get the number of unique classes from numeric targets.

Returns:

Number of unique classes

Return type:

int

Raises:

ValueError – If no target data available
ValueError – If numeric targets not available

Notes: - Uses numeric targets (not raw) - For multi-target, uses first column - Result is cached until data changes - NaN values are excluded from count

property num_processings: int

Get the number of unique processings.

Returns:: Number of processing versions
Return type:: int

property num_samples: int

Get the number of samples.

Returns:: Number of samples (0 if no data)
Return type:: int

property num_targets: int

Get the number of target variables.

Returns:: Number of targets (0 if no data)
Return type:: int

property processing_ids: List[str]

Get the list of processing IDs.

Returns:: Copy of processing names
Return type:: list of str

set_task_type(task_type: TaskType, forced: bool = True) → None[source]

Set the task type explicitly.

Parameters:

task_type – TaskType enum value
forced – If True, prevents auto-detection from overriding this value in subsequent processing (e.g., after MinMaxScaler). Default True.

property task_type: TaskType | None

Get the detected task type.

Returns:: TaskType enum or None if no targets added

property task_type_forced: bool: Check if task type was explicitly forced (disabling auto-detection).

transform_predictions(y_pred: ndarray, from_processing: str, to_processing: str) → ndarray[source]

Transform predictions from one processing state to another.

Applies appropriate forward or inverse transformations based on the ancestry relationship between processings.

Parameters:

y_pred (np.ndarray) – Prediction array to transform
from_processing (str) – Current processing state of predictions
to_processing (str) – Target processing state

Returns:

Transformed predictions in target processing state

Return type:

np.ndarray

Raises:

ValueError – If either processing doesn’t exist
ValueError – If no transformation path exists
ValueError – If transformation fails

Examples: >>> # Model trained on scaled targets >>> predictions = model.predict(X_test) >>> # Transform back to numeric space >>> numeric_preds = targets.transform_predictions( … predictions, ‘scaled’, ‘numeric’ … )

Notes: - Empty predictions return empty array - Uses cached ancestry for efficiency - Handles both forward and inverse transformations

See Also: TargetTransformer: Handles transformation logic

y(indices: list[int] | ndarray, processing: str) → ndarray[source]

Convenience method to get targets with indices.

Alias for get_targets with different parameter order.

Parameters:

indices (array-like of int) – Sample indices to retrieve
processing (str) – Processing name

Returns:

Target array for specified indices

Return type:

np.ndarray

Examples: >>> targets.y([0, 1, 2], ‘numeric’) array([[1.], [2.], [3.]])