nirs4all.data.targets module
Target data management with processing chains.
- class nirs4all.data.targets.Targets[source]
Bases:
objectTarget manager that stores target arrays with processing chains.
Manages multiple versions of target data (raw, numeric, scaled, etc.) with processing ancestry tracking and transformation capabilities. Delegates specialized operations to helper components for better maintainability.
Examples
>>> targets = Targets() >>> targets.add_targets(np.array([1, 2, 3, 1, 2])) >>> targets.num_samples 5 >>> targets.num_classes 3
>>> # Add scaled version >>> from sklearn.preprocessing import StandardScaler >>> scaler = StandardScaler() >>> scaled_data = scaler.fit_transform(targets.get_targets('numeric')) >>> targets.add_processed_targets('scaled', scaled_data, 'numeric', scaler)
>>> # Transform predictions back to numeric space >>> predictions = model.predict(X_test) >>> numeric_preds = targets.transform_predictions( ... predictions, 'scaled', 'numeric' ... )
See also
ProcessingChain: Manages processing ancestry NumericConverter: Converts raw data to numeric TargetTransformer: Transforms predictions between states
- __repr__() str[source]
Return unambiguous string representation.
- Returns:
String showing samples, targets, and processings
- Return type:
- __str__() str[source]
Return readable string representation with statistics.
- Returns:
Multi-line string with processing statistics
- Return type:
Notes: - Skips ‘raw’ processing in display - Shows min/max/mean for numeric processings - Computed statistics are not cached
- add_processed_targets(processing_name: str, targets: ndarray | List | tuple, ancestor: str = 'numeric', transformer: TransformerMixin | None = None, mode: str = 'train', labelizer: bool = True) None[source]
Add processed version of target data.
- Parameters:
processing_name (str) – Unique name for this processing
targets (array-like) – Processed target data (same number of samples)
ancestor (str, optional) – Source processing name. Defaults to ‘numeric’.
transformer (TransformerMixin, optional) – Transformer used to create this processing
mode (str, optional) – Mode for validation (‘train’ enforces shape checks). Defaults to ‘train’.
labelizer (bool, optional) – Legacy parameter (currently unused). Defaults to True.
- Raises:
ValueError – If processing_name already exists
ValueError – If ancestor doesn’t exist
ValueError – If shape doesn’t match existing data (in train mode)
Examples: >>> from sklearn.preprocessing import StandardScaler >>> scaler = StandardScaler() >>> scaled = scaler.fit_transform(targets.get_targets(‘numeric’)) >>> targets.add_processed_targets(‘scaled’, scaled, ‘numeric’, scaler)
- add_targets(targets: ndarray | List | tuple) None[source]
Add target samples. Can be called multiple times to append.
Automatically creates ‘raw’ and ‘numeric’ processings on first call. Subsequent calls append to existing data.
- Parameters:
targets (array-like) – Target data as 1D (single target) or 2D (multiple targets)
- Raises:
ValueError – If processings beyond ‘raw’ and ‘numeric’ exist
ValueError – If target dimensions don’t match existing data
Notes: - First call: creates ‘raw’ and ‘numeric’ processings - Subsequent calls: appends to existing arrays - Invalidates statistics cache
Examples: >>> targets = Targets() >>> targets.add_targets([1, 2, 3]) >>> targets.num_samples 3 >>> targets.add_targets([4, 5]) >>> targets.num_samples 5
- get_processing_ancestry(processing: str) List[str][source]
Get the full ancestry chain for a processing.
- Parameters:
processing (str) – Processing name
- Returns:
Processing names from root to specified processing
- Return type:
- Raises:
ValueError – If processing doesn’t exist
Examples: >>> targets.get_processing_ancestry(‘scaled’) [‘raw’, ‘numeric’, ‘scaled’]
- get_targets(processing: str = 'numeric', indices: List[int] | ndarray | None = None) ndarray[source]
Get target data for a specific processing.
- Parameters:
- Returns:
Target array of shape (n_samples, n_targets) or (selected_samples, n_targets)
- Return type:
np.ndarray
- Raises:
ValueError – If processing doesn’t exist
Examples: >>> targets.get_targets(‘numeric’) array([[1.], [2.], [3.]])
>>> targets.get_targets('numeric', indices=[0, 2]) array([[1.], [3.]])
- get_task_type_for_processing(processing: str) TaskType | None[source]
Get the task type for a specific processing.
This method allows retrieving the task type that was detected when a specific processing was added. Useful for understanding how different transformations (e.g., discretization, binning) affect the task type.
- Parameters:
processing (str) – Processing name to query
- Returns:
Task type for the processing, or None if not available
- Return type:
Optional[TaskType]
Examples
>>> targets.add_targets([1.0, 2.0, 3.0, 4.0, 5.0]) >>> targets.get_task_type_for_processing('numeric') TaskType.REGRESSION
>>> # After discretization >>> targets.add_processed_targets('binned', [0, 0, 1, 1, 2], 'numeric') >>> targets.get_task_type_for_processing('binned') TaskType.MULTICLASS_CLASSIFICATION
- invert_transform(y_pred: ndarray, from_processing: str, to_processing: str = 'raw') ndarray[source]
Inverse transform predictions from one processing back to another.
- Parameters:
- Returns:
Inverse transformed predictions
- Return type:
np.ndarray
Notes: This method delegates to transform_predictions for the actual transformation.
See Also: transform_predictions: Main transformation method
- property num_classes: int
Get the number of unique classes from numeric targets.
- Returns:
Number of unique classes
- Return type:
- Raises:
ValueError – If no target data available
ValueError – If numeric targets not available
Notes: - Uses numeric targets (not raw) - For multi-target, uses first column - Result is cached until data changes - NaN values are excluded from count
- property num_processings: int
Get the number of unique processings.
- Returns:
Number of processing versions
- Return type:
- property num_samples: int
Get the number of samples.
- Returns:
Number of samples (0 if no data)
- Return type:
- property num_targets: int
Get the number of target variables.
- Returns:
Number of targets (0 if no data)
- Return type:
- set_task_type(task_type: TaskType, forced: bool = True) None[source]
Set the task type explicitly.
- Parameters:
task_type – TaskType enum value
forced – If True, prevents auto-detection from overriding this value in subsequent processing (e.g., after MinMaxScaler). Default True.
- property task_type: TaskType | None
Get the detected task type.
- Returns:
TaskType enum or None if no targets added
- property task_type_forced: bool
Check if task type was explicitly forced (disabling auto-detection).
- transform_predictions(y_pred: ndarray, from_processing: str, to_processing: str) ndarray[source]
Transform predictions from one processing state to another.
Applies appropriate forward or inverse transformations based on the ancestry relationship between processings.
- Parameters:
- Returns:
Transformed predictions in target processing state
- Return type:
np.ndarray
- Raises:
ValueError – If either processing doesn’t exist
ValueError – If no transformation path exists
ValueError – If transformation fails
Examples: >>> # Model trained on scaled targets >>> predictions = model.predict(X_test) >>> # Transform back to numeric space >>> numeric_preds = targets.transform_predictions( … predictions, ‘scaled’, ‘numeric’ … )
Notes: - Empty predictions return empty array - Uses cached ancestry for efficiency - Handles both forward and inverse transformations
See Also: TargetTransformer: Handles transformation logic
- y(indices: list[int] | ndarray, processing: str) ndarray[source]
Convenience method to get targets with indices.
Alias for get_targets with different parameter order.
- Parameters:
- Returns:
Target array for specified indices
- Return type:
np.ndarray
Examples: >>> targets.y([0, 1, 2], ‘numeric’) array([[1.], [2.], [3.]])