nirs4all.data.selection.role_assigner module
Role assigner for dataset configuration.
This module provides role assignment for DataFrame columns, assigning them to features (X), targets (Y), or metadata roles with validation to prevent overlap.
Example
>>> assigner = RoleAssigner()
>>> result = assigner.assign(df, {
... "features": "2:-1",
... "targets": -1,
... "metadata": [0, 1]
... })
>>> print(result.features) # Features DataFrame
>>> print(result.targets) # Targets DataFrame
>>> print(result.metadata) # Metadata DataFrame
- class nirs4all.data.selection.role_assigner.RoleAssigner(case_sensitive: bool = True, allow_overlap: bool = False)[source]
Bases:
objectAssign columns to data roles (features, targets, metadata).
Validates that: - No column is assigned to multiple roles - At least features are assigned - Indices are valid
Supports the same column selection syntax as ColumnSelector.
Example
>>> assigner = RoleAssigner() >>> result = assigner.assign(df, { ... "features": "2:-1", # All columns except first 2 and last ... "targets": -1, # Last column ... "metadata": [0, 1] # First 2 columns ... })
- assign(df: DataFrame, roles: Dict[str, int | str | List[int] | List[str] | Dict[str, Any] | slice | None]) RoleAssignmentResult[source]
Assign columns to roles.
- Parameters:
df – The DataFrame to assign roles from.
roles – Dictionary mapping role names to column selections. Supported roles: “features”, “targets”, “metadata” Also accepts: “x” (alias for features), “y” (alias for targets)
- Returns:
RoleAssignmentResult with separated DataFrames.
- Raises:
RoleAssignmentError – If assignment is invalid (overlap, missing features).
- assign_auto(df: DataFrame, target_columns: int | str | List[int] | List[str] | Dict[str, Any] | slice | None = None, metadata_columns: int | str | List[int] | List[str] | Dict[str, Any] | slice | None = None) RoleAssignmentResult[source]
Auto-assign roles with specified targets and metadata.
Features are automatically set to all remaining columns.
- Parameters:
df – The DataFrame to assign roles from.
target_columns – Column selection for targets (Y).
metadata_columns – Column selection for metadata.
- Returns:
RoleAssignmentResult with separated DataFrames.
- extract_y_from_x(df: DataFrame, y_columns: int | str | List[int] | List[str] | Dict[str, Any] | slice | None) RoleAssignmentResult[source]
Extract target columns from a features DataFrame.
This is useful when Y columns are embedded in the X data.
- Parameters:
df – DataFrame containing both features and targets.
y_columns – Column selection for targets to extract.
- Returns:
RoleAssignmentResult with features (remaining) and targets (extracted).
- validate_roles(df: DataFrame, roles: Dict[str, int | str | List[int] | List[str] | Dict[str, Any] | slice | None]) List[str][source]
Validate a role specification without performing assignment.
- Parameters:
df – The DataFrame to validate against.
roles – Role specification to validate.
- Returns:
List of warning messages (empty if no warnings).
- Raises:
RoleAssignmentError – If role specification is invalid.
- exception nirs4all.data.selection.role_assigner.RoleAssignmentError[source]
Bases:
ExceptionRaised when role assignment fails.
- class nirs4all.data.selection.role_assigner.RoleAssignmentResult(features: DataFrame | None, targets: DataFrame | None, metadata: DataFrame | None, feature_indices: List[int], target_indices: List[int], metadata_indices: List[int])[source]
Bases:
objectResult of role assignment.
- features
DataFrame containing feature columns (X).
- Type:
pandas.core.frame.DataFrame | None
- targets
DataFrame containing target columns (Y).
- Type:
pandas.core.frame.DataFrame | None
- metadata
DataFrame containing metadata columns.
- Type:
pandas.core.frame.DataFrame | None