nirs4all.data.metadata module

Metadata management for SpectroDataset.

This module contains Metadata class for managing sample-level auxiliary data. Metadata has one row per sample and aligns with the indexer’s row indices.

class nirs4all.data.metadata.Metadata[source]

Bases: object

Lightweight metadata manager for sample-level auxiliary data.

add_column(column: str, values: List | ndarray) None[source]

Add new metadata column.

Parameters:
  • column – Column name

  • values – Column values (must match number of rows)

add_metadata(data: ndarray | DataFrame | DataFrame, headers: List[str] | None = None) None[source]

Add metadata rows.

Parameters:
  • data – 2D array (n_samples, n_cols) or DataFrame

  • headers – Column names (required if data is ndarray)

property columns: List[str]

List of metadata column names (excluding row_id).

get(indices: List[int] | ndarray | None = None, columns: List[str] | None = None) DataFrame[source]

Get metadata as DataFrame.

Parameters:
  • indices – Row indices to select (None = all)

  • columns – Columns to return (None = all except row_id)

Returns:

Polars DataFrame (without row_id column)

get_column(column: str, indices: List[int] | ndarray | None = None) ndarray[source]

Get single column as numpy array.

Parameters:
  • column – Column name

  • indices – Row indices to select (None = all)

Returns:

Numpy array of column values

property num_rows: int

Number of metadata rows.

to_numeric(column: str, indices: List[int] | ndarray | None = None, method: Literal['label', 'onehot'] = 'label') tuple[ndarray, Dict][source]

Convert categorical column to numeric encoding.

Parameters:
  • column – Column name

  • indices – Row indices (None = all)

  • method – “label” for label encoding, “onehot” for one-hot

Returns:

(numeric_array, encoding_info) tuple where encoding_info contains method details and class mappings

update_metadata(indices: List[int] | ndarray, column: str, values: List | ndarray) None[source]

Update metadata values for specific rows.

Parameters:
  • indices – Row indices to update

  • column – Column name

  • values – New values (must match length of indices)