nirs4all.data.loaders.csv_loader_new module
CSV file loader implementation.
This module provides the CSVLoader class for loading CSV files, including support for compressed CSV files (.csv.gz, .csv.zip).
- class nirs4all.data.loaders.csv_loader_new.CSVLoader[source]
Bases:
FileLoaderLoader for CSV files.
Supports: - Plain CSV files (.csv) - Gzip-compressed CSV files (.csv.gz) - Zip-compressed CSV files (.csv.zip)
- Parameters:
delimiter – Field delimiter (default: ‘;’)
decimal_separator – Decimal separator (default: ‘.’)
has_header – Whether first row is header (default: True)
header_unit – Unit for headers (‘cm-1’, ‘nm’, etc.)
na_policy – How to handle NA values (‘remove’ or ‘abort’)
categorical_mode – How to handle categorical data (‘auto’, ‘preserve’, ‘none’)
data_type – Type of data being loaded (‘x’, ‘y’, or ‘metadata’)
encoding – File encoding (default: ‘utf-8’)
member – For zip files, specific member to extract
- load(path: Path, na_policy: str = 'auto', data_type: str = 'x', categorical_mode: str = 'auto', header_unit: str = 'cm-1', encoding: str = 'utf-8', member: str | None = None, **user_params: Any) LoaderResult[source]
Load data from a CSV file.
- Parameters:
path – Path to the CSV file.
na_policy – How to handle NA values (‘remove’, ‘abort’, or ‘auto’).
data_type – Type of data (‘x’, ‘y’, or ‘metadata’).
categorical_mode – How to handle categorical columns.
header_unit – Unit type for headers.
encoding – File encoding.
member – For zip files, specific member to extract.
**user_params – Additional CSV parsing parameters.
- Returns:
LoaderResult with the loaded data.
- nirs4all.data.loaders.csv_loader_new.load_csv(path, na_policy: str = 'auto', data_type: str = 'x', categorical_mode: str = 'auto', header_unit: str = 'cm-1', **user_params)[source]
Load a CSV file using the CSVLoader.
This function maintains backward compatibility with the original load_csv API.
- Parameters:
path – Path to the CSV file.
na_policy – How to handle NA values.
data_type – Type of data being loaded.
categorical_mode – How to handle categorical columns.
header_unit – Unit type for headers.
**user_params – Additional CSV parsing parameters.
- Returns:
Tuple of (DataFrame, report, na_mask, headers, header_unit).