nirs4all.data.parsers.legacy_parser module

Legacy parser for dataset configuration.

This parser handles the current train_x/test_x format that is fully implemented and widely used. It provides backward compatibility with existing configurations.

class nirs4all.data.parsers.legacy_parser.LegacyParser[source]

Bases: BaseParser

Parser for legacy train_x/test_x configuration format.

This parser handles dictionary configurations using the established key format: train_x, train_y, test_x, test_y, train_group, test_group.

It also handles flexible key naming (X_train, Xtrain, etc.) by normalizing to the standard format.

can_parse(input_data: Any) bool[source]

Check if this is a legacy format configuration.

Parameters:

input_data – The input to check.

Returns:

True if input is a dict with legacy keys or data arrays.

parse(input_data: Dict[str, Any]) ParserResult[source]

Parse a legacy format configuration.

Parameters:

input_data – Dictionary configuration to parse.

Returns:

ParserResult with normalized configuration.

nirs4all.data.parsers.legacy_parser.normalize_config_keys(config: Dict[str, Any]) Dict[str, Any][source]

Normalize dataset configuration keys to standard format.

Maps variations like ‘x_train’, ‘X_train’, ‘Xtrain’ to ‘train_x’. Maps metadata variations like ‘metadata_train’, ‘train_metadata’, ‘m_train’ to ‘train_group’.

Parameters:

config – Original configuration dictionary.

Returns:

Normalized configuration with standardized keys.