nirs4all.data.serialization package

Submodules

Module contents

Serialization module for dataset configuration.

This module provides functionality for serializing and deserializing dataset configurations to/from YAML and JSON formats.

class nirs4all.data.serialization.ConfigSerializer(include_defaults: bool = False, normalize: bool = True, sort_keys: bool = True)[source]

Bases: object

Serializer for dataset configurations.

Handles serialization to YAML/JSON with: - Normalization of configs before serialization - Conversion of numpy arrays to lists - Conversion of Path objects to strings - Enum value serialization - Removal of internal/private keys

Example

```python serializer = ConfigSerializer()

# Serialize to YAML yaml_str = serializer.to_yaml(config_dict)

# Serialize to JSON json_str = serializer.to_json(config_dict)

# Save to file serializer.save(config_dict, “config.yaml”)

# Load from file config = serializer.load(“config.yaml”)

# Compare configs diff = serializer.diff(old_config, new_config) ```

INTERNAL_KEYS = {'_normalized', '_original', '_parsed', '_sources', '_variation_mode', '_variations'}
diff(old_config: Dict[str, Any] | DatasetConfigSchema, new_config: Dict[str, Any] | DatasetConfigSchema) ConfigDiff[source]

Compare two configurations.

Parameters:
  • old_config – Original configuration.

  • new_config – New configuration.

Returns:

ConfigDiff with differences.

load(path: str | Path) Dict[str, Any][source]

Load config from file.

Parameters:

path – Path to config file.

Returns:

Configuration dictionary.

Raises:
save(config: Dict[str, Any] | DatasetConfigSchema, path: str | Path, format: SerializationFormat | None = None) None[source]

Save config to file.

Parameters:
  • config – Configuration to save.

  • path – Output file path.

  • format – Output format (auto-detected from extension if None).

to_json(config: Dict[str, Any] | DatasetConfigSchema, indent: int = 2, **kwargs) str[source]

Serialize config to JSON string.

Parameters:
  • config – Configuration dict or schema object.

  • indent – Indentation level.

  • **kwargs – Additional arguments for json.dumps.

Returns:

JSON string.

to_yaml(config: Dict[str, Any] | DatasetConfigSchema, **kwargs) str[source]

Serialize config to YAML string.

Parameters:
  • config – Configuration dict or schema object.

  • **kwargs – Additional arguments for yaml.dump.

Returns:

YAML string.

nirs4all.data.serialization.deserialize_config(content: str, format: SerializationFormat = SerializationFormat.YAML) Dict[str, Any][source]

Convenience function to deserialize config.

Parameters:
  • content – Serialized content.

  • format – Content format.

Returns:

Configuration dictionary.

nirs4all.data.serialization.diff_configs(old_config: Dict[str, Any] | DatasetConfigSchema, new_config: Dict[str, Any] | DatasetConfigSchema) ConfigDiff[source]

Convenience function to diff configs.

Parameters:
  • old_config – Original configuration.

  • new_config – New configuration.

Returns:

ConfigDiff with differences.

nirs4all.data.serialization.serialize_config(config: Dict[str, Any] | DatasetConfigSchema, format: SerializationFormat = SerializationFormat.YAML, **kwargs) str[source]

Convenience function to serialize config.

Parameters:
  • config – Configuration to serialize.

  • format – Output format.

  • **kwargs – Additional serializer options.

Returns:

Serialized string.