nirs4all.data.serialization.serializer module

Configuration serializer for dataset configurations.

This module provides serialization/deserialization of dataset configurations to/from YAML and JSON formats, with normalization and diffing support.

Phase 8 Implementation - Dataset Configuration Roadmap Section 8.3: Configuration Serialization

class nirs4all.data.serialization.serializer.ConfigDiff(added: Dict[str, Any], removed: Dict[str, Any], changed: Dict[str, Tuple[Any, Any]], unchanged: Set[str])[source]

Bases: object

Result of comparing two configurations.

added

Keys added in the new config.

Type:

Dict[str, Any]

removed

Keys removed from the old config.

Type:

Dict[str, Any]

changed

Keys with different values, with (old, new) tuples.

Type:

Dict[str, Tuple[Any, Any]]

unchanged

Keys with identical values.

Type:

Set[str]

added: Dict[str, Any]
changed: Dict[str, Tuple[Any, Any]]
is_identical() bool[source]

Check if configs are identical.

removed: Dict[str, Any]
summary() str[source]

Get a summary of changes.

unchanged: Set[str]
class nirs4all.data.serialization.serializer.ConfigSerializer(include_defaults: bool = False, normalize: bool = True, sort_keys: bool = True)[source]

Bases: object

Serializer for dataset configurations.

Handles serialization to YAML/JSON with: - Normalization of configs before serialization - Conversion of numpy arrays to lists - Conversion of Path objects to strings - Enum value serialization - Removal of internal/private keys

Example

```python serializer = ConfigSerializer()

# Serialize to YAML yaml_str = serializer.to_yaml(config_dict)

# Serialize to JSON json_str = serializer.to_json(config_dict)

# Save to file serializer.save(config_dict, “config.yaml”)

# Load from file config = serializer.load(“config.yaml”)

# Compare configs diff = serializer.diff(old_config, new_config) ```

INTERNAL_KEYS = {'_normalized', '_original', '_parsed', '_sources', '_variation_mode', '_variations'}
diff(old_config: Dict[str, Any] | DatasetConfigSchema, new_config: Dict[str, Any] | DatasetConfigSchema) ConfigDiff[source]

Compare two configurations.

Parameters:
  • old_config – Original configuration.

  • new_config – New configuration.

Returns:

ConfigDiff with differences.

load(path: str | Path) Dict[str, Any][source]

Load config from file.

Parameters:

path – Path to config file.

Returns:

Configuration dictionary.

Raises:
save(config: Dict[str, Any] | DatasetConfigSchema, path: str | Path, format: SerializationFormat | None = None) None[source]

Save config to file.

Parameters:
  • config – Configuration to save.

  • path – Output file path.

  • format – Output format (auto-detected from extension if None).

to_json(config: Dict[str, Any] | DatasetConfigSchema, indent: int = 2, **kwargs) str[source]

Serialize config to JSON string.

Parameters:
  • config – Configuration dict or schema object.

  • indent – Indentation level.

  • **kwargs – Additional arguments for json.dumps.

Returns:

JSON string.

to_yaml(config: Dict[str, Any] | DatasetConfigSchema, **kwargs) str[source]

Serialize config to YAML string.

Parameters:
  • config – Configuration dict or schema object.

  • **kwargs – Additional arguments for yaml.dump.

Returns:

YAML string.

class nirs4all.data.serialization.serializer.SerializationFormat(value)[source]

Bases: str, Enum

Supported serialization formats.

JSON = 'json'
YAML = 'yaml'
nirs4all.data.serialization.serializer.deserialize_config(content: str, format: SerializationFormat = SerializationFormat.YAML) Dict[str, Any][source]

Convenience function to deserialize config.

Parameters:
  • content – Serialized content.

  • format – Content format.

Returns:

Configuration dictionary.

nirs4all.data.serialization.serializer.diff_configs(old_config: Dict[str, Any] | DatasetConfigSchema, new_config: Dict[str, Any] | DatasetConfigSchema) ConfigDiff[source]

Convenience function to diff configs.

Parameters:
  • old_config – Original configuration.

  • new_config – New configuration.

Returns:

ConfigDiff with differences.

nirs4all.data.serialization.serializer.serialize_config(config: Dict[str, Any] | DatasetConfigSchema, format: SerializationFormat = SerializationFormat.YAML, **kwargs) str[source]

Convenience function to serialize config.

Parameters:
  • config – Configuration to serialize.

  • format – Output format.

  • **kwargs – Additional serializer options.

Returns:

Serialized string.