nirs4all.data.performance.cache module
Data caching for dataset loading.
This module provides caching functionality to avoid redundant file loading and improve performance for repeated data access.
Phase 8 Implementation - Dataset Configuration Roadmap Section 8.5: Performance Optimization - Caching
- class nirs4all.data.performance.cache.CacheEntry(data: Any, key: str, timestamp: float = <factory>, size_bytes: int = 0, source_path: str | None = None, source_mtime: float | None = None, hit_count: int = 0)[source]
Bases:
objectA cached data entry.
- data
The cached data.
- Type:
Any
- class nirs4all.data.performance.cache.DataCache(max_size_mb: float = 500, max_entries: int = 100, ttl_seconds: float | None = None)[source]
Bases:
objectLRU cache for loaded data.
Provides in-memory caching with: - Configurable size limits - LRU eviction policy - File modification detection - Thread-safe access - Cache statistics
Example
```python cache = DataCache(max_size_mb=500)
# Store data cache.set(“my_data”, numpy_array, source_path=”/path/to/file.csv”)
# Retrieve data data = cache.get(“my_data”)
# With automatic loading data = cache.get_or_load(“key”, lambda: load_expensive_data())
# Check stats print(cache.stats()) ```
- get(key: str) Any | None[source]
Get data from cache.
- Parameters:
key – Cache key.
- Returns:
Cached data or None if not found.
- get_or_load(key: str, loader: Callable[[], T], source_path: str | None = None) T[source]
Get from cache or load and cache.
- Parameters:
key – Cache key.
loader – Function to call if not cached.
source_path – Optional source file path.
- Returns:
Cached or newly loaded data.
- invalidate(key: str) bool[source]
Remove entry from cache.
- Parameters:
key – Cache key.
- Returns:
True if entry was removed.