nirs4all.operators.filters.report module
Filtering report generator for sample filtering operations.
Provides utilities to generate comprehensive reports about sample filtering, including statistics, visualizations, and export capabilities.
- class nirs4all.operators.filters.report.FilterResult(filter_name: str, reason: str, n_samples: int, n_excluded: int, n_kept: int, exclusion_rate: float, excluded_indices: List[int] = <factory>, stats: Dict[str, ~typing.Any]=<factory>)[source]
Bases:
objectResult of applying a single filter.
- class nirs4all.operators.filters.report.FilteringReport(dataset_name: str, partition: str, timestamp: str = <factory>, filter_results: List[FilterResult] = <factory>, combined_mode: str = 'any', n_total_samples: int = 0, n_final_excluded: int = 0, n_final_kept: int = 0, cascade_to_augmented: bool = True, n_augmented_excluded: int = 0)[source]
Bases:
objectComprehensive report of sample filtering operations.
This class aggregates results from multiple filters and provides methods for analysis, visualization, and export.
- filter_results
List of individual filter results
- Type:
- add_filter_result(result: FilterResult) None[source]
Add a filter result to the report.
- filter_results: List[FilterResult]
- print_report(verbose: int = 1) None[source]
Print the filtering report to console.
- Parameters:
verbose – Verbosity level (0=minimal, 1=normal, 2=detailed)
- class nirs4all.operators.filters.report.FilteringReportGenerator(dataset: SpectroDataset)[source]
Bases:
objectGenerator for creating comprehensive filtering reports.
This class provides utilities for collecting filter statistics, generating reports, and exporting results.
Example
>>> generator = FilteringReportGenerator(dataset) >>> report = generator.create_report( ... filters=[YOutlierFilter(method="iqr")], ... mode="any", ... partition="train" ... ) >>> report.print_report()
- compare_filters(filters: List[SampleFilter], X: ndarray, y: ndarray) Dict[str, Any][source]
Compare multiple filters on the same data without applying them.
Useful for understanding which filter is more aggressive or to find the overlap between filter decisions.
- Parameters:
filters – List of filters to compare
X – Feature array
y – Target array
- Returns:
individual: Per-filter stats
overlap: Samples flagged by multiple filters
unique: Samples flagged by only one filter
- Return type:
Dictionary with comparison statistics
- create_report(filters: List[SampleFilter], X: ndarray, y: ndarray, sample_indices: ndarray, mode: str = 'any', partition: str = 'train', cascade_to_augmented: bool = True, dry_run: bool = True) FilteringReport[source]
Create a filtering report by applying filters to data.
- Parameters:
filters – List of SampleFilter instances to apply
X – Feature array (n_samples, n_features)
y – Target array (n_samples,) or (n_samples, n_targets)
sample_indices – Array of sample indices corresponding to X/y
mode – Filter combination mode (“any” or “all”)
partition – Which partition is being filtered
cascade_to_augmented – Whether augmented samples will be cascaded
dry_run – If True, don’t actually mark samples as excluded
- Returns:
FilteringReport with all statistics and results
- generate_from_indexer(partition: str | None = 'train') FilteringReport[source]
Generate a report from current indexer exclusion state.
This method creates a report based on samples already marked as excluded in the indexer, rather than applying filters.
- Parameters:
partition – Partition to report on (None for all partitions)
- Returns:
FilteringReport based on current exclusion state