nirs4all.visualization.chart_utils package

Submodules

Module contents

Utility classes for chart generation.

class nirs4all.visualization.chart_utils.ChartAnnotator(config: ChartConfig | Dict[str, Any] | None = None)[source]

Bases: object

Helper for adding annotations to charts.

Centralizes text formatting, positioning, and color selection for chart annotations. Uses ChartConfig for styling.

config: ChartConfig instance for customization.

add_heatmap_annotations(ax, matrix: ndarray, normalized_matrix: ndarray, count_matrix: ndarray, x_labels: List, y_labels: List, show_counts: bool = True, precision: int = 3) → None[source]

Add text annotations to heatmap cells.

Parameters:

ax – Matplotlib axes object.
matrix – Original score matrix.
normalized_matrix – Normalized matrix for color selection.
count_matrix – Matrix of sample counts.
x_labels – List of x-axis labels.
y_labels – List of y-axis labels.
show_counts – Whether to show sample counts.
precision – Number of decimal places for scores.

add_statistics_box(ax, values: List[float], position: str = 'upper right', precision: int = 4) → None[source]

Add statistics text box to plot.

Parameters:

ax – Matplotlib axes object.
values – List of values to compute statistics from.
position – Position string for text box placement.
precision – Number of decimal places.

static get_text_color(background_value: float, threshold: float = 0.5) → str[source]

Determine text color based on background for optimal contrast.

Parameters:

background_value – Normalized background value (0-1).
threshold – Threshold for switching from white to black text.

Returns:

Color string (always ‘black’ for consistency).

class nirs4all.visualization.chart_utils.DataAggregator[source]

Bases: object

Aggregate scores using different strategies.

Supports multiple aggregation methods with proper handling of ranking information (when display and rank metrics differ).

static aggregate(scores: List, method: str, higher_better: bool) → float[source]

Aggregate scores using specified method.

Parameters:

scores – List of scores (can be floats or tuples of (display_score, rank_score)).
method – Aggregation method (‘best’, ‘worst’, ‘mean’, ‘median’).
higher_better – Whether higher values are better.

Returns:

Aggregated score value.

class nirs4all.visualization.chart_utils.MatrixBuilder[source]

Bases: object

Build matrices for heatmap visualizations.

Handles grouping scores by variables and creating 2D matrices with support for different aggregation strategies.

Optimized to work with PredictionResultsList from predictions.top().

static build_matrices(score_dict: Dict, aggregation: str, higher_better: bool, natural_sort: bool = True) → Tuple[List, List, ndarray, ndarray][source]

Build matrices from score dictionary.

Parameters:

score_dict – Dict of scores grouped by x and y variables. Can be {y: {x: [scores]}} or {y: {x: (score, count)}}.
aggregation – Aggregation method (‘best’, ‘mean’, ‘median’, ‘identity’). Use ‘identity’ if scores are already aggregated tuples.
higher_better – Whether higher values are better.
natural_sort – Whether to use natural sorting for labels.

Returns:

Tuple of (y_labels, x_labels, score_matrix, count_matrix).

static build_score_dict(predictions_list, x_var: str, y_var: str, display_score_field: str, rank_field: str | None = None) → Dict[source]

Group scores by x and y variables from PredictionResultsList.

Parameters:

predictions_list – List of prediction results.
x_var – Variable name for x-axis grouping.
y_var – Variable name for y-axis grouping.
display_score_field – Field name for display scores.
rank_field – Optional field name for ranking scores.

Returns:

{y_val: {x_val: [(display_score, rank_score), …]}} or {y_val: {x_val: [score1, score2, …]}} if no rank_field.

Return type:

Dict structure

static build_score_dict_with_dynamic_partition(predictions_list, x_var: str, y_var: str, metric: str, use_rank_scores: bool = False) → Dict[source]

Group scores by x and y variables when partition is one of the grouping variables.

This method handles the special case where ‘partition’ is used as x_var or y_var. It extracts the score from the appropriate partition field based on the partition value.

Parameters:

predictions_list – List of prediction results.
x_var – Variable name for x-axis grouping.
y_var – Variable name for y-axis grouping.
metric – Metric name to extract scores for.
use_rank_scores – If True, include rank scores for proper aggregation.

Returns:

{y_val: {x_val: [score1, score2, …]}} or: {y_val: {x_val: [(display_score, rank_score), …]}}

Return type:

Dict structure

class nirs4all.visualization.chart_utils.PredictionsAdapter(predictions)[source]

Bases: object

Adapter for Predictions API with optimized data access.

Wraps the refactored Predictions API to provide convenient methods for charts. Leverages predictions.top(), lazy loading, and structured results.

Key Optimizations: - Uses predictions.top() for efficient ranking - Supports lazy loading (load_arrays=False) for metadata-only queries - Works with PredictionResult/PredictionResultsList classes - Avoids redundant metric calculations

predictions: Predictions object instance.

extract_metric_values(predictions_list: PredictionResultsList, metric: str, partition: str = 'test') → List[float][source]

Extract metric values from prediction results.

Parameters:

predictions_list – List of prediction results.
metric – Metric name to extract.
partition – Partition to extract from (default: ‘test’).

Returns:

List of metric values.

get_all_predictions_metadata(rank_metric: str = 'rmse', rank_partition: str = 'test', **filters) → PredictionResultsList[source]

Get all predictions matching filters (metadata only, fast).

Parameters:

rank_metric – Metric for sorting (default: ‘rmse’).
rank_partition – Partition for sorting (default: ‘test’).
**filters – Filters to apply (dataset_name, model_name, etc.).

Returns:

PredictionResultsList with all matching predictions (no arrays loaded).

get_top_models(n: int, rank_metric: str, rank_partition: str = 'val', ascending: bool | None = None, load_arrays: bool = True, **filters) → PredictionResultsList[source]

Get top N models using predictions.top() API.

Parameters:

n – Number of top models to retrieve.
rank_metric – Metric to rank by.
rank_partition – Partition to rank on (default: ‘val’).
ascending – Sort order (None = auto-detect from metric).
load_arrays – Whether to load prediction arrays (default: True).
**filters – Additional filters (dataset_name, model_name, etc.).

Returns:

PredictionResultsList of top N models.

static is_higher_better(metric: str) → bool[source]

Check if metric is higher-is-better.

Parameters:: metric – Metric name.
Returns:: True if higher is better, False otherwise.

class nirs4all.visualization.chart_utils.ScoreNormalizer[source]

Bases: object

Normalize scores for visualization.

Handles normalization to [0, 1] range with support for both higher-is-better and lower-is-better metrics.

static is_higher_better(metric: str) → bool[source]

Check if metric is higher-is-better.

Parameters:: metric – Metric name.
Returns:: True if higher is better, False otherwise.

static normalize(matrix: ndarray, higher_better: bool, per_row: bool = False, per_column: bool = False) → ndarray[source]

Normalize matrix values to [0, 1] range.

Parameters:

matrix – Input matrix to normalize.
higher_better – Whether higher values are better.
per_row – If True, normalize each row independently.
per_column – If True, normalize each column independently. Takes precedence over per_row if both are True.

Returns:

Normalized matrix with values in [0, 1] range.