nirs4all.analysis.transfer_metrics module
Transfer Metrics Computation.
This module provides fast, optimized computation of transfer-focused metrics between two datasets in PCA space. Metrics are designed to assess how well preprocessing aligns datasets for transfer learning scenarios.
Metrics computed: - Centroid Distance: Euclidean distance between dataset centroids in PCA space - CKA (Centered Kernel Alignment): Representation similarity - Grassmann Distance: Angular distance between PCA subspaces - RV Coefficient: Multivariate correlation structure - Procrustes Disparity: Shape alignment after optimal transformation - Trustworthiness: Neighborhood preservation - Spread Distance: Distribution overlap combining covariance and sample distances
- class nirs4all.analysis.transfer_metrics.TransferMetrics(centroid_distance: float, cka_similarity: float, grassmann_distance: float, rv_coefficient: float, procrustes_disparity: float, trustworthiness: float, spread_distance: float, evr_source: float, evr_target: float)[source]
Bases:
objectContainer for transfer metrics between two datasets.
- class nirs4all.analysis.transfer_metrics.TransferMetricsComputer(n_components: int = 10, k_neighbors: int = 10, random_state: int = 0)[source]
Bases:
objectFast computation of transfer metrics between two datasets.
Key optimization: Computes PCA once per dataset, then reuses for all metric computations.
- Parameters:
n_components – Number of PCA components for projection.
k_neighbors – Number of neighbors for trustworthiness computation.
random_state – Random state for reproducibility.
- compute(X_source: ndarray, X_target: ndarray, compute_trust: bool = True) TransferMetrics[source]
Compute all transfer metrics between two datasets.
- Parameters:
X_source – Source dataset (n_samples_src, n_features).
X_target – Target dataset (n_samples_tgt, n_features).
compute_trust – Whether to compute trustworthiness (slower).
- Returns:
TransferMetrics containing all computed metrics.
- compute_raw_and_preprocessed(X_source_raw: ndarray, X_target_raw: ndarray, X_source_pp: ndarray, X_target_pp: ndarray, compute_trust: bool = True) Tuple[TransferMetrics, TransferMetrics, Dict[str, float]][source]
Compute metrics for both raw and preprocessed data, plus improvement.
- Parameters:
X_source_raw – Raw source dataset.
X_target_raw – Raw target dataset.
X_source_pp – Preprocessed source dataset.
X_target_pp – Preprocessed target dataset.
compute_trust – Whether to compute trustworthiness.
- Returns:
Tuple of (raw_metrics, pp_metrics, improvements_dict)
- nirs4all.analysis.transfer_metrics.compute_transfer_score(metrics: TransferMetrics, raw_metrics: TransferMetrics | None = None, weights: Dict[str, float] | None = None) float[source]
Compute a composite transfer score from metrics.
Higher scores indicate better transfer potential.
- Parameters:
metrics – TransferMetrics from preprocessed data.
raw_metrics – Optional baseline metrics for computing improvements.
weights – Optional custom weights for metric combination.
- Returns:
Composite transfer score (0-1 scale, higher is better). Returns NaN if critical metrics are invalid.