PredictionResultsList Reference
The PredictionResultsList class is a specialized list container that wraps lists of PredictionResult objects returned by the top() method of the Predictions class. It provides additional functionality while maintaining full compatibility with standard Python list operations.
Quick Reference
Get Top Predictions
# Get top predictions (returns PredictionResultsList)
top_models = predictions.top(n=5, rank_metric="mse", aggregate_partitions=True)
Save All Predictions to CSV
top_models.save(path="results", filename="top_5_models.csv")
Get Prediction by ID
prediction = top_models.get("abc123")
if prediction:
print(f"Found: {prediction.model_name}")
Print Summary Report
print(top_models[0].summary())
Output:
|----------|---------|----------|--------|--------|--------|--------|
| | Nsample | Nfeature | R² | RMSE | MSE | MAE |
|----------|---------|----------|--------|--------|--------|--------|
| Cros Val | 50 | 100 | 0.966 | 0.195 | 0.038 | 0.160 |
| Train | 50 | 100 | 0.944 | 0.231 | 0.053 | 0.191 |
| Test | 50 | 100 | 0.962 | 0.176 | 0.031 | 0.141 |
|----------|---------|----------|--------|--------|--------|--------|
Standard List Operations
len(top_models) # Length
top_models[0] # Indexing
top_models[:3] # Slicing
for model in top_models: # Iteration
...
Key Features
Extended Functionality
save(path, filename): Save all predictions to a single structured CSV fileget(id): Fast retrieval of predictions by their unique IDStandard list operations: indexing, slicing, iteration, length, etc.
Enhanced PredictionResult
summary(): Generate a formatted tab report with metrics for train/val/test partitionssave_to_csv(path_or_file, filename): Save individual prediction to CSVeval_score(metrics): Calculate metrics for the prediction
Usage Examples
Basic Usage
from nirs4all.data import Predictions
predictions = Predictions()
# Get top 5 models using top() method
top_models = predictions.top(
n=5,
rank_metric="mse",
rank_partition="val",
display_partition="test",
aggregate_partitions=True
)
# Type: PredictionResultsList (extends list)
print(type(top_models)) # <class 'PredictionResultsList'>
print(len(top_models)) # 5
Saving to CSV
The save() method creates a structured CSV:
Line 1: dataset_name
Line 2: model_classname + model_id
Line 3: fold_id
Line 4: partition
Line 5: column headers (y_true_partition, y_pred_partition, ...)
Lines 6+: prediction data
Example:
top_models.save(
path="results",
filename="top_5_models.csv"
)
For aggregated results, the CSV has columns like:
y_true_train_fold0,y_pred_train_fold0y_true_val_fold0,y_pred_val_fold0y_true_test,y_pred_test
Common Workflows
Analyze Top Models
# Get top 10 models
top_10 = predictions.top(
n=10,
rank_metric="mse",
aggregate_partitions=True
)
# Save all to CSV
top_10.save(path="results/analysis")
# Print summaries
for i, model in enumerate(top_10, 1):
print(f"\n{'='*80}")
print(f"MODEL {i}: {model.model_name} (ID: {model.id})")
print(f"{'='*80}")
print(model.summary())
Export Best Model Details
# Get best model
best = predictions.top(n=1, rank_metric="rmse")[0]
# Print summary
print("BEST MODEL PERFORMANCE:")
print(best.summary())
# Save individual prediction
best.save_to_csv("results/best_model.csv")
# Access details
print(f"Model: {best.model_name}")
print(f"Dataset: {best.dataset_name}")
print(f"Fold: {best.fold_id}")
print(f"Score: {best.get('rank_score')}")
Compare Multiple Models
# Get top 5 models
top_5 = predictions.top(n=5, rank_metric="r2", ascending=False)
# Save all predictions to single file
top_5.save(filename="top_5_comparison.csv")
# Compare metrics
for model in top_5:
scores = model.eval_score(metrics=["rmse", "mae", "r2"])
print(f"{model.model_name}: {scores}")
Group By: Top N Per Group
The group_by parameter allows you to get top N results per group instead of N total.
This is useful when comparing models across multiple datasets or configurations.
# Get top 3 models PER DATASET (flat list, sorted by global rank)
top_per_dataset = predictions.top(
n=3,
rank_metric="rmse",
group_by="dataset_name"
)
# Each result includes 'group_key' for easy filtering
for pred in top_per_dataset:
dataset = pred['group_key'][0] # group_key is a tuple
print(f"{dataset}: {pred.model_name} - RMSE: {pred.get('rmse', 0):.4f}")
# Filter results for a specific dataset
wheat_results = [r for r in top_per_dataset if r['group_key'] == ('wheat',)]
Grouped dict output with return_grouped=True:
# Get top 3 models per dataset as a dictionary
grouped = predictions.top(
n=3,
rank_metric="rmse",
group_by="dataset_name",
return_grouped=True
)
# Result: {('dataset1',): [...], ('dataset2',): [...]}
for group_key, results in grouped.items():
print(f"\n{group_key[0]}: {len(results)} best models")
for i, pred in enumerate(results, 1):
print(f" {i}. {pred.model_name}: RMSE={pred.get('rmse', 0):.4f}")
Multi-column grouping:
# Top 2 per (dataset, model_class) combination
per_combo = predictions.top(
n=2,
rank_metric="rmse",
group_by=["dataset_name", "model_classname"]
)
# Each result has group_key like ('wheat', 'PLSRegression')
Complete Workflow Example
from nirs4all.data import Predictions
# Load existing predictions
predictions = Predictions.load(
dataset_name="my_dataset",
path="results"
)
# Get top 10 models ranked by MSE on validation set
top_models = predictions.top(
n=10,
rank_metric="mse",
rank_partition="val",
display_partition="test",
aggregate_partitions=True, # Include train/val/test data
ascending=True # Lower MSE is better
)
# Save all predictions to CSV
top_models.save(
path="results/analysis",
filename="top_10_models.csv"
)
# Print summary for best model
print("=" * 80)
print("BEST MODEL SUMMARY")
print("=" * 80)
print(top_models[0].summary())
# Access specific prediction by ID
best_id = top_models[0].id
best_prediction = top_models.get(best_id)
# Iterate through predictions
for i, prediction in enumerate(top_models, 1):
print(f"\n{i}. {prediction.model_name} (ID: {prediction.id})")
print(f" Fold: {prediction.fold_id}")
print(f" Rank Score: {prediction.get('rank_score'):.4f}")
# Save individual prediction
prediction.save_to_csv(f"results/individual/model_{i}.csv")
API Reference
PredictionResultsList
class PredictionResultsList(list):
def save(self, path: str = "results", filename: Optional[str] = None) -> None
def get(self, prediction_id: str) -> Optional[PredictionResult]
Methods:
__init__(predictions=None): Initialize with optional list of predictionssave(path="results", filename=None): Save all predictions to structured CSVget(prediction_id): Retrieve prediction by ID (returnsPredictionResultorNone)All standard list methods:
append(),extend(),pop(),remove(), etc.
PredictionResult
class PredictionResult(dict):
def summary(self) -> str
def save_to_csv(self, path_or_file: str = "results", filename: Optional[str] = None) -> None
def eval_score(self, metrics: Optional[List[str]] = None) -> Dict[str, Any]
@property
def id(self) -> str
@property
def dataset_name(self) -> str
@property
def model_name(self) -> str
@property
def model_classname(self) -> str
@property
def fold_id(self) -> str
@property
def config_name(self) -> str
@property
def step_idx(self) -> int
@property
def op_counter(self) -> int
Notes
Aggregated vs Non-Aggregated Results
Aggregated results (when aggregate_partitions=True):
Contains nested dictionaries for
train,val,testpartitionsEach partition has
y_true,y_pred, and score fieldsSummary shows metrics for all partitions
Non-aggregated results (single partition):
Contains
y_true,y_predat the root levelSummary shows metrics for that partition only
CSV File Structure
With aggregation:
dataset_name
model_classname_id
fold_id
partition
y_true_train_foldX,y_pred_train_foldX,y_true_val_foldX,y_pred_val_foldX,y_true_test,y_pred_test
0.5,0.52,0.6,0.58,0.55,0.54
...
Without aggregation:
dataset_name
model_classname_id
fold_id
partition
y_true,y_pred
0.5,0.52
...
Implementation Details
Type:
PredictionResultsListextends Python’s built-inlistclassCompatibility: Fully compatible with all list operations and duck typing
Performance:
get()method uses linear search (O(n)), suitable for small result setsDependencies: Uses
TabReportManagerfor summary generationReturn Type:
top()returnsPredictionResultsListinstead of plain list
Key Points
✅ Backward Compatible: All existing code continues to work
✅ List Compatible: Standard list operations work normally
✅ Flexible: Works with aggregated and non-aggregated results
✅ Type Safe: Properly typed with Union types
See Also
Writing a Pipeline in nirs4all - Pipeline syntax reference
Visualization - Visualization and charts