nirs4all.pipeline.storage.store_queries module

Reusable SQL query builders for WorkspaceStore.

This module provides parameterised SQL constants and helper functions for building dynamic WHERE clauses. All queries use $1, $2 style positional parameters for safe parameterised execution via DuckDB.

nirs4all.pipeline.storage.store_queries.build_filter_clause(filters: dict[str, object]) tuple[str, list[object]][source]

Build a WHERE clause from a dictionary of column filters.

Parameters:

filters – Mapping of column name to value. None values are skipped. String values containing % are treated as LIKE patterns.

Returns:

A (clause, params) tuple where clause is a SQL fragment like "WHERE col1 = $1 AND col2 LIKE $2" and params is the positional parameter list. If no filters apply the clause is an empty string.

nirs4all.pipeline.storage.store_queries.build_prediction_query(*, dataset_name: str | None = None, model_class: str | None = None, partition: str | None = None, fold_id: str | None = None, branch_id: int | None = None, pipeline_id: str | None = None, run_id: str | None = None, limit: int | None = None, offset: int = 0) tuple[str, list[object]][source]

Build a full SELECT query for the predictions table.

Supports joining through pipelines when filtering by run_id.

Returns:

(sql, params) ready for conn.execute(sql, params).

nirs4all.pipeline.storage.store_queries.build_top_predictions_query(*, n: int, metric: str = 'val_score', ascending: bool = True, partition: str = 'val', dataset_name: str | None = None, group_by: str | None = None) tuple[str, list[object]][source]

Build a ranking query for top-N predictions.

When group_by is set, returns top n per group using a window function.

Parameters:
  • n – Number of top predictions to return.

  • metric – Column name to rank by. Must be a valid prediction column (validated against _PREDICTION_COLUMNS).

  • ascending – Sort direction.

  • partition – Only consider this partition.

  • dataset_name – Optional dataset filter.

  • group_by – Optional grouping column. Must be a valid prediction column if provided.

Returns:

(sql, params) ready for conn.execute(sql, params).

Raises:

ValueError – If metric or group_by is not a valid column name.