nirs4all.pipeline.bundle.loader module

Bundle Loader - Load and predict from exported bundles.

This module provides the BundleLoader class for loading prediction bundles (.n4a format) and running predictions without needing the original workspace.

The loader supports:
  • Loading bundle metadata and structure

  • Extracting artifacts on-demand

  • Building artifact provider for prediction

  • Resolving bundles for PredictionResolver

  • Full support for branching pipelines

  • Full support for meta-models (stacking)

  • CV ensemble prediction with fold weights

Example

>>> from nirs4all.pipeline.bundle import BundleLoader
>>>
>>> loader = BundleLoader("model.n4a")
>>> print(loader.metadata.pipeline_uid)
>>> y_pred = loader.predict(X_new)
>>>
>>> # Or via PredictionResolver
>>> resolver = PredictionResolver(workspace_path)
>>> resolved = resolver.resolve("model.n4a")  # Automatically uses BundleLoader
class nirs4all.pipeline.bundle.loader.BundleArtifactProvider(bundle_path: str | Path, artifact_index: Dict[str, str], fold_weights: Dict[int, float] | None = None, step_info: Dict[int, Dict[str, Any]] | None = None, trace: ExecutionTrace | None = None)[source]

Bases: ArtifactProvider

Artifact provider for bundles.

Provides artifacts from a loaded bundle, with lazy loading and caching. Supports branching pipelines and meta-models.

bundle_path

Path to the bundle file

artifact_cache

Cache of loaded artifacts

artifact_index

Index mapping step/fold to artifact filenames

fold_weights

Fold weights for CV ensemble

step_info

Information about each step (operator_type, branch_path, etc.)

get_artifact(step_index: int, fold_id: int | None = None) Any | None[source]

Get a single artifact for a step.

Parameters:
  • step_index – 1-based step index

  • fold_id – Optional fold ID for fold-specific artifacts

Returns:

Artifact object or None if not found

get_artifacts_for_step(step_index: int, branch_path: List[int] | None = None) List[Tuple[str, Any]][source]

Get all artifacts for a step.

Parameters:
  • step_index – 1-based step index

  • branch_path – Optional branch path filter

Returns:

List of (artifact_id, artifact_object) tuples

get_fold_artifacts(step_index: int, branch_path: List[int] | None = None) List[Tuple[int, Any]][source]

Get all fold-specific artifacts for a step.

Parameters:
  • step_index – 1-based step index

  • branch_path – Optional branch path filter

Returns:

List of (fold_id, artifact_object) tuples, sorted by fold_id

get_fold_weights() Dict[int, float][source]

Get fold weights for CV ensemble.

Returns:

Fold weights dictionary

get_meta_model_sources(step_index: int) List[Tuple[int, str]][source]

Get source model info for a meta-model step.

Parameters:

step_index – Step index of the meta-model

Returns:

List of (source_step_index, artifact_id) tuples

get_step_operator_type(step_index: int) str | None[source]

Get the operator type for a step.

Parameters:

step_index – 1-based step index

Returns:

Operator type string or None

has_artifacts_for_step(step_index: int) bool[source]

Check if artifacts exist for a step.

Parameters:

step_index – 1-based step index

Returns:

True if artifacts are available for this step

class nirs4all.pipeline.bundle.loader.BundleLoader(bundle_path: str | Path)[source]

Bases: object

Load and use prediction bundles.

Provides functionality for loading .n4a bundles, extracting metadata, and running predictions.

bundle_path

Path to the bundle file

metadata

Bundle metadata

Type:

nirs4all.pipeline.bundle.loader.BundleMetadata | None

trace

Execution trace (if available)

Type:

nirs4all.pipeline.trace.execution_trace.ExecutionTrace | None

pipeline_config

Pipeline configuration

Type:

Dict[str, Any]

fold_weights

Fold weights for CV ensemble

Type:

Dict[int, float]

artifact_provider

Provider for artifacts

Type:

nirs4all.pipeline.bundle.loader.BundleArtifactProvider | None

Example

>>> loader = BundleLoader("model.n4a")
>>> print(f"Pipeline: {loader.metadata.pipeline_uid}")
>>> print(f"Preprocessing: {loader.metadata.preprocessing_chain}")
>>> y_pred = loader.predict(X_new)
artifact_provider: BundleArtifactProvider | None
fold_weights: Dict[int, float]
get_chain_for_artifact(artifact_key: str) OperatorChain | None[source]

Get the operator chain for an artifact from the bundle.

Parameters:

artifact_key – Artifact key (e.g., “step_1”, “step_4_fold0”)

Returns:

OperatorChain for the artifact or None if not found

get_merged_chains(import_context_chain: OperatorChain, step_offset: int = 0) Dict[str, OperatorChain][source]

Get all artifact chains merged with an import context chain.

Used when importing a bundle into another pipeline. Each artifact’s chain is prefixed with the import context chain.

Parameters:
  • import_context_chain – Chain from the importing pipeline context

  • step_offset – Step offset to apply to bundle steps

Returns:

Dict mapping artifact keys to merged chains

get_partitioner_routing(step_index: int | None = None) Dict[str, Any] | None[source]

Get partitioner routing info for a specific step or all steps.

Parameters:

step_index – Specific step index, or None for all

Returns:

Routing info dict or None

get_required_metadata_columns() List[str][source]

Get the metadata columns required for prediction routing.

Returns:

List of column names needed for routing, empty if no routing needed.

get_step_info() List[Dict[str, Any]][source]

Get information about steps in the bundle.

Returns:

List of step info dictionaries

has_partitioner_routing() bool[source]

Check if the bundle has metadata partitioner routing info.

Returns:

True if the bundle contains partitioner routing configuration.

import_artifacts_to_registry(registry: ArtifactRegistry, import_context_chain: OperatorChain | None = None, step_offset: int = 0, new_pipeline_id: str | None = None) Dict[str, str][source]

Import bundle artifacts into an artifact registry.

Registers all artifacts from this bundle into the target registry, optionally merging with an import context chain for proper V3 tracking.

Parameters:
  • registry – Target ArtifactRegistry to import into

  • import_context_chain – Optional chain from import context to prefix

  • step_offset – Step offset for imported artifacts

  • new_pipeline_id – New pipeline ID for imported artifacts

Returns:

Dict mapping original artifact keys to new artifact IDs

metadata: BundleMetadata | None
pipeline_config: Dict[str, Any]
predict(X: ndarray, branch_path: List[int] | None = None) ndarray[source]

Run prediction on input data.

Applies all preprocessing steps and model(s) from the bundle. Supports branching pipelines, meta-models (stacking), and CV ensembles.

Parameters:
  • X – Input features as numpy array

  • branch_path – Optional branch path filter for multi-branch pipelines

Returns:

Predictions as numpy array

predict_with_metadata(X: ndarray, metadata: Dict[str, ndarray], fallback_branch: int | None = None) ndarray[source]

Run prediction with metadata-based sample routing.

For bundles with metadata partitioner branches, this method routes each sample to the appropriate branch based on its metadata value. Each sample is processed by the transformers and models from its matching branch.

Parameters:
  • X – Input features as numpy array (n_samples, n_features)

  • metadata – Dict mapping column names to value arrays. Must include the column used for partitioning during training.

  • fallback_branch – Optional branch ID to use for samples with unknown metadata values. If None, raises error for unknowns.

Returns:

Predictions as numpy array

Raises:

ValueError – If required metadata column is missing or samples have unknown values without fallback.

Example

>>> loader = BundleLoader("model.n4a")
>>> X_new = np.random.randn(100, 500)
>>> metadata = {"site": np.array(["A"]*50 + ["B"]*50)}
>>> y_pred = loader.predict_with_metadata(X_new, metadata)
to_resolved_prediction() Any[source]

Convert bundle to ResolvedPrediction for use with Predictor.

Returns:

ResolvedPrediction instance

trace: ExecutionTrace | None
class nirs4all.pipeline.bundle.loader.BundleMetadata(bundle_format_version: str = '1.0', nirs4all_version: str = '', created_at: str = '', pipeline_uid: str = '', source_type: str = '', model_step_index: int | None = None, fold_strategy: str = 'weighted_average', preprocessing_chain: str = '', trace_id: str | None = None, original_manifest: Dict[str, ~typing.Any]=<factory>, partitioner_routing: Dict[str, ~typing.Any]=<factory>)[source]

Bases: object

Metadata for a prediction bundle.

Contains information about the bundle format, source, and contents.

bundle_format_version

Version of the bundle format

Type:

str

nirs4all_version

Version of nirs4all that created the bundle

Type:

str

created_at

ISO timestamp of bundle creation

Type:

str

pipeline_uid

UID of the source pipeline

Type:

str

source_type

Type of source that was exported

Type:

str

model_step_index

Index of the model step

Type:

int | None

fold_strategy

Strategy for combining fold predictions

Type:

str

preprocessing_chain

Summary of preprocessing steps

Type:

str

trace_id

ID of the execution trace (if available)

Type:

str | None

original_manifest

Subset of original manifest metadata

Type:

Dict[str, Any]

partitioner_routing

Routing info for metadata partitioner branches

Type:

Dict[str, Any]

bundle_format_version: str = '1.0'
created_at: str = ''
fold_strategy: str = 'weighted_average'
classmethod from_dict(data: Dict[str, Any]) BundleMetadata[source]

Create BundleMetadata from dictionary.

Parameters:

data – Dictionary from bundle manifest.json

Returns:

BundleMetadata instance

model_step_index: int | None = None
nirs4all_version: str = ''
original_manifest: Dict[str, Any]
partitioner_routing: Dict[str, Any]
pipeline_uid: str = ''
preprocessing_chain: str = ''
source_type: str = ''
trace_id: str | None = None
nirs4all.pipeline.bundle.loader.load_bundle(bundle_path: str | Path) BundleLoader[source]

Convenience function to load a bundle.

Parameters:

bundle_path – Path to the .n4a bundle file

Returns:

BundleLoader instance