nirs4all.api.retrain module

Module-level retrain() function for nirs4all.

This module provides a simple interface for retraining nirs4all pipelines on new data. It wraps PipelineRunner.retrain() with ergonomic defaults.

Example

>>> import nirs4all
>>> # Full retrain on new data
>>> result = nirs4all.retrain(
...     source="exports/model.n4a",
...     data=new_data,
...     mode="full"
... )
>>> print(f"New RMSE: {result.best_rmse:.4f}")
nirs4all.api.retrain.retrain(source: Dict[str, Any] | str | Path, data: str | Path | ndarray | Tuple[ndarray, ...] | Dict[str, Any] | SpectroDataset | DatasetConfigs, *, mode: str = 'full', name: str = 'retrain_dataset', new_model: Any | None = None, epochs: int | None = None, session: Session | None = None, verbose: int = 1, save_artifacts: bool = True, **kwargs: Any) RunResult[source]

Retrain a pipeline on new data.

This function enables retraining trained pipelines with various modes, allowing for full retraining, transfer learning, or fine-tuning.

Parameters:
  • source – Pipeline source to retrain from. Can be: - Prediction dict from result.best or result.top() - Path to exported bundle: "exports/model.n4a" - Path to pipeline config directory

  • data – New dataset to train on. Can be: - Path to data folder: "new_data/" - Numpy arrays: (X, y) - Dict: {"X": X, "y": y} - SpectroDataset instance

  • mode – Retrain mode. Options: - “full”: Train everything from scratch (same pipeline structure) - “transfer”: Use existing preprocessing, train new model - “finetune”: Continue training existing model Default: “full”

  • name – Name for the retrain dataset (for logging). Default: “retrain_dataset”

  • new_model – Optional new model for transfer mode. Replaces the original model while keeping preprocessing.

  • epochs – Optional number of epochs for fine-tuning neural networks.

  • session – Optional Session for resource reuse. If provided, uses the session’s runner.

  • verbose – Verbosity level (0=quiet, 1=info, 2=debug). Default: 1

  • save_artifacts – Whether to save retrained artifacts. Default: True

  • **kwargs – Additional retraining parameters: - learning_rate: Learning rate for fine-tuning - freeze_layers: List of layers to freeze during fine-tuning - step_modes: Per-step mode overrides (advanced)

Returns:

  • predictions: Predictions from the retrained pipeline

  • per_dataset: Per-dataset execution details

  • best: Best prediction entry

  • best_score: Best model’s primary test score

Return type:

RunResult containing

Raises:

Examples

Full retrain on new data:

>>> import nirs4all
>>>
>>> # Original training
>>> original = nirs4all.run(pipeline, train_data)
>>>
>>> # Retrain on new data with same pipeline
>>> retrained = nirs4all.retrain(
...     source=original.best,
...     data=new_train_data,
...     mode="full"
... )
>>> print(f"Original: {original.best_rmse:.4f}")
>>> print(f"Retrained: {retrained.best_rmse:.4f}")

Transfer learning with new model:

>>> from sklearn.ensemble import RandomForestRegressor
>>>
>>> result = nirs4all.retrain(
...     source="exports/pls_model.n4a",
...     data=new_data,
...     mode="transfer",
...     new_model=RandomForestRegressor(n_estimators=100)
... )

Fine-tune a neural network:

>>> result = nirs4all.retrain(
...     source="exports/nn_model.n4a",
...     data=new_data,
...     mode="finetune",
...     epochs=10,
...     learning_rate=0.0001
... )

Retrain from an exported bundle:

>>> result = nirs4all.retrain(
...     source="exports/wheat_model.n4a",
...     data="new_wheat_data/",
...     mode="full",
...     verbose=2
... )
>>> result.export("exports/retrained_model.n4a")

See also