Fine-Tuning¶

Fine-tuning trains the final model using the prepared dataset and the best hyperparameters from a sweep.

CLI¶

Grouped CLI

bertnado full-train \
  --output-dir output/train \
  --model-name PoetschLab/GROVER \
  --dataset output/dataset \
  --best-config-path output/sweep/best_sweep_config.json \
  --task-type binary_classification \
  --project-name bertnado

Standalone command

bertnado-train \
  --output-dir output/train \
  --model-name PoetschLab/GROVER \
  --dataset output/dataset \
  --best-config-path output/sweep/best_sweep_config.json \
  --task-type binary_classification \
  --project-name bertnado

Python API¶

Train from Python

from pathlib import Path

from bertnado.api import train_model

train_model(
    output_dir=Path("output/train"),
    dataset=Path("output/dataset"),
    best_config_path=Path("output/sweep/best_sweep_config.json"),
    project_name="bertnado",
    task_type="binary_classification",
    model_name="PoetschLab/GROVER",
    warmup_ratio=0.03,
    lr_scheduler_type="cosine",
)

Inputs¶

Input	Description
`--dataset`	Path to a dataset created by `bertnado-data`.
`--best-config-path`	Path to `best_sweep_config.json` from `bertnado-sweep`.
`--model-name`	Hugging Face model name or local model path.
`--task-type`	`regression`, `binary_classification`, or `multilabel_classification`.
`--project-name`	W&B project used for training logs.

best_sweep_config.json should contain training hyperparameters such as learning_rate, per_device_train_batch_size, per_device_eval_batch_size, epochs, weight_decay, and logging_steps.

It can also contain extra Hugging Face TrainingArguments kwargs such as warmup_ratio, lr_scheduler_type, gradient_accumulation_steps, eval_steps, save_steps, max_grad_norm, fp16, or bf16. BertNado passes valid extra training-argument keys through automatically. For grouped fixed extras, use a training_args object in the sweep config.

BertNado manages output_dir, logging_dir, report_to, load_best_model_at_end, metric_for_best_model, and greater_is_better itself.

Best Checkpoint Metric¶

If the config was produced by bertnado-sweep, the optimization metric is already saved in the config:

best_sweep_config.json excerpt

{
  "metric_for_best_model": "roc_auc",
  "greater_is_better": true,
  "optimization_metric": {
    "name": "eval/roc_auc",
    "goal": "maximize"
  }
}

BertNado passes those values to Hugging Face TrainingArguments, so load_best_model_at_end uses the same metric that selected the best sweep run.

You can override the saved metric:

Override checkpoint metric

bertnado-train \
  --output-dir output/train \
  --model-name PoetschLab/GROVER \
  --dataset output/dataset \
  --best-config-path output/sweep/best_sweep_config.json \
  --task-type binary_classification \
  --project-name bertnado \
  --metric-name eval/roc_auc \
  --metric-goal maximize

What Happens¶

BertNado:

Loads the prepared dataset from --dataset.
Uses the train split for training.
Uses the validation split for evaluation.
Loads the requested Hugging Face sequence classification model.
Applies the hyperparameters from best_sweep_config.json.
Logs the run to W&B.
Saves the final model under output/train/model.

For binary classification, BertNado can automatically read class_weights.json from the prepared dataset and use it to set a positive class weight.

Outputs¶

Training output

output/train/
|-- logs/
|-- checkpoint-*/
`-- model/

Use output/train/model as the model directory for prediction and feature extraction.