Finetuning NVIDIA Nemotron 3 Nano in NeMo Automodel #976

adil-a · 2025-12-17T08:43:29Z

adil-a
Dec 17, 2025
Collaborator

NVIDIA Nemotron 3 Nano is a reasoning model that employs a hybrid Mixture-of-Experts (MoE) architecture which has 3.5B active parameters and 30B parameters in total. The model is the best amongst its size class on SWE-Bench, GPQA Diamond, etc.

For a quick start-up, we offer recipes for both full-model SFT and PEFT.

Data

We use the SQuAD Q/A dataset in this walkthrough. Below is an example sample:

{
    "context": "In the past, the Malays used to call the Portuguese Serani from the Arabic Nasrani, but the term now refers to the modern Kristang creoles of Malaysia.",
    "question": "What term did the Malays use for the Portuguese Serani?",
    "answers": {"text": ["Nasrani"], "answer_start": [75]}
}

Run the Fine-Tune Script

Apply YAML-Based Configuration

NeMo Automodel uses a flexible configuration system that combines YAML configuration files with command-line overrides. This allows you to maintain base configurations while easily experimenting with different parameters.

PEFT

python examples/llm_finetune/finetune.py  -c examples/llm_finetune/nemotron/nemotron_nano_v3_squad_peft.yaml

Configure Model Freezing

peft:
  _target_: nemo_automodel.components._peft.lora.PeftConfig
  match_all_linear: True
  dim: 8
  alpha: 32
  use_triton: True

In this run, we add LoRA weights to all linear layers, but this can be modified to pass in specific target modules.

Full SFT

torchrun --nproc-per-node=8 examples/llm_finetune/finetune.py  -c examples/llm_finetune/nemotron/nemotron_nano_v3_squad.yaml

Happy finetuning!

Wonder1905 · 2025-12-31T02:16:59Z

Wonder1905
Dec 31, 2025

The diff between Full and PEFT is drastic, do you guys have anymore experiments? thoughts?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Finetuning NVIDIA Nemotron 3 Nano in NeMo Automodel #976

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Finetuning NVIDIA Nemotron 3 Nano in NeMo Automodel #976

Uh oh!

Uh oh!

adil-a Dec 17, 2025 Collaborator

Data

Run the Fine-Tune Script

Apply YAML-Based Configuration

PEFT

Configure Model Freezing

Full SFT

Replies: 1 comment

Uh oh!

Wonder1905 Dec 31, 2025

adil-a
Dec 17, 2025
Collaborator

Wonder1905
Dec 31, 2025