Skip to content

Actions: NVIDIA-NeMo/RL

Actions

Create PR to main with cherry-pick from release

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
658 workflow runs
658 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

fix: make sft dynamic batch step time check more stable (#1265)
Create PR to main with cherry-pick from release #508: Commit c68b4c2 pushed by terrykong
10s main
fix: remove noisy qwen2 vl nightly test loss check (#1272)
Create PR to main with cherry-pick from release #507: Commit 38125c2 pushed by terrykong
13s main
feat: add valid_tokens_per_sec metric and total_valid_tokens to save …
Create PR to main with cherry-pick from release #506: Commit d0e203c pushed by terrykong
12s main
docs: Update v0.3.0 announcement link (#1269)
Create PR to main with cherry-pick from release #505: Commit 9a909cc pushed by terrykong
9s main
docs: add missing async_grpo.enabled flag to configuration (#1237)
Create PR to main with cherry-pick from release #504: Commit f1bfeb6 pushed by terrykong
11s main
fix: moonlight CI test mem regression (increase cache flush) (#1257)
Create PR to main with cherry-pick from release #503: Commit 557b7ec pushed by terrykong
13s main
fix: fp8 rollout nightly fix check from step 100 to 40 (#1233)
Create PR to main with cherry-pick from release #502: Commit 376e625 pushed by terrykong
12s main
fix: gitignore only the top level datasets directory (#1252)
Create PR to main with cherry-pick from release #501: Commit d653437 pushed by terrykong
14s main
fix: Release gradient memory after policy training (#1147)
Create PR to main with cherry-pick from release #500: Commit 43928aa pushed by terrykong
21s main
fix: Fix gradient clipping of non-float32 params (#1158)
Create PR to main with cherry-pick from release #499: Commit 2df9ea5 pushed by terrykong
13s main
fix: lower steps in smolvlm nightly test (#1239)
Create PR to main with cherry-pick from release #498: Commit 0ad4722 pushed by terrykong
13s main
fix: fix checkpointing when val_period does not divide `save_period…
Create PR to main with cherry-pick from release #497: Commit d82ca75 pushed by terrykong
9s main
feat: Update Theoretical TFLOPS (#1236)
Create PR to main with cherry-pick from release #496: Commit f7645f3 pushed by terrykong
9s main
fix: Fix OOM in validation during colocated training (#1159)
Create PR to main with cherry-pick from release #495: Commit cc8a93e pushed by terrykong
10s main
feat: VLM support via megatron backend (#1115)
Create PR to main with cherry-pick from release #494: Commit b50bfca pushed by terrykong
10s main
docs: async doc update for importance sampling correction (#1222)
Create PR to main with cherry-pick from release #493: Commit 8003918 pushed by terrykong
12s main
feat: Adding perf metrics (#1183)
Create PR to main with cherry-pick from release #492: Commit bc1a027 pushed by parthchadha
11s main
fix: grpo-llama3.1-8b-instruct-1n8g-megatron-fp8-rollouts runs 40 ste…
Create PR to main with cherry-pick from release #491: Commit c2b36f2 pushed by terrykong
13s main
feat: add on policy distillation algorithm (#1006)
Create PR to main with cherry-pick from release #490: Commit 17ea9ab pushed by terrykong
10s main
fix: loosen sft-llama3.2-1b-1n8g-fsdp2tp1.v3.sh step time/loss check …
Create PR to main with cherry-pick from release #489: Commit b445a3a pushed by terrykong
13s main
chore: Update cherry-pick workflow to use v0.63.0 (#1218)
Create PR to main with cherry-pick from release #488: Commit 629a82b pushed by pablo-garay
12s main
fix: nightlies using v1 can't use model_save_format=safetensors (#1226)
Create PR to main with cherry-pick from release #487: Commit ebfa9e2 pushed by chtruong814
14s main
fix: dpo mistral nightly needs more time (#1225)
Create PR to main with cherry-pick from release #486: Commit 0dca729 pushed by chtruong814
21s main
fix: invalid time for fp8 grpo test 300 -> 240 minutes (#1220)
Create PR to main with cherry-pick from release #485: Commit 5166d74 pushed by terrykong
9s main
fix: Handle missing prompts in math HF data processor and add regress…
Create PR to main with cherry-pick from release #484: Commit 4528931 pushed by terrykong
9s main