Skip to content

Conversation

@zfan2356
Copy link

@zfan2356 zfan2356 commented Jan 7, 2026

Hi ~, Currently the instructions for the pre-training setup in the on-policy distillation example README are somewhat confusing. I’ve refined and clarified them so that the preparation commands can now be run sequentially without confusion. and align with the run-qwen3-8B-opd.sh script.

@zfan2356 zfan2356 closed this Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant