GRPO training examples for Qwen2.5

Hi, thank you very much for your great work. I noticed that current examples mainly focus on Qwen3 series, may I ask whether the project plan to relase some examples for Qwen2.5 models, since I found the reward will always be 0 if directly change the model from Qwen3 to Qwen2.5 in current examples.

Best,