Skip to content

allow passing eval_ds parameters through tunix cli, plus bugfixes#1238

Open
andytwigg wants to merge 11 commits intomainfrom
atwigg/add_eval_ds_to_cli
Open

allow passing eval_ds parameters through tunix cli, plus bugfixes#1238
andytwigg wants to merge 11 commits intomainfrom
atwigg/add_eval_ds_to_cli

Conversation

@andytwigg
Copy link
Copy Markdown
Collaborator

@andytwigg andytwigg commented Mar 12, 2026

  • allow passing eval_ds parameters through cli (for running eval dataset alongside grpo train)

plus bugfixes:

  • use correct qwen3 lora regex in examples/rl/grpo/gsm8k/run_qwen3.sh
  • add model_download_path to rl/examples scripts to avoid downloading models to same paths

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the flexibility of the tunix CLI by enabling detailed configuration of evaluation datasets directly through command-line parameters. This allows for more robust training and evaluation workflows, particularly when running evaluation alongside training. Additionally, it addresses critical bug fixes related to model-specific configurations and download paths, ensuring correct model behavior and preventing file conflicts during model acquisition.

Highlights

  • Evaluation Dataset Parameterization: Introduced new CLI parameters and configuration options to allow specifying evaluation datasets independently, enabling concurrent evaluation during training.
  • Qwen3 LoRA Module Path Fix: Corrected the regular expression used for Qwen3 LoRA module paths in example scripts to target the correct projection layers.
  • HuggingFace Model Download Path Improvement: Modified the model download logic to append the model name to the download path, preventing different HuggingFace models from overwriting each other in the same directory.
  • Dataset Loading Refactoring: Refactored dataset loading and initialization into a dedicated helper function within grpo_main.py for better code organization and reusability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • examples/rl/grpo/gsm8k/run_qwen3.sh
    • Added new shell variables for num_test_batches, train_split, and eval_split with default values.
    • Updated the actor_model_config.lora_config.module_path regex to correctly target Qwen3 projection layers (q_proj, k_proj, v_proj, o_proj).
    • Modified the grpo_main command to pass the newly introduced evaluation dataset parameters.
  • examples/rl/grpo/gsm8k/run_qwen3_simplereward.sh
    • Adjusted default values for batch_size, num_batches, and introduced num_test_batches.
    • Added new shell variables for train_split and eval_split with default values.
    • Updated the actor_model_config.lora_config.module_path regex to correctly target Qwen3 projection layers.
    • Modified the grpo_main command to pass the newly introduced evaluation dataset parameters.
    • Increased the rl_training_config.eval_every_n_steps from 10 to 100.
  • tunix/cli/base_config.yaml
    • Added a train_split configuration parameter with a default of 'train'.
    • Introduced a new section for eval_dataset configuration, including parameters like eval_data_source, eval_dataset_name, eval_num_batches, and eval_split.
  • tunix/cli/grpo_main.py
    • Implemented a new private helper method _load_and_init_dataset to centralize dataset loading and initialization logic.
    • Refactored the run_grpo_trainer method to utilize the new _load_and_init_dataset for both training and optional evaluation datasets.
    • Modified the grpo_trainer.train call to accept an optional eval_ds argument.
  • tunix/models/automodel.py
    • Modified the download_model function to append the model's name to the model_download_path for Kaggle and HuggingFace sources, preventing directory conflicts.
  • tunix/rl/reward_manager.py
    • Commented out a call to _log_one_example() within the _get_rewards function.
  • tunix/rl/rl_cluster.py
    • Changed the logging level for skipping string metrics in _log_metrics from logging.info to logging.debug.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to pass evaluation dataset parameters through the Tunix CLI, which is a great enhancement for running evaluations alongside training. It also includes a couple of important bug fixes. My review focuses on a critical syntax error in one of the example scripts, a minor duplication issue, and some code cleanup. Overall, the changes are good, but the identified issues should be addressed.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Removed duplicate echo statements for Train Split and Eval Split.
Copy link
Copy Markdown
Collaborator

@s-noghabi s-noghabi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants