allow passing eval_ds parameters through tunix cli, plus bugfixes by andytwigg · Pull Request #1238 · google/tunix

andytwigg · 2026-03-12T04:31:14Z

allow passing eval_ds parameters through cli (for running eval dataset alongside grpo train)

plus bugfixes:

use correct qwen3 lora regex in examples/rl/grpo/gsm8k/run_qwen3.sh
add model_download_path to rl/examples scripts to avoid downloading models to same paths

gemini-code-assist · 2026-03-12T04:31:35Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the flexibility of the tunix CLI by enabling detailed configuration of evaluation datasets directly through command-line parameters. This allows for more robust training and evaluation workflows, particularly when running evaluation alongside training. Additionally, it addresses critical bug fixes related to model-specific configurations and download paths, ensuring correct model behavior and preventing file conflicts during model acquisition.

Highlights

Evaluation Dataset Parameterization: Introduced new CLI parameters and configuration options to allow specifying evaluation datasets independently, enabling concurrent evaluation during training.
Qwen3 LoRA Module Path Fix: Corrected the regular expression used for Qwen3 LoRA module paths in example scripts to target the correct projection layers.
HuggingFace Model Download Path Improvement: Modified the model download logic to append the model name to the download path, preventing different HuggingFace models from overwriting each other in the same directory.
Dataset Loading Refactoring: Refactored dataset loading and initialization into a dedicated helper function within grpo_main.py for better code organization and reusability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

examples/rl/grpo/gsm8k/run_qwen3.sh
- Added new shell variables for num_test_batches, train_split, and eval_split with default values.
- Updated the actor_model_config.lora_config.module_path regex to correctly target Qwen3 projection layers (q_proj, k_proj, v_proj, o_proj).
- Modified the grpo_main command to pass the newly introduced evaluation dataset parameters.
examples/rl/grpo/gsm8k/run_qwen3_simplereward.sh
- Adjusted default values for batch_size, num_batches, and introduced num_test_batches.
- Added new shell variables for train_split and eval_split with default values.
- Updated the actor_model_config.lora_config.module_path regex to correctly target Qwen3 projection layers.
- Modified the grpo_main command to pass the newly introduced evaluation dataset parameters.
- Increased the rl_training_config.eval_every_n_steps from 10 to 100.
tunix/cli/base_config.yaml
- Added a train_split configuration parameter with a default of 'train'.
- Introduced a new section for eval_dataset configuration, including parameters like eval_data_source, eval_dataset_name, eval_num_batches, and eval_split.
tunix/cli/grpo_main.py
- Implemented a new private helper method _load_and_init_dataset to centralize dataset loading and initialization logic.
- Refactored the run_grpo_trainer method to utilize the new _load_and_init_dataset for both training and optional evaluation datasets.
- Modified the grpo_trainer.train call to accept an optional eval_ds argument.
tunix/models/automodel.py
- Modified the download_model function to append the model's name to the model_download_path for Kaggle and HuggingFace sources, preventing directory conflicts.
tunix/rl/reward_manager.py
- Commented out a call to _log_one_example() within the _get_rewards function.
tunix/rl/rl_cluster.py
- Changed the logging level for skipping string metrics in _log_metrics from logging.info to logging.debug.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces the ability to pass evaluation dataset parameters through the Tunix CLI, which is a great enhancement for running evaluations alongside training. It also includes a couple of important bug fixes. My review focuses on a critical syntax error in one of the example scripts, a minor duplication issue, and some code cleanup. Overall, the changes are good, but the identified issues should be addressed.

examples/rl/grpo/gsm8k/run_qwen3.sh

tunix/rl/reward_manager.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Removed duplicate echo statements for Train Split and Eval Split.

…nto atwigg/add_eval_ds_to_cli

tunix/cli/grpo_main.py

s-noghabi

Thanks for making the changes

allow passing eval_ds parameters through tunix cli, plus bugfixes

a1b56b7

andytwigg requested review from abheesht17, hgao327, jiangyangmu, lc5211, sizhit2, tianshub and wang2yn84 as code owners March 12, 2026 04:31

github-actions bot assigned jiangyangmu Mar 12, 2026

andytwigg had a problem deploying to testing March 12, 2026 04:31 — with GitHub Actions Error

andytwigg assigned s-noghabi and jiangyangmu and unassigned jiangyangmu and s-noghabi Mar 12, 2026

gemini-code-assist bot reviewed Mar 12, 2026

View reviewed changes

examples/rl/grpo/gsm8k/run_qwen3.sh Outdated Show resolved Hide resolved

examples/rl/grpo/gsm8k/run_qwen3.sh Outdated Show resolved Hide resolved

tunix/rl/reward_manager.py Outdated Show resolved Hide resolved

Update examples/rl/grpo/gsm8k/run_qwen3.sh

be7e153

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

andytwigg had a problem deploying to testing March 12, 2026 04:51 — with GitHub Actions Error

Remove duplicate echo statements in run_qwen3.sh

c13d334

Removed duplicate echo statements for Train Split and Eval Split.

andytwigg had a problem deploying to testing March 12, 2026 04:52 — with GitHub Actions Error

Enable logging of one example in reward manager

3e32ada

andytwigg had a problem deploying to testing March 12, 2026 04:53 — with GitHub Actions Failure

andytwigg temporarily deployed to testing March 12, 2026 04:53 — with GitHub Actions Inactive

andytwigg added 2 commits March 12, 2026 07:17

add model_download_path to qwen3 grpo run scripts

6a96dc5

Merge branch 'atwigg/add_eval_ds_to_cli' of github.com:google/tunix i…

f2007d4

…nto atwigg/add_eval_ds_to_cli

andytwigg had a problem deploying to testing March 12, 2026 07:18 — with GitHub Actions Error

andytwigg added 2 commits March 12, 2026 00:23

revert changes to tunix/models/automodel.py

8d636a8

Remove unnecessary blank line in automodel.py

525ac77

andytwigg temporarily deployed to testing March 12, 2026 07:24 — with GitHub Actions Inactive

s-noghabi reviewed Mar 12, 2026

View reviewed changes

tunix/cli/grpo_main.py Outdated Show resolved Hide resolved

s-noghabi reviewed Mar 12, 2026

View reviewed changes

tunix/cli/grpo_main.py Outdated Show resolved Hide resolved

tunix/cli/grpo_main.py Outdated Show resolved Hide resolved

tunix/cli/grpo_main.py Outdated Show resolved Hide resolved

resolving comments

65fdb6f

andytwigg had a problem deploying to testing March 26, 2026 16:35 — with GitHub Actions Error

remove whitespace

e910b04

andytwigg had a problem deploying to testing March 26, 2026 16:50 — with GitHub Actions Error

s-noghabi approved these changes Mar 26, 2026

View reviewed changes

andytwigg temporarily deployed to testing March 26, 2026 18:41 — with GitHub Actions Inactive

andytwigg had a problem deploying to testing March 26, 2026 20:49 — with GitHub Actions Error

andytwigg temporarily deployed to testing March 26, 2026 22:48 — with GitHub Actions Inactive

Merge branch 'main' into atwigg/add_eval_ds_to_cli

49ed1e7

andytwigg temporarily deployed to testing March 27, 2026 00:49 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow passing eval_ds parameters through tunix cli, plus bugfixes#1238

allow passing eval_ds parameters through tunix cli, plus bugfixes#1238
andytwigg wants to merge 11 commits intomainfrom
atwigg/add_eval_ds_to_cli

andytwigg commented Mar 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

s-noghabi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andytwigg commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

s-noghabi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andytwigg commented Mar 12, 2026 •

edited

Loading