Skip to content

Commit 1c16ed5

Browse files
Show every distinct GRPO task type in the first N rows (#238)
Previously the GRPO section used a round-robin interleave that split rows into a vLLM bucket and a non-vLLM bucket and walked each bucket in turn. That guaranteed uniqueness within each bucket, but not across the whole section: the top 10 rows were all vLLM (with GSM8K and DAPO repeating because the vLLM bucket only had four distinct types), while the first non-vLLM distinct type (Sudoku) did not show up until row 11. Duplicates were also grouped by bucket instead of ordered by popularity across the whole section. Replace _interleave_by_task with _unique_types_first: 1. Sort every GRPO row by (popularity, count_ok, has_vllm) descending so the tuple ordering is popularity first, then vLLM as a minor tiebreaker between rows of the same popularity and task type. 2. Walk the sorted list once. The first time a task_type is seen, that row is the representative of that type; every subsequent row with the same task_type is a duplicate. 3. Emit all representatives first (already in popularity order from step 1), then all duplicates (also in popularity order). Result for the current GRPO pool (11 distinct task types): Row 1 GSM8K Math + vLLM (Llama3.1 8B) Row 2 Sudoku (NeMo Gym Sudoku) Row 3 Multi Environment (NeMo Gym Multi Environment) Row 4 2048 Game (gpt oss BF16 20B) Row 5 Minesweeper Game (gpt oss 20B) Row 6 Auto Kernel Creation (gpt oss 20B) Row 7 DAPO Math + vLLM (Qwen3 8B FP8) Row 8 ORPO (Llama3 8B) Row 9 Wordle + vLLM (Openenv wordle) Row 10 Vision Math + vLLM (Qwen2.5 VL 7B) Row 11 DPO (Zephyr 7B) Row 12..29 duplicates in popularity order Rows 1-11 now cover every distinct task type in the section, which was the stated goal. Non-GRPO sections are untouched.
1 parent 15899e0 commit 1c16ed5

File tree

2 files changed

+72
-65
lines changed

2 files changed

+72
-65
lines changed

0 commit comments

Comments
 (0)