Skip to content

Commit 5a3193e

Browse files
committed
Remove debug script
Removed sandbox/grpo_language/debug_reward.py and updated TROUBLESHOOTING.md to remove references to it.
1 parent 0ed798c commit 5a3193e

File tree

2 files changed

+1
-103
lines changed

2 files changed

+1
-103
lines changed

sandbox/grpo_language/TROUBLESHOOTING.md

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -107,20 +107,7 @@ Check these metrics in Weights & Biases:
107107
- `reward/evaluate_response/avg_MathReward_reward` - should stay reasonably high
108108
- `reward/evaluate_response/avg_ThinkingReward_reward` - should increase quickly
109109

110-
### 5. Quick Debug Test
111-
112-
Run the debug script to verify the reward function works:
113-
```bash
114-
python sandbox/grpo_language/debug_reward.py
115-
```
116-
117-
Expected output:
118-
- Japanese text → reward 1.0
119-
- English text → reward 0.0
120-
- Multiple Japanese blocks → reward 0.5
121-
- No blocks but Japanese response → reward 0.2
122-
123-
### 6. Alternative: Start with English, then transition
110+
### 5. Alternative: Start with English, then transition
124111

125112
If Japanese isn't working, you could:
126113

sandbox/grpo_language/debug_reward.py

Lines changed: 0 additions & 89 deletions
This file was deleted.

0 commit comments

Comments
 (0)