Skip to content

Commit abb653e

Browse files
committed
Remove sandbox config and reference apps/grpo configs instead
Changes: - Removed sandbox/grpo_language/qwen3_1_7b.yaml (use configs from apps/grpo/) - Updated usage comment in main.py to reference apps/grpo/qwen3_1_7b.yaml - Updated README.md to reference apps/grpo/ configs - Fixed README.md to use <思考> tags instead of <think> tags
1 parent 4e87a4d commit abb653e

File tree

3 files changed

+5
-154
lines changed

3 files changed

+5
-154
lines changed

sandbox/grpo_language/README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,11 @@ pip install langid
3434
## Usage
3535

3636
```bash
37-
python -m sandbox.grpo_language.main --config sandbox/grpo_language/qwen3_1_7b.yaml
37+
python -m sandbox.grpo_language.main --config apps/grpo/qwen3_1_7b.yaml
3838
```
3939

40+
You can use any of the config files from `apps/grpo/` (e.g., `qwen3_1_7b.yaml`, `qwen3_8b.yaml`, `qwen3_32b.yaml`).
41+
4042
## How It Works
4143

4244
1. The model receives a math problem and is instructed to use `<思考>` tags for reasoning
@@ -70,7 +72,7 @@ To use a different language:
7072

7173
Over the course of training, the model should learn to:
7274
1. Solve math problems correctly
73-
2. Use `<think></think>` tags for its reasoning
75+
2. Use `<思考></思考>` tags for its reasoning
7476
3. Write its thinking in Japanese (or the configured target language)
7577

7678
## Metrics

sandbox/grpo_language/main.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
# This source code is licensed under the BSD-style license found in the
55
# LICENSE file in the root directory of this source tree.
66

7-
# Usage: python -m sandbox.grpo_language.main --config sandbox/grpo_language/qwen3_1_7b.yaml
7+
# Usage: python -m sandbox.grpo_language.main --config apps/grpo/qwen3_1_7b.yaml
88

99
import asyncio
1010
import time

sandbox/grpo_language/qwen3_1_7b.yaml

Lines changed: 0 additions & 151 deletions
This file was deleted.

0 commit comments

Comments
 (0)