Skip to content

[LLM] Rewrite GSM8K reward function to follow standard GRPO conventions #6397

[LLM] Rewrite GSM8K reward function to follow standard GRPO conventions

[LLM] Rewrite GSM8K reward function to follow standard GRPO conventions #6397

Triggered via pull request March 5, 2026 11:01
Status Success
Total duration 22m 56s
Artifacts 7

build-wheels-linux.yml

on: pull_request
generate-matrix  /  generate
5s
generate-matrix / generate
Matrix: pytorch/rl
Fit to window
Zoom out
Zoom in

Artifacts

Produced during runtime
Name Size Digest
pytorch_rl__3.10_cpu_x86_64
1.98 MB
sha256:2d8d40093027500f0c5e52f9547a75f8b511b7b18b714920dc0ec61feac94da8
pytorch_rl__3.10_cu126_x86_64
1.98 MB
sha256:0b7a039f8c8d379789234255ed93a9b4843e290d246c4edad21078db8471c61f
pytorch_rl__3.10_cu128_x86_64
1.98 MB
sha256:3952ef543033b28eac4b79b9da36ee095b12ec6d9b48bc44abd65b3caaee00b7
pytorch_rl__3.10_cu129_x86_64
1.98 MB
sha256:fdf73920924d6b3739f79a50f338dafd2e965093c8be687617c6c2b0e4bb22ca
pytorch_rl__3.10_cu130_x86_64
1.98 MB
sha256:fa4a6c2e0a89639cb0ceb1095ba859e5700d90bed7eac768ee9294e76702b7e9
pytorch_rl__3.10_rocm7.1_x86_64
1.98 MB
sha256:13e6e4804c198bff30385606c684cf2de2a22bde3782c6328026b933bc40ac6c
pytorch_rl__3.10_rocm7.2_x86_64
1.98 MB
sha256:39600685b4f83fae4f5678f2938a65e6145fa2c51fa409a4754b2a9ec4977e09