[LLM] Rewrite GSM8K reward function to follow standard GRPO conventions #6397
build-wheels-linux.yml
on: pull_request
generate-matrix
/
generate
5s
Matrix: pytorch/rl
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
pytorch_rl__3.10_cpu_x86_64
|
1.98 MB |
sha256:2d8d40093027500f0c5e52f9547a75f8b511b7b18b714920dc0ec61feac94da8
|
|
|
pytorch_rl__3.10_cu126_x86_64
|
1.98 MB |
sha256:0b7a039f8c8d379789234255ed93a9b4843e290d246c4edad21078db8471c61f
|
|
|
pytorch_rl__3.10_cu128_x86_64
|
1.98 MB |
sha256:3952ef543033b28eac4b79b9da36ee095b12ec6d9b48bc44abd65b3caaee00b7
|
|
|
pytorch_rl__3.10_cu129_x86_64
|
1.98 MB |
sha256:fdf73920924d6b3739f79a50f338dafd2e965093c8be687617c6c2b0e4bb22ca
|
|
|
pytorch_rl__3.10_cu130_x86_64
|
1.98 MB |
sha256:fa4a6c2e0a89639cb0ceb1095ba859e5700d90bed7eac768ee9294e76702b7e9
|
|
|
pytorch_rl__3.10_rocm7.1_x86_64
|
1.98 MB |
sha256:13e6e4804c198bff30385606c684cf2de2a22bde3782c6328026b933bc40ac6c
|
|
|
pytorch_rl__3.10_rocm7.2_x86_64
|
1.98 MB |
sha256:39600685b4f83fae4f5678f2938a65e6145fa2c51fa409a4754b2a9ec4977e09
|
|