An RL environment for training and evaluating LLMs on generating ASCII diagrams.
environments/ascii_align/: environment, rubric, and testsrollout-dashboard/: RL Studio web UI for rollout inspection and reward visualization
| Name | Name | Last commit date | ||
|---|---|---|---|---|