All

8 repositories

TTRL
Public
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
rl reasoning llm
rl reasoning llm
Python
•
MIT License
•81•1.1k•17•0•Updated Apr 15, 2026Apr 15, 2026
P1-VL
Public
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads
2•15•0•0•Updated Feb 11, 2026Feb 11, 2026
RL-Compositionality
Public
FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones
Python
•
Apache License 2.0
•6•66•2•0•Updated Jan 26, 2026Jan 26, 2026
SimpleVLA-RL
Public
[ICLR 2026] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
rl vla reasoning
rl vla reasoning
Python
•
MIT License
•105•1.6k•46•1•Updated Jan 6, 2026Jan 6, 2026
P1
Public
P1: Mastering Physics Olympiads with Reinforcement Learning
4•84•3•0•Updated Dec 29, 2025Dec 29, 2025
Entropy-Mechanism-of-RL
Public
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
rl reasoning llm
rl reasoning llm
Python
•15•435•2•0•Updated Jul 11, 2025Jul 11, 2025
PRIME
Public
Scalable RL solution for advanced reasoning of language models
rl reasoning llm
rl reasoning llm
Python
•
Apache License 2.0
•111•1.9k•8•2•Updated Mar 18, 2025Mar 18, 2025
ImplicitPRM
Public
Repo of paper "Free Process Rewards without Process Labels"
rl prm test-time-scaling
rl prm test-time-scaling
Python
•
Apache License 2.0
•11•171•12•0•Updated Mar 14, 2025Mar 14, 2025

ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.