PhD Candidate @ Tsinghua University
-
Tsinghua University
- Beijing, China
- https://shenzhi-wang.netlify.app/
- @ShenzhiWang_THU
- https://huggingface.co/shenzhi-wang
Pinned Loading
-
hiyouga/EasyR1
hiyouga/EasyR1 PublicEasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
-
-
Llama3-Chinese-Chat
Llama3-Chinese-Chat PublicThis is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model.
-
Beyond-the-80-20-Rule-RLVR
Beyond-the-80-20-Rule-RLVR PublicThe open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning."
-
LeapLabTHU/FamO2O
LeapLabTHU/FamO2O PublicRepository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.

