Shenzhi-Wang

Shenzhi Wang Shenzhi-Wang

PhD Candidate @ Tsinghua University

Achievements

hiyouga/EasyR1 hiyouga/EasyR1 Public

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4.7k 364
LeapLabTHU/cooragent LeapLabTHU/cooragent Public

Official Repository of Cooragent

Python 2.2k 158
Llama3-Chinese-Chat Llama3-Chinese-Chat Public

This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model.

319 21
Beyond-the-80-20-Rule-RLVR Beyond-the-80-20-Rule-RLVR Public

The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning."

Python 49 2
LeapLabTHU/FamO2O LeapLabTHU/FamO2O Public

Repository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)

Python 40 2
recon recon Public

The official source code for "Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling" (ACL 2024, Findings)

Python 14