Skip to content
View simonucl's full-sized avatar

Block or report simonucl

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. CHATS-lab/verbalized-sampling CHATS-lab/verbalized-sampling Public

    Verbalized Sampling, a training-free prompting strategy to mitigate mode collapse in LLMs by requesting responses with probabilities. Achieves 2-3x diversity improvement while maintaining quality. …

    Python 647 78

  2. spiral-rl/spiral spiral-rl/spiral Public

    SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

    Python 173 20

  3. ChenmienTan/RL2 ChenmienTan/RL2 Public

    Python 1k 103

  4. LeonGuertler/TextArena LeonGuertler/TextArena Public

    A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning

    Python 334 77

  5. axon-rl/gem axon-rl/gem Public

    A Gym for Agentic LLMs

    Python 415 27

  6. Cohere-Labs-Community/iterative-data-selection Cohere-Labs-Community/iterative-data-selection Public

    Python 30 6