We are pleased to submit our recently open-sourced model, Nanbeige4-3B-Thinking-2511, for evaluation in the Chatbot Arena. This model has achieved a score of 60 points on the Arena-Hard V2 benchmark via advanced distillation techniques and reinforcement learning optimization.
Model Details:
Despite its small size (3B parameters), Nanbeige4-3B-Thinking-2511 exhibits competitive performance across a broad range of tasks, making it a compelling case study for efficient language modeling. We believe its inclusion in Chatbot Arena will provide valuable insights into the capabilities of parameter-efficient models in real-world, human-preference-driven evaluations.
We are happy to provide any additional support required for evaluation, such as API access.
Thank you for your consideration. We look forward to your feedback and the opportunity to contribute to the community.