How to reproduce the benchmark numbers in the RynnBrain technical report?

Thanks for sharing the great work! I followed the [step-by-step instructions](https://github.com/alibaba-damo-academy/RynnScale/tree/main/projects/rynn_brain#evaluation) and use the [Rynn Bench ](https://huggingface.co/datasets/Alibaba-DAMO-Academy/RynnBrain-Bench)dataset with a few simple code change. However it looks like the benchmark numbers in Table 3 from [TR paper ](https://arxiv.org/pdf/2602.14979v1)cannot be reproduced for open-source models such as Qwen3-VL-8B-Instruct and Cosmos-Reason2, especially for RynnBrain-Grounding, Area and Affordance. Is there any important details missing in the instruction for hyper-parameters and/or datasets?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reproduce the benchmark numbers in the RynnBrain technical report? #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to reproduce the benchmark numbers in the RynnBrain technical report? #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions