[Tools] Add Support for RWKV Model Evaluation #880

yanqingliu35-stack · 2025-10-27T02:52:22Z

This PR introduces native support for evaluating RWKV models

how to use

python run_lm_eval.py /path/to/model

gemini-code-assist · 2025-10-27T02:52:35Z

Summary of Changes

Hello @yanqingliu35-stack, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces native support for evaluating RWKV language models within a standardized framework. By adding a new script and a custom adapter, it enables seamless integration with the lm_eval harness, allowing for comprehensive benchmarking of RWKV models across various NLP tasks and ensuring that evaluation results are consistently recorded.

Highlights

New RWKV Evaluation Script: A new Python script, run_lm_eval.py, has been added to facilitate the evaluation of RWKV models.
LM Evaluation Harness Integration: The script integrates RWKV models with the lm_eval evaluation harness, providing a standardized framework for benchmarking.
Custom Adapter Implementation: An EvalHarnessAdapter class is implemented to bridge RWKV model functionalities, such as tokenization, log-likelihood calculation, and text generation, with the lm_eval API.
Result Persistence: Evaluation results are automatically saved to a JSON file, ensuring easy access and analysis of model performance across tasks.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds a new script for evaluating RWKV models using lm-eval. The script provides a good starting point by wrapping the RWKV model in an adapter compatible with the evaluation harness. My review includes several suggestions to improve the script's correctness, robustness, and maintainability. The most critical issue is that the script, as written, will not run any evaluation tasks because the task list is empty. I've also pointed out a bug where a global model variable is used instead of the instance's self.model, and suggested improvements like using dependency injection and enhancing command-line argument parsing for better flexibility.

gemini-code-assist · 2025-10-27T02:54:16Z