verify and support Kimi-K2.5 model by liuyuhang-2025 · Pull Request #7612 · vllm-project/vllm-ascend

liuyuhang-2025 · 2026-03-24T13:36:54Z

What this PR does / why we need it?

This PR introduces the official E2E verification test configuration for the moonshotai/Kimi-K2.5 model (W4A8 quantized version) on the vLLM Ascend backend.

Changes proposed:

Test Cases: Added tests/e2e/models/configs/Kimi-K2.5-W4A8.yaml to automate the verification pipeline based on the existing deployment tutorial.

This is needed to ensure continuous integration (CI) and automated verification for the newly supported Kimi-K2.5 model on Ascend NPU environments (Atlas 800 A2/A3).

Fixes [Feature]: Verify / Support moonshotai/Kimi-K2.5 #6683

Does this PR introduce any user-facing change?

No user-facing APIs or existing documentation were modified. This PR solely adds internal E2E testing configurations for CI validation.

How was this patch tested?

Automated Testing: CI validation via the newly added E2E test configuration Kimi-K2.5-W4A8.yaml. The test parameters were strictly aligned with the existing Kimi-K2.5.md documentation.
vLLM version: v0.18.0
vLLM main: vllm-project/vllm@ed359c4

gemini-code-assist · 2026-03-24T13:37:08Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes automated end-to-end testing for the Kimi-K2.5 model, specifically its W4A8 quantized version, within the vLLM Ascend backend. By integrating a new test configuration, it ensures robust continuous integration and verification for this model on Ascend NPU hardware, enhancing reliability and maintainability without affecting user-facing APIs.

Highlights

New E2E Test Configuration: Introduced an official E2E verification test configuration for the moonshotai/Kimi-K2.5 model (W4A8 quantized version) on the vLLM Ascend backend.
Automated Verification: Added tests/e2e/models/configs/Kimi-K2.5-W4A8.yaml to automate the verification pipeline, ensuring continuous integration and automated verification for the Kimi-K2.5 model on Ascend NPU environments.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds an end-to-end test configuration for the Kimi-K2.5-W4A8 model. The configuration is consistent with the provided documentation and the changes appear correct. The pull request description is well-written and follows the repository's template. However, the pull request title does not conform to the repository's style guide, which requires a [Branch][Module][Action] prefix. This is a high-priority requirement. A compliant title would be, for example: [Test][Feature] Add E2E test configuration for Kimi-K2.5.

github-actions · 2026-03-24T13:53:39Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

verify and support Kimi-K2.5 model

971f318

liuyuhang-2025 requested a review from wangxiyuan as a code owner March 24, 2026 13:36

gemini-code-assist bot reviewed Mar 24, 2026

View reviewed changes

liuyuhang-2025 mentioned this pull request Mar 24, 2026

[Feature]: Verify / Support moonshotai/Kimi-K2.5 #6683

Open

github-actions bot added the module:tests label Mar 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

verify and support Kimi-K2.5 model#7612

verify and support Kimi-K2.5 model#7612
liuyuhang-2025 wants to merge 1 commit intovllm-project:mainfrom
liuyuhang-2025:support-kimi-k2.5

liuyuhang-2025 commented Mar 24, 2026 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liuyuhang-2025 commented Mar 24, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot commented Mar 24, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liuyuhang-2025 commented Mar 24, 2026 •

edited by github-actions bot

Loading