Skip to content

Conversation

@l-bat
Copy link
Owner

@l-bat l-bat commented Oct 22, 2025

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces support for Large Reasoning Model (LRM) optimization by implementing advanced KV-cache token eviction strategies. The main enhancements include:

  • Implementation of RKV algorithm for token importance scoring during both prefill and decode stages
  • Addition of refined selection methods (KVCrush and DiverseKV) to improve token diversity and reduce redundancy
  • Support for Qwen3 and Phi3 model architectures
  • Mathematical reasoning benchmark infrastructure for MATH500 and GSM8K datasets

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
modules/genai_optimizations/setup.py Adds dependencies for mathematical reasoning (latex2sympy2, word2number)
modules/genai_optimizations/genai_opt/token_eviction.py Implements RKV algorithm, refined selection modes (KVCrush/DiverseKV), adaptive budget allocation, and key similarity computation
modules/genai_optimizations/genai_opt/sparse_attention.py Adds forward functions for Qwen3 and Phi3 models, fixes dropout parameter naming
modules/genai_optimizations/genai_opt/__init__.py Exports KVCacheRefinedSelection enum
modules/genai_optimizations/benchmarks/utils.py Adds CLI arguments for refined selection algorithms
modules/genai_optimizations/benchmarks/reasoning_parser.py New file implementing mathematical answer extraction and evaluation logic
modules/genai_optimizations/benchmarks/math500_gsm_bench.py New benchmark script for MATH500 and GSM8K datasets
modules/genai_optimizations/benchmarks/README.md Documents LRM optimization example
modules/genai_optimizations/README.md Documents RKV mode, refined selection algorithms, and supported LRM models

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

string = string.replace("infinity", "\\infty")
if "\\infty" not in string:
string = string.replace("inf", "\\infty")
string = string.replace("+\\inity", "\\infty")
Copy link

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of '\inity' to '\infty'. The code appears to be attempting to fix typos but includes a typo itself.

Suggested change
string = string.replace("+\\inity", "\\infty")
string = string.replace("\\inity", "\\infty")

Copilot uses AI. Check for mistakes.
praasz and others added 3 commits November 13, 2025 15:57
* Update usage of shared stream buffer

Signed-off-by: Raasz, Pawel <[email protected]>

* Add onnxscript to custom operation req tests

Signed-off-by: Raasz, Pawel <[email protected]>

* Fix FFT custom op mapping to use it instead ops in OV

Signed-off-by: Raasz, Pawel <[email protected]>

* Fix FFT evaluate seg faults and update not supported configurations

Signed-off-by: Raasz, Pawel <[email protected]>

* Skip calculate_grid test

Signed-off-by: Raasz, Pawel <[email protected]>

* Number of inputs in custom FFT compatible with ONNX DFT

Signed-off-by: Raasz, Pawel <[email protected]>

* Revert input check for FFT and set version for onnxscript

Signed-off-by: Raasz, Pawel <[email protected]>

---------

Signed-off-by: Raasz, Pawel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants