Lt/te reasoning #1

l-bat · 2025-10-22T14:24:41Z

No description provided.

Copilot

Pull Request Overview

This pull request introduces support for Large Reasoning Model (LRM) optimization by implementing advanced KV-cache token eviction strategies. The main enhancements include:

Implementation of RKV algorithm for token importance scoring during both prefill and decode stages
Addition of refined selection methods (KVCrush and DiverseKV) to improve token diversity and reduce redundancy
Support for Qwen3 and Phi3 model architectures
Mathematical reasoning benchmark infrastructure for MATH500 and GSM8K datasets

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`modules/genai_optimizations/setup.py`	Adds dependencies for mathematical reasoning (latex2sympy2, word2number)
`modules/genai_optimizations/genai_opt/token_eviction.py`	Implements RKV algorithm, refined selection modes (KVCrush/DiverseKV), adaptive budget allocation, and key similarity computation
`modules/genai_optimizations/genai_opt/sparse_attention.py`	Adds forward functions for Qwen3 and Phi3 models, fixes dropout parameter naming
`modules/genai_optimizations/genai_opt/__init__.py`	Exports KVCacheRefinedSelection enum
`modules/genai_optimizations/benchmarks/utils.py`	Adds CLI arguments for refined selection algorithms
`modules/genai_optimizations/benchmarks/reasoning_parser.py`	New file implementing mathematical answer extraction and evaluation logic
`modules/genai_optimizations/benchmarks/math500_gsm_bench.py`	New benchmark script for MATH500 and GSM8K datasets
`modules/genai_optimizations/benchmarks/README.md`	Documents LRM optimization example
`modules/genai_optimizations/README.md`	Documents RKV mode, refined selection algorithms, and supported LRM models

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

modules/genai_optimizations/genai_opt/token_eviction.py

Copilot · 2025-10-22T14:26:03Z

modules/genai_optimizations/benchmarks/reasoning_parser.py

+    string = string.replace("infinity", "\\infty")
+    if "\\infty" not in string:
+        string = string.replace("inf", "\\infty")
+    string = string.replace("+\\inity", "\\infty")


Corrected spelling of '\inity' to '\infty'. The code appears to be attempting to fix typos but includes a typo itself.

Suggested change

string = string.replace("+\\inity", "\\infty")

string = string.replace("\\inity", "\\infty")

* Update usage of shared stream buffer Signed-off-by: Raasz, Pawel <[email protected]> * Add onnxscript to custom operation req tests Signed-off-by: Raasz, Pawel <[email protected]> * Fix FFT custom op mapping to use it instead ops in OV Signed-off-by: Raasz, Pawel <[email protected]> * Fix FFT evaluate seg faults and update not supported configurations Signed-off-by: Raasz, Pawel <[email protected]> * Skip calculate_grid test Signed-off-by: Raasz, Pawel <[email protected]> * Number of inputs in custom FFT compatible with ONNX DFT Signed-off-by: Raasz, Pawel <[email protected]> * Revert input check for FFT and set version for onnxscript Signed-off-by: Raasz, Pawel <[email protected]> --------- Signed-off-by: Raasz, Pawel <[email protected]>

l-bat requested a review from Copilot October 22, 2025 14:24

github-actions bot added the dependencies label Oct 22, 2025

Copilot AI reviewed Oct 22, 2025

View reviewed changes

l-bat force-pushed the lt/te_reasoning branch from f5a3b72 to 00e29dd Compare October 22, 2025 14:36

l-bat force-pushed the lt/te_reasoning branch from 00e29dd to 0d51c55 Compare October 30, 2025 11:24

github-actions bot added the category: custom operations label Nov 7, 2025

praasz and others added 3 commits November 13, 2025 15:57

[GenAI] Support refined algorithms (KVCrush, DiverseKV)

a020792

[GenAI] Support Token Eviction for LRMs

20f7e00

l-bat force-pushed the lt/te_reasoning branch from d643b0c to 20f7e00 Compare November 13, 2025 15:12

github-actions bot added the category: NVIDIA plugin label Nov 13, 2025

minor fixes

7999142

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lt/te reasoning #1

Lt/te reasoning #1

Uh oh!

l-bat commented Oct 22, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	string = string.replace("+\\inity", "\\infty")
	string = string.replace("\\inity", "\\infty")

Lt/te reasoning #1

Are you sure you want to change the base?

Lt/te reasoning #1

Uh oh!

Conversation

l-bat commented Oct 22, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants