Add EXAONE 4.0 reasoning parser #22617

nuxlear · 2025-08-11T04:31:06Z

Add EXAONE 4.0 reasoning parser
Add request parameter for ReasoningParser.extract_reasoning_content_streaming()

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

EXAONE 4.0 currently uses reasoning_parser=deepseek_r1 (Please refer #21718).
However, it incorrectly treats all output as reasoning content instead of normal content when no <think> or </think> tags are found and enable_thinking=False.

While this was easily fixed by modifying the output of extract_reasoning_content(), the issue persists in streaming mode.
By adding a request parameter to extract_reasoning_content_streaming(), we can figure out whether the streamed token is reasoning content or normal content.

Test Plan

You can change the "stream" option for testing.

Run server

vllm serve LGAI-EXAONE/EXAONE-4.0.1-32B --tool-call-parser hermes --reasoning-parser exaone4

Test normal request (all outputs should be content)

curl -X POST http://localhost:8850/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "LGAI-EXAONE/EXAONE-4.0.1-32B",
        "messages": [
            {"role": "user", "content": "Which is bigger, 3.7 or 3.11?"}
        ],
        "max_tokens": 1024,
        "stream": false
    }'

Test reasoning request (should starts with reasoning_content)

curl -X POST http://localhost:8850/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "LGAI-EXAONE/EXAONE-4.0.1-32B",
        "messages": [
            {"role": "user", "content": "Which is bigger, 3.7 or 3.11?"}
        ],
        "chat_template_kwargs": {"enable_thinking": true},
        "max_tokens": 4096,
        "stream": false
    }'

Test Result

{"id":"chatcmpl-717caf53062e4b06ba51ad9e71c9512b","object":"chat.completion","created":1754885297,"model":"LGAI-EXAONE/EXAONE-4.0.1-32B","choices":[{"index":0,"message":{"role":"assistant","content":"To determine which number is bigger between **3.7** and **3.11**, follow these steps:\n\n1. **Compare the Whole Number Parts:**\n   - Both numbers have the same whole number part: **3**.\n\n2. **Compare the Decimal Parts:**\n   - **3.7** can be written as **3.70** to make the comparison easier.\n   - Now, compare **0.70** and **0.11**:\n     - **70** (from 0.70) is greater than **11** (from 0.11).\n\n3. **Conclusion:**\n   - Since **0.70 > 0.11**, it follows that **3.7 > 3.11**.\n\n\\[\n\\boxed{3.7}\n\\]","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":null},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":27,"total_tokens":219,"completion_tokens":192,"prompt_tokens_details":null},"prompt_logprobs":null,"kv_transfer_params":null}

{"id":"chatcmpl-fbaf462004774a73a947f9ff99d1cfb3","object":"chat.completion","created":1754885833,"model":"LGAI-EXAONE/EXAONE-4.0.1-32B","choices":[{"index":0,"message":{"role":"assistant","content":"\n\nTo determine which is bigger between 3.7 and 3.11, compare the decimal values digit by digit, starting from the left.\n\n- Both numbers have the same whole number part (3).\n- In the tenths place, 3.7 has a 7, while 3.11 has a 1. Since 7 is greater than 1, 3.7 is larger at this point.\n- Even without further digits, the tenths place comparison is sufficient to conclude that 3.7 is bigger than 3.11.\n\nTo verify:\n- Rewrite 3.7 as 3.70 for easier comparison: 3.70 vs. 3.11.\n- Compare place values:\n  - Units: 3 = 3\n  - Tenths: 7 > 1\n  - Since the tenths differ, no need to check hundredths.\n- Alternatively, convert to fractions:\n  - 3.7 = 37/10 = 370/100\n  - 3.11 = 311/100\n  - Since 370/100 > 311/100, 3.7 is larger.\n\nSubtraction also confirms:  \n3.70 - 3.11 = 0.59, which is positive, so 3.7 is bigger.\n\nThus, **3.7 is bigger than 3.11**.","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning_content":"I need to determine which is bigger between 3.7 and 3.11. Both are decimals, so I should compare them digit by digit.\n\nFirst, look at the whole number part. Both have 3, so they're equal up to the units place. Now, I need to look at the tenths place.\n\nFor 3.7, the tenths digit is 7. For 3.11, the tenths digit is 1. Since 7 is greater than 1, 3.7 should be larger than 3.11.\n\nBut let me write them with the same number of decimal places to make it easier. 3.7 can be written as 3.70, and 3.11 is already 3.11. So, comparing 3.70 and 3.11:\n\n- Units: both 3\n\n- Tenths: 7 vs 1 → 7 > 1\n\nSince the tenths digit is higher for 3.7, it doesn't matter what comes after; 3.70 is greater than 3.11.\n\nI could also think in terms of fractions. 3.7 is 37/10, which is 370/100. And 3.11 is 311/100. Comparing 370/100 and 311/100, clearly 370 > 311, so 370/100 > 311/100, meaning 3.7 > 3.11.\n\n370/100 is 3.70, and 311/100 is 3.11, yes. So, 3.70 is indeed greater than 3.11.\n\nAnother way: 3.11 is three and eleven hundredths, while 3.7 is three and seven tenths. Seven tenths is seventy hundredths, and seventy hundredths is greater than eleven hundredths, so 3.7 > 3.11.\n\nI think I'm overcomplicating it. The simple comparison shows that since the tenths place is higher in 3.7, it's bigger.\n\nBut just to be thorough, let's consider if there's any trick here. Sometimes people might misread 3.7 as 3.07 or something, but no, 3.7 is clearly three point seven, which is 3.70.\n\nIn some contexts, decimals might be written differently, but standardly, 3.7 means 3.70.\n\nPerhaps the question is about numerical values only, so no tricks.\n\nSo, I think 3.7 is bigger than 3.11.\n\nBut let me confirm with subtraction: 3.7 - 3.11 = ? 3.7 minus 3.11.\n\nTo subtract, align decimals:\n\n  3.70\n\n- 3.11\n\n______\n\nStart from right: 0 - 1, can't do, borrow. 10 - 1 = 9, but since we borrowed, the 7 becomes 6 (because 70 becomes 69? Let's think carefully.\n\nActually, 3.70 minus 3.11:\n\nHundredths place: 0 < 1, so need to borrow from tenths. But tenths place has 7, which is 70 hundredths. So, take one from tenths, so tenths become 6, and hundredths become 10. Then 10 - 1 = 9.\n\nNow tenths place: 6 (after borrowing) minus 1 = 5.\n\nUnits place: 3 - 3 = 0.\n\nSo, 0.59, which is positive, meaning 3.70 > 3.11.\n\nYes, difference is 0.59, so 3.7 is bigger.\n\nIf I think on a number line, 3.11 is to the left of 3.7, so smaller.\n\nTherefore, 3.7 is bigger.\n\nThe question is \"which is bigger, 3.7 or 3.11?\" So, answer should be 3.7.\n\nBut just to be complete, is there any context where 3.11 could be larger? I don't think so. Unless it's a different base or something, but it's standard decimal.\n\nPerhaps someone might confuse it with fractions, but 3.7 is 37/10 = 3.7, 3.11 is 311/100 = 3.11, and 37/10 = 370/100 > 311/100.\n\n370/100 vs 311/100, yes.\n\nOr as mixed numbers: 3 7/10 vs 3 11/100. 7/10 = 70/100 > 11/100, so 3 70/100 > 3 11/100.\n\nAll ways confirm.\n\nSo, I think it's clear.\n"},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"service_tier":null,"system_fingerprint":null,"usage":{"prompt_tokens":23,"total_tokens":1572,"completion_tokens":1549,"prompt_tokens_details":null},"prompt_logprobs":null,"kv_transfer_params":null}

(Optional) Documentation Update

- Add `request` parameter for ReasoningParser.extract_reasoning_content_streaming() Co-authored-by: Junwon Hwang <[email protected]> Co-authored-by: heyzude <[email protected]>

gemini-code-assist

Code Review

This pull request introduces a new reasoning parser for the EXAONE 4.0 model and refactors the ReasoningParser interface to support it. The core change is the addition of the request parameter to extract_reasoning_content_streaming to allow parsers to access request-specific information, like enable_thinking. While the changes are generally well-implemented and include comprehensive tests, I've found a critical issue in the non-streaming implementation of the new parser where it fails to use this new request parameter, leading to behavior that is inconsistent with its streaming counterpart and incorrect under certain conditions.

gemini-code-assist · 2025-08-11T04:32:30Z

vllm/reasoning/exaone4_reasoning_parser.py

+        if self.end_token not in model_output:
+            if model_output_parts[1]:
+                return model_output, None
+            return None, model_output


The logic for handling model output without an end token in non-streaming mode is inconsistent with the streaming implementation and doesn't correctly handle the enable_thinking flag. When enable_thinking is true, the output should be treated as reasoning content even if <think> and </think> tags are missing. The current implementation only checks for the presence of the <think> tag, which can lead to incorrect parsing of the model's output.

You should use the request object to check chat_template_kwargs.get("enable_thinking"), similar to how it's done in extract_reasoning_content_streaming, to ensure consistent behavior.

Suggested change

if self.end_token not in model_output:

if model_output_parts[1]:

return model_output, None

return None, model_output

if self.end_token not in model_output:

enable_thinking = (request is not None and

request.chat_template_kwargs is not None and

request.chat_template_kwargs.get("enable_thinking"))

if enable_thinking or model_output_parts[1]:

return model_output, None

return None, model_output

github-actions · 2025-08-11T05:14:50Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Add EXAONE 4.0 reasoning parser

46573d4

- Add `request` parameter for ReasoningParser.extract_reasoning_content_streaming() Co-authored-by: Junwon Hwang <[email protected]> Co-authored-by: heyzude <[email protected]>

nuxlear requested a review from aarnphm as a code owner August 11, 2025 04:31

mergify bot added deepseek Related to DeepSeek models frontend qwen Related to Qwen models labels Aug 11, 2025

gemini-code-assist bot reviewed Aug 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add EXAONE 4.0 reasoning parser #22617

Add EXAONE 4.0 reasoning parser #22617

nuxlear commented Aug 11, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 11, 2025

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!

Add EXAONE 4.0 reasoning parser #22617

Are you sure you want to change the base?

Add EXAONE 4.0 reasoning parser #22617

Conversation

nuxlear commented Aug 11, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 11, 2025

Uh oh!

Uh oh!

nuxlear commented Aug 11, 2025 •

edited by github-actions bot

Loading