feat: add prompt caching support for Kimi K2 on Groq #7324

daniel-lxs · 2025-08-22T16:15:13Z

Description

This PR ports the prompt caching support for Kimi K2 on Groq from the upstream Cline repository.

Ported from: cline/cline#5697

Changes

Added interface to handle Groq's cached token fields in the response
Implemented proper cost calculation with cache read discounts using the existing function
Enabled prompt caching for Kimi K2 model with a 50% discount on cached input tokens
Updated tests to verify the caching functionality works correctly

Implementation Details

Groq Handler

Added a custom interface that extends OpenAI's CompletionUsage to include
Overrode the method to use a custom method
The method:
- Extracts cached token information from Groq's response
- Calculates costs with proper cache discounts
- Reports non-cached input tokens separately from cached tokens

Model Configuration

Set for the Kimi K2 model
Added for 50% discount on cached tokens

Tests

Updated existing test to expect the new usage format with cache fields
Added new test case for cached token handling

Testing

✅ All tests passing (12/12)
✅ TypeScript compilation successful
✅ ESLint checks pass

Credits

This implementation is based on the original work from the Cline repository PR #5697.

Important

Adds prompt caching support for Kimi K2 on Groq with cost calculation and test updates.

Behavior:
- Enables prompt caching for moonshotai/kimi-k2-instruct model with a 50% discount on cached input tokens in groq.ts.
- Implements cost calculation with cache read discounts in GroqHandler.
Implementation:
- Adds GroqUsage interface in groq.ts to handle cached token fields.
- Overrides createMessage() in GroqHandler to yield usage data with cache details.
- Introduces yieldUsage() in GroqHandler to calculate and yield usage costs.
Tests:
- Updates tests in groq.spec.ts to verify caching functionality and cost calculations.
- Adds test case for handling cached tokens in usage data.

^{This description was created by}^{for 8fa6f00. You can customize this summary. It will automatically update as commits are pushed.}

Ported from upstream Cline repository PR #5697 Original PR: cline/cline#5697 - Added GroqUsage interface to handle cached token fields - Implemented proper cost calculation with cache read discounts - Enabled prompt caching for Kimi K2 model with 50% discount on cached tokens - Updated tests to verify caching functionality Co-authored-by: Cline Contributors <[email protected]>

roomote

Thank you for your contribution! I've reviewed the changes and found some issues that need attention before merging.

roomote · 2025-08-22T16:19:21Z

src/api/providers/groq.ts

+		// Calculate non-cached input tokens for proper reporting
+		const nonCachedInputTokens = Math.max(0, inputTokens - cacheReadTokens - cacheWriteTokens)
+
+		console.log("usage", {


Debug logging should be removed from production code. Could we remove this console.log statement?

roomote · 2025-08-22T16:19:21Z

src/api/providers/groq.ts

 import type { ApiHandlerOptions } from "../../shared/api"
+import type { ApiHandlerCreateMessageMetadata } from "../index"
+import { ApiStream } from "../transform/stream"
+import { convertToOpenAiMessages } from "../transform/openai-format"


Is this import still needed? It appears to be unused since the createMessage method is overridden and doesn't call convertToOpenAiMessages.

roomote · 2025-08-22T16:19:21Z

src/api/providers/groq.ts

+			}
+
+			if (chunk.usage) {
+				yield* this.yieldUsage(chunk.usage as GroqUsage)


Could we add type validation here to ensure chunk.usage conforms to GroqUsage structure? The type assertion without validation could potentially cause runtime errors if the API response structure changes.

roomote · 2025-08-22T16:19:21Z

src/api/providers/groq.ts

+
+		const cacheReadTokens = usage?.prompt_tokens_details?.cached_tokens || 0
+
+		// Groq does not track cache writes


Could we expand this comment to provide more context? For example: 'Groq does not track cache writes - only cache reads are reported in the API response. This is a limitation of the Groq API as of [date].'

roomote · 2025-08-22T16:19:21Z

src/api/providers/__tests__/groq.spec.ts

+			cacheReadTokens: 30,
+		})
+		expect(typeof firstChunk.value.totalCost).toBe("number")
 	})


Consider adding edge case tests:

When prompt_tokens_details is present but cached_tokens is undefined

When cached tokens exceed total prompt tokens (error case)

Verify actual cost calculation values instead of just checking the type

daniel-lxs requested review from cte, jr and mrubens as code owners August 22, 2025 16:15

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Aug 22, 2025

github-project-automation bot moved this to Triage in Roo Code Roadmap Aug 22, 2025

github-project-automation bot moved this to New in Roo Code Roadmap Aug 22, 2025

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Aug 22, 2025

hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 22, 2025

daniel-lxs moved this from Triage to PR [Needs Review] in Roo Code Roadmap Aug 22, 2025

roomote bot reviewed Aug 22, 2025

View reviewed changes

cte approved these changes Aug 22, 2025

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 22, 2025

hannesrudolph added PR - Needs Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Aug 22, 2025

mrubens merged commit faab314 into main Aug 22, 2025
35 of 36 checks passed

mrubens deleted the feat/groq-kimi-k2-prompt-caching branch August 22, 2025 16:40

github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 22, 2025

github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Aug 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add prompt caching support for Kimi K2 on Groq #7324

feat: add prompt caching support for Kimi K2 on Groq #7324

Uh oh!

daniel-lxs commented Aug 22, 2025 •

edited

Loading

Uh oh!

roomote bot left a comment

Uh oh!

roomote bot Aug 22, 2025

Uh oh!

roomote bot Aug 22, 2025

Uh oh!

roomote bot Aug 22, 2025

Uh oh!

roomote bot Aug 22, 2025

Uh oh!

roomote bot Aug 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		const cacheReadTokens = usage?.prompt_tokens_details?.cached_tokens \|\| 0

		// Groq does not track cache writes

feat: add prompt caching support for Kimi K2 on Groq #7324

feat: add prompt caching support for Kimi K2 on Groq #7324

Uh oh!

Conversation

daniel-lxs commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Implementation Details

Groq Handler

Model Configuration

Tests

Testing

Credits

Uh oh!

roomote bot left a comment

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

roomote bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

daniel-lxs commented Aug 22, 2025 •

edited

Loading