Skip to content

Commit d233bc4

Browse files
authored
fix(gemini): gemini input token calculation when implicit cache is hit using langchain (#1451)
fix: gemini caching token calculation when using langchain Currently: When `input_modality_1` contains tokens, `input` token count is 0. The cached token logic only subtracts cached tokens from `input`, when they should be subtracted from the `input_modality_1`. Proposed fix: Subtract `cache_tokens_details` from the corresponding `input_modality` in addition to subtracting from `input`.
1 parent 478e7e2 commit d233bc4

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

langfuse/langchain/CallbackHandler.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1175,6 +1175,9 @@ def _parse_usage_model(usage: Union[pydantic.BaseModel, dict]) -> Any:
11751175
if "input" in usage_model:
11761176
usage_model["input"] = max(0, usage_model["input"] - value)
11771177

1178+
if f"input_modality_{item['modality']}" in usage_model:
1179+
usage_model[f"input_modality_{item['modality']}"] = max(0, usage_model[f"input_modality_{item['modality']}"] - value)
1180+
11781181
usage_model = {k: v for k, v in usage_model.items() if isinstance(v, int)}
11791182

11801183
return usage_model if usage_model else None

0 commit comments

Comments
 (0)