[AI Proxy] Incorrect llm_total_tokens_count metric for models with reasoning/hidden tokens

### Is there an existing issue for this?

- [x] I have searched the existing issues

### Kong version (`$ kong version`)

2.9.1

### Current Behavior

Currently, the AI Proxy plugin **ignores the explicit `total_tokens` value returned by the LLM provider** when generating Prometheus metrics. Instead, it manually calculates the total by summing prompt and completion tokens. This results in inaccurate monitoring and underreporting of token usage.

### Steps to Reproduce
1. Configure a route with the AI Proxy plugin.
2. Make a request to a model that uses reasoning tokens (where `total != prompt + completion`).
3. Observe the upstream JSON response body:
   ```json
   "usage": {
       "prompt_tokens": 3,
       "total_tokens": 298,
       "completion_tokens": 7
   }
   ```
   *(Note the difference: 3 + 7 = 10, but the actual billed total is 298)*.
4. Check the Prometheus metric `ai_llm_total_tokens_count`. It records `10` instead of `298`.

### Root Cause Analysis
I have identified two places in the source code causing this behavior:

**1. The data is not passed to the metrics plugin (`kong/llm/drivers/shared.lua`)**
In the `shared.lua` driver, specifically around line 837, the code extracts `prompt` and `completion` tokens for metrics but **omits** `total_tokens`, even though it captures it for the analytics log container.

```lua
  -- kong\llm\drivers\shared.lua - Line 863
  if response_object.usage then
    if response_object.usage.prompt_tokens then
      request_analytics_plugin[log_entry_keys.USAGE_CONTAINER][log_entry_keys.PROMPT_TOKENS] = response_object.usage.prompt_tokens
    end
    if response_object.usage.completion_tokens then
      request_analytics_plugin[log_entry_keys.USAGE_CONTAINER][log_entry_keys.COMPLETION_TOKENS] = response_object.usage.completion_tokens
    end
    if response_object.usage.total_tokens then
      request_analytics_plugin[log_entry_keys.USAGE_CONTAINER][log_entry_keys.TOTAL_TOKENS] = response_object.usage.total_tokens
    end

    ai_plugin_o11y.metrics_set("llm_prompt_tokens_count", response_object.usage.prompt_tokens)
    ai_plugin_o11y.metrics_set("llm_completion_tokens_count", `response_object.usage.completion_tokens)`
```

**2. The total is hard-calculated (`kong/llm/plugin/observability.lua`)**
Even if the driver passed the value, the observability plugin currently forces a manual summation, completely ignoring any explicit total provided by the driver.

```lua
-- kong/llm/plugin/observability.lua - Line 78
    elseif key == "llm_total_tokens_count" then
      return _M.metrics_get("llm_prompt_tokens_count") + _M.metrics_get("llm_completion_tokens_count")
    end
```

The fix should be a simple add the metric (if exists on the response body)

```lua
ai_plugin_o11y.metrics_set("llm_total_tokens_count", `response_object.usage.total_tokens)`
```

### Expected Behavior

_No response_

### Steps To Reproduce

_No response_

### Anything else?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AI Proxy] Incorrect llm_total_tokens_count metric for models with reasoning/hidden tokens #14816

Is there an existing issue for this?

Kong version (`$ kong version`)

Current Behavior

Steps to Reproduce

Root Cause Analysis

Expected Behavior

Steps To Reproduce

Anything else?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[AI Proxy] Incorrect llm_total_tokens_count metric for models with reasoning/hidden tokens #14816

Description

Is there an existing issue for this?

Kong version ($ kong version)

Current Behavior

Steps to Reproduce

Root Cause Analysis

Expected Behavior

Steps To Reproduce

Anything else?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Kong version (`$ kong version`)