-
Notifications
You must be signed in to change notification settings - Fork 27
AGENT-DRAFT: GKO-2024 — LLM Analytics & Dashboard Enhancements [code+dev-draft] #1319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Analytics |
107 changes: 107 additions & 0 deletions
107
docs/apim/4.11/guides/analytics/configure-llm-analytics-in-elasticsearch.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| # Configure LLM Analytics in Elasticsearch | ||
|
|
||
| ## Dashboard Template | ||
|
|
||
| The LLM dashboard template (id: `llm`) replaces the previous AI Gateway template. It provides 10 pre-configured widgets monitoring token consumption, cost trends, and model-specific usage. The template targets APIs with `API_TYPE = 'LLM'` and includes time-series charts for token count and cost, faceted breakdowns by model and HTTP status, and summary statistics for total/average tokens and costs. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| Before you configure LLM analytics in Elasticsearch, ensure the following: | ||
|
|
||
| * Gravitee API Management platform with Elasticsearch-backed analytics | ||
| * APIs configured with LLM proxy policies that populate `additional-metrics` fields (tokens-sent, tokens-received, sent-cost, received-cost, model, provider) | ||
| * Analytics data retention policy sufficient for cost tracking requirements | ||
| * User permissions: | ||
| * `Environment-dashboard-r`: View dashboards | ||
| * `Environment-dashboard-c`: Create dashboards | ||
| * `Environment-dashboard-u`: Update dashboards | ||
| * `Environment-dashboard-d`: Delete dashboards | ||
|
|
||
| ## Gateway Configuration | ||
|
|
||
| ### Elasticsearch Aggregation Scripts | ||
|
|
||
| The gateway uses Painless scripts to compute LLM metrics from additional-metrics fields. These scripts handle missing fields gracefully (defaulting to 0). | ||
|
|
||
| **LLM Total Token Script:** | ||
|
|
||
| ```painless | ||
| (doc['additional-metrics.long_llm-proxy_tokens-sent'].size() > 0 ? doc['additional-metrics.long_llm-proxy_tokens-sent'].value : 0) + \ | ||
| (doc['additional-metrics.long_llm-proxy_tokens-received'].size() > 0 ? doc['additional-metrics.long_llm-proxy_tokens-received'].value : 0) | ||
| ``` | ||
|
|
||
| **LLM Total Cost Script:** | ||
|
|
||
| ```painless | ||
| (doc['additional-metrics.double_llm-proxy_sent-cost'].size() > 0 ? doc['additional-metrics.double_llm-proxy_sent-cost'].value : 0) + \ | ||
| (doc['additional-metrics.double_llm-proxy_received-cost'].size() > 0 ? doc['additional-metrics.double_llm-proxy_received-cost'].value : 0) | ||
| ``` | ||
|
|
||
| Both scripts support `buildSum` and `buildAvg` aggregation types. | ||
|
|
||
| ### Analytics Definition | ||
|
|
||
| The `analytics-definition.yaml` configuration includes metric, facet, and filter definitions for LLM analytics. | ||
|
|
||
| **Metric Configuration:** | ||
|
|
||
| | Property | LLM_PROMPT_TOTAL_TOKEN | LLM_PROMPT_TOKEN_COST | | ||
| |:---------|:----------------------|:---------------------| | ||
| | Label | Total token count | Total token cost | | ||
| | APIs | LLM | LLM | | ||
| | Type | COUNTER | GAUGE | | ||
| | Unit | (none) | NUMBER | | ||
| | Measures | COUNT, AVG | COUNT, AVG | | ||
|
|
||
| Both metrics support faceting and filtering by: API, APPLICATION, PLAN, GATEWAY, TENANT, ZONE, HTTP_METHOD, HTTP_STATUS_CODE_GROUP, HTTP_STATUS, HTTP_PATH, HTTP_PATH_MAPPING, HTTP_USER_AGENT_OS_NAME, HTTP_USER_AGENT_DEVICE, HOST, GEO_IP_COUNTRY, GEO_IP_REGION, GEO_IP_CITY, GEO_IP_CONTINENT, CONSUMER_IP, LLM_PROXY_MODEL, LLM_PROXY_PROVIDER. | ||
|
|
||
| Filters additionally include HTTP_ENDPOINT_RESPONSE_TIME, HTTP_GATEWAY_LATENCY, HTTP_GATEWAY_RESPONSE_TIME. | ||
|
|
||
| **Facet/Filter Configuration:** | ||
|
|
||
| | Property | Type | Operators | Elasticsearch Field | | ||
| |:---------|:-----|:----------|:-------------------| | ||
| | LLM_PROXY_MODEL | KEYWORD | EQ, IN | `additional-metrics.keyword_llm-proxy_model` | | ||
| | LLM_PROXY_PROVIDER | KEYWORD | EQ, IN | `additional-metrics.keyword_llm-proxy_provider` | | ||
|
|
||
| All existing HTTP and LLM metrics now support `LLM_PROXY_MODEL` and `LLM_PROXY_PROVIDER` as additional facets and filters. | ||
|
|
||
| ## Creating LLM Analytics Queries | ||
|
|
||
| To query LLM token or cost metrics, construct a `MetricRequest` with the metric name (`LLM_PROMPT_TOTAL_TOKEN` or `LLM_PROMPT_TOKEN_COST`), desired measures (`COUNT` or `AVG`), and optional filters/facets. | ||
|
|
||
| 1. Select the metric and measures in your analytics API request. | ||
| 2. Add filters to scope the query (e.g., `API_TYPE = 'LLM'`, `LLM_PROXY_MODEL IN ['gpt-4', 'claude-3']`). | ||
| 3. Specify facets to group results (e.g., by `LLM_PROXY_PROVIDER` or `LLM_PROXY_MODEL`). | ||
| 4. (Optional) Add sorts using the `SortFilter` interface (measure + order). | ||
| 5. Submit the request to the analytics API endpoint. | ||
|
|
||
| The response includes aggregated token counts, costs, or averages based on your configuration. | ||
|
|
||
| ## Creating an LLM Dashboard | ||
|
|
||
| To create an LLM dashboard from the template: | ||
|
|
||
| 1. Navigate to **Observability** > **Dashboards**. | ||
| 2. Click **Create dashboard** > **Create from template**. | ||
| 3. Select the **LLM** template in the left panel and click **Use template**. | ||
| 4. The platform creates the dashboard and redirects to the new dashboard view. | ||
| 5. Adjust filters or the timeframe to customize the dashboard view. | ||
|
|
||
| ### Verification | ||
|
|
||
| To verify the dashboard was created successfully, navigate back to **Observability** > **Dashboards**. The new LLM dashboard appears in the dashboard list. | ||
|
|
||
| {% hint style="warning" %} | ||
| If the dashboard displays no data, verify that: | ||
| * The Elasticsearch backend is running | ||
| * LLM APIs are generating traffic and populating `additional-metrics` fields | ||
| {% endhint %} | ||
|
|
||
| ### Next Steps | ||
|
|
||
| After creating an LLM dashboard, you can: | ||
|
|
||
| * Create additional custom dashboards | ||
| * Add filters to focus on specific models, providers, or cost ranges | ||
| * Monitor for abnormal behavior, increased errors, or unusual token consumption | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Reference |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
|
|
||
| # LLM Analytics API Reference | ||
|
|
||
| For configuration guidance, see [Query LLM Metrics and Configure Dashboards](query-llm-metrics-and-configure-dashboards.md). | ||
|
|
||
|
|
||
| ## Restrictions | ||
|
|
||
| - LLM metrics require Elasticsearch-backed analytics (not available with in-memory or other analytics backends) | ||
| - `LLM_PROMPT_TOTAL_TOKEN` and `LLM_PROMPT_TOKEN_COST` only populate when LLM proxy policies write to `additional-metrics` fields | ||
| - Facet requests limited to 3 dimensions (`by` maxItems: 3) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [vale] reported by reviewdog 🐶 |
||
| - Time-series requests limited to 2 grouping dimensions (`by` maxItems: 2) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [vale] reported by reviewdog 🐶 |
||
| - `LLM_PROXY_MODEL` and `LLM_PROXY_PROVIDER` filters support only `EQ` and `IN` operators | ||
| - Average measure (`AVG`) no longer displays unit suffix in stats widgets (changed from "ms" to empty string) | ||
|
|
||
| ## Related Changes | ||
|
|
||
| The HTTP Proxy dashboard template now filters the HTTP_REQUESTS widget by `API_TYPE = 'HTTP_PROXY'` to exclude LLM traffic. Widget titles in the HTTP Proxy template updated for clarity: "Average Latency" → "Average Latency in ms", "Average Response Time" → "Average Response Time in ms". The stats component formatting changed to remove unit suffixes from `AVG` measures. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| # Release Notes |
21 changes: 21 additions & 0 deletions
21
docs/gko/4.11/release-information/release-notes/gko-4.11.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| # GKO 4.11 | ||
|
|
||
| ## Highlights | ||
|
|
||
| ## Breaking Changes | ||
|
|
||
| ## New Features | ||
|
|
||
|
|
||
| <!-- PIPELINE:GKO-2024 --> | ||
| #### **LLM Analytics and Cost Tracking** | ||
|
|
||
| * Track token consumption and costs for LLM-proxied API traffic through new dashboard widgets and analytics queries. | ||
| * Monitor total token usage, per-request costs, and model-specific metrics with two new metrics (`LLM_PROMPT_TOTAL_TOKEN` and `LLM_PROMPT_TOKEN_COST`) and facets for provider and model filtering. | ||
| * Requires APIs configured with LLM proxy policies that populate `additional-metrics` fields and Elasticsearch-backed analytics. | ||
| * Includes a new LLM dashboard template with 10 pre-configured widgets for token count, cost trends, and model-specific usage analysis. | ||
| <!-- /PIPELINE:GKO-2024 --> | ||
|
|
||
| ## Improvements | ||
|
|
||
| ## Bug Fixes |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[vale] reported by reviewdog 🐶
Microsoft.Adverbs:Remove 'gracefully' if it's not important to the meaning of the statement. (https://docs.microsoft.com/en-us/style-guide/word-choice/use-simple-words-concise-sentences)