You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/latest/plugins/ai-proxy-multi.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -940,6 +940,8 @@ For verification, the behaviours should be consistent with the verification in [
940
940
941
941
The following example demonstrates how you can log LLM request related information in the gateway's access log to improve analytics and audit. The following variables are available:
942
942
943
+
*`request_llm_model`: LLM model name specified in the request.
944
+
*`apisix_upstream_response_time`: Time taken for APISIX to send the request to the upstream service and receive the full response
943
945
*`request_type`: Type of request, where the value could be `traditional_http`, `ai_chat`, or `ai_stream`.
944
946
*`llm_time_to_first_token`: Duration from request sending to the first token received from the LLM service, in milliseconds.
945
947
*`llm_model`: LLM model.
@@ -951,7 +953,7 @@ Update the access log format in your configuration file to include additional LL
The access log entry shows the request type is `ai_chat`, time to first token is `2858` milliseconds, LLM model is `gpt-4`, prompt token usage is `23`, and completion token usage is `8`.
1001
+
The access log entry shows the request type is `ai_chat`, Apisix upstream response time is `5765` milliseconds, time to first token is `2858` milliseconds, Requested LLM model is `gpt-4`. LLM model is `gpt-4`, prompt token usage is `23`, and completion token usage is `8`.
Copy file name to clipboardExpand all lines: docs/en/latest/plugins/ai-proxy.md
+5-3Lines changed: 5 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -388,6 +388,8 @@ You should receive a response similar to the following:
388
388
389
389
The following example demonstrates how you can log LLM request related information in the gateway's access log to improve analytics and audit. The following variables are available:
390
390
391
+
*`request_llm_model`: LLM model name specified in the request.
392
+
*`apisix_upstream_response_time`: Time taken for APISIX to send the request to the upstream service and receive the full response
391
393
*`request_type`: Type of request, where the value could be `traditional_http`, `ai_chat`, or `ai_stream`.
392
394
*`llm_time_to_first_token`: Duration from request sending to the first token received from the LLM service, in milliseconds.
393
395
*`llm_model`: LLM model.
@@ -399,7 +401,7 @@ Update the access log format in your configuration file to include additional LL
The access log entry shows the request type is `ai_chat`, time to first token is `2858` milliseconds, LLM model is `gpt-4`, prompt token usage is `23`, and completion token usage is `8`.
449
+
The access log entry shows the request type is `ai_chat`, Apisix upstream response time is `5765` milliseconds, time to first token is `2858` milliseconds, Requested LLM model is `gpt-4`. LLM model is `gpt-4`, prompt token usage is `23`, and completion token usage is `8`.
0 commit comments