-
Notifications
You must be signed in to change notification settings - Fork 735
docs(amazonq): Add latency metric diagrams for amazon q chat #5865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -259,6 +259,292 @@ How long it took from when the user stopped pressing a key to when they were sho | |
| end | ||
| ``` | ||
|
|
||
| ## Amazon Q Chat | ||
|
|
||
| ### amazonq_chatRoundTrip | ||
|
|
||
| Measures sequential response times in Q chat, from user input to message display. Tracks time intervals between key events: editor receiving the message, feature processing, and final message rendering | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode | ||
| participant event as Event Recorder | ||
| participant partner as Partner team code | ||
| participant telemetry | ||
|
|
||
| User->>chat: Write chat message and press enter | ||
| chat->>vscode: send message with timestamp | ||
| vscode->>event: record chatMessageSent/editorReceivedMessage timestamps | ||
| vscode->>partner: forward chat message | ||
| partner->>event: record featureReceivedMessage timestamp | ||
| partner->>partner: call backend/get response | ||
| partner->>vscode: forward response contents | ||
| vscode->>chat: display message | ||
| chat->>vscode: send stop-chat-message-telemetry event | ||
| vscode->>event: record messageDisplayed timestamp | ||
| event->>vscode: get the telemetry timestamps | ||
| vscode->>telemetry: emit amazonq_chatRoundTrip with telemetry timestamps | ||
| ``` | ||
|
|
||
| ### cwsprChatTimeToFirstChunk | ||
|
|
||
| The time between when the conversation stream is created and when we got back the first usable result | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant user as User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode extension host | ||
| participant generateResponse as Generate response | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this the service client?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Its technically the name of the function that executes the calls to the backend, I wanted it to live seperate from the vscode extension host since the chat message -> vscode extension host happens completely seperately from that function. Let me know if theres a better name you suggest |
||
| participant backend as Q service backend | ||
|
|
||
| user->>chat: Presses enter with message | ||
| chat->>vscode: Tell VSCode to generate a response | ||
| vscode->>generateResponse: start generating | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| backend->>generateResponse: get conversation id | ||
| note over backend, generateResponse: cwsprChatTimeToFirstChunk | ||
| rect rgb(230, 230, 230, 0.5) | ||
| backend->>backend: generate first chunk | ||
| backend->>generateResponse: chunk received | ||
| end | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| loop for each subsequent chunk | ||
| backend->>backend: generate next chunk | ||
| backend->>generateResponse: chunk received | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| end | ||
| ``` | ||
|
|
||
| ### cwsprChatTimeBetweenChunks | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we clarify somewhere what a "chunk" is in a chat context?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've changed the description to include |
||
|
|
||
| An array of time when successive pieces of data are received from the server | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant user as User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode extension host | ||
| participant generateResponse as Generate response | ||
| participant backend as Q service backend | ||
|
|
||
| user->>chat: Presses enter with message | ||
| chat->>vscode: Tell VSCode to generate a response | ||
| vscode->>generateResponse: start generating | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| backend->>generateResponse: get conversation id | ||
|
|
||
| loop for each subsequent chunk | ||
| note over backend, generateResponse: cwsprChatTimeBetweenChunks | ||
| rect rgb(230, 230, 230, 0.5) | ||
| backend->>backend: generate next chunk | ||
| backend->>generateResponse: chunk received | ||
| generateResponse->>generateResponse: record timestamp | ||
| end | ||
|
|
||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| end | ||
| ``` | ||
|
|
||
| ### cwsprChatFullResponseLatency | ||
|
|
||
| The time between when the conversation id was created and the final response from the server was received | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant user as User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode extension host | ||
| participant generateResponse as Generate response | ||
| participant backend as Q service backend | ||
|
|
||
| user->>chat: Presses enter with message | ||
| chat->>vscode: Tell VSCode to generate a response | ||
| vscode->>generateResponse: start generating | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| backend->>generateResponse: get conversation id | ||
|
|
||
| note over backend, chat: cwsprChatFullResponseLatency | ||
| rect rgb(230, 230, 230, 0.5) | ||
| loop for each subsequent chunk | ||
| backend->>backend: generate next chunk | ||
| backend->>generateResponse: chunk received | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| end | ||
| backend->>generateResponse: final chunk received | ||
| end | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| ``` | ||
|
|
||
| ### cwsprChatTimeToFirstUsableChunk | ||
|
|
||
| The time between the initial server request, including creating the conversation stream, and the first usable result | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant user as User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode extension host | ||
| participant generateResponse as Generate response | ||
| participant backend as Q service backend | ||
|
|
||
| user->>chat: Presses enter with message | ||
| chat->>vscode: Tell VSCode to generate a response | ||
| vscode->>generateResponse: start generating | ||
| note over backend, generateResponse: cwsprChatTimeToFirstUsableChunk | ||
| rect rgb(230, 230, 230, 0.5) | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| backend->>generateResponse: get conversation id | ||
| backend->>backend: generate first chunk | ||
| backend->>generateResponse: chunk received | ||
| end | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| loop for each subsequent chunk | ||
| backend->>backend: generate next chunk | ||
| backend->>generateResponse: chunk received | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| end | ||
| ``` | ||
|
|
||
| ### cwsprChatFullServerResponseLatency | ||
|
|
||
| The time between the initial server request, including creating the conversation stream, and the final response from the server | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant user as User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode extension host | ||
| participant generateResponse as Generate response | ||
| participant backend as Q service backend | ||
|
|
||
| user->>chat: Presses enter with message | ||
| chat->>vscode: Tell VSCode to generate a response | ||
| vscode->>generateResponse: start generating | ||
| note over backend, chat: cwsprChatFullServerResponseLatency | ||
| rect rgb(230, 230, 230, 0.5) | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| backend->>generateResponse: get conversation id | ||
| loop for each subsequent chunk | ||
| backend->>backend: generate next chunk | ||
| backend->>generateResponse: chunk received | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| end | ||
| backend->>generateResponse: final chunk received | ||
| end | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| ``` | ||
|
|
||
| ### cwsprChatTimeToFirstDisplay | ||
|
|
||
| The time between the user pressing enter and when the first piece of data is displayed to the user | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant user as User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode extension host | ||
| participant generateResponse as Generate response | ||
| participant backend as Q service backend | ||
| note over backend, user: cwsprChatTimeToFirstDisplay | ||
| rect rgb(230, 230, 230, 0.5) | ||
| user->>chat: Presses enter with message | ||
| chat->>vscode: Tell VSCode to generate a response | ||
| vscode->>generateResponse: start generating | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| backend->>generateResponse: get conversation id | ||
| backend->>backend: generate first chunk | ||
| backend->>generateResponse: chunk received | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| end | ||
| loop for each subsequent chunk | ||
| backend->>backend: generate next chunk | ||
| backend->>generateResponse: chunk received | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| end | ||
| ``` | ||
|
|
||
| ### cwsprChatTimeBetweenDisplays | ||
|
|
||
| An array of time when successive pieces of server responses are displayed to the user | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant user as User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode extension host | ||
| participant generateResponse as Generate response | ||
| participant backend as Q service backend | ||
|
|
||
| user->>chat: Presses enter with message | ||
| chat->>vscode: Tell VSCode to generate a response | ||
| vscode->>generateResponse: start generating | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| backend->>generateResponse: get conversation id | ||
|
|
||
| note over backend, chat: cwsprChatTimeBetweenDisplays | ||
| rect rgb(230, 230, 230, 0.5) | ||
| loop for each subsequent chunk | ||
| backend->>backend: generate next chunk | ||
| backend->>generateResponse: chunk received | ||
| generateResponse->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| chat->>vscode: record display timestamp | ||
| end | ||
| end | ||
| ``` | ||
|
|
||
| ### cwsprChatFullDisplayLatency | ||
|
|
||
| The time between the user pressing enter and the entire response being rendered | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| participant user as User | ||
| participant chat as Chat UI | ||
| participant vscode as VSCode extension host | ||
| participant generateResponse as Generate response | ||
| participant backend as Q service backend | ||
|
|
||
| note over backend, user: cwsprChatFullDisplayLatency | ||
| rect rgb(230, 230, 230, 0.5) | ||
| user->>chat: Presses enter with message | ||
| chat->>vscode: Tell VSCode to generate a response | ||
| vscode->>generateResponse: start generating | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| backend->>generateResponse: get conversation id | ||
| generateResponse->>backend: start stream | ||
| backend->>backend: create conversation id | ||
| loop for each subsequent chunk | ||
| backend->>backend: generate next chunk | ||
| backend->>vscode: send chunk to display | ||
| vscode->>chat: display chunk | ||
| end | ||
| end | ||
|
|
||
| ``` | ||
|
|
||
| ## Crash Monitoring | ||
|
|
||
| We make an attempt to gather information regarding when the IDE crashes, then report it to telemetry. This is the diagram of the steps that take place. | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.