Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
285 changes: 285 additions & 0 deletions docs/telemetry-perf.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,291 @@ How long it took from when the user stopped pressing a key to when they were sho
end
```

## Amazon Q Chat

### amazonq_chatRoundTrip

Measures sequential response times in Q chat, from user input to message display. Tracks time intervals between key events: editor receiving the message, feature processing, and final message rendering

```mermaid
sequenceDiagram
participant User
participant chat as Chat UI
participant vscode as VSCode
participant event as Event Recorder
participant partner as Partner team code
participant telemetry

User->>chat: Write chat message and press enter
chat->>vscode: send message with timestamp
vscode->>event: record chatMessageSent/editorReceivedMessage timestamps
vscode->>partner: forward chat message
partner->>event: record featureReceivedMessage timestamp
partner->>partner: call backend/get response
partner->>vscode: forward response contents
vscode->>chat: display message
chat->>vscode: send stop-chat-message-telemetry event
vscode->>event: record messageDisplayed timestamp
vscode->>telemetry: emit amazonq_chatRoundTrip
```

### cwsprChatTimeToFirstChunk

The time between when the conversation id was created and when we got back the first usable result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the conversation id was created

Can this be reworded to something higher level like "when the first conversation message is sent" (IIUC)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "when the conversation stream was created"


```mermaid
sequenceDiagram
participant user as User
participant chat as Chat UI
participant vscode as VSCode
participant generateResponse as Generate response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the service client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its technically the name of the function that executes the calls to the backend, I wanted it to live seperate from the vscode extension host since the chat message -> vscode extension host happens completely seperately from that function. Let me know if theres a better name you suggest

participant backend

user->>chat: Presses enter with message
chat->>vscode: Tell VSCode to generate a response
vscode->>generateResponse: start generating
generateResponse->>backend: start stream
backend->>backend: create conversation id
backend->>generateResponse: get conversation id
note over backend, generateResponse: cwsprChatTimeToFirstChunk
rect rgb(230, 230, 230, 0.5)
backend->>backend: generate first chunk
backend->>generateResponse: chunk received
end
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
loop for each subsequent chunk
backend->>backend: generate next chunk
backend->>generateResponse: chunk received
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
end
```

### cwsprChatTimeBetweenChunks
Copy link
Contributor

@nkomonen-amazon nkomonen-amazon Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we clarify somewhere what a "chunk" is in a chat context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the description to include
"successive pieces" of data rather than "successive chunks". Does that make more sense?


An array of time when successive chunks of data are received from the server

```mermaid
sequenceDiagram
participant user as User
participant chat as Chat UI
participant vscode as VSCode
participant generateResponse as Generate response
participant backend

user->>chat: Presses enter with message
chat->>vscode: Tell VSCode to generate a response
vscode->>generateResponse: start generating
generateResponse->>backend: start stream
backend->>backend: create conversation id
backend->>generateResponse: get conversation id

loop for each subsequent chunk
note over backend, generateResponse: cwsprChatTimeBetweenChunks
rect rgb(230, 230, 230, 0.5)
backend->>backend: generate next chunk
backend->>generateResponse: chunk received
generateResponse->>generateResponse: record timestamp
end

generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
end
```

### cwsprChatFullResponseLatency

The time between when the conversation id was created and the final response from the server was received

```mermaid
sequenceDiagram
participant user as User
participant chat as Chat UI
participant vscode as VSCode
participant generateResponse as Generate response
participant backend

user->>chat: Presses enter with message
chat->>vscode: Tell VSCode to generate a response
vscode->>generateResponse: start generating
generateResponse->>backend: start stream
backend->>backend: create conversation id
backend->>generateResponse: get conversation id

note over backend, chat: cwsprChatFullResponseLatency
rect rgb(230, 230, 230, 0.5)
loop for each subsequent chunk
backend->>backend: generate next chunk
backend->>generateResponse: chunk received
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
end
backend->>generateResponse: final chunk received
end
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
```

### cwsprChatTimeToFirstUsableChunk

The time between the initial server request, including creating the conversation id, and the first usable result

```mermaid
sequenceDiagram
participant user as User
participant chat as Chat UI
participant vscode as VSCode
participant generateResponse as Generate response
participant backend

user->>chat: Presses enter with message
chat->>vscode: Tell VSCode to generate a response
vscode->>generateResponse: start generating
note over backend, generateResponse: cwsprChatTimeToFirstUsableChunk
rect rgb(230, 230, 230, 0.5)
generateResponse->>backend: start stream
backend->>backend: create conversation id
backend->>generateResponse: get conversation id
backend->>backend: generate first chunk
backend->>generateResponse: chunk received
end
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
loop for each subsequent chunk
backend->>backend: generate next chunk
backend->>generateResponse: chunk received
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
end
```

### cwsprChatFullServerResponseLatency

The time between the initial server request, including creating the conversation id, and the final response from the server

```mermaid
sequenceDiagram
participant user as User
participant chat as Chat UI
participant vscode as VSCode
participant generateResponse as Generate response
participant backend

user->>chat: Presses enter with message
chat->>vscode: Tell VSCode to generate a response
vscode->>generateResponse: start generating
note over backend, chat: cwsprChatFullServerResponseLatency
rect rgb(230, 230, 230, 0.5)
generateResponse->>backend: start stream
backend->>backend: create conversation id
backend->>generateResponse: get conversation id
loop for each subsequent chunk
backend->>backend: generate next chunk
backend->>generateResponse: chunk received
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
end
backend->>generateResponse: final chunk received
end
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
```

### cwsprChatTimeToFirstDisplay

The time between the user pressing enter and when the first chunk of data is displayed to the user

```mermaid
sequenceDiagram
participant user as User
participant chat as Chat UI
participant vscode as VSCode
participant generateResponse as Generate response
participant backend
note over backend, user: cwsprChatTimeToFirstDisplay
rect rgb(230, 230, 230, 0.5)
user->>chat: Presses enter with message
chat->>vscode: Tell VSCode to generate a response
vscode->>generateResponse: start generating
generateResponse->>backend: start stream
backend->>backend: create conversation id
backend->>generateResponse: get conversation id
backend->>backend: generate first chunk
backend->>generateResponse: chunk received
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
end
loop for each subsequent chunk
backend->>backend: generate next chunk
backend->>generateResponse: chunk received
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
end
```

### cwsprChatTimeBetweenDisplays

An array of time when successive chunks of server responses are displayed to the user

```mermaid
sequenceDiagram
participant user as User
participant chat as Chat UI
participant vscode as VSCode
participant generateResponse as Generate response
participant backend

user->>chat: Presses enter with message
chat->>vscode: Tell VSCode to generate a response
vscode->>generateResponse: start generating
generateResponse->>backend: start stream
backend->>backend: create conversation id
backend->>generateResponse: get conversation id

note over backend, chat: cwsprChatTimeBetweenDisplays
rect rgb(230, 230, 230, 0.5)
loop for each subsequent chunk
backend->>backend: generate next chunk
backend->>generateResponse: chunk received
generateResponse->>vscode: send chunk to display
vscode->>chat: display chunk
chat->>vscode: record display timestamp
end
end
```

### cwsprChatFullDisplayLatency

The time between the user pressing enter and the entire response being rendered

```mermaid
sequenceDiagram
participant user as User
participant chat as Chat UI
participant vscode as VSCode
participant generateResponse as Generate response
participant backend

note over backend, user: cwsprChatFullDisplayLatency
rect rgb(230, 230, 230, 0.5)
user->>chat: Presses enter with message
chat->>vscode: Tell VSCode to generate a response
vscode->>generateResponse: start generating
generateResponse->>backend: start stream
backend->>backend: create conversation id
backend->>generateResponse: get conversation id
generateResponse->>backend: start stream
backend->>backend: create conversation id
loop for each subsequent chunk
backend->>backend: generate next chunk
backend->>vscode: send chunk to display
vscode->>chat: display chunk
end
end

```

## Crash Monitoring

We make an attempt to gather information regarding when the IDE crashes, then report it to telemetry. This is the diagram of the steps that take place.
Expand Down
Loading