Skip to content

Conversation

@nkomonen-amazon
Copy link
Contributor

@nkomonen-amazon nkomonen-amazon commented Feb 19, 2025

Problem

We noticed that there was an auth dropoff between session_start with a brand new clientId, versus when the auth_userState metric indicated isFirstUse meaning the user is net new. We went from 12.7k for session_start to 9.8k for auth_userState, these should have been basically the same.

Solution

Add in certain metrics to help debug where the discrepancy is coming from:

  • When we determine the user is a first time user, we will also check if the clientId is newly generated. If this is not the case we know there is a discrepancy here
    • The relevant metric will be function_call with a functionName: isFirstUse, result: Failed, and a reason: ClientIdAlreadyExisted
  • When we determine the user is a first time user, if we detected that they had previous auth connections, this will indicate a likely cause for the discrepancy
    • The relevant metric will be function_call with reason: UnexpectedConnections
  • We will emit metrics when the Auth Login page loads since that also had a discrepancy and the telemetry did not exist
    • The relevant metric is webview_load and it will indicate when the Auth Login/Reauth page has actually loaded
    • Previously we were observing the telemetry for the command aws.amazonq.focusChat, but all this did was emit when called and didn't confirm the UI actually loaded.
  • We will also add the isFirstUse metric source value in to some other existing metrics

  • Treat all work as PUBLIC. Private feature/x branches will not be squash-merged at release time.
  • Your code changes must meet the guidelines in CONTRIBUTING.md.
  • License: I confirm that my contribution is made under the terms of the Apache 2.0 license.

Avoid any invalid logic involved with hasExistingConnections()

Signed-off-by: nkomonen-amazon <[email protected]>
Problem:

We were seeing some cases where the ClientId was seen as new, but
ifFirstUse was reporting as not new.

Solution:

Validate that if we have a new ClientId, that isFirstUse MUST be true.
Otherwise we emit an error, that should help us debug certain cases.

Signed-off-by: nkomonen-amazon <[email protected]>
Before, on a users first use, it was technically possible for Q to indicate it was NOT the first use
if somehow there were existing connections detected.

A previous commit removed that case. But we want to now emit
telemetry if we detected that we WOULD have run in to that case when not
expected.

Signed-off-by: nkomonen-amazon <[email protected]>
Problem:

The inital problem is that when we focused chat, it returned succeeded even though it didn't
have any idea if the UI was actually rendered to the user.

Solution:

This new solution will have the UI emit a telemetry metric once per session that indicates
when the UI has loaded. It will also distinguish between the login and reauth page.

The metric is `webview_load` and the `webviewName` field will distinguish between login and reauth

NOTE:

When Q chat itself has its UI ready we already emit a `webview_load` metric. So use all of these metrics
to determine what the user actually saw

Signed-off-by: nkomonen-amazon <[email protected]>
Signed-off-by: nkomonen-amazon <[email protected]>
@nkomonen-amazon nkomonen-amazon marked this pull request as ready for review February 19, 2025 18:43
@nkomonen-amazon nkomonen-amazon requested a review from a team as a code owner February 19, 2025 18:43
@nkomonen-amazon
Copy link
Contributor Author

/runIntegrationTests

}

private didCall: { login: boolean; reauth: boolean } = { login: false, reauth: false }
public setUiReady(state: 'login' | 'reauth') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this something we could just use the once() util for? That way you don't have to keep track of the extra state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to emit once for login and reauth separately which didn't allow me to use once(). But I think in a future PR we could do some memoize + once to address this


describe('ExtensionUse.isFirstUse()', function () {
let instance: ExtensionUse
const notHasExistingConnections = () => false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be hasNoExistingConnections ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll follow up with the name change after. Don't want to have to run CI again unless theres something critical.

}

if (isAmazonQ()) {
this.isFirstUseCurrentSession = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: why is this always true for amazonQ?

Copy link
Contributor Author

@nkomonen-amazon nkomonen-amazon Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was previous logic where if we detected an auth connection in Toolkit that we would not consider it a first time user. But this doesn't make sense for Q.

Also if they get this far in the function it is guaranteed they are a first time user, since they would have otherwise returned earlier

]
},
{
"name": "session_start",
Copy link
Contributor

@justinmk3 justinmk3 Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? There is already a metric for starting the plugin. Adding similar, semantically-redundant metrics will make it harder to reason about the lifecycle and telemetry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the metric?

Copy link
Contributor

@justinmk3 justinmk3 Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this metric already exists. So different question is can the existing metric be updated instead , in the common repo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I plan to do this. It was just due to time constraints. I'll follow up after this PR

"description": "When the Amazon Q sign in page is opened and focused.",
"metadata": [
{
"type": "source",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm being rushed to get the change out for this release, but I will port this over after

// Our callback won't fire on the first view.
if (webviewView.visible) {
telemetry.auth_signInPageOpened.emit({ result: 'Succeeded', passive: true })
telemetry.auth_signInPageOpened.emit({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we modify this metric and make it more general in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we need to better structure out telemetry since it is messy with what "opened" means. We can also collapse all webviews in to a single metric probably

Copy link
Contributor

@jpinkney-aws jpinkney-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the time constraints we're under for testing the amazon q changes, I don't see anything majorly blocking. I think we definitely need follow up PRs for things like porting back to telemetry though.

Also, is some of this code going to live on after us finding the problem or will we scrap it afterwords?

@nkomonen-amazon
Copy link
Contributor Author

@jpinkney-aws

Some of these metrics will be removed (mainly the function_call ones to help debug). But the others like session_start and auth_signInPageOpened will be ported. Though I think we may need to look at adding something like source as a global field instead since we have so many cases for it.

I will look in to a fast follow-up.

@nkomonen-amazon nkomonen-amazon merged commit 35502be into aws:master Feb 19, 2025
31 of 32 checks passed
@nkomonen-amazon nkomonen-amazon deleted the authDropoff branch February 19, 2025 19:17
avi-alpert pushed a commit to leigaol/aws-toolkit-vscode that referenced this pull request Feb 19, 2025
## Problem

We noticed that there was an auth dropoff between `session_start` with a
brand new clientId, versus when the `auth_userState` metric indicated
`isFirstUse` meaning the user is net new. We went from 12.7k for
`session_start` to 9.8k for `auth_userState`, these should have been
basically the same.

## Solution

Add in certain metrics to help debug where the discrepancy is coming
from:

- When we determine the user is a first time user, we will also check if
the clientId is newly generated. If this is not the case we know there
is a discrepancy here
- The relevant metric will be `function_call` with a `functionName:
isFirstUse`, `result: Failed`, and a `reason: ClientIdAlreadyExisted`
- When we determine the user is a first time user, if we detected that
they had previous auth connections, this will indicate a likely cause
for the discrepancy
- The relevant metric will be `function_call` with `reason:
UnexpectedConnections`
- We will emit metrics when the Auth Login page loads since that also
had a discrepancy and the telemetry did not exist
- The relevant metric is `webview_load` and it will indicate when the
Auth Login/Reauth page has actually loaded
- Previously we were observing the telemetry for the command
`aws.amazonq.focusChat`, but all this did was emit when called and
didn't confirm the UI actually loaded.
- We will also add the `isFirstUse` metric source value in to some other
existing metrics

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

---------

Signed-off-by: nkomonen-amazon <[email protected]>
leigaol pushed a commit to leigaol/aws-toolkit-vscode that referenced this pull request Feb 19, 2025
* fix(appbuilder): pass in the negative version of --use-container when using build quickpick (aws#6603)

## Problem
When users use appbuilder to build their lambda functions, they choose
between using their samconfig file or manually selecting the build
parameters/flags. The problem is that when the user selects build flags
and intentionally doesn't select the ```--use-container``` flag, the
command will still be run with --use-container if the samconfig file has
```use_container``` is set to true.

## Solution
Whenever the user manually selects the build flags and doesn't select
```--use-container```, we add the negative version of
```--use-container```, which is ```--no-use-container```. This serves as
an override if the samconfig file has ```--use-container``` set to true.

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

* ci(jscpd): merge target branch in jscpd to avoid false negatives.  (aws#6572)

## Problem
- Follow up to
aws#6564 (review).

## Solution
- It appears that there is an undocumented "feature" that GHA don't run
when there is a merge conflict. See
[here](https://github.com/orgs/community/discussions/11265)
- This means we don't have to handle the failure case where a merge
fails.
- Add fake config identity to mitigate this error: 
<img width="913" alt="image"
src="https://github.com/user-attachments/assets/cd426ec7-e1ca-4d13-a3b1-3985b5593c07"
/>


## Notes
Going to let this sit and make sure it works as changes are merged into
master.


---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

---------

Co-authored-by: Justin M. Keyes <[email protected]>

* ci: fix and enable post-release notification (aws#6613)

- Enable for prod runs
- Fix script slightly because the way codebuild runs bash and the way my
local runs bash seems to not be the same.
- Tested on dev release pipeline

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

* refactor: notify.txt typos (aws#6616)

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

* config(amazonq): update polling config for codefix (aws#6617)

## Problem

Increase in codefix timeouts


## Solution

Increase default timeout and lower the polling frequency


---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

* fix(amazonq): auto-review removes existing issues (aws#6535)

## Problem

Auto-reviews often produce less code issues than manual reviews, but the
current behavior is to remove all the issues in the file when processing
the new ones. This means that issues discovered by manual reviews but
not auto-reviews will silently disappear if auto-reviews is enabled.


## Solution

- Auto-reviews should not clear the previous issues, but instead merge
in the new results to the existing group.
- Fixed a related issue with the `ignoreIssue` command being flaky


---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

* feat(amazonq): /doc: add support for infrastructure diagrams (aws#6561)

Problem:
- Amazon Q does not have support for infrastructure diagrams

Solution:
- Add support for them



![infra_diagram](https://github.com/user-attachments/assets/79693ab0-d95d-415e-8daf-04d59fed8573)

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

Co-authored-by: Viktor Shesternyak <[email protected]>

* refactor(cleanup): remove dead code (aws#6619)

This does nothing anymore


---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

Signed-off-by: nkomonen-amazon <[email protected]>

* telemetry(auth): Update metrics to better debug Auth dropoff (aws#6625)

## Problem

We noticed that there was an auth dropoff between `session_start` with a
brand new clientId, versus when the `auth_userState` metric indicated
`isFirstUse` meaning the user is net new. We went from 12.7k for
`session_start` to 9.8k for `auth_userState`, these should have been
basically the same.

## Solution

Add in certain metrics to help debug where the discrepancy is coming
from:

- When we determine the user is a first time user, we will also check if
the clientId is newly generated. If this is not the case we know there
is a discrepancy here
- The relevant metric will be `function_call` with a `functionName:
isFirstUse`, `result: Failed`, and a `reason: ClientIdAlreadyExisted`
- When we determine the user is a first time user, if we detected that
they had previous auth connections, this will indicate a likely cause
for the discrepancy
- The relevant metric will be `function_call` with `reason:
UnexpectedConnections`
- We will emit metrics when the Auth Login page loads since that also
had a discrepancy and the telemetry did not exist
- The relevant metric is `webview_load` and it will indicate when the
Auth Login/Reauth page has actually loaded
- Previously we were observing the telemetry for the command
`aws.amazonq.focusChat`, but all this did was emit when called and
didn't confirm the UI actually loaded.
- We will also add the `isFirstUse` metric source value in to some other
existing metrics

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

---------

Signed-off-by: nkomonen-amazon <[email protected]>

---------

Signed-off-by: nkomonen-amazon <[email protected]>
Co-authored-by: Frederic Mbea <[email protected]>
Co-authored-by: Hweinstock <[email protected]>
Co-authored-by: Justin M. Keyes <[email protected]>
Co-authored-by: Maxim Hayes <[email protected]>
Co-authored-by: Tai Lai <[email protected]>
Co-authored-by: Viktor Shesternyak <[email protected]>
Co-authored-by: Viktor Shesternyak <[email protected]>
Co-authored-by: Nikolas Komonen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants