Skip to content

Conversation

@nkomonen-amazon
Copy link
Contributor

Problem:

getChatAuthState() is called in many places by the Q features simultaneously,
this eventually triggers multiple calls to getToken() and if needed refreshToken().

This resulted in refreshToken being spammed and the Identity team seeing spikes in token refreshes
from clients.

Solution:

Throttle getChatAuthState().

Throttling w/ leading: true, allows us to instantly return
a fresh result OR a cached result in the case we are throttled. Debounce on the
other hand would cause callers to hang since they have to wait for debounce to timeout.

Also, we put a debounce on getToken() before in #6282 but this did not work since a new
SsoAccessToken instance is created each time the offending code flow triggered (we could
look to cache the instance instead which would enable the getToken() debounce to be useful.

Testing

To test the difference after adding the throttle:

  • Add log statements to getToken()
  • Set an expired date in the SSO cache for both token expiration + client registration expiration
  • Use chat

What would happen is that without throttle it would trigger getChatAuthState() many times, likely due to the connection
becoming invalid and sending an event to all Q features, causing each of them to call getChatAuthState() at the same time.

But when the throttle was added, the amount of these calls dropped to at most 2.

Signed-off-by: nkomonen-amazon [email protected]


  • Treat all work as PUBLIC. Private feature/x branches will not be squash-merged at release time.
  • Your code changes must meet the guidelines in CONTRIBUTING.md.
  • License: I confirm that my contribution is made under the terms of the Apache 2.0 license.

Problem:

getChatAuthState() is called in many places by the Q features simultaneously,
this eventually triggers multiple calls to getToken() and if needed refreshToken().

The resulted in refreshToken being spammed and the Identity team seeing spikes in token refreshes
from clients.

Solution:

Throttle getChatAuthState().

Throttling w/ leading: true, allows us to instantly return
a fresh result OR a cached result in the case we are throttled. Debounce on the
other hand would cause callers to hang since they have to wait for debounce to timeout.

Also, we put a debounce on getToken() before but this did not work since a new
SsoAccessToken instance is created each time the offending code flow triggered (we could
look to cache the instance instead which would enable the getToken() debounce to be useful.

Signed-off-by: nkomonen-amazon <[email protected]>
Since this is not required to fix the original issue we will
revert this previous change to reduce potential confusion.

I will create a SIM ticket to describe a better overall fix though

Signed-off-by: nkomonen-amazon <[email protected]>
@nkomonen-amazon nkomonen-amazon requested review from a team as code owners January 31, 2025 23:16
@github-actions
Copy link

  • This pull request implements a feat or fix, so it must include a changelog entry (unless the fix is for an unreleased feature). Review the changelog guidelines.
    • Note: beta or "experiment" features that have active users should announce fixes in the changelog.
    • If this is not a feature or fix, use an appropriate type from the title guidelines. For example, telemetry-only changes should use the telemetry type.

import { asStringifiedStack } from '../../shared/telemetry/spans'
import { withTelemetryContext } from '../../shared/telemetry/util'
import { focusAmazonQPanel } from '../../codewhispererChat/commands/registerCommands'
import { throttle } from 'lodash'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have a backlog item to remove dependency on lodash for performance reasons. The work was started here: #5157

Copy link
Contributor Author

@nkomonen-amazon nkomonen-amazon Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some discussions w/ our move to the LSP work we will probably be able to drop this code, getting rid of the dependency. Also there is currently no existing throttle implementation. We will either need to build our own or add a new dependency. Ticket for removing lodash is here: https://taskei.amazon.dev/tasks/IDE-10293

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW it looks like throttle just uses debounce under the hood: https://github.com/lodash/lodash/blob/npm/throttle.js

Signed-off-by: nkomonen-amazon <[email protected]>
@nkomonen-amazon nkomonen-amazon merged commit 353aa27 into aws:master Feb 7, 2025
25 of 26 checks passed
s7ab059789 pushed a commit to s7ab059789/aws-toolkit-vscode that referenced this pull request Feb 19, 2025
## Problem:

getChatAuthState() is called in many places by the Q features
simultaneously,
this eventually triggers multiple calls to getToken() and if needed
refreshToken().

This resulted in refreshToken being spammed and the Identity team seeing
spikes in token refreshes
from clients.

## Solution:

Throttle getChatAuthState().

Throttling w/ leading: true, allows us to instantly return
a fresh result OR a cached result in the case we are throttled. Debounce
on the
other hand would cause callers to hang since they have to wait for
debounce to timeout.

Also, we put a debounce on getToken() before in aws#6282 but this did not
work since a new
SsoAccessToken instance is created each time the offending code flow
triggered (we could
look to cache the instance instead which would enable the getToken()
debounce to be useful.

### Testing

To test the difference after adding the throttle:
- Add log statements to `getToken()`
- Set an expired date in the SSO cache for both token expiration +
client registration expiration
- Use chat

What would happen is that without throttle it would trigger
getChatAuthState() many times, likely due to the connection
becoming invalid and sending an event to all Q features, causing each of
them to call getChatAuthState() at the same time.

But when the throttle was added, the amount of these calls dropped to at
most 2.

Signed-off-by: nkomonen-amazon <[email protected]>


---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.

---------

Signed-off-by: nkomonen-amazon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants