fix(auth): token refresh rapidly called unexpectedly #6479

nkomonen-amazon · 2025-01-31T23:16:36Z

Problem:

getChatAuthState() is called in many places by the Q features simultaneously,
this eventually triggers multiple calls to getToken() and if needed refreshToken().

This resulted in refreshToken being spammed and the Identity team seeing spikes in token refreshes
from clients.

Solution:

Throttle getChatAuthState().

Throttling w/ leading: true, allows us to instantly return
a fresh result OR a cached result in the case we are throttled. Debounce on the
other hand would cause callers to hang since they have to wait for debounce to timeout.

Also, we put a debounce on getToken() before in #6282 but this did not work since a new
SsoAccessToken instance is created each time the offending code flow triggered (we could
look to cache the instance instead which would enable the getToken() debounce to be useful.

Testing

To test the difference after adding the throttle:

Add log statements to getToken()
Set an expired date in the SSO cache for both token expiration + client registration expiration
Use chat

What would happen is that without throttle it would trigger getChatAuthState() many times, likely due to the connection
becoming invalid and sending an event to all Q features, causing each of them to call getChatAuthState() at the same time.

But when the throttle was added, the amount of these calls dropped to at most 2.

Signed-off-by: nkomonen-amazon [email protected]

Treat all work as PUBLIC. Private feature/x branches will not be squash-merged at release time.
Your code changes must meet the guidelines in CONTRIBUTING.md.
License: I confirm that my contribution is made under the terms of the Apache 2.0 license.

Problem: getChatAuthState() is called in many places by the Q features simultaneously, this eventually triggers multiple calls to getToken() and if needed refreshToken(). The resulted in refreshToken being spammed and the Identity team seeing spikes in token refreshes from clients. Solution: Throttle getChatAuthState(). Throttling w/ leading: true, allows us to instantly return a fresh result OR a cached result in the case we are throttled. Debounce on the other hand would cause callers to hang since they have to wait for debounce to timeout. Also, we put a debounce on getToken() before but this did not work since a new SsoAccessToken instance is created each time the offending code flow triggered (we could look to cache the instance instead which would enable the getToken() debounce to be useful. Signed-off-by: nkomonen-amazon <[email protected]>

Since this is not required to fix the original issue we will revert this previous change to reduce potential confusion. I will create a SIM ticket to describe a better overall fix though Signed-off-by: nkomonen-amazon <[email protected]>

github-actions · 2025-01-31T23:16:50Z

This pull request implements a feat or fix, so it must include a changelog entry (unless the fix is for an unreleased feature). Review the changelog guidelines.
- Note: beta or "experiment" features that have active users should announce fixes in the changelog.
- If this is not a feature or fix, use an appropriate type from the title guidelines. For example, telemetry-only changes should use the telemetry type.

Hweinstock · 2025-02-03T14:32:13Z

packages/core/src/codewhisperer/util/authUtil.ts

 import { asStringifiedStack } from '../../shared/telemetry/spans'
 import { withTelemetryContext } from '../../shared/telemetry/util'
 import { focusAmazonQPanel } from '../../codewhispererChat/commands/registerCommands'
+import { throttle } from 'lodash'


I believe we have a backlog item to remove dependency on lodash for performance reasons. The work was started here: #5157

After some discussions w/ our move to the LSP work we will probably be able to drop this code, getting rid of the dependency. Also there is currently no existing throttle implementation. We will either need to build our own or add a new dependency. Ticket for removing lodash is here: https://taskei.amazon.dev/tasks/IDE-10293

FWIW it looks like throttle just uses debounce under the hood: https://github.com/lodash/lodash/blob/npm/throttle.js

Signed-off-by: nkomonen-amazon <[email protected]>

## Problem: getChatAuthState() is called in many places by the Q features simultaneously, this eventually triggers multiple calls to getToken() and if needed refreshToken(). This resulted in refreshToken being spammed and the Identity team seeing spikes in token refreshes from clients. ## Solution: Throttle getChatAuthState(). Throttling w/ leading: true, allows us to instantly return a fresh result OR a cached result in the case we are throttled. Debounce on the other hand would cause callers to hang since they have to wait for debounce to timeout. Also, we put a debounce on getToken() before in aws#6282 but this did not work since a new SsoAccessToken instance is created each time the offending code flow triggered (we could look to cache the instance instead which would enable the getToken() debounce to be useful. ### Testing To test the difference after adding the throttle: - Add log statements to `getToken()` - Set an expired date in the SSO cache for both token expiration + client registration expiration - Use chat What would happen is that without throttle it would trigger getChatAuthState() many times, likely due to the connection becoming invalid and sending an event to all Q features, causing each of them to call getChatAuthState() at the same time. But when the throttle was added, the amount of these calls dropped to at most 2. Signed-off-by: nkomonen-amazon <[email protected]> --- - Treat all work as PUBLIC. Private `feature/x` branches will not be squash-merged at release time. - Your code changes must meet the guidelines in [CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines). - License: I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: nkomonen-amazon <[email protected]>

nkomonen-amazon added 2 commits January 31, 2025 18:03

revert getToken() debounce

4228cfa

Since this is not required to fix the original issue we will revert this previous change to reduce potential confusion. I will create a SIM ticket to describe a better overall fix though Signed-off-by: nkomonen-amazon <[email protected]>

nkomonen-amazon requested review from a team as code owners January 31, 2025 23:16

jpinkney-aws approved these changes Feb 3, 2025

View reviewed changes

Hweinstock reviewed Feb 3, 2025

View reviewed changes

remove old test

33e3c5f

Signed-off-by: nkomonen-amazon <[email protected]>

nkomonen-amazon merged commit 353aa27 into aws:master Feb 7, 2025
25 of 26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(auth): token refresh rapidly called unexpectedly #6479

fix(auth): token refresh rapidly called unexpectedly #6479

Uh oh!

nkomonen-amazon commented Jan 31, 2025

Uh oh!

github-actions bot commented Jan 31, 2025

Uh oh!

Hweinstock Feb 3, 2025

Uh oh!

nkomonen-amazon Feb 3, 2025 •

edited

Loading

Uh oh!

jpinkney-aws Feb 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix(auth): token refresh rapidly called unexpectedly #6479

fix(auth): token refresh rapidly called unexpectedly #6479

Uh oh!

Conversation

nkomonen-amazon commented Jan 31, 2025

Problem:

Solution:

Testing

Uh oh!

github-actions bot commented Jan 31, 2025

Uh oh!

Hweinstock Feb 3, 2025

Choose a reason for hiding this comment

Uh oh!

nkomonen-amazon Feb 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jpinkney-aws Feb 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nkomonen-amazon Feb 3, 2025 •

edited

Loading