Skip to content

Commit b60036b

Browse files
authored
fix(amazonq): retry LSP token refresh on recoverable errors only. (aws#7192)
## Problem Some users were having invalid bearer token exceptions when making chat requests through agentic chat. This highlights an issue in how we sync the token. Originally, we thought this was because of a race condition. We check the token every 10 seconds for changes, and if its expired, refresh it and send it to the language server. However, that still leaves a 10 second gap where the language server could use an expired token before we update it. However, we add a buffer of one minute to our `isExpired` check here: https://github.com/aws/aws-toolkit-vscode/blob/db673c9b74b36591bb5642b3da7d4bc7ae2afaf4/packages/core/src/auth/sso/model.ts#L160 Therefore, this causes the token to expire a minute early, meaning there is a full minute, where our checks should detect the expired token and refresh it both locally and on the language server. However, they don't because the checks aren't actually being made. This is because of how we handle token refresh errors. Note the following behavior: - if refresh throws a recoverable error, we throw it. See [here](https://github.com/aws/aws-toolkit-vscode/blob/6dbb21e50e539c5973586714295e3ce066b030ef/packages/core/src/auth/auth.ts#L856-L878). - if refresh throws a non-recoverable error we invalidate the connection. see [here](https://github.com/aws/aws-toolkit-vscode/blob/6dbb21e50e539c5973586714295e3ce066b030ef/packages/core/src/auth/auth.ts#L893-L948). The current `refreshConnection` logic doesn't work with this because we continuously try to refresh until it throws an error. However, based on the behavior above, we do want to retry on refresh errors since that means the error is recoverable! Additionally, we don't want to retry when token refreshes result in an invalid connection because those are nonrecoverable errors. ## Solution - Refactor our token refreshes to log errors thrown and continue to retry (since these are implicitly recoverable errors). - Avoid refreshing when the connection state is invalid (since this implies an unrecoverable error). ## Notes - Flare auth can't come soon enough. This behavior is not obvious to a consumer, and leads to bugs like this. - debugged w/ @nkomonen-amazon --- - Treat all work as PUBLIC. Private `feature/x` branches will not be squash-merged at release time. - Your code changes must meet the guidelines in [CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines). - License: I confirm that my contribution is made under the terms of the Apache 2.0 license.
1 parent 8f3fb44 commit b60036b

File tree

1 file changed

+14
-8
lines changed

1 file changed

+14
-8
lines changed

packages/amazonq/src/lsp/auth.ts

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -66,20 +66,29 @@ export const notificationTypes = {
6666
* Facade over our VSCode Auth that does crud operations on the language server auth
6767
*/
6868
export class AmazonQLspAuth {
69-
constructor(private readonly client: LanguageClient) {}
69+
#logErrorIfChanged = onceChanged((s) => getLogger('amazonqLsp').error(s))
70+
constructor(
71+
private readonly client: LanguageClient,
72+
private readonly authUtil: AuthUtil = AuthUtil.instance
73+
) {}
7074

7175
/**
7276
* @param force bypass memoization, and forcefully update the bearer token
7377
*/
7478
async refreshConnection(force: boolean = false) {
75-
const activeConnection = AuthUtil.instance.auth.activeConnection
76-
if (activeConnection?.type === 'sso') {
79+
const activeConnection = this.authUtil.auth.activeConnection
80+
if (activeConnection?.state === 'valid' && activeConnection?.type === 'sso') {
7781
// send the token to the language server
78-
const token = await AuthUtil.instance.getBearerToken()
82+
const token = await this.authUtil.getBearerToken()
7983
await (force ? this._updateBearerToken(token) : this.updateBearerToken(token))
8084
}
8185
}
8286

87+
async logRefreshError(e: unknown) {
88+
const err = e as Error
89+
this.#logErrorIfChanged(`Unable to update bearer token: ${err.name}:${err.message}`)
90+
}
91+
8392
public updateBearerToken = onceChanged(this._updateBearerToken.bind(this))
8493
private async _updateBearerToken(token: string) {
8594
const request = await this.createUpdateCredentialsRequest({
@@ -93,10 +102,7 @@ export class AmazonQLspAuth {
93102

94103
public startTokenRefreshInterval(pollingTime: number = oneMinute / 2) {
95104
const interval = setInterval(async () => {
96-
await this.refreshConnection().catch((e) => {
97-
getLogger('amazonqLsp').error('Unable to update bearer token: %s', (e as Error).message)
98-
clearInterval(interval)
99-
})
105+
await this.refreshConnection().catch((e) => this.logRefreshError(e))
100106
}, pollingTime)
101107
return interval
102108
}

0 commit comments

Comments
 (0)