Commit b60036b
authored
fix(amazonq): retry LSP token refresh on recoverable errors only. (aws#7192)
## Problem
Some users were having invalid bearer token exceptions when making chat
requests through agentic chat. This highlights an issue in how we sync
the token.
Originally, we thought this was because of a race condition. We check
the token every 10 seconds for changes, and if its expired, refresh it
and send it to the language server. However, that still leaves a 10
second gap where the language server could use an expired token before
we update it.
However, we add a buffer of one minute to our `isExpired` check here:
https://github.com/aws/aws-toolkit-vscode/blob/db673c9b74b36591bb5642b3da7d4bc7ae2afaf4/packages/core/src/auth/sso/model.ts#L160
Therefore, this causes the token to expire a minute early, meaning there
is a full minute, where our checks should detect the expired token and
refresh it both locally and on the language server. However, they don't
because the checks aren't actually being made.
This is because of how we handle token refresh errors. Note the
following behavior:
- if refresh throws a recoverable error, we throw it. See
[here](https://github.com/aws/aws-toolkit-vscode/blob/6dbb21e50e539c5973586714295e3ce066b030ef/packages/core/src/auth/auth.ts#L856-L878).
- if refresh throws a non-recoverable error we invalidate the
connection. see
[here](https://github.com/aws/aws-toolkit-vscode/blob/6dbb21e50e539c5973586714295e3ce066b030ef/packages/core/src/auth/auth.ts#L893-L948).
The current `refreshConnection` logic doesn't work with this because we
continuously try to refresh until it throws an error. However, based on
the behavior above, we do want to retry on refresh errors since that
means the error is recoverable! Additionally, we don't want to retry
when token refreshes result in an invalid connection because those are
nonrecoverable errors.
## Solution
- Refactor our token refreshes to log errors thrown and continue to
retry (since these are implicitly recoverable errors).
- Avoid refreshing when the connection state is invalid (since this
implies an unrecoverable error).
## Notes
- Flare auth can't come soon enough. This behavior is not obvious to a
consumer, and leads to bugs like this.
- debugged w/ @nkomonen-amazon
---
- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.1 parent 8f3fb44 commit b60036b
1 file changed
+14
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
69 | | - | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
70 | 74 | | |
71 | 75 | | |
72 | 76 | | |
73 | 77 | | |
74 | 78 | | |
75 | | - | |
76 | | - | |
| 79 | + | |
| 80 | + | |
77 | 81 | | |
78 | | - | |
| 82 | + | |
79 | 83 | | |
80 | 84 | | |
81 | 85 | | |
82 | 86 | | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
83 | 92 | | |
84 | 93 | | |
85 | 94 | | |
| |||
93 | 102 | | |
94 | 103 | | |
95 | 104 | | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
| 105 | + | |
100 | 106 | | |
101 | 107 | | |
102 | 108 | | |
| |||
0 commit comments