fix(ui): stop polling on expired/invalid key auth error in PriceDataReload#24350
fix(ui): stop polling on expired/invalid key auth error in PriceDataReload#24350xykong wants to merge 1 commit intoBerriAI:mainfrom
Conversation
…eload When a user's virtual key expires, the JWT cookie may still be valid so they remain logged in. The PriceDataReload component polls /schedule/model_cost_map_reload/status and /model/cost_map/source every 30 seconds, triggering repeated ProxyException (Expired Key, HTTP 400) errors that spam the server logs. Changes: - Extract isAuthError() helper (module-level) to detect HTTP 400/401 responses - Wrap fetchReloadStatus and fetchSourceInfo with useCallback - On auth error, call stopPolling() which clears the interval and sets pollingDisabled=true to prevent further polling - Show an Alert banner when polling is paused, prompting the user to re-login - useEffect now depends on pollingDisabled so it restarts correctly after a fresh login (accessToken change resets pollingDisabled to false) Fixes: repeated ProxyException auth errors from expired virtual keys
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes a polling spam issue in Key changes:
Issues found:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| ui/litellm-dashboard/src/components/price_data_reload.tsx | Adds auth-error detection and polling backoff to stop spamming the server on expired keys. Core mechanism is correct but has a logic bug: pollingDisabled is never reset when accessToken changes, so polling won't restart after re-login as claimed. The isAuthError regex is also fragile, and the "expired" alert persists after a successful manual hard-refresh. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A([Component mounts / accessToken changes]) --> B{accessToken present\nAND pollingDisabled === false?}
B -- No --> C([useEffect early return — no polling])
B -- Yes --> D[Initial fetch: fetchReloadStatus + fetchSourceInfo]
D --> E[Start setInterval 30s]
E --> F{Interval fires}
F --> G[fetchReloadStatus]
F --> H[fetchSourceInfo]
G --> I{Auth error\nHTTP 400/401?}
H --> J{Auth error\nHTTP 400/401?}
I -- Yes --> K[stopPolling: clearInterval + setPollingDisabled true]
J -- Yes --> K
I -- No --> L[Update reloadStatus state]
J -- No --> M[Update sourceInfo state]
K --> N[Show Alert warning banner]
N --> O{User re-logs in\nnew accessToken prop?}
O -- Yes --> P["useEffect re-runs BUT pollingDisabled still true ⚠️"]
P --> C
O -- No --> Q([Polling stays stopped])
Reviews (1): Last reviewed commit: "fix(ui): stop polling on expired/invalid..." | Re-trigger Greptile
| useEffect(() => { | ||
| if (!accessToken || pollingDisabled) return; | ||
|
|
||
| fetchReloadStatus(); | ||
| fetchSourceInfo(); | ||
|
|
||
| intervalRef.current = setInterval(() => { | ||
| fetchReloadStatus(); | ||
| fetchSourceInfo(); | ||
| }, 30000); | ||
|
|
||
| return () => { | ||
| if (intervalRef.current !== null) { | ||
| clearInterval(intervalRef.current); | ||
| intervalRef.current = null; | ||
| } | ||
| }; | ||
| }, [accessToken, pollingDisabled, fetchReloadStatus, fetchSourceInfo]); |
There was a problem hiding this comment.
Polling never resumes after re-login
The PR description states that "polling restarts correctly after fresh login (new accessToken)", but this is not the case with the current implementation.
When an auth error stops polling, pollingDisabled is set to true. If the user then re-logs in (causing the parent to pass a new accessToken), the useEffect re-runs (because accessToken and the memoized callbacks changed), but pollingDisabled is still true — so the early-return guard fires and the interval is never restarted.
Polling will only resume if the component is fully unmounted and remounted (e.g., navigating away and back). A simple fix is to reset pollingDisabled to false whenever accessToken changes:
// Reset polling-disabled guard when a new token arrives (e.g. after re-login)
useEffect(() => {
if (accessToken) {
setPollingDisabled(false);
}
}, [accessToken]);Place this before the polling useEffect. This ensures the next run of the polling effect sees pollingDisabled === false and starts the interval again.
| const isAuthError = (error: unknown): boolean => { | ||
| if (error instanceof Error) { | ||
| return /HTTP (400|401)/.test(error.message); | ||
| } | ||
| return false; | ||
| }; |
There was a problem hiding this comment.
isAuthError relies on a fragile error-message substring match
The regex /HTTP (400|401)/ works only if the networking layer formats its errors exactly with that substring (e.g., "... HTTP 400 ..."). If networking.ts ever changes its error serialisation (e.g., Axios wraps it as "Request failed with status code 400", or a future refactor drops the "HTTP" prefix), isAuthError will silently return false, polling will continue indefinitely, and the fix regresses without any obvious failure.
Consider enriching the custom error objects thrown by the networking helpers with a typed status field, and checking that instead:
// In networking.ts (example)
class HttpError extends Error {
constructor(public status: number, message: string) {
super(message);
}
}
// Then here:
const isAuthError = (error: unknown): boolean =>
error instanceof HttpError && (error.status === 400 || error.status === 401);If modifying the networking layer is out of scope for this PR, at minimum document the expected format in a comment so future maintainers don't break the assumption silently.
| {pollingDisabled && ( | ||
| <Alert | ||
| type="warning" | ||
| message="Auto-refresh paused — your session key has expired. Please re-login to restore live status." | ||
| showIcon | ||
| style={{ marginBottom: 16 }} | ||
| /> | ||
| )} |
There was a problem hiding this comment.
Alert shown even after a successful manual hard-refresh
When pollingDisabled is true, the warning banner persists even if the user successfully performs a "Hard Refresh" via the button (which calls fetchReloadStatus / fetchSourceInfo directly). A successful manual call implies the token is now valid again — leaving the "session key has expired" alert up at that point is misleading UX.
Consider clearing pollingDisabled (and restarting polling) on a successful hard-refresh response, or at least hiding the alert if a direct call succeeds:
const handleHardRefresh = async () => {
// ...
try {
const response = await reloadModelCostMap(accessToken);
if (response.status === "success") {
// ...
setPollingDisabled(false); // Clear the auth-error gate on success
}
}
// ...
};
Summary
Fixes #24349
The
PriceDataReloadcomponent polls two admin endpoints every 30 seconds. When the user's virtual key expires (while the JWT cookie remains valid), every poll triggersProxyException: Expired Key(HTTP 400), spamming the server with repeated auth errors indefinitely.Changes:
isAuthError()module-level helper detects HTTP 400/401 responsesfetchReloadStatusandfetchSourceInfowrapped withuseCallback(satisfies React hooks lint rules)stopPolling()clears the interval and setspollingDisabled = true<Alert>banner shown when polling is paused, prompting the user to re-loginuseEffectdepends onpollingDisabledso polling restarts correctly after fresh login (newaccessToken)Tests
isAuthError()edge cases (non-Error throws, non-matching status codes)Manual verification: Deployed to production Kubernetes cluster. Zero
ProxyExceptionauth errors from polling after fix — confirmed via pod logs over 30+ minutes.Checklist
ui/litellm-dashboard/src/components/price_data_reload.tsxonly (no backend changes)useEffectcleanup (clearInterval on unmount) preserved