-
Notifications
You must be signed in to change notification settings - Fork 455
fix(api): improve error message clarity in evaluation results #3364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(api): improve error message clarity in evaluation results #3364
Conversation
Enhance error handling in invoke_app to display actual provider error messages instead of generic HTTP error codes. This improves user experience by showing actionable error information (e.g., "OpenAI rate limit exceeded" instead of "HTTP 429: Too Many Requests"). Changes: - Parse response detail.message/stacktrace from LLM provider errors - Preserve HTTP status code and full error response for debugging - Add detailed stacktrace extraction from multiple response formats - Improve error message prioritization (provider message > generic error) Previously, evaluation table showed ambiguous error messages that didn't explain what went wrong or how to fix it. Users can now see the actual provider error and hover for technical details. Related issue: Agenta-AI#3324 - [UX bug] Misleading error message in evaluation table
|
Someone is attempting to deploy a commit to the agenta projects Team on Vercel. A member of the Team first needs to authorize it. |
|
Hi @ashrafchowdury , I've completed the fix for #3324 and verified it with testing. I see the Vercel preview check is currently failing, and as a result I‘m not entirely sure about the correct next steps. Please let me know what specific changes or additions are needed from my side, and I'm happy to update accordingly. |
|
Thanks for the contribution, @wsxzei. I believe this issue should be addressed on the frontend, as it appears to be an error in how the message is parsed. We just need to ensure that the UI displays the actual message correctly. |
Thanks for your review, @ashrafchowdury. I appreciate your perspective on this issue. I also initially thought this was a frontend issue, but after tracing the request flow during debugging, I found the problem is actually in the backend error handling. The system has two evaluation modes with different architectures: Human Evals (frontend handles LLM invocation results)
Auto Evals (backend handles LLM invocation results)
The issue occurs in Auto Evals mode. In except aiohttp.ClientResponseError as e:
error_message = app_response.get("detail", {}).get(
"error", f"HTTP error {e.status}: {e.message}"
)When LLM providers return quota/rate limit errors, the actual error details are in the response body. The current code doesn't extract this provider-specific message, so users see generic text like "HTTP error 429: too many requests" instead of the real error. |
Summary
Fix misleading error messages in the evaluation table by capturing and displaying the actual error messages from LLM providers instead of generic HTTP error codes.
Problem
When LLM providers return errors (e.g., rate limits, quota exceeded), the evaluation table only showed generic messages like "HTTP 429: Too Many Requests". This confused users because they couldn't tell:
Solution
Enhanced error handling in
invoke_app(api/oss/src/services/llm_apps_service.py) to parse and display the actual provider error messages:detail.messagefrom the LLM provider's error response as the primary error messagedetail.errorwhenmessageis not availableImpact
Before
After
Changes Made
Related Issues
Closes #3324