-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Fix RequestResponseIO parseAndThrow to preserve retryable exception types #37341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Improved error messages when user code fails to serialize (pickle) for distributed execution. The original error was too technical and didn't explain the cause or suggest fixes. Changes: - Enhanced RuntimeError message with clear explanation of why serialization is required - Added common causes (lambdas capturing file handles, DB connections, thread locks) - Provided three concrete fixes: module-level functions, setup() methods, checking closure captures - Broadened exception catching to include TypeError and other pickling failures (not just RuntimeError) - Added exception chaining (from e) to preserve original stack trace - Added test case to verify the new error message content This significantly improves developer experience when debugging serialization issues, especially for new Apache Beam users. Fixes apache#37209 Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Fix Python formatter precommit check by applying yapf v0.43.0 formatting rules to modified files.
…ypes Problem: The parseAndThrow method in Call.java was wrapping retryable exceptions (UserCodeTimeoutException, UserCodeRemoteSystemException) in a generic UserCodeExecutionException, which breaks the retry logic that depends on exception.shouldRepeat() returning true. Solution: - Scan the full causal chain using Guava's Throwables.getCausalChain() - Preserve all specific retryable exception types (Quota/Timeout/RemoteSystem) - Prefer specific types over generic UserCodeExecutionException when both exist in the chain to prevent masking of retryable exceptions - Handle circular causal chains gracefully by catching IllegalArgumentException Testing: - Added 10 new unit tests covering: * Direct retryable exceptions (Timeout, RemoteSystem) * Nested exceptions (UncheckedExecutionException wrapping) * Generic UserCodeExecutionException masking specific types * Triple-nested exceptions * Circular reference in causal chain * Non-UserCode exceptions (RuntimeException) - All existing tests pass - Full rrio test suite passes (90 tasks) Fixes apache#37176 Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Summary of ChangesHello @PDGGK, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical bug in the RequestResponseIO component where retryable exceptions were being masked by a generic exception type, preventing the intended retry mechanism from functioning correctly. The changes ensure that the original, specific retryable exception types are preserved and propagated, thereby restoring proper retry behavior and improving the reliability of user code execution within the RRIO framework. Additionally, the Python SDK's error messages for non-serializable user code have been enhanced to provide clearer guidance for developers. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
|
Assigning reviewers: R: @jrmccluskey for label python. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
Hi! I notice that 2 Python PreCommit checks failed (Python 3.10 and 3.11), but Python 3.12 and 3.13 passed successfully. Since this PR only modifies Java files ( The failure pattern (only 3.10/3.11, not 3.12/3.13) suggests this might be a flaky test or infrastructure issue specific to those Python versions. Could someone please re-run the failed Python PreCommit checks? Thank you! Failed checks:
|
|
Closing this PR as it was created from the wrong branch ( Created a new PR #37342 from a clean branch ( |
Description
Fixes #37176
The
parseAndThrowmethod inCall.javawas wrapping retryable exceptions (UserCodeTimeoutException,UserCodeRemoteSystemException) in a genericUserCodeExecutionException, which breaks the retry logic that depends onexception.shouldRepeat()returning true.Problem
When user code throws a
UserCodeTimeoutExceptionorUserCodeRemoteSystemException(which haveshouldRepeat() = true), the old implementation would wrap these in a genericUserCodeExecutionException(which hasshouldRepeat() = false), causing theRepeaterto not retry the operation as intended.Solution
Throwables.getCausalChain()UserCodeExecutionExceptionwhen both exist in the chain to prevent masking of retryable exceptionsIllegalArgumentExceptionChanges
Modified Files
sdks/java/io/rrio/src/main/java/org/apache/beam/io/requestresponse/Call.java(+31/-7 lines)Throwablesimport for causal chain traversalparseAndThrowmethod to scan full exception chainsdks/java/io/rrio/src/test/java/org/apache/beam/io/requestresponse/CallTest.java(+264/-2 lines)Testing
Added comprehensive test coverage for:
UncheckedExecutionExceptionUserCodeExecutionExceptionmasking specific types (3 scenarios)Test Results:
CallTest: All tests passingrriotest suite: 90 tasks passing ✅spotlessCheckpassing ✅Impact
Behavior Change: Code that previously saw a generic
UserCodeExecutionExceptionmay now see the specific subtype (UserCodeTimeoutException/UserCodeRemoteSystemException). This is the intended fix to restore proper retry behavior.Performance: Minimal impact - exception chain traversal only occurs on error paths.
Backwards Compatibility: The change improves correctness. Any code that relied on exceptions being wrapped was working around a bug.
Example
Before:
After:
Checklist