Skip to content

Fix SIGINT handler for StopOnExitJobServer#34748

Merged
lostluck merged 1 commit intoapache:masterfrom
shunping:prism-sigint-handler
Apr 26, 2025
Merged

Fix SIGINT handler for StopOnExitJobServer#34748
lostluck merged 1 commit intoapache:masterfrom
shunping:prism-sigint-handler

Conversation

@shunping
Copy link
Collaborator

@shunping shunping commented Apr 26, 2025

The previously registered SIGINT handler had an incorrect signature according to the signal module documentation:
https://docs.python.org/3/library/signal.html#signal.signal

The handler is called with two arguments: the signal number and the current stack frame (None or a frame object; for a description of frame objects,

This PR updates the handler to the correct signature. It now gracefully stops the server and ensures the original handler is called for proper application exit. This is important especially in a Colab environment, because the kernel will send out SIGINT when the stop button of a cell is pressed.

@shunping shunping changed the title Fix sigint handler for StopOnExitJobServer Fix SIGINT handler for StopOnExitJobServer Apr 26, 2025
@shunping shunping self-assigned this Apr 26, 2025
@shunping shunping requested a review from lostluck April 26, 2025 02:19
Copy link
Contributor

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That definitely explains the exception message. Good find!

@shunping
Copy link
Collaborator Author

Run Python_Dataframes PreCommit 3.11

@codecov
Copy link

codecov bot commented Apr 26, 2025

Codecov Report

Attention: Patch coverage is 50.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 56.50%. Comparing base (8cd0a29) to head (530fab6).
Report is 12 commits behind head on master.

Files with missing lines Patch % Lines
...thon/apache_beam/runners/portability/job_server.py 50.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #34748      +/-   ##
============================================
+ Coverage     56.48%   56.50%   +0.02%     
  Complexity     3289     3289              
============================================
  Files          1178     1180       +2     
  Lines        180978   181262     +284     
  Branches       3399     3399              
============================================
+ Hits         102220   102419     +199     
- Misses        75499    75584      +85     
  Partials       3259     3259              
Flag Coverage Δ
python 81.26% <50.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lostluck
Copy link
Contributor

Does this help with #33623 in combination with the Singleton experiment?

@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @claudevdm for label python.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@shunping
Copy link
Collaborator Author

Run Python_Runners PreCommit 3.9

@shunping
Copy link
Collaborator Author

shunping commented Apr 26, 2025

Does this help with #33623 in combination with the Singleton experiment?

Yes, there are multiple changes recently to address #33623.

'PrismJobServer' is wrapped around StopOnExitJobServer', which is then made into a singleton. After this PR, when the event of SIGINT (eg. pressing the stop button in Colab or Ctrl + c in terminal) is triggered, the prism server will stop gracefully rather than being defunct'ed.

The singleton of 'PrismJobServer' on the other hand is still valid (though there is no prism subprocess running atm), and we will be able to start the subprocess again if we run another pipeline on the same Colab.

The current behavior is the same for both Colab and Terminal. A future improvement is that we can keep the prism server alive on the event of SIGINT in an interactive environment and, as @lostluck suggested in another thread, enforce an idle timeout.

@lostluck
Copy link
Contributor

Fantastic!

@lostluck lostluck merged commit 930d14a into apache:master Apr 26, 2025
93 of 94 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants