Skip to content

[LIVY-786] Interrupt interpreter threads when cancelling statements#495

Open
ArnavBalyan wants to merge 5 commits intoapache:masterfrom
ArnavBalyan:arnavb/fix-cancel-issue
Open

[LIVY-786] Interrupt interpreter threads when cancelling statements#495
ArnavBalyan wants to merge 5 commits intoapache:masterfrom
ArnavBalyan:arnavb/fix-cancel-issue

Conversation

@ArnavBalyan
Copy link
Member

@ArnavBalyan ArnavBalyan commented Nov 24, 2025

What changes were proposed in this pull request?

  • Cancel API doesn't cancel anything running on the Spark driver
  • track the interpreter thread for each running statement and interrupt it during cancel
  • Clean up the threads and reset interrupt after statement completion

How was this patch tested?

  • Unit test added

Closes LIVY-786

@ArnavBalyan ArnavBalyan force-pushed the arnavb/fix-cancel-issue branch from ffd922a to 922215d Compare November 25, 2025 09:41
@codecov-commenter
Copy link

codecov-commenter commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 0% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.35%. Comparing base (0efa2d7) to head (42aec74).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
.../src/main/scala/org/apache/livy/repl/Session.scala 0.00% 15 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             master     #495       +/-   ##
=============================================
- Coverage     68.38%   54.35%   -14.03%     
+ Complexity     1199      863      -336     
=============================================
  Files           106      106               
  Lines          6711     6717        +6     
  Branches        831      831               
=============================================
- Hits           4589     3651      -938     
- Misses         1657     2621      +964     
+ Partials        465      445       -20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ArnavBalyan
Copy link
Member Author

cc @gyogal @lmccay could you please take a look when possible thanks! :)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the statement cancellation mechanism in Apache Livy by adding thread interruption capabilities to cancel driver code that doesn't involve Spark jobs. Previously, the cancel API only cancelled Spark jobs but couldn't interrupt code running directly on the driver thread (e.g., Thread.sleep()).

Key Changes:

  • Added thread tracking using a ConcurrentHashMap to map statement IDs to their executing threads
  • Implemented thread interruption during statement cancellation to terminate non-Spark driver code
  • Added cleanup logic to remove thread references and reset interrupt flags after statement completion

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
repl/src/main/scala/org/apache/livy/repl/Session.scala Added statementThreads map to track interpreter threads, implemented thread interruption in cancel() method, and added cleanup in the statement execution finally block
repl/src/test/scala/org/apache/livy/repl/SparkSessionSpec.scala Added test case to verify cancellation of driver code without Spark jobs, including validation that interrupted code doesn't complete and variables aren't defined

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ArnavBalyan ArnavBalyan force-pushed the arnavb/fix-cancel-issue branch from bd4750c to 4f99fe5 Compare December 13, 2025 10:11
@ArnavBalyan
Copy link
Member Author

Addressed bot comments where applicable

@ArnavBalyan
Copy link
Member Author

Hi @lmccay just wanted to gently bump if you could pls take a look thanks!

@gyogal
Copy link
Contributor

gyogal commented Jan 26, 2026

There is an older PR for this issue, and one of the comments on it may be relevant here as well: #307 (review) (I haven't tested it myself)

Copy link
Contributor

@lmccay lmccay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple questions for clarification. Please update the description with the answers to make sure it is clear what the intent is here when done.
I'm also not very familiar with this particular code and would be more comfortable with another reviewer approving as well.

info(s"Failed to cancel statement $statementId.")
statement.compareAndTransit(StatementState.Cancelling, StatementState.Cancelled)
} else {
Option(statementThreads.get(statementId)).foreach(_.interrupt())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArnavBalyan - Can you please verify that this change is only intended to interrupt interruptible tasks such as sleep, object.wait, etc? As @gyogal has mentioned this will not interrupt actual long running threads.

I am also curious about the fact that this is being called within the while loop. Are we waiting for the state to be successfully set to Cancelled instead of still Cancelling? The upon failure, we set it to Cancelled after timing out therefore it will not try to be interrupted again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants