Skip to content

fix: handle bdb shutdown interrupts#659

Merged
ato merged 2 commits intomasterfrom
adam/handle-bdb-shutdown-interrupts
Jun 12, 2025
Merged

fix: handle bdb shutdown interrupts#659
ato merged 2 commits intomasterfrom
adam/handle-bdb-shutdown-interrupts

Conversation

@adam-miller
Copy link
Copy Markdown
Contributor

When CrawlController.requestCrawlStop() is triggered multiple times, Thread.interrupt() is called on each of the toe threads every time. If BDB is accessed while the interrupted status flag is set, the environment is invalidated and all subsequent interactions will fail. This error only occurs during with specific timing, but you can trigger this by repeatedly hitting 'Terminate' in UI.

The BdbFrontier avoids this with a lock that is triggered during shutdown, leaving only a few interactions that occur during shutdown. Preceding these calls with Thread.interrupted() clears the flag and allows the environment to survive the shutdown. A more robust solution may be to move all BDB interactions into a separate thread, but I'm unable to replicate the issue with these checks in place.

@ato
Copy link
Copy Markdown
Collaborator

ato commented Jun 12, 2025

Tricky. Even if we modify BDB, it looks like it would be hard to handle interrupt() fully safely because according to the javadoc:

If this thread is blocked in an I/O operation upon an InterruptibleChannel then the channel will be closed,

So I guess that leaves trying to mark certain code sections as not safe to be interrupted, either with a lock or moving them to special threads. However the whole point of using interrupt is the normal shutdown got stuck and you're trying to force it and so disallowing interrupts entirely for BDB doesn't seem ideal either, because BDB might be the thing that's stuck.

I think we've hit this with PANDAS (our scheduling tool) inadvertently stopping a job twice in some cases too. I'd be tempted to also change requestCrawlStop() so that calling it twice doesn't interrupt and instead have a separate API or option for force stop and also have the UI show a different button for it after the normal stop has been issued.

@ato ato merged commit 5601453 into master Jun 12, 2025
7 checks passed
@ato ato deleted the adam/handle-bdb-shutdown-interrupts branch June 12, 2025 01:49
ato added a commit that referenced this pull request Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants