-
-
Notifications
You must be signed in to change notification settings - Fork 176
Prevent task spawning during unload #1387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cbcc3b2 to
885b6d2
Compare
Sounds good, I'm skimming but this sounds overall reasonable! Not the best time for me to properly test just now, but I'm hoping it shouldn't be too hard once I get a good opportunity to test. |
I am not reproducing the issue, even with Still I think the description of the change sounds like a good thing, so I am likely to approve if it has solved the issue for you. |
|
Some CI runs show as Would it be worth rebasing this on (EDIT To be more explicit in my meaning: I am hoping to rebase to get CI runs going if you don't mind as the PR author. Thank you for considering. EDIT 2: merge commit works as well, thank you for that!) |
Yeah, I think it's just as likely to be a race condition that could've theoretically happened at any time — but which is now happening consistently for me. Can't explain why. If you want to try to dig into it: ultimately, I think this is the line that is triggering the spawning of a task. You could open dev tools, go to the Sources tab, do Cmd-O (with focus in the dev tools), and type Wouldn't blame you if you didn't care that much, though! |
…spawning-during-unload
I merged instead of rebased (muscle memory), but the PR is updated. Let's see how CI likes it. |
This comment was marked as outdated.
This comment was marked as outdated.
|
That one editor test failure is a consistent timeout, across PR's and IIRC also happens on |
For what it's worth: I didn't hit that breakpoint when I Quit the Pulsar app on the latest Rolling (1.130.2026010721). |
|
HUH. Sorry for comment spam, but I just got one of the zombie processes... (Not with this PR's binary.) Hmmmmmm. |
Iiiiinteresting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look okay to me.
A fair proportion of the diff is fairly trivial refactors, linter stuff, moving lines lower down the file as necessary to keep them outside of an if/else...
I did manage to reproduce the issue mentioned a cluster of times on a Rolling binary, and as far as I can remember I didn't ever see it on this PR's CI artifact binary...
I did go and do the PR body's suggested test steps ("launch/wait 10 seconds/quit") with this PR's binary 10+ times in a row and didn't reproduce the issue... as a fairness check I also re-tested a Rolling binary in the same manner 10+ times and couldn't reproduce the issue again... 🤷
What does it all mean? /jk
Anyhow, since I never repro'd the issue with this change, and the premise seems okay, and the code seems okay, and the resulting bins work and have no additional issues in CI... LGTM!
Since I started using the 1.131.0 release candidate (the most recent rolling release) I've been consistently having an issue where a “zombie” process named “Pulsar Helper (Renderer)” is left running after Pulsar quits. In github#44 I speculated that it might be related to the worker-process strategy I was seeing in the
githubpackage. Luckily, it's simpler than that!The
file-iconspackage, of all things, was the source of my woes. We can argue over whether that package is overengineered… but, well, it's certainly engineered. It has many strategies for discerning the “correct” icon to use for a given kind of file; one such strategy involves analyzing a file for modelines, since they're common invimfor setting file metadata.Somewhere in that strategy is some code that gets run during package deactivation. It seems to inadvertently schedule a task — via the Task API — to do something in the background. This is a one-off task that would quickly have returned — but if the package is being deactivated because the editor itself is quitting, then it's scheduling a task to run just as the environment is being torn down. The task then runs in a state of isolated torture — unable to send messages back to the parent process, since that process has been terminated. (It doesn't exit the way you might expect; instead it quickly starts using 100% of CPU, perhaps caught in some sort of exception-handling loop.)
Annoyingly, we don't have great tools to prevent this scenario more comprehensively. Ideally we'd be able to tell when a worker script is caught in this hell and automatically exit it instead of having it get stuck like this… but that sort of thing is largely under the control of the package author. We can probably do more at the margins here, but there's one simple fix we can do now:
We already know when we're about to unload the environment. Once we get past any applicable confirmation dialogs, and the editor-unloading process is past the point of no return, we set
atom.unloadingtotrue, and then immediately trigger package deactivation for all active packages.So if a package wants to schedule a
Taskwhenatom.unloadingistrue, we cantell it to get bentsilently refuse to spawn a child process. This should have no ill effects because the environment is fated to be destroyed anyway — so nothing would be around to react to the newly spawned task, even if it were spawned for a legitimate reason. (And even that is arguable; package deactivation cannot go async, so the best a package can do during deactivation is schedule some work. By definition it'll already be deactivated by the time that async work is actually performed, so it won't be able to respond to the output of that work.)This was basically a two-line change; the rest of it is just adding new guards for existing code that assumed
this.childProcesswould always be present.I must admit that I have no idea why this is suddenly an issue with Electron 30 when it wasn't with Electron 12. But file this under “this code should never have worked” — no reason not to fix it in the present!
Testing
No great way to test this in the specs, since any attempt to replicate the conditions of unloading would actually destroy the environment under which the tests are running. I'd be curious to see if anyone else can replicate the zombie process issue; it's not hard for me to reproduce on macOS on a project of sufficient heft, such as the Pulsar codebase itself.
All you'd need to do is install the
file-iconspackage, then relaunch Pulsar, and then quit it after about ten seconds. Then monitor Task Manager (or Activity Monitor ortop) and keep an eye out for a zombie Pulsar process.Failing that, I think it's pretty easy to ascertain that this is a low-risk change, but let me know if you disagree.