Move task execution into a daemon by arcanis · Pull Request #263 · yarnpkg/zpm

arcanis · 2026-03-04T23:37:50Z

This PR adds support for long-running tasks, currently annotated with a @long-running attribute.

To support that, task execution has been moved inside a daemon process managed by Yarn Switch. The core logic still lives inside Yarn (not Yarn Switch), with Yarn Switch being merely responsible to keep records about which daemons are in use in which projects.

The daemons are currently accessible through unauthenticated websockets listening on localhost. It's slightly insecure in a multi-user context, auth should be implemented in a follow-up.

greptile-apps · 2026-03-04T23:45:13Z

Confidence Score: 1/5

Not safe to merge — two Windows build-breaking issues and a daemon-kill orphan bug need to be fixed first.
The PR introduces substantial new functionality and the overall daemon architecture is sound, but there are two concrete compilation errors (unconditional Unix-only import in coordinator.rs, missing winapi dependency for Windows cfg blocks in daemons.rs) that will prevent builds on Windows entirely. Additionally, there is a behavioral bug where killing the daemon leaves spawned task children running as orphans, which directly contradicts the intent of the switch daemon --kill command for cleanup. There is also unbounded memory growth in the output buffer for long-running daemons. The security concern (unauthenticated WebSocket) is acknowledged and intentionally deferred, which is acceptable for an initial implementation, but the build blockers and orphan-process bug are showstoppers.
packages/zpm/src/daemon/coordinator.rs (Unix-only import), packages/zpm-switch/src/daemons.rs (missing winapi dependency), packages/zpm-switch/src/commands/switch/daemon_kill.rs (orphaned task children on kill), packages/zpm/src/daemon/client.rs (SIGKILL on process group), taskfile (debug entries)

Important Files Changed

Filename	Overview
packages/zpm/src/daemon/coordinator.rs	New coordinator file that manages daemon execution. Contains two critical issues: (1) unconditional import of `std::os::unix::fs::MetadataExt` at line 3 will fail to compile on Windows, and (2) unbounded memory growth in `output_buffer` HashMap which accumulates one entry per task ID for the lifetime of the daemon without ever removing entries.
packages/zpm-switch/src/daemons.rs	Daemon registry with process lifecycle management. Windows code (lines 100-153) references `winapi` crate types but `winapi` is not declared as a dependency in `Cargo.toml`, causing build failure on Windows platforms.
packages/zpm/src/daemon/client.rs	Daemon client with WebSocket-based IPC. `StandaloneDaemonHandle::kill()` (lines 41-49) uses `kill -9 -{pid}` which sends SIGKILL to the entire process group, preventing graceful cleanup of the daemon and its task children.
packages/zpm-switch/src/commands/switch/daemon_kill.rs	Daemon kill command. Calls `daemons::kill_process` which sends SIGTERM only to the daemon process itself. Task subprocesses spawned by the daemon are not signaled and continue running as orphans, defeating the purpose of the kill command.
packages/zpm/src/commands/tasks/push.rs	Task push command reworked to use daemon WebSocket client. Contains a confusing `0 << 8` expression at line 66 in the success exit code path that should be simplified to `0`.
taskfile	Root project taskfile. Contains several debug/test tasks (`bar`, `bar2`, `x`, `producer`, `foo`) that appear to be development scratch entries added for testing the new daemon functionality. They don't serve any project-level purpose and should be removed before merging or moved to test fixtures if needed by acceptance tests.

_{Last reviewed commit: 39ef683}

greptile-apps · 2026-03-04T23:45:16Z

packages/zpm/src/daemon/coordinator.rs

@@ -0,0 +1,344 @@
+use std::collections::{HashMap, HashSet};
+use std::io::Write;
+use std::os::unix::fs::MetadataExt;


Unix-only import used unconditionally

std::os::unix::fs::MetadataExt is a Unix-only trait imported unconditionally. This will cause a compile error on Windows. The import and all code that calls .ino() (lines ~45–47) must be gated behind #[cfg(unix)], with the inode-watching block either disabled or replaced with a no-op on non-Unix platforms.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/zpm/src/daemon/coordinator.rs Line: 3 Comment: Unix-only import used unconditionally `std::os::unix::fs::MetadataExt` is a Unix-only trait imported unconditionally. This will cause a compile error on Windows. The import and all code that calls `.ino()` (lines ~45–47) must be gated behind `#[cfg(unix)]`, with the inode-watching block either disabled or replaced with a no-op on non-Unix platforms. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-04T23:45:17Z

packages/zpm/src/daemon/coordinator.rs

+    let output_buffer: OutputBuffer
+        = Arc::new(RwLock::new(HashMap::new()));
+
+    let subscription_registry
+        = Arc::new(SubscriptionRegistry::new());
+
+    let long_lived_registry
+        = Arc::new(LongLivedRegistry::new());
+
+    let scheduler_for_loop
+        = scheduler.clone();
+
+    let (loop_event_tx, mut loop_event_rx)
+        = mpsc::unbounded_channel::<ExecutorEvent>();
+
+    let subscription_registry_for_loop
+        = subscription_registry.clone();
+
+    let subscription_registry_for_events
+        = subscription_registry.clone();
+
+    let output_buffer_for_events
+        = output_buffer.clone();
+
+    let long_lived_registry_for_events
+        = long_lived_registry.clone();
+
+    let scheduler_for_events
+        = scheduler.clone();
+
+    tokio::spawn(async move {
+        while let Some(event) = loop_event_rx.recv().await {
+            if let ExecutorEvent::Output { task_id, line, stream } = &event {
+                if let Ok(mut buffer) = output_buffer_for_events.write() {
+                    let lines: &mut Vec<BufferedOutputLine>
+                        = buffer
+                            .entry(task_id.to_string())
+                            .or_insert_with(Vec::new);
+
+                    lines.push(BufferedOutputLine {
+                        line: line.to_string(),
+                        stream: stream.as_str().to_string(),
+                    });
+
+                    if lines.len() > OUTPUT_BUFFER_MAX_LINES {
+                        let excess
+                            = lines.len() - OUTPUT_BUFFER_MAX_LINES;
+
+                        lines.drain(0..excess);
+                    }


Unbounded memory growth in output buffer

The output_buffer HashMap (created at line 58–59) accumulates an entry for every task ID that ever runs in this daemon session. While the per-task line count is capped at OUTPUT_BUFFER_MAX_LINES (1000 lines), the number of task entries in the HashMap is never pruned. For a long-running daemon that processes thousands of short-lived tasks, this will steadily grow the resident memory of the daemon process. Entries for completed tasks (particularly non-long-lived ones whose output has already been retrieved by the client) should be removed once they are no longer needed.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/zpm/src/daemon/coordinator.rs Line: 58-107 Comment: Unbounded memory growth in output buffer The `output_buffer` HashMap (created at line 58–59) accumulates an entry for every task ID that ever runs in this daemon session. While the per-task line count is capped at `OUTPUT_BUFFER_MAX_LINES` (1000 lines), the number of task entries in the HashMap is never pruned. For a long-running daemon that processes thousands of short-lived tasks, this will steadily grow the resident memory of the daemon process. Entries for completed tasks (particularly non-long-lived ones whose output has already been retrieved by the client) should be removed once they are no longer needed. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-04T23:45:18Z

packages/zpm/src/daemon/client.rs

+    pub fn kill(&self) {
+        #[cfg(unix)]
+        {
+            let _ = std::process::Command::new("kill")
+                .arg("-9")
+                .arg(format!("-{}", self.pid))
+                .status();
+        }
+    }


SIGKILL on entire process group prevents graceful cleanup

kill -9 -{pid} sends SIGKILL to every process in the daemon's process group. Because SIGKILL cannot be caught or ignored, neither the daemon nor any of its running task children will have a chance to flush buffers, clean up temporary files, or release resources. In addition, if any task subprocess moves itself to a different process group, it will survive this kill.

For the standalone case it may be acceptable to be forceful, but using SIGTERM first (with a timeout and SIGKILL as a fallback) would be safer and more consistent with the SIGTERM used elsewhere in the codebase.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/zpm/src/daemon/client.rs Line: 41-49 Comment: SIGKILL on entire process group prevents graceful cleanup `kill -9 -{pid}` sends SIGKILL to every process in the daemon's process group. Because SIGKILL cannot be caught or ignored, neither the daemon nor any of its running task children will have a chance to flush buffers, clean up temporary files, or release resources. In addition, if any task subprocess moves itself to a different process group, it will survive this kill. For the standalone case it may be acceptable to be forceful, but using SIGTERM first (with a timeout and SIGKILL as a fallback) would be safer and more consistent with the SIGTERM used elsewhere in the codebase. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-04T23:45:19Z

packages/zpm-switch/src/daemons.rs

+    #[cfg(windows)]
+    {
+        use std::ptr::null_mut;
+        unsafe {
+            let handle = winapi::um::processthreadsapi::OpenProcess(
+                winapi::um::winnt::PROCESS_QUERY_LIMITED_INFORMATION,
+                0,
+                pid,
+            );
+            if handle.is_null() {
+                false
+            } else {
+                winapi::um::handleapi::CloseHandle(handle);
+                true
+            }
+        }
+    }
+
+    #[cfg(not(any(unix, windows)))]
+    {
+        true
+    }
+}
+
+pub fn kill_process(pid: u32) -> bool {
+    #[cfg(unix)]
+    {
+        unsafe { libc::kill(pid as i32, libc::SIGTERM) == 0 }
+    }
+
+    #[cfg(windows)]
+    {
+        use std::ptr::null_mut;
+        unsafe {
+            let handle = winapi::um::processthreadsapi::OpenProcess(
+                winapi::um::winnt::PROCESS_TERMINATE,
+                0,
+                pid,
+            );
+            if handle.is_null() {
+                false
+            } else {
+                let result = winapi::um::processthreadsapi::TerminateProcess(handle, 1) != 0;
+                winapi::um::handleapi::CloseHandle(handle);
+                result
+            }
+        }
+    }
+
+    #[cfg(not(any(unix, windows)))]
+    {
+        false
+    }
+}


winapi crate referenced but not declared as a dependency

Both is_process_alive and kill_process have #[cfg(windows)] branches that reference winapi::um::processthreadsapi, winapi::um::winnt, and winapi::um::handleapi. However, winapi does not appear in the Cargo.toml for zpm-switch (nor in the workspace Cargo.toml changes in this PR). This will produce a compile error on Windows. You need to add:

[target.'cfg(windows)'.dependencies] winapi = { version = "0.3", features = ["processthreadsapi", "winnt", "handleapi"] }

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/zpm-switch/src/daemons.rs Line: 100-153 Comment: `winapi` crate referenced but not declared as a dependency Both `is_process_alive` and `kill_process` have `#[cfg(windows)]` branches that reference `winapi::um::processthreadsapi`, `winapi::um::winnt`, and `winapi::um::handleapi`. However, `winapi` does not appear in the `Cargo.toml` for `zpm-switch` (nor in the workspace `Cargo.toml` changes in this PR). This will produce a compile error on Windows. You need to add: ```toml [target.'cfg(windows)'.dependencies] winapi = { version = "0.3", features = ["processthreadsapi", "winnt", "handleapi"] } ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-04T23:45:20Z

packages/zpm/src/commands/tasks/push.rs

+        client.push_tasks(task_subscriptions, parent_task_id, None, None).await?;

-        Ok(ExitStatus::from_raw(0))
+        Ok(ExitStatus::from_raw(0 << 8))


0 << 8 is always 0 — confusing no-op

ExitStatus::from_raw(0 << 8) evaluates identically to ExitStatus::from_raw(0). The << 8 shift pattern is used elsewhere to encode non-zero exit codes in the wait-status format, but shifting zero is a no-op and reads as if the author forgot to put a real value in.

Suggested change

Ok(ExitStatus::from_raw(0 << 8))

Ok(ExitStatus::from_raw(0))

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/zpm/src/commands/tasks/push.rs Line: 66 Comment: `0 << 8` is always `0` — confusing no-op `ExitStatus::from_raw(0 << 8)` evaluates identically to `ExitStatus::from_raw(0)`. The `<< 8` shift pattern is used elsewhere to encode non-zero exit codes in the wait-status format, but shifting zero is a no-op and reads as if the author forgot to put a real value in. ```suggestion Ok(ExitStatus::from_raw(0)) ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-04T23:45:21Z

taskfile

+bar:
+  sleep 5
+
+bar2:
+  sleep 10
+
+x:
+  python3 -c "import time; print(f'ts:{int(time.time()*1000)}:line1')"
+  sleep 1
+  python3 -c "import time; print(f'ts:{int(time.time()*1000)}:line2')"
+  sleep 1
+  python3 -c "import time; print(f'ts:{int(time.time()*1000)}:line3')"
+
+producer:
+  for x in {1..10}; do
+    echo "producer: $x"
+    sleep 1
+  done
+
+foo: bar& bar2&
+  echo "foo"
+


Debug/test tasks left in the repository root taskfile

The tasks bar, bar2, x, producer, and foo appear to be development scratch entries added to test the new daemon functionality. They don't appear to serve any project-level purpose and should be removed before merging, or moved to a test fixture if they are needed for acceptance tests.

Prompt To Fix With AI

This is a comment left during a code review. Path: taskfile Line: 1-22 Comment: Debug/test tasks left in the repository root `taskfile` The tasks `bar`, `bar2`, `x`, `producer`, and `foo` appear to be development scratch entries added to test the new daemon functionality. They don't appear to serve any project-level purpose and should be removed before merging, or moved to a test fixture if they are needed for acceptance tests. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-04T23:45:22Z

packages/zpm-switch/src/commands/switch/daemon_kill.rs

+    pub async fn execute(&self) -> Result<(), Error> {
+        let project_cwd = get_final_cwd()?;
+
+        let find_result = find_closest_package_manager(&project_cwd)?;
+
+        let detected_root = find_result
+            .detected_root_path
+            .ok_or(Error::NoProjectFound)?;
+
+        let Some(daemon) = daemons::get_daemon(&detected_root)? else {
+            println!(
+                "{} No daemon registered for this project",
+                DataType::Info.colorize("ℹ")
+            );
+            return Ok(());
+        };
+
+        if !daemons::is_process_alive(daemon.pid) {
+            daemons::unregister_daemon(&detected_root)?;
+            println!(
+                "{} Daemon was not running (cleaned up stale entry)",
+                DataType::Info.colorize("ℹ")
+            );
+            return Ok(());
+        }
+
+        if daemons::kill_process(daemon.pid) {
+            daemons::unregister_daemon(&detected_root)?;
+            println!(
+                "{} Stopped daemon for {} (PID: {})",
+                DataType::Success.colorize("✓"),
+                detected_root.to_print_string(),
+                daemon.pid
+            );


Killing the daemon does not terminate its running task children

daemons::kill_process sends SIGTERM only to the daemon process itself (the yarn debug daemon binary). All task subprocesses that the daemon has spawned are in the same session but may be in their own process groups. When the daemon receives SIGTERM it will exit — but because nothing in the daemon's signal handling path terminates the child processes, those tasks continue running as orphans.

This means switch daemon --kill can leave long-running tasks (e.g. @long-lived dev servers) silently running in the background after the user believes they have been stopped. The daemon should either propagate the signal to its children on shutdown, or the kill command should enumerate and terminate task children before sending SIGTERM to the daemon.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/zpm-switch/src/commands/switch/daemon_kill.rs Line: 21-54 Comment: Killing the daemon does not terminate its running task children `daemons::kill_process` sends `SIGTERM` only to the daemon process itself (the `yarn debug daemon` binary). All task subprocesses that the daemon has spawned are in the same session but may be in their own process groups. When the daemon receives SIGTERM it will exit — but because nothing in the daemon's signal handling path terminates the child processes, those tasks continue running as orphans. This means `switch daemon --kill` can leave long-running tasks (e.g. `@long-lived` dev servers) silently running in the background after the user believes they have been stopped. The daemon should either propagate the signal to its children on shutdown, or the kill command should enumerate and terminate task children before sending SIGTERM to the daemon. How can I resolve this? If you propose a fix, please make it concise.

arcanis added 10 commits February 26, 2026 21:02

Adds tests for yarn switch

2efd995

Adds a new daemon command

fcc4643

Adds an initial handshake

f30c08b

Replaces IPCs by websockets

84690cc

wip

9d4fbca

wip

50ced0d

Moves task execution into the daemon

3278eaf

Fixes style

351096c

wip

7c0bbd1

wip

39ef683

greptile-apps bot reviewed Mar 4, 2026

View reviewed changes

arcanis deployed to test-reports March 5, 2026 03:23 — with GitHub Actions View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move task execution into a daemon#263

Move task execution into a daemon#263
arcanis wants to merge 10 commits intomainfrom
mael/daemon-tasks

arcanis commented Mar 4, 2026

Uh oh!

greptile-apps bot commented Mar 4, 2026

Uh oh!

greptile-apps bot Mar 4, 2026

Uh oh!

greptile-apps bot Mar 4, 2026

Uh oh!

greptile-apps bot Mar 4, 2026

Uh oh!

greptile-apps bot Mar 4, 2026

Uh oh!

greptile-apps bot Mar 4, 2026

Uh oh!

greptile-apps bot Mar 4, 2026

Uh oh!

greptile-apps bot Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arcanis commented Mar 4, 2026

Uh oh!

greptile-apps bot commented Mar 4, 2026

Confidence Score: 1/5

Important Files Changed

Uh oh!

greptile-apps bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant