Skip to content

Commit bce09c5

Browse files
process: kill on drop (meta-pytorch#802)
Summary: Pull Request resolved: meta-pytorch#802 the test added in D77348271 not withstanding, child process cleanup is evidently unreliable after D77392241 as evidenced by running `buck2 test fbcode//monarch/hyperactor_mesh:hyperactor_mesh` and observing that on completion, 100+ orphaned instances of `hyperactor_mesh_test_bootstrap` are left running. this diff implements the only way i've found that fixes this problem (even reintroducing parent death detection via signal handling does not leading me to conclude it never did and that the line this diff puts back is in fact, the reason it seemed to). ghstack-source-id: 301707897 exported-using-ghexport Reviewed By: mariusae Differential Revision: D79898223 fbshipit-source-id: 74001e5a1a763bd264acb7518f7970d2ccdc1478
1 parent 171bb4c commit bce09c5

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

hyperactor_mesh/src/alloc/process.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -371,6 +371,7 @@ impl ProcessAlloc {
371371
cmd.env(bootstrap::BOOTSTRAP_LOG_CHANNEL, log_channel.to_string());
372372
cmd.stdout(Stdio::piped());
373373
cmd.stderr(Stdio::piped());
374+
cmd.kill_on_drop(true);
374375

375376
let proc_id = ProcId::Ranked(WorldId(self.name.to_string()), index);
376377
tracing::debug!("Spawning process {:?}", cmd);

0 commit comments

Comments
 (0)