Skip to content

Commit 30480db

Browse files
jlebondustymabe
authored andcommitted
platform/qemu: detect if QEMU process exits unexpectedly
Currently, we only try to detect if the QEMU process exited by actually `wait()`ing for it in the `kola qemuexec` path. We should do it in the kola testing path as well so that it's easy to tell if e.g. it was killed while the test was running. Note this doesn't actually stop the test early if QEMU exited. That would require some tricky wiring into the harness. But at least what it prints helps diagnose the issue when we see the test time out on SSH. And the QEMU process won't just hang there as defunct.
1 parent 23f0d16 commit 30480db

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

mantle/platform/machine/qemu/cluster.go

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,8 @@ type Cluster struct {
4040
*platform.BaseCluster
4141
flight *flight
4242

43-
mu sync.Mutex
43+
mu sync.Mutex
44+
tearingDown bool
4445
}
4546

4647
func (qc *Cluster) NewMachine(userdata *conf.UserData) (platform.Machine, error) {
@@ -231,10 +232,22 @@ func (qc *Cluster) NewMachineWithQemuOptions(userdata *conf.UserData, options pl
231232

232233
qc.AddMach(qm)
233234

235+
// In this flow, nothing actually Wait()s for the QEMU process. Let's do it here
236+
// and print something if it exited unexpectedly. Ideally in the future, this
237+
// interface allows the test harness to provide e.g. a channel we can signal on so
238+
// it knows to stop the test once QEMU dies.
239+
go func() {
240+
err := inst.Wait()
241+
if err != nil && !qc.tearingDown {
242+
plog.Errorf("QEMU process finished abnormally: %v", err)
243+
}
244+
}()
245+
234246
return qm, nil
235247
}
236248

237249
func (qc *Cluster) Destroy() {
250+
qc.tearingDown = true
238251
qc.BaseCluster.Destroy()
239252
qc.flight.DelCluster(qc)
240253
}

0 commit comments

Comments
 (0)