Skip to content

Commit 3173ca2

Browse files
committed
socket: fixing possible mixed up response with events
Patch drains possible events on qmp socket until true "qmp_capabilities" response is received. This targets a nasty and rare problem in "Connect(), Run()" sequence, when asynchronous QEMU "event" just after the "qmp_capabilities" request is accepted as response and further any QMP request gets completed with response from the preceding "qmp_capabilities" response. The following QMP protocol sequence is possible: 1 read {"QMP": {"version": {"qemu": {"micro": 0, "minor": 1, "major": 5}, ...}" 2 write {"execute":"qmp_capabilities"} 3 read {"timestamp": {"seconds": xxx, "microseconds": xxx}, "event": "..."} 4 read {"return": {}} the #3 is unexpected by the current Connect() implementation and "event" is considered as a proper response on "qmp_capabilities", in other turn #4 is read in the go.mos.listen() and immediately pushed to the stream channel, so any further QMP command (Run() call) will be immediately completed by an empty response from line #4. The described problem of unexpected empty response line was observed on this code qmp.SocketMonitor sequence: Connect() Run('{"execute":"query-status"}') <<< Returns empty response Disconnect() The problem is very rare and was observed ~5 times over a fairly long period of time (several months), which corresponds to nature of the described rare protocol race. The current patch was tested on modified QEMU, where an aritifical sleep() was introduced in the qmp_marshal_qmp_capabilities() call just right after the qmp_qmp_capabilities() was invoked, so all further events can be accepted by the QMP socket: --- qapi/qapi-commands-control.c 2023-11-08 08:55:16.209007741 +0100 +++ qapi/qapi-commands-control.c.orig 2023-11-08 08:55:13.929005997 +0100 @@ -42,10 +42,6 @@ qmp_qmp_capabilities(arg.has_enable, arg.enable, &err); + printf(">>> BEFORE SLEEP 10\n"); + sleep(10); + printf(">>> AFTER SLEEP 10\n"); + error_propagate(errp, err); out: Any QMP event can be freely received by the QMP client, while the execution flow of the qmp_marshal_qmp_capabilities() was interrupted or scheduled out. The fix of the described is simple: read from the QMP socket until response is received and drop all possible events. Signed-off-by: Roman Penyaev <[email protected]>
1 parent 2e3d018 commit 3173ca2

File tree

1 file changed

+23
-7
lines changed

1 file changed

+23
-7
lines changed

qmp/socket.go

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -137,13 +137,29 @@ func (mon *SocketMonitor) Connect() error {
137137
return err
138138
}
139139

140-
// Check for no error on return
141-
var r response
142-
if err := dec.Decode(&r); err != nil {
143-
return err
144-
}
145-
if err := r.Err(); err != nil {
146-
return err
140+
// Wait for response
141+
scanner := bufio.NewScanner(c)
142+
for scanner.Scan() {
143+
var e Event
144+
145+
b := scanner.Bytes()
146+
if err := json.Unmarshal(b, &e); err != nil {
147+
return err
148+
}
149+
150+
// If not an event then our qmp capabilities response
151+
if e.Event == "" {
152+
var r response
153+
if err = json.Unmarshal(b, &r); err != nil {
154+
return err
155+
}
156+
// Check response on errors
157+
if err := r.Err(); err != nil {
158+
return err
159+
}
160+
break
161+
}
162+
// Drop possible event, continue reading
147163
}
148164

149165
// Initialize socket listener for command responses and asynchronous

0 commit comments

Comments
 (0)