TCTI fails with “Resource temporarily unavailable” (EAGAIN) errors

We have observed the following problem:

When using the `mssim` TCTI, _“Resource temporarily unavailable”_ errors occur regularly. Often 2 out of 3 runs will fail!

For example, it looks like this:

```
tpm2_load -T 'mssim:host=192.168.178.47,port=2323' -C 0x81000006 -P 12345 -u ecc.pub -r ecc.priv -c ecc.ctx
Error message: WARNING:tcti:src/util-io/io.c:66:read_all() read on fd 3 failed with errno 11: Resource temporarily unavailable
ERROR:esys:src/tss2-esys/api/Esys_ContextSave.c:251:Esys_ContextSave_Finish() Received a non-TPM Error
ERROR:esys:src/tss2-esys/api/Esys_ContextSave.c:92:Esys_ContextSave() Esys Finish ErrorCode (0x000a000a)
ERROR: Esys_ContextSave(0xA000A) - tcti:IO failure
```

Note that “Resource temporarily unavailable” comes down to an **`EAGAIN`** error (i.e. errno 11).

---

I think the reason why this can happen is the way how `tcti_mssim_receive()` is currently implemented: It will first `poll()` the network socket until it becomes "ready for reading", and once this has happened, it will attempt to `recv()` the **_full_** response message. This is actually wrapped in the `socket_recv_buf()` function, which just calls the `read_all()` function.

There are, to my understanding, at least two ways how this can go wrong:

- If `poll()` signals that the network socket is "ready for reading", it means that **_some_** bytes can be read now, but it does **not** guarantee that the _full_ message is available yet. Nonetheless, the subsequent `read_all()` always attempts to read the *full* message, by repeatedly calling `recv()`. This will fail, if the _full_ message cannot be read right now. Specifically, the `read_all()` function will fail with an **`EAGAIN`** error (instead of blocking and waiting), if insufficient data is available at the moment – because the socket was opened in `O_NONBLOCK` mode. And that is, I suppose, precisely what we are seeing.
 
- At least on the Linux platform, the `poll()` and `select()` functions may cause a so-called **_"spurious readiness notification"_**. This means that a socket may be reported as "ready for reading" but then the subsequent `read()` may still block because the socket is **not** actually ready. In `O_NONBLOCK` mode, `recv()` or `read()` will fail with **`EAGAIN`** in this situation.

  For reference, please see the "BUGS" sections at:
  * https://man7.org/linux/man-pages/man2/poll.2.html
  * https://man7.org/linux/man-pages/man2/select.2.html

---

At the core of the problem is that the `TEMP_RETRY` macro does **not** currently handle the `EAGAIN` (and `EWOULDBLOCK`) errors.

At least on the Linux platform. It appears there is some handling on FreeBSD already 🤔 

The following patch contains a simple workaround that has fixed the “Resource temporarily unavailable” problem for us:
```diff
diff --git a/src/util-io/io.h b/src/util-io/io.h
index 595177d3..dc9a35fa 100644
--- a/src/util-io/io.h
+++ b/src/util-io/io.h
@@ -44,11 +44,12 @@ typedef SSIZE_T ssize_t;
     dest =__ret; }
 #else
 #define TEMP_RETRY(dest, exp) \
-{   int __ret; \
+{   int __ret, __err = 0; \
     do { \
+        if (__err > 0) usleep(100U); \
         __ret = exp; \
-    } while (__ret == SOCKET_ERROR && errno == EINTR); \
-    ((dest)) =__ret; }
+    } while ((__ret == SOCKET_ERROR) && (errno == EINTR || errno == EAGAIN || errno == EWOULDBLOCK) && (++__err < 32767)); \
+    ((dest)) = __ret; }
 #endif
 
 #ifdef __cplusplus
```

I think the preferable solution would be going back to polling when it turns out that no or insufficient data is available for reading, while keeping the partial message that has already been read. But that would probably require some more significant changes.

Regards.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TCTI fails with “Resource temporarily unavailable” (EAGAIN) errors #2949

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TCTI fails with “Resource temporarily unavailable” (EAGAIN) errors #2949

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions