Skip to content

Conversation

jadalilleboe
Copy link
Contributor

RUST-564

There were three spec tests being skipped due to socketTimeoutMs not being implemented, but on further investigation it seems like other drivers are skipping these tests due to them being flaky. The C# driver is skipping "Ignore network timeout error on find" and "Reset server and pool after network timeout error during authentication" due to flakiness and the Python driver has a BF associated with the spec test "add RetryableWriteError and UnknownTransactionCommitResult labels to connection errors". Therefore, I think these tests being unskipped should be part of a separate ticket and I changed the description of why they are being skipped to "flaky tests".

@abr-egn abr-egn marked this pull request as draft December 12, 2024 19:35
@abr-egn
Copy link
Contributor

abr-egn commented Dec 12, 2024

So it turns out that I'd completely misunderstood what read_timeout/write_timeout do on sockets - when they're in nonblocking mode (which is what async I/O is built on), it boils down to "essentially nothing". They definitely don't cause timeout errors when that duration is passed, at most they cause the future to be polled an extra time.

TL;DR - this PR is exactly correct according to the specifications I gave, but those were wrong :)

I'm looking to see if I can implement a timeout at the level of our existing AsyncStream wrapper without too much hassle. This is made a pain by the very low-level async nature of the stream APIs here, if it gets to be too much effort this probably won't be worth it since it was intended as a quick workaround to the lack of CSOT.

@abr-egn abr-egn removed their request for review December 12, 2024 19:45
@abr-egn
Copy link
Contributor

abr-egn commented Dec 18, 2024

TL;DR - it looks like this won't be worth the engineer time it would take.

Longer version: it would be possible to implement this. However, because of technical details*, doing so in a performant way would be a challenging and nontrivial project. In the process of investigating this, I came across information that led me to revisit our conclusions that CSOT (RUST-582) wouldn't be feasible, and I now believe that is reasonably doable and the better path.

* The Tokio Sleep future doesn't implement the Unpin trait, which would transitively make any stream implementation using it to implement a timeout also not Unpin. However, Self: Unpin is required for most uses of async I/O. The potential solutions for this are:

  • Wrap the Sleep future in a Box, which is always Unpin no matter the contents. However, this means that our hypothetical socket-with-timeout stream would potentially incur a heap allocation on every socket read or write, which doesn't seem great.
  • Implement our own timing machinery that is Unpin. Presumably, if it were easy to do this in a general, performant way Tokio would have done so already

@abr-egn abr-egn closed this Dec 18, 2024
@abr-egn abr-egn mentioned this pull request Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants