Skip to content

Commit 773266a

Browse files
samluryefacebook-github-bot
authored andcommitted
Allow NetRx to explicitly reject connections from NetTx (meta-pytorch#962)
Summary: Pull Request resolved: meta-pytorch#962 The purpose of this diff is to handle the following scenario: 1. Process A starts serving a NetRx. 2. Process B creates a NetTx that connects to process A's NetRx. 3. B sends a few messages to A, and the messages are acked. 4. Process A dies/is killed, while B stays alive. 5. A new Process C starts serving a NetRx on the same channel as from step 1. 6. B's NetTx connects to C's NetRx, *with no way of knowing it has connected to a different process than before*. 7. B sends messages to C, starting from where it left off with A. 8. C rejects all of B's messages because of invalid sequence numbers. 9. B's NetTx eventually times out after a long time with no acks. This diff expedites the `NetTx` failure from step 9 by allowing `NetRx` to explicitly reject a connection when it sees an out-of-sequence message. Instead of a simple `u64` ack, the `NetRx` response is now an enum with two variants: `Reject` and `Ack(u64)`. The enum is serialized with bincode. ghstack-source-id: 305073647 Reviewed By: mariusae Differential Revision: D80640441 fbshipit-source-id: 7a32f6538081091e0e852f86427b63f58301c174
1 parent 7fd1028 commit 773266a

File tree

1 file changed

+200
-77
lines changed

1 file changed

+200
-77
lines changed

0 commit comments

Comments
 (0)