Skip to content

Comments

prov/efa: Fix 0 byte send/read/write support#11890

Open
a-szegel wants to merge 10 commits intoofiwg:mainfrom
a-szegel:handle-0-byte-operations-in-efa-direct-rebased
Open

prov/efa: Fix 0 byte send/read/write support#11890
a-szegel wants to merge 10 commits intoofiwg:mainfrom
a-szegel:handle-0-byte-operations-in-efa-direct-rebased

Conversation

@a-szegel
Copy link
Contributor

Aligns efa-proto and efa-direct to support doc updates in #11887.

Allocate and register a 4KB bounce buffer during EFA-direct domain
initialization. This buffer will be used to support 0-byte inject
operations which are required by libfabric semantics but not directly
supported by the EFA device.

The buffer is registered with FI_SEND | FI_RECV | FI_READ | FI_WRITE |
FI_REMOTE_READ | FI_REMOTE_WRITE permissions to support all operation
types. Memory registration uses efa_mr_internal_regv() to integrate
with the provider's MR infrastructure.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Enable 0-byte RDMA read operations by using the pre-registered bounce
buffer when total_len is 0. The implementation substitutes the bounce
buffer's SGE and lkey into the first entry of the SGE list and sets
iov_count to 1.

This follows the libfabric requirement that providers must support
0-byte operations.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Enable 0-byte RDMA write operations by using the pre-registered bounce
buffer when total_len is 0. The implementation follows the same pattern
as 0-byte read: substitute the bounce buffer's SGE and lkey into the
first entry of the SGE list and set iov_count to 1.

This ensures compliance with libfabric semantics for 0-byte transfers.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Update assert in 0 byte read path to allow msg->iov_count to be
equal to 0.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
@a-szegel a-szegel requested a review from a team February 13, 2026 02:48
@a-szegel a-szegel force-pushed the handle-0-byte-operations-in-efa-direct-rebased branch from 5ab67ea to 511d39e Compare February 13, 2026 03:43
Implement fi_inject_write[data] for 0-byte transfers using the pre-registered
bounce buffer. The function only supports 0-byte operations and returns
-FI_EOPNOTSUPP for non-zero lengths, as inline RDMA write is not yet
supported at the device level.

Update efa_rma_ops to use efa_rma_inject_write[data] instead of
fi_no_rma_inject[data], enabling applications to perform 0-byte inject
writes on EFA-direct.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Enable 0-byte message send operations by detecting len == 0 in
efa_post_send and using the pre-registered bounce buffer as inline
data. This centralizes the 0-byte handling logic in efa_post_send,
allowing efa_ep_msg_inject to remain simple and consistent.

The implementation sets up the bounce buffer as inline data with
iov_count = 1 and use_inline = true, then continues with the normal
send path.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Post the 0 byte RDMA Read with the pkt_entry's pre-registered
bounce buffer, and MR to allow the user to pass in NULL/0 local
buffer/descriptors.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Previous comment was incorrect because 0 byte send/tsend/read/write
operations can have iov_count = 0.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
local_iov_len does not have to equal remote_iov_len for 0
byte write.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Adds unit tests for 0 byte send/read/write for both efa-proto
and efa-direct

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
@a-szegel a-szegel force-pushed the handle-0-byte-operations-in-efa-direct-rebased branch from 511d39e to a487a46 Compare February 13, 2026 03:51
@a-szegel
Copy link
Contributor Author

Looks like there are some real failures in CI that will need to be fixed before this PR can be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant