Skip to content

UCP/FT/RMA: get/offload/zcopy#11162

Merged
evgeny-leksikov merged 1 commit intoopenucx:masterfrom
evgeny-leksikov:ucp_ep_err_mode_failover_get
Feb 5, 2026
Merged

UCP/FT/RMA: get/offload/zcopy#11162
evgeny-leksikov merged 1 commit intoopenucx:masterfrom
evgeny-leksikov:ucp_ep_err_mode_failover_get

Conversation

@evgeny-leksikov
Copy link
Contributor

@evgeny-leksikov evgeny-leksikov commented Jan 30, 2026

What?

FT support of "get/offload/zcopy" protocol
depends on #11155

Why?

Fault tolerance support

@evgeny-leksikov evgeny-leksikov changed the title UCP/FT/RMA: EP infra + put/offload/zcopy UCP/FT/RMA: get/offload/zcopy Jan 30, 2026
@dpressle
Copy link
Contributor

dpressle commented Feb 4, 2026

/build

@evgeny-leksikov evgeny-leksikov force-pushed the ucp_ep_err_mode_failover_get branch 3 times, most recently from d6ebf68 to 72be564 Compare February 4, 2026 20:49
@evgeny-leksikov evgeny-leksikov marked this pull request as ready for review February 4, 2026 20:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds fault tolerance support for the GET RMA operation with the "offload/zcopy" protocol. It refactors the existing PUT fault tolerance tests to support both PUT and GET operations, and consolidates duplicate reset logic into a shared function.

Changes:

  • Refactored fault tolerance test infrastructure to support both PUT and GET RMA operations
  • Consolidated the zcopy_reset implementation into a shared ucp_proto_offload_zcopy_reset function
  • Added new test cases for GET operations with initiator and target-side failures

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/gtest/ucp/test_ucp_fault_tolerance.cc Refactored tests to support both PUT and GET operations with a unified test infrastructure
src/ucp/rma/put_offload.c Removed duplicate reset function and switched to shared implementation
src/ucp/rma/get_offload.c Updated to use shared reset function
src/ucp/proto/proto_common.h Added declaration for shared reset function
src/ucp/proto/proto_common.c Added shared reset function implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@evgeny-leksikov evgeny-leksikov force-pushed the ucp_ep_err_mode_failover_get branch from 72be564 to f80144d Compare February 4, 2026 20:55
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@evgeny-leksikov evgeny-leksikov enabled auto-merge (squash) February 5, 2026 16:47
@evgeny-leksikov evgeny-leksikov merged commit e2f13b0 into openucx:master Feb 5, 2026
147 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants