Skip to content
This repository was archived by the owner on Dec 9, 2025. It is now read-only.

Support Rdma ROCE#84

Merged
aojea merged 78 commits intogoogle:mainfrom
aojea:rdma_dev
May 25, 2025
Merged

Support Rdma ROCE#84
aojea merged 78 commits intogoogle:mainfrom
aojea:rdma_dev

Conversation

@aojea
Copy link
Collaborator

@aojea aojea commented May 22, 2025

Tested using dranet with GKE A3 Mega RDMA nics, see examples/demo_gke_rdma/README.md

The diff is very large because I didn't understand well how RDMA worked and I had several architectural wrong decisions.
The code is also now better organized with a clear differentiation between the operations that need to be done before Pod creating dra_hooks.go and during the pod creation nri_hooks.go. Those files content now a brief comment explaining the expectaction of the code that runs there, specially related to the pod lifecycle.

  • RDMA devices are no longer exposed, only network devices, the ones associated to RDMA contain a boolean attribute to indicate it
  • RDMA both in shared or exclusive mode move the network and character devices requested to the Pod, in exclusive mode the link RDMA device is also moved to the Pod
  • Add a configuration API for definiting Interface attributes and the routes associated to each interface.
  • The interface in the host is moved with the existing attributes , addresses and assoicated routes to the interface
  • If the interface does not have any address during the prepareResources a best effort to get it via dhcp is done, this is done in the host namespace becuse the external dhcp server only sees the interface, and this is independent of this running in a Pod or the host namespce ... the acquired ip is also not configured until the interface is in the pod namespace
  • well known interface names like docker , cilium_host, are now ignored to avoid publishing interfaces that can not be used

aojea added 17 commits May 20, 2025 16:56
Change-Id: Id7e51571592d095b33124b43405be99b15d23d36
Change-Id: I263a0d4edb5e17977accdd1d4e30f74c57b351d0
Change-Id: I5531dedb6400b8ed4ad5d7c8a50e9775081ad3e9
Change-Id: I6d913d2920318e60c394c6cf2c6598c56f3f3db5
Change-Id: I6f8012e6577b2eb04c01044facef8abe2a8d01cc
Change-Id: I7c4b2ed69370287b53b6356e7071591412e316bb
Change-Id: If628567e6e12370d56de73c41f7cc53633ec2068
Change-Id: I3ece910eb3dd89d8eabf5f1fe8da07abb1e25af8
Change-Id: Ib55cb65705e11e91e4a0f23819abbed2dd8aeb87
Change-Id: I39a1d2a2bfdf6c9230f17ba5dc2d1330cfdb1ad2
Change-Id: I441773c7dc5f2623d5a71d1ac7a3446abe9af678
Change-Id: I6f8d13277f6be76db9f9e3c8e7e0ccb5b46bff5a
Change-Id: Icbf09c3f7e6a7554492f6b41bb93f20cc9603173
Change-Id: I837f7a892c64a6df8d5501b3e2c8ffdfd0df73ce
Change-Id: I8a4ea2f47efaadd06ec7f1736b74b38054f638be
Change-Id: I1831a4e6348d4408dc15f54860716e1f64f2e6b8
Change-Id: I0b9180a5e5f2c4d8f6a2bc788ca70137886ec9ce
@aojea aojea changed the title Rdma [WIP] Rdma May 22, 2025
Change-Id: I1465860dd5308be756570f2520fddfff9f6e8e98
@aojea aojea changed the title [WIP] Rdma [WIP] Rdma ROCE May 22, 2025
aojea added 10 commits May 22, 2025 19:09
Change-Id: Ib9fb78ef307857663e006655f4a64907a6ce8657
Change-Id: I376172db7a63ccb8e7e624e4726a902adf849956
Change-Id: I1ed25f54cf82a763bed779280f45dda0d3b7a70d
Change-Id: I4d6e8a7b8b843cc1b811bce6a5b47dea25327f18
Change-Id: I09e8ce7b10695116ee4da819aa410b008dcb82e1
Change-Id: I9eef60ef869b79b7767e18db97cfbd44fa682bdc
Change-Id: I26802851e732161e2a708f61ad37aafe585111a6
Change-Id: I8beafaa69add73f8706ffa42ed0001866fdfa314
Change-Id: I72e9b16ab2ca327068d600c1be54c52e89977bad
Change-Id: I73e267281353eb7dcf2297913856e676be504d5c
aojea added 26 commits May 24, 2025 08:55
Change-Id: I5a5eca4036464d5e8f1c3b173ac25ed4fd43c97b
Change-Id: I36854a58ea2e4373cabe97913690c87af566a278
Change-Id: Ib9620948badbb6153d700414fbb1fff8f20d308b
Change-Id: Ie05c27995239fc2fa4c47444b7c81975d2e36b68
Change-Id: I2de7a3fb0db0336547d7b392be880c923bfba4ef
Change-Id: I3f0620b923dfc5cb984dce103863f4be226108e1
Change-Id: I4b4ed0a1a1617030377d4f70cee5a1bd7c218c18
Change-Id: Ic8137edeaf85ca2cf573e0302afd83a4ad32ebff
Change-Id: I9843dd9bf488a1ee10b6ff33f09801281a671051
Change-Id: Iac4de675911bff1663e0e71c6483c471b2424720
Change-Id: Id3fdefd4831bcaf973d2abda49f2b825cc405ec9
Change-Id: I3e3178fce3966d5941c1c4cc8adc6898d2d32230
Change-Id: I02064aa7d086352378db8d044c10c65251ded578
Change-Id: I83595f57e032f9a8b6154bee08cf49754caab559
Change-Id: Ib35b8963cea9c7b1249601a57406ec83f5bc3679
Change-Id: I03366c1db154d67ca35061b620f15974e8c2ce76
Change-Id: I0eccd5cbf33e38c89604600e93438ad1c11b920d
Change-Id: Ifecba7e101c121fbff9506ddab86f8068a843819
Change-Id: If6d2190a26303f252f2f391a47a5155fdf2c2f6b
Change-Id: I79aea64d03590bd7435a67e51e7334cf26c77419
Change-Id: I457a6087f5980abc9ccb086a877457a323ca7eb1
Change-Id: Ib9887d599fde583efe1493c9c6e8e3f15c1114f6
Change-Id: Ifbc6348c3ea962236cb499007caa1fec769295cf
Change-Id: I09507042d9084fe7b16165b23c1b43faf155504e
Change-Id: Idad4c63fa795c65a67b6e17a7fac7d65e2b1a28a
Change-Id: I5b8bcaf0fbdb254c64e01d4ccb6836b90f3d2a89
@aojea aojea changed the title [WIP] Rdma ROCE Support Rdma ROCE May 25, 2025
Change-Id: Iaf2a290fb82bb40202244e9f842aab77488acc87
@aojea aojea merged commit 240d6a0 into google:main May 25, 2025
7 checks passed
@aojea aojea mentioned this pull request May 25, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant