-
Notifications
You must be signed in to change notification settings - Fork 82
✨ Bring your own network #1472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
✨ Bring your own network #1472
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general this approach seems good to me. Thanks a lot for this contribution!
@guettli @batistein what's your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot again for this PR @johannesfrey. I think we can merge it if you follow the suggestions I gave. It's really good work!
3f6b8f6
to
b27b859
Compare
Sorry for the long delay 🙏 . Thx for the reviews! I hope I addressed your suggestions correctly. PTAL. Thx! |
b27b859
to
028f798
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @johannesfrey ! I went another time over the details and found a few things.
bdf6ca4
to
4806de2
Compare
thanks @johannesfrey! Do you think anything is missing right now? If not, I'd propose the following path:
|
That sounds awesome. Thx! The reconciler should then use its internal client (which should be the shared fake one from before) to find the network. But it cannot find it, so there must be something in the line that deleted it or there is some race when using the fake client and probably the usage of the mutexes in there?! I tried some variations of changing the locks in there, but to no avail. So wasn't able to really deflake the test. So, would be really cool if you could take another look there. And if the test makes more harm than that it's helping, we could also think of removing/chaning it. WDYT? |
mmh that's an important observation, thanks @johannesfrey. We will have a look. I'm not able to see anything in the code right now. |
Hello, what is the plan here? Is this PR still considered? |
we just moved this PR into testing. |
Looking forward to trying this out. I'm planning to use an existing private network with a NAT Gateway/Bastion Host. One big advantage of a private network is that it doesn't incur overage fees when you take the 10Gbps option. |
@tcldr this implementation only covers hcloud not vswitch. Also internal routing even via public IPs is not incurring any costs. |
@batistein good to know, thanks. Is there something particular blocking vswitch enabled networks and subnets? |
They are basically unstable, and we removed, therefore, for all our customers, the support of private networks 3 years ago. Since then, the instability on hetzner side are not resolved that's why we never invested time to support them via caph. As Syself we switched to a zero trust architecture which aligns also more with our future plans. See: https://syself.com/docs/hetzner/apalla/platform/zero-trust |
That's concerning, I'm only just experimenting with this feature now. Do you have particular examples? Would you be open to PRs that adds support so that CAPH provides the option for those who are self-managing? |
Yes, of course we are open to PRs, these need to be E2E tested and the logic needs to be separated so it doesn't affect the current code. |
We were also waiting for this PR, but meanwhile we tried out hetzner private network and experienced similar issues as @batistein told about. So we decided not to use it neither and go zero-trust way. |
Thanks for the heads-up @batistein and @bitnik ! Lots to consider there. Appreciate you both taking the time to share your thoughts. |
Any news on this PR? |
@batistein Thx for all the context around your past experiences with private networks. At the time I initially created this PR I was not aware that you have been dropping support for those already 3 years ago (at least for the customers you have been interacting with). So I'm a bit reluctant to "burden" you with even more private network functionality. So if the recommended way is zero-trust, I would be totally fine to close this PR. So I guess we would "just" need a decision how to proceed here (with the option to "close and forget" being totally fine form my side 😉). And I guess this would also relieve all others waiting for this eventually to be merged. Also @janiskemper thx so much for your effort reviewing all of this. |
@johannesfrey IMO everyone should decide that on their own. We don't use Hetzner's private networks and have reasons to not do it. However, others might come to different conclusions. I'd very much like to get this merged. We are very busy right now internally though - that's why the testing didn't happen yet. I recognize the interest into this and put it up on our agenda. It will still take some time though, I'm afraid. However, that's a good thing because in this way we will be able to ensure the quality of CAPH, even if it is a bit slower. |
What this PR does / why we need it:
This PR makes it possible to "adopt" a pre-existing network by passing its ID to
hetznerCluster.spec.hcloudNetwork.id
instead of the network being created during cluster creation. Furthermore, during cluster deletion it only deletes the attached network if it does not have theowned
label attached to it (currently the only way here to discriminate between a CAPH-managed network and an unmanaged one).Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #762
Special notes for your reviewer:
This has been lingering around for a while untouched on my fork and I decided to rebase it onto the current main branch. Please consider this as a first attempt to approach this topic as a whole. I also tried to already add some unit tests. I guess it also might require some e2e tests!? No idea if this is the desired way to do this and about other side-effects I did not see. So looking forward for feedback or any pointers. And also feel free to push changes to the PR, as I'll be pretty occupied with other things almost the whole September. Just wanted to push this out there already for you to take a look at 🙂
The most controversial changes so far:
hcloudNetwork.id
mutually exclusive withcidrBlock
,subnetCidrBlock
andnetworkZone
cidrBlock
,subnetCidrBlock
andnetworkZone
to be pointers (I guess this could also be done with empty strings, but pointers make it possible to be not shown at all, when not provided)Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs: