-
-
Notifications
You must be signed in to change notification settings - Fork 36.1k
Clean util/network to further align with stdlib #114941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
66ac21e to
9fe5820
Compare
bdraco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'd like a second set of eyes before merging
|
It looks like there is a merge conflict that needs to be addressed |
9fe5820 to
3a35ed2
Compare
Done. CI failed now, but I don't think it's related. |
|
@bdraco, since I have you: |
They can be deprecated in future PRs similar to how we do other deprecations Line 613 in f452c5b
|
| with pytest.raises(ValueError): | ||
| assert indieauth._parse_client_id("http://255.255.255.255/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a semantics change. Maybe we should keep a check for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really? 255.255.255.255 is "limited broadcast", and has been reserved from being used for unicast traffic for the past... ~40 years (RFC 919 section 7 dated October 1984). I'm not even sure how usable it'll actually be even if you force your OS to use it, but if someone really managed to, I guess they get to keep the pieces when things break? (As a sidenote, we are already (purposefully) diverging from the indieauth spec, which prohibits bare IP addresses entirely, if this wasn't an edge case enough.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm more concerned about the change to the semantics in is_private, the test here is only highlighting that change.
This PR
>>> from homeassistant.util import network
>>> from ipaddress import ip_address
>>> network.is_private(ip_address("255.255.255.255"))
True
dev
>>> from homeassistant.util import network
>>> from ipaddress import ip_address
>>> network.is_private(ip_address("255.255.255.255"))
False
I'm assuming 255.255.255.255 was chosen for a reason in that test so it seems wrong to remove it without letting the original author of the PR that it was added in have chance to review (#15369). Alternatively, we can keep compatibility with something like:
diff --git a/homeassistant/util/network.py b/homeassistant/util/network.py
index d5830d25b69..3360347b1b4 100644
--- a/homeassistant/util/network.py
+++ b/homeassistant/util/network.py
@@ -7,6 +7,8 @@ import re
import yarl
+BROADCAST = ip_address("255.255.255.255")
+
def is_loopback(address: IPv4Address | IPv6Address) -> bool:
"""Check if an address is a loopback address."""
@@ -16,7 +18,12 @@ def is_loopback(address: IPv4Address | IPv6Address) -> bool:
def is_private(address: IPv4Address | IPv6Address) -> bool:
"""Check if an address is a unique local non-loopback address."""
- return address.is_private and not is_loopback(address) and not address.is_link_local
+ return (
+ address.is_private
+ and not is_loopback(address)
+ and not address.is_link_local
+ and address != BROADCAST
+ )
def is_link_local(address: IPv4Address | IPv6Address) -> bool:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I strongly believe we should not be adding those kind of "special" semantics, especially for reasons that are not clear. 255.255.255.255 is not routable, per RFC 919, section 7, and RFC8190. You literally cannot use it, e.g. try a traceroute from your computer, it will refuse to even route the traffic to your default gateway.
The fact that HA's is_private returned False before, was a bug. Python's is_private is following those RFCs (and IANA's special-purpose registry assignments) and has the correct semantic here.
I can't imagine a scenario where bug-for-bug compatibity is desired, but I acknowledge I may be missing something! @Baloob I've noticed you're the #15369/indieauth author, perhaps you could shed some light on why http://255.255.255.255/ was chosen as a test case?
BTW, the only case in this entire file where I see the need to override stdlib, is with is_loopback, and that is due to a Python stdlib bug. But in this case, I've sent a PR to Python to fix that in stdlib itself and over time remove our custom is_loopback override. I intend to entirely deprecate the rest of these utility functions in a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest keeping bug-for-bug compatibility for now as it makes this PR mergable now. It can be removed in a followup when others are available to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick response & review! Honestly I don't think that makes sense. This whole PR is about resolving long-standing bugs resulting from HA hardcoding IANA-reserved ranges, and not doing it right (either in the first place, or due to drift). Reintroducing some of these bugs without a clear rationale is a slippery slope and kind of defeats the whole point. I have tried to keep semantics in a couple of places where it made sense (see the commit message), but in the 255.255.255.255 example in particular, I don't think it does.
As demonstrated above, 255.2555.255.255 is unroutable in Linux, including in HA's Docker/HassOS, so there is no realistic way one can use this IP in a network, and still exchange traffic with HA. IOW, the test suite tests for something that is just not possible. I hope this demonstration makes the change easier to review, but if you'd like to wait for additional reviewers, I understand :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to wait for another reviewer as I don't feel comfortable making that decision in isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda agree with bdraco, and judging by your comments, you do too:
Is it really? 255.255.255.255 is "limited broadcast", and has been reserved from being used for unicast traffic for the past... ~40 years (RFC 919 section 7 dated October 1984).
Meaning this should have raised on this given test, as this is not a valid address for authing. This test should be re-instated, and the code should be fixed to raise for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! Ι'll admit that I'm not super familiar with indieauth, and I may be misunderstanding something. And I'd like to add code that I understand :)
First of all: the indieauth code accepts only "local" addresses for auth. The networking code previously (erroneously) considered this address "globally routable" and thus indieauth did not accept it. The address is definitely not globally routable. Is this all correct?
Second: should 255.255.255.255 be singled out in the indieauth code? Why? What's the real-life
condition under which this address would be used? Basically, under which scenario would this code path be reached (outside of tests)?
|
Note that I see a lot more room for improvement in the future: For example, HA's Moreover, note that several components have opted into using stdlib methods directly instead of using I have a plan to fix all that but I think this should be in a separate PR, and these should wait until this first step of a PR gets merged. The steps I'm thinking of are, in the short-term:
Longer-term:
|
Please note, that our use is not limited to this codebase. There are thousands of integrations in the wild that might rely on these. |
|
There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days. |
|
I still think this is a good idea and needs to happen, and happy to continue updating the PR as we go. |
|
https://nvd.nist.gov/vuln/detail/CVE-2024-4032 against Python was published recently, recategorizing a few of the blocks referenced here. It came to my attention just yesterday due to a Debian Security Announcement. Home Assistant would probably gain a similar CVE for what is described in this PR here. ("affected the is_private and is_global properties [...] where values wouldn’t be returned in accordance with the latest information from the IANA Special-Purpose Address Registries."). I personally have my doubts over whether this was CVE worthy and don't currently intend on seeking a CVE for it. But I think it establishes that this PR requires a little bit of extra attention. Practically speaking, I also think it speaks to how brittle categorizing IANA ranges can be, and why Home Assistant should not be attempting to do this is_private etc. categorization itself, but rely on stdlib instead. |
|
There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days. |
|
I'd still like to see this merged! |
|
There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days. |
|
I still think this needs to happen and I'm happy to merge dev/rebase if/when required, but it'd be good to hear from a maintainer before I put more work into this 😄 Thank you for your time! |
|
There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days. |
* Remove the constants LINK_LOCAL_NETWORKS and LOOPBACK_NETWORKS, as they are currently unreferenced, due to the changes made by PR#102019. * Replace the PRIVATE_NETWORKS check with stdlib's is_private() method. Besides a reduction of code, this results in a more comprehensive check, as stdlib includes additional address spaces that were not covered before, and is ever evolving (see, for example, python/cpython#113179). * In contrast with CPython, loopbacks and link-locals are NOT included in our definition of "private", so adjust the code to keep the existing semantics * However, this *does* change the semantics for some address spaces to follow Python's interpretation of IANA, such as marking reserved documentation spaces as "local". Invert the test cases for 198.51.100.0/24 (TEST-NET-2) and 2001:db8::/32 (Documentation) accordingly. * We now need a new globally reachable IP space to use as EXTERNAL_ADDRESSES in test_auth. Use AS112's address space, as it is reserved by IANA, but marked as Globally Reachable. * Finally, inline the last remaining constant IPV6_IPV4_LOOPBACK, and leave a comment that work is underway in CPython by yours truly to reduce the need for it and have is_loopback() subsume it. Further work is required to align the semantics of e.g. what "private" means with stdlib, and replace these seldomly used util functions with stdlib properties across the tree (as is already the case in several integrations), but hopefully this provides a stepping stone towards this direction.
29515a0 to
c0864d7
Compare
|
Still not stale. |
|
There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days. |
Proposed change
Remove the constants LINK_LOCAL_NETWORKS and LOOPBACK_NETWORKS, as they are currently unreferenced, due to the changes made by PR#102019.
Replace the PRIVATE_NETWORKS check with stdlib's is_private() method. Besides a reduction of code, this results in a more comprehensive check, as stdlib includes additional address spaces that were not covered before, and is ever evolving (see, for example, GH-113171: Fix "private" (non-global) IP address ranges python/cpython#113179).
In contrast with CPython, loopbacks and link-locals are NOT included in our definition of "private", so adjust the code to keep the existing semantics
However, this does change the semantics for some address spaces to follow Python's interpretation of IANA, such as marking reserved documentation spaces as "local". Invert the test cases for 198.51.100.0/24 (TEST-NET-2) and 2001:db8::/32 (Documentation) accordingly.
We now need a new globally reachable IP space to use as EXTERNAL_ADDRESSES in test_auth. Use AS112's address space, as it is reserved by IANA, but marked as Globally Reachable.
Finally, inline the last remaining constant IPV6_IPV4_LOOPBACK, and leave a comment that work is underway in CPython by yours truly to reduce the need for it and have is_loopback() subsume it.
Further work is required to align the semantics of e.g. what "private" means with stdlib, and replace these seldomly used util functions with stdlib properties across the tree (as is already the case in several integrations), but hopefully this provides a stepping stone towards this direction.
Type of change
Additional information
Checklist
ruff format homeassistant tests)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest.requirements_all.txt.Updated by running
python3 -m script.gen_requirements_all..coveragerc.To help with the load of incoming pull requests: