Skip to content

Conversation

@paravoid
Copy link
Contributor

@paravoid paravoid commented Apr 5, 2024

Proposed change

  • Remove the constants LINK_LOCAL_NETWORKS and LOOPBACK_NETWORKS, as they are currently unreferenced, due to the changes made by PR#102019.

  • Replace the PRIVATE_NETWORKS check with stdlib's is_private() method. Besides a reduction of code, this results in a more comprehensive check, as stdlib includes additional address spaces that were not covered before, and is ever evolving (see, for example, GH-113171: Fix "private" (non-global) IP address ranges python/cpython#113179).

  • In contrast with CPython, loopbacks and link-locals are NOT included in our definition of "private", so adjust the code to keep the existing semantics

  • However, this does change the semantics for some address spaces to follow Python's interpretation of IANA, such as marking reserved documentation spaces as "local". Invert the test cases for 198.51.100.0/24 (TEST-NET-2) and 2001:db8::/32 (Documentation) accordingly.

  • We now need a new globally reachable IP space to use as EXTERNAL_ADDRESSES in test_auth. Use AS112's address space, as it is reserved by IANA, but marked as Globally Reachable.

  • Finally, inline the last remaining constant IPV6_IPV4_LOOPBACK, and leave a comment that work is underway in CPython by yours truly to reduce the need for it and have is_loopback() subsume it.

Further work is required to align the semantics of e.g. what "private" means with stdlib, and replace these seldomly used util functions with stdlib properties across the tree (as is already the case in several integrations), but hopefully this provides a stepping stone towards this direction.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Deprecation (breaking change to happen in the future)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • I have followed the perfect PR recommendations
  • The code has been formatted using Ruff (ruff format homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • For the updated dependencies - a link to the changelog, or at minimum a diff between library versions is added to the PR description.
  • Untested files have been added to .coveragerc.

To help with the load of incoming pull requests:

@paravoid paravoid requested a review from a team as a code owner April 5, 2024 13:44
@home-assistant home-assistant bot added bugfix cla-signed core small-pr PRs with less than 30 lines. labels Apr 5, 2024
@MartinHjelmare MartinHjelmare changed the title util/network: cleanup, further alignment with stdlib Clean util/network to further align with stdlib Apr 6, 2024
@paravoid paravoid force-pushed the util-network-stdlib branch 4 times, most recently from 66ac21e to 9fe5820 Compare April 11, 2024 15:26
Copy link
Member

@bdraco bdraco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd like a second set of eyes before merging

@bdraco bdraco added the second-opinion-wanted Add this label when a reviewer needs a second opinion from another member. label Apr 11, 2024
@bdraco
Copy link
Member

bdraco commented Apr 13, 2024

It looks like there is a merge conflict that needs to be addressed

@paravoid paravoid force-pushed the util-network-stdlib branch from 9fe5820 to 3a35ed2 Compare April 13, 2024 09:07
@paravoid
Copy link
Contributor Author

It looks like there is a merge conflict that needs to be addressed

Done.

CI failed now, but I don't think it's related.

@paravoid
Copy link
Contributor Author

@bdraco, since I have you: is_private() (modified here) and is_invalid() for that matter, are not used anywhere in the tree as far as I can tell. Do we remove them? Deprecate them for the benefit of third-party integrations? In any case, I assume it's better to do that in a subsequent commit, correct?

@bdraco
Copy link
Member

bdraco commented Apr 13, 2024

@bdraco, since I have you: is_private() (modified here) and is_invalid() for that matter, are not used anywhere in the tree as far as I can tell. Do we remove them? Deprecate them for the benefit of third-party integrations? In any case, I assume it's better to do that in a subsequent commit, correct?

They can be deprecated in future PRs similar to how we do other deprecations

frame.report(

Comment on lines -86 to -87
with pytest.raises(ValueError):
assert indieauth._parse_client_id("http://255.255.255.255/")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a semantics change. Maybe we should keep a check for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really? 255.255.255.255 is "limited broadcast", and has been reserved from being used for unicast traffic for the past... ~40 years (RFC 919 section 7 dated October 1984). I'm not even sure how usable it'll actually be even if you force your OS to use it, but if someone really managed to, I guess they get to keep the pieces when things break? (As a sidenote, we are already (purposefully) diverging from the indieauth spec, which prohibits bare IP addresses entirely, if this wasn't an edge case enough.)

Copy link
Member

@bdraco bdraco Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm more concerned about the change to the semantics in is_private, the test here is only highlighting that change.

This PR

>>> from homeassistant.util import network
>>> from ipaddress import ip_address
>>> network.is_private(ip_address("255.255.255.255"))
True

dev

>>> from homeassistant.util import network
>>> from ipaddress import ip_address
>>> network.is_private(ip_address("255.255.255.255"))
False

I'm assuming 255.255.255.255 was chosen for a reason in that test so it seems wrong to remove it without letting the original author of the PR that it was added in have chance to review (#15369). Alternatively, we can keep compatibility with something like:

diff --git a/homeassistant/util/network.py b/homeassistant/util/network.py
index d5830d25b69..3360347b1b4 100644
--- a/homeassistant/util/network.py
+++ b/homeassistant/util/network.py
@@ -7,6 +7,8 @@ import re
 
 import yarl
 
+BROADCAST = ip_address("255.255.255.255")
+
 
 def is_loopback(address: IPv4Address | IPv6Address) -> bool:
     """Check if an address is a loopback address."""
@@ -16,7 +18,12 @@ def is_loopback(address: IPv4Address | IPv6Address) -> bool:
 
 def is_private(address: IPv4Address | IPv6Address) -> bool:
     """Check if an address is a unique local non-loopback address."""
-    return address.is_private and not is_loopback(address) and not address.is_link_local
+    return (
+        address.is_private
+        and not is_loopback(address)
+        and not address.is_link_local
+        and address != BROADCAST
+    )
 
 
 def is_link_local(address: IPv4Address | IPv6Address) -> bool:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly believe we should not be adding those kind of "special" semantics, especially for reasons that are not clear. 255.255.255.255 is not routable, per RFC 919, section 7, and RFC8190. You literally cannot use it, e.g. try a traceroute from your computer, it will refuse to even route the traffic to your default gateway.

The fact that HA's is_private returned False before, was a bug. Python's is_private is following those RFCs (and IANA's special-purpose registry assignments) and has the correct semantic here.

I can't imagine a scenario where bug-for-bug compatibity is desired, but I acknowledge I may be missing something! @Baloob I've noticed you're the #15369/indieauth author, perhaps you could shed some light on why http://255.255.255.255/ was chosen as a test case?

BTW, the only case in this entire file where I see the need to override stdlib, is with is_loopback, and that is due to a Python stdlib bug. But in this case, I've sent a PR to Python to fix that in stdlib itself and over time remove our custom is_loopback override. I intend to entirely deprecate the rest of these utility functions in a subsequent PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest keeping bug-for-bug compatibility for now as it makes this PR mergable now. It can be removed in a followup when others are available to review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick response & review! Honestly I don't think that makes sense. This whole PR is about resolving long-standing bugs resulting from HA hardcoding IANA-reserved ranges, and not doing it right (either in the first place, or due to drift). Reintroducing some of these bugs without a clear rationale is a slippery slope and kind of defeats the whole point. I have tried to keep semantics in a couple of places where it made sense (see the commit message), but in the 255.255.255.255 example in particular, I don't think it does.

As demonstrated above, 255.2555.255.255 is unroutable in Linux, including in HA's Docker/HassOS, so there is no realistic way one can use this IP in a network, and still exchange traffic with HA. IOW, the test suite tests for something that is just not possible. I hope this demonstration makes the change easier to review, but if you'd like to wait for additional reviewers, I understand :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to wait for another reviewer as I don't feel comfortable making that decision in isolation.

Copy link
Member

@frenck frenck May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda agree with bdraco, and judging by your comments, you do too:

Is it really? 255.255.255.255 is "limited broadcast", and has been reserved from being used for unicast traffic for the past... ~40 years (RFC 919 section 7 dated October 1984).

Meaning this should have raised on this given test, as this is not a valid address for authing. This test should be re-instated, and the code should be fixed to raise for this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Ι'll admit that I'm not super familiar with indieauth, and I may be misunderstanding something. And I'd like to add code that I understand :)

First of all: the indieauth code accepts only "local" addresses for auth. The networking code previously (erroneously) considered this address "globally routable" and thus indieauth did not accept it. The address is definitely not globally routable. Is this all correct?

Second: should 255.255.255.255 be singled out in the indieauth code? Why? What's the real-life
condition under which this address would be used? Basically, under which scenario would this code path be reached (outside of tests)?

@frenck frenck self-assigned this Apr 25, 2024
@paravoid
Copy link
Contributor Author

Note that I see a lot more room for improvement in the future: util.network is both unnecessarily wrapping stdlib in most cases, and using conflicting (and confusing) terminology as compared to stdlib and IANA/IETF at the same time.

For example, HA's is_local is not about link-local or loopback, but rather about private addresses, i.e. stdlib's is_private. HA does have an is_private, but that is confusingly not equivalent to stdlib's is_private, but rather an implementation is arguably not very useful semantically (as is perhaps evidenced by the fact that it is not used anywhere in the tree), and thus not present in stdlib.

Moreover, note that several components have opted into using stdlib methods directly instead of using homeassistant.util.network. For example, homeassistant.util.network.is_loopback is being used only by homeassistant.helpers.network, but ipaddress.is_loopback is used by dhcp, network(!), shelly, ssdp, systemmonitor, yeelight, zeroconf.

I have a plan to fix all that but I think this should be in a separate PR, and these should wait until this first step of a PR gets merged.

The steps I'm thinking of are, in the short-term:

  • is_invalid() is deprecated (using homeassistant.helpers.deprecation.deprecated_function) in favor of stdlib's is_unspecified method, as it's literally a wrapper around it. There are no users in-tree.
  • is_link_local() is deprecated, in favor of stdlib's is_link_local. There are two users in-tree (axis and lametric).
  • is_private() is deprecated. There are no users in-tree, no equivalent stdlib method.
  • is_local()'s is deprecated, in favor of stdlib's is_private, as they are (effectively) the same. auth/login_flow, auth/indieauth, emulated_hue, http and webhook are the in-tree users.

Longer-term:

  • My stdlib PR fixing is_loopback for v6-mapped-to-v4 was merged last week and backported to 3.12.x today, and thus will be part of 3.13 & 3.12.10. Once HA bumps its REQUIRED_PYTHON_VER to a version equal or higher than either of these, then is_loopback() can be deprecated as well and its sole user in the tree (helpers.network) converted over.

@frenck frenck marked this pull request as draft May 14, 2024 20:26
@frenck
Copy link
Member

frenck commented May 14, 2024

is_private() is deprecated. There are no users in-tree, no equivalent stdlib method.

Please note, that our use is not limited to this codebase. There are thousands of integrations in the wild that might rely on these.

@github-actions
Copy link

There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days.
If you are the author of this PR, please leave a comment if you want to keep it open. Also, please rebase your PR onto the latest dev branch to ensure that it's up to date with the latest changes.
Thank you for your contribution!

@github-actions github-actions bot added the stale label Jul 13, 2024
@paravoid
Copy link
Contributor Author

I still think this is a good idea and needs to happen, and happy to continue updating the PR as we go.

@github-actions github-actions bot removed the stale label Jul 14, 2024
@paravoid
Copy link
Contributor Author

https://nvd.nist.gov/vuln/detail/CVE-2024-4032 against Python was published recently, recategorizing a few of the blocks referenced here. It came to my attention just yesterday due to a Debian Security Announcement.

Home Assistant would probably gain a similar CVE for what is described in this PR here. ("affected the is_private and is_global properties [...] where values wouldn’t be returned in accordance with the latest information from the IANA Special-Purpose Address Registries."). I personally have my doubts over whether this was CVE worthy and don't currently intend on seeking a CVE for it. But I think it establishes that this PR requires a little bit of extra attention.

Practically speaking, I also think it speaks to how brittle categorizing IANA ranges can be, and why Home Assistant should not be attempting to do this is_private etc. categorization itself, but rely on stdlib instead.

@frenck frenck added this to the 2024.10.0b0 milestone Sep 19, 2024
@github-actions
Copy link

There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days.
If you are the author of this PR, please leave a comment if you want to keep it open. Also, please rebase your PR onto the latest dev branch to ensure that it's up to date with the latest changes.
Thank you for your contribution!

@github-actions github-actions bot added the stale label Nov 18, 2024
@paravoid
Copy link
Contributor Author

I'd still like to see this merged!

@github-actions github-actions bot removed the stale label Nov 18, 2024
@github-actions
Copy link

There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days.
If you are the author of this PR, please leave a comment if you want to keep it open. Also, please rebase your PR onto the latest dev branch to ensure that it's up to date with the latest changes.
Thank you for your contribution!

@github-actions github-actions bot added the stale label Jan 17, 2025
@paravoid
Copy link
Contributor Author

I still think this needs to happen and I'm happy to merge dev/rebase if/when required, but it'd be good to hear from a maintainer before I put more work into this 😄 Thank you for your time!

@github-actions github-actions bot removed the stale label Jan 21, 2025
@github-actions
Copy link

There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days.
If you are the author of this PR, please leave a comment if you want to keep it open. Also, please rebase your PR onto the latest dev branch to ensure that it's up to date with the latest changes.
Thank you for your contribution!

@github-actions github-actions bot added the stale label Mar 23, 2025
@paravoid paravoid marked this pull request as ready for review March 24, 2025 17:58
* Remove the constants LINK_LOCAL_NETWORKS and LOOPBACK_NETWORKS, as
  they are currently unreferenced, due to the changes made by PR#102019.

* Replace the PRIVATE_NETWORKS check with stdlib's is_private() method.
  Besides a reduction of code, this results in a more comprehensive
  check, as stdlib includes additional address spaces that were not
  covered before, and is ever evolving (see, for example,
  python/cpython#113179).

* In contrast with CPython, loopbacks and link-locals are NOT included
  in our definition of "private", so adjust the code to keep the
  existing semantics

* However, this *does* change the semantics for some address spaces to
  follow Python's interpretation of IANA, such as marking reserved
  documentation spaces as "local". Invert the test cases for
  198.51.100.0/24 (TEST-NET-2) and 2001:db8::/32 (Documentation)
  accordingly.

* We now need a new globally reachable IP space to use as
  EXTERNAL_ADDRESSES in test_auth. Use AS112's address space, as it is
  reserved by IANA, but marked as Globally Reachable.

* Finally, inline the last remaining constant IPV6_IPV4_LOOPBACK, and
  leave a comment that work is underway in CPython by yours truly to
  reduce the need for it and have is_loopback() subsume it.

Further work is required to align the semantics of e.g. what "private"
means with stdlib, and replace these seldomly used util functions with
stdlib properties across the tree (as is already the case in several
integrations), but hopefully this provides a stepping stone towards this
direction.
@paravoid paravoid force-pushed the util-network-stdlib branch from 29515a0 to c0864d7 Compare March 24, 2025 18:00
@paravoid
Copy link
Contributor Author

Still not stale.

@github-actions github-actions bot removed the stale label Mar 25, 2025
@github-actions
Copy link

There hasn't been any activity on this pull request recently. This pull request has been automatically marked as stale because of that and will be closed if no further activity occurs within 7 days.
If you are the author of this PR, please leave a comment if you want to keep it open. Also, please rebase your PR onto the latest dev branch to ensure that it's up to date with the latest changes.
Thank you for your contribution!

@github-actions github-actions bot added the stale label May 24, 2025
@github-actions github-actions bot closed this May 31, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Jun 1, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

bugfix cla-signed core second-opinion-wanted Add this label when a reviewer needs a second opinion from another member. small-pr PRs with less than 30 lines. stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants