Skip to content

Commit a1a69e1

Browse files
committed
Guard against wide-domain checks
Previously, we would parse the malicious URL example with an authority of 'user:[email protected]\\test.corp.google.com:8080' but we would not parse that into its components because it had invalid characters. So accessing the `host` attribute would result in `None`. That said, someone might still have used the `authority` attribute and been misled. To avoid misuse by developers, let's parse this similarly to the fix in the blog post. See also: - https://bugs.xdavidhu.me/google/2020/03/08/the-unexpected-google-wide-domain-check-bypass/
1 parent 9a87fd6 commit a1a69e1

File tree

2 files changed

+13
-1
lines changed

2 files changed

+13
-1
lines changed

src/rfc3986/abnf_regexp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@
3939
# than appear in Appendix B for scheme. This will prevent over-eager
4040
# consuming of items that aren't schemes.
4141
SCHEME_RE = '[a-zA-Z][a-zA-Z0-9+.-]*'
42-
_AUTHORITY_RE = '[^/?#]*'
42+
_AUTHORITY_RE = '[^\\\\/?#]*'
4343
_PATH_RE = '[^?#]*'
4444
_QUERY_RE = '[^#]*'
4545
_FRAGMENT_RE = '.*'

tests/test_uri.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -351,3 +351,15 @@ def test_empty_querystrings_persist():
351351
ref = URIReference.from_string(url)
352352
assert ref.query == ''
353353
assert ref.unsplit() == url
354+
355+
356+
def test_wide_domain_bypass_check():
357+
"""Verify we properly parse/handle the authority.
358+
359+
See also:
360+
https://bugs.xdavidhu.me/google/2020/03/08/the-unexpected-google-wide-domain-check-bypass/
361+
"""
362+
url = "https://user:[email protected]\\test.corp.google.com:8080/path/to/something?param=value#hash"
363+
ref = URIReference.from_string(url)
364+
assert ref.scheme == "https"
365+
assert ref.host == "xdavidhu.me"

0 commit comments

Comments
 (0)