Skip to content

Commit ffd5a6b

Browse files
committed
url: relax check to ignore semicolons in URLs
* according to RFC 1738 section 2.2, semicolons are reserved but valid characters in URIs * `urllib.parse.urlparse()` interprets semicolons as `params` * thus, the check for no params in `is_url()` returns `False` for some valid URLs
1 parent 181b3db commit ffd5a6b

File tree

3 files changed

+12
-1
lines changed

3 files changed

+12
-1
lines changed

CHANGES.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,10 @@
1515
Changes
1616
=======
1717

18+
Version <next>
19+
20+
- is_url: allow URL parameters (i.e. semicolon)
21+
1822
Version 1.4.2 (2024-11-01)
1923

2024
- setup: remove pytest-invenio to make imports cleaner

idutils/validators.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -172,7 +172,7 @@ def is_purl(val):
172172
def is_url(val):
173173
"""Test if argument is a URL."""
174174
res = urlparse(val)
175-
return bool(res.scheme and res.netloc and res.params == "")
175+
return bool(res.scheme and res.netloc)
176176

177177

178178
def is_lsid(val):

tests/test_idutils.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -870,6 +870,13 @@ def test_doi():
870870
assert not idutils.is_doi("10.1.NOTGOOD.0/123456")
871871

872872

873+
def test_url():
874+
"""Test URL validation."""
875+
for i, expected_schemes, normalized_value, url_value in identifiers:
876+
if url_value:
877+
assert idutils.is_url(url_value)
878+
879+
873880
def test_ascl():
874881
"""Test ASCL validation."""
875882
assert idutils.is_ascl("ascl:1908.011")

0 commit comments

Comments
 (0)