Skip to content

URL: Relax check to ignore semicolons in URLs#116

Merged
max-moser merged 1 commit intoinveniosoftware:masterfrom
max-moser:mm/is_url
Jan 28, 2025
Merged

URL: Relax check to ignore semicolons in URLs#116
max-moser merged 1 commit intoinveniosoftware:masterfrom
max-moser:mm/is_url

Conversation

@max-moser
Copy link
Contributor

Closes #115

@tmorrell
Copy link
Contributor

Yup, another piece of evidence this change is needed. Could you add in the test from https://github.com/inveniosoftware/idutils/pull/92/files, and then we can get this merged finally.

@max-moser
Copy link
Contributor Author

Actually, your PR has more goodies than this one. Closing in favor of yours.

@max-moser max-moser closed this Jan 28, 2025
@max-moser max-moser reopened this Jan 28, 2025
@max-moser
Copy link
Contributor Author

@tmorrell I tried rebasing your PR on master, but was denied permissions to push to the branch.
As such, I'll update & merge this PR instead; I interpret your PR being basically the same as an implicit approval :)

* according to RFC 1738 section 2.2, semicolons are reserved but valid
  characters in URIs
* `urllib.parse.urlparse()` interprets semicolons as `params`
* thus, the check for no params in `is_url()` returns `False` for some
  valid URLs
@max-moser max-moser merged commit ffd5a6b into inveniosoftware:master Jan 28, 2025
2 checks passed
@max-moser max-moser deleted the mm/is_url branch January 28, 2025 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

URLs with semicolons are reported as invalid

2 participants