Skip to content

Conversation

@aisk
Copy link
Contributor

@aisk aisk commented Dec 17, 2023

@erlend-aasland
Copy link
Contributor

Can you update the PR title to more accurately (and succinctly) describe the change?

@aisk aisk changed the title gh-111788: fix a bug that urllib.robotparser will raise exception whe… gh-111788: Don't treat path in robots.txt as URL in urllib.robotparser Jan 10, 2024
@aisk
Copy link
Contributor Author

aisk commented Jan 10, 2024

Thanks for the review, updated!

@aisk aisk changed the title gh-111788: Don't treat path in robots.txt as URL in urllib.robotparser gh-111788: Don't treat path in robots.txt as URL in urllib.robotparser Jan 10, 2024
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR, but the path can already be spoiled by applying unquote() which can decode %3F in the path to false ?. Fixing this requires larger changes. See #138502 which fixes several issues including this.

@serhiy-storchaka
Copy link
Member

Fixed as a part of #138502.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants