-
Notifications
You must be signed in to change notification settings - Fork 39
Open
Description
Hi,
Let's take a look at the following example from Google:
robots.txt location:
http://example.com/robots.txt
Valid for:
http://example.com/
http://example.com/folder/file
Not valid for:
http://other.example.com/
https://example.com/
http://example.com:8181/
For instance, when asked if any page on http://other.example.com/ is allowed, reppy returns False.
It should either return True or potentially throw an exception, but definitely not False.
Returning False is incorrect because robots.txt is not a whitelist.
Here is an example:
import reppy
robots_content = 'Disallow: /abc'
robots = reppy.Robots.parse('http://example.com/robots.txt', robots_content)
print(robots.allowed('http://example.com/', '*'))
# True (**correct**)
print(robots.allowed('http://other.example.com/', '*'))
# False (**incorrect**)
print(robots.allowed('http://apple.com/', '*'))
# False (**incorrect**)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels