Skip to content

Commit d6e2e28

Browse files
authored
fix(uptime): If we fail to fetch robots.txt we should allow the url to be processed. (#74655)
We shouldn't block a url from detection if we fail to fetch the robots.txt for that site
1 parent b6f6732 commit d6e2e28

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

src/sentry/uptime/detectors/tasks.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -267,7 +267,7 @@ def check_url_robots_txt(url: str) -> bool:
267267
return get_robots_txt_parser(url).can_fetch(UPTIME_USER_AGENT, url)
268268
except Exception:
269269
logger.warning("Failed to check robots.txt", exc_info=True)
270-
return False
270+
return True
271271

272272

273273
def get_robots_txt_parser(url: str) -> RobotFileParser:

tests/sentry/uptime/detectors/test_tasks.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,15 @@ def test_no_robots_txt(self):
307307
):
308308
assert process_candidate_url(self.project, 100, url, 50)
309309

310+
def test_error_robots_txt(self):
311+
# Supplying no robots txt should allow all urls
312+
url = "https://sentry.io"
313+
with mock.patch(
314+
"sentry.uptime.detectors.tasks.get_robots_txt_parser",
315+
side_effect=Exception("Robots.txt fetch failed"),
316+
):
317+
assert process_candidate_url(self.project, 100, url, 50)
318+
310319

311320
class TestFailedUrl(TestCase):
312321
def test(self):

0 commit comments

Comments
 (0)