-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
I am crawling this website to find all the pages that 404, But the website i am crawling have the 404's redirected to a pretty 'sorry for 404' page(302). So is there a way to detect link that get redirected like this? , log the links that gets redirected to a pretty 404 link
I was running a small python code like this
import requests link = 'https://example/1234sdsd' r = requests.get(link, allow_redirects=False) print(link,r.status_code, r.headers['Location'])
print log comes like this :"https://example/1234sdsd 302 /404.aspx?item=%2f1234sdsd&user=extranet%5cAnonymous&site=website"
i was looking for something like this with the crawler
"302 - original link (1 of 1669 -0%"
Metadata
Metadata
Assignees
Labels
No labels