We found that we hit a lot of image urls, which we won't store anyway, therefore we should filter by file type