Skip to content

Baidu: not all their IPs pass forward confirmation #4

@fabiankessler

Description

@fabiankessler

Example IP that works:
123.125.66.120 resolves to baiduspider-123-125-66-120.crawl.baidu.com
and baiduspider-123-125-66-120.crawl.baidu.com points to 123.125.66.120

Example IP that fails:
180.76.15.14 resolves to baiduspider-180-76-15-14.crawl.baidu.com
but baiduspider-180-76-15-14.crawl.baidu.com points to nothing

Baidu only tells to check the reverse dns, see http://help.baidu.com/question?prod_en=master&class=498&id=1000973

Also see https://en.wikipedia.org/wiki/Forward-confirmed_reverse_DNS

I believe that those 180.76 IPs belong to Baidu, but I'm not sure.
Right now they are interpreted as IMPERSONATOR.

In my opinion Baidu should fix it, if the IPs belong to them.
Their English feedback page http://webmaster.baidu.com/feedback/index currently redirects to a Chinese page.

On our side we could return these as KnownCrawlerResultStatus.FAILED when the hostname does not resolve to any IP.

Opinions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions