Skip to content

Commit 3f39571

Browse files
committed
[Radar] Add crawlers section to glossary
1 parent 349a6ac commit 3f39571

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

src/content/docs/radar/glossary.mdx

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,14 @@ Each entry on the Verified Bots list exists because a corresponding IP address w
244244

245245
The data displayed on domain-specific geographic traffic patterns is based solely on data from our recursive DNS services. All data displayed is in accordance with our privacy policies and commitments. This data may include attack traffic and cross-origin requests.
246246

247+
## Web Crawlers
248+
249+
[Web crawlers](https://www.cloudflare.com/learning/bots/what-is-a-web-crawler/) are a type of bot that browse the Internet to collect and index website content. They are mainly used by search engines like Google or Bing to make pages discoverable in search results.
250+
251+
They are also used by AI platforms, either to gather content for training large language models, or to retrieve up-to-date information for AI assistants. In both search and AI cases, crawlers work by following links from one page to another, creating a map of online content.
252+
253+
The crawl-to-refer ratio metric is calculated by first mapping crawl requests for HTML pages based on the `User-Agent` header, and referral requests for HTML pages based on the `Referer` header, by platform (e.g., the ratio for Google is based on crawl requests from Googlebot, and referral requests from Google platforms).
254+
247255
## WHOIS
248256

249257
WHOIS is a standard for publishing the contact and nameserver information for all registered domains. Each registrar maintains their own WHOIS service. Anyone can query the registrar's WHOIS service to reveal the data behind a given domain.

0 commit comments

Comments
 (0)