Skip to content

Commit b41da31

Browse files
authored
feat: bot detection (#210)
1 parent 7056369 commit b41da31

File tree

22 files changed

+2486
-567
lines changed

22 files changed

+2486
-567
lines changed
Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
---
2+
title: Bot Detection
3+
description: Detect and classify bots with server-side header analysis and client-side browser fingerprinting.
4+
---
5+
6+
## Introduction
7+
8+
The modern web is full of [bots](https://nuxtseo.com/learn/controlling-crawlers). Start detecting them to better understand your traffic and optimize your Nuxt app for both human users and automated agents.
9+
10+
Nuxt Robots provides bot detection that works on both server and client side, from simple heuristics using HTTP header `User-Agent` checks to advanced client-side detection using [BotD](https://github.com/fingerprintjs/BotD).
11+
12+
## Bot Categories
13+
14+
The module classifies bots into categories based on their intended purpose and trustworthiness. **Trusted bots** are legitimate services that respect robots.txt and provide value to websites, while **untrusted bots** include automation tools, scrapers, and potentially malicious crawlers.
15+
16+
### Search Engine Bots (trusted)
17+
Search engines that index content for public search results and respect standard web protocols.
18+
- **Google**: `googlebot`, `google.com/bot.html`
19+
- **Bing**: `bingbot`, `msnbot`
20+
21+
### Social Media Bots (trusted)
22+
Social platforms that crawl content for link previews, cards, and social sharing features.
23+
- **Facebook**: `facebookexternalhit`, `facebook.com`
24+
- **Twitter**: `twitterbot`, `twitter`
25+
26+
### SEO & Analytics Bots (trusted)
27+
Professional SEO and analytics services that provide legitimate website analysis and insights.
28+
- **Ahrefs**: `ahrefsbot`, `ahrefs.com`
29+
- **Majestic**: `mj12bot`, `majestic12.co.uk/bot`
30+
31+
### AI & ML Bots (trusted)
32+
AI companies and research organizations training models or providing AI-powered services.
33+
- **OpenAI**: `gptbot`, `openai.com`
34+
- **Anthropic**: `anthropic`
35+
36+
### Automation Tools (untrusted)
37+
Browser automation and testing frameworks that may be used for legitimate testing or malicious scraping.
38+
- **Selenium**: `selenium`, `webdriver`
39+
- **Playwright**: `playwright`
40+
41+
### HTTP Tools (untrusted)
42+
Command-line HTTP clients and programmatic request libraries often used for automated data extraction.
43+
- **cURL**: `curl`
44+
- **Python Requests**: `python-requests`, `python`
45+
46+
### Security Scanners (untrusted)
47+
Network scanning and vulnerability assessment tools that may indicate malicious reconnaissance.
48+
- **Nmap**: `nmap`, `insecure.org`
49+
- **Nikto**: `nikto`
50+
51+
### Scraping Tools (untrusted)
52+
Dedicated web scraping frameworks designed for automated data collection.
53+
- **Scrapy**: `scrapy`, `scrapy.org`
54+
- **Generic Scraper**: `scraper`
55+
56+
Missing a bot? Submit a quick PR :)
57+
[View and contribute to bot definitions →](https://github.com/nuxt-modules/robots/blob/main/src/const-bots.ts)
58+
59+
## Nitro Bot Detection
60+
61+
Since server-side detection only uses HTTP headers, detection can only work for bots that correctly identify themselves in the `User-Agent` header.
62+
63+
You can detect bots inside a Nitro route, middleware, or API handler.
64+
65+
```ts
66+
import { getBotDetection } from '#robots/server/composables/getBotDetection'
67+
68+
export default defineEventHandler((e) => {
69+
const detection = getBotDetection(e)
70+
71+
if (detection.isBot) {
72+
return { message: `${detection.botName} bot detected`, category: detection.botCategory }
73+
}
74+
75+
return { message: 'Human user' }
76+
})
77+
```
78+
79+
For full behavior, please consult the [`getBotDetection`](/docs/robots/nitro-api/get-bot-detection) API docs.
80+
81+
## Nuxt Bot Detection
82+
83+
When using bot detection in Nuxt, it will use the `User-Agent` header by default. You can optionally use the [BotD](https://github.com/fingerprintjs/BotD) fingerprinting library to detect advanced automation tools by setting `fingerprint: true`.
84+
85+
```vue
86+
<script setup lang="ts">
87+
import { useBotDetection } from '#robots/app/composables/useBotDetection'
88+
89+
const { isBot, botName, botCategory, trusted } = useBotDetection({
90+
fingerprint: true, // detects using botd
91+
})
92+
</script>
93+
94+
<template>
95+
<div v-if="isBot">
96+
Bot detected: {{ botName }} ({{ botCategory }})
97+
</div>
98+
</template>
99+
```
100+
101+
See the [`useBotDetection()`](/docs/robots/api/use-bot-detection) API docs for full usage details.
102+
103+
## Fingerprinting with BotD
104+
105+
When using `fingerprint: true`, the composable will load the [BotD](https://github.com/fingerprintjs/BotD)
106+
library when the window is idle and perform client-side fingerprinting to detect advanced bots and automation tools.
107+
108+
### Performance Considerations
109+
110+
This fingerprinting is computationally expensive for end users' CPUs, so you should be mindful of when you enable it. For example, you may consider only enabling it for sensitive pages where bot detection is critical.
111+
112+
That said, the composable aims to be performant and will cache the bot result in the user's local storage under the `'__nuxt_robots:botd'` key so it will only run once.
113+
114+
```ts
115+
localStorage.getItem('__nuxt_robots:botd') // returns the cached bot detection result - used internally already
116+
```
117+
118+
### Watching For Fingerprinting
119+
120+
The properties returned from the composable are all `ref`s. It's important to watch these for changes if you're using fingerprinting, as the results will not be immediately available when the composable is called.
121+
122+
```ts
123+
import { useBotDetection } from '#robots/app/composables/useBotDetection'
124+
import { watch } from 'vue'
125+
126+
const { isBot } = useBotDetection({
127+
fingerprint: true,
128+
})
129+
130+
watch(isBot, (detected) => {
131+
if (detected) {
132+
console.log(`Bot detected!`)
133+
}
134+
})
135+
```
136+
137+
Alternatively you can use the `onFingerprintResult` callback to handle the result when fingerprinting completes.
138+
139+
```ts
140+
import { useBotDetection } from '#robots/app/composables/useBotDetection'
141+
142+
const botd = useBotDetection({
143+
fingerprint: true,
144+
onFingerprintResult(result) {
145+
// Fingerprinting completed
146+
console.log('Detection result:', result)
147+
},
148+
})
149+
```
Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
---
2+
title: useBotDetection()
3+
description: A reactive composable for detecting and classifying bots with optional client-side fingerprinting.
4+
---
5+
6+
## Introduction
7+
8+
**Type:** `function useBotDetection(options?: UseBotDetectionOptions): UseBotDetectionReturn`{lang="ts"}
9+
10+
```ts
11+
import type { UseBotDetectionOptions, UseBotDetectionReturn } from '@nuxtjs/robots/util'
12+
```
13+
14+
Detect and classify bots using server-side header analysis and optional client-side browser fingerprinting.
15+
16+
The composable provides reactive access to bot detection results with automatic caching. Client-side fingerprinting is opt-in due to performance costs.
17+
18+
**🔔 Important:** Bot detection only runs when you use this composable. No automatic bot detection occurs - it's entirely opt-in based on your usage of this composable.
19+
20+
## Usage
21+
22+
**Basic detection:**
23+
24+
```ts
25+
import { useBotDetection } from '#robots/app/composables/useBotDetection'
26+
27+
const { isBot, botName, botCategory, trusted } = useBotDetection()
28+
// isBot: ComputedRef<boolean>
29+
// botName: ComputedRef<BotName | undefined> // 'googlebot', 'facebook', etc.
30+
// botCategory: ComputedRef<BotCategory | undefined> // 'search-engine', 'social', etc.
31+
// trusted: ComputedRef<boolean | undefined>
32+
```
33+
34+
**With fingerprinting:**
35+
36+
```ts
37+
import { useBotDetection } from '#robots/app/composables/useBotDetection'
38+
39+
const { isBot, botName, botCategory, trusted, reset } = useBotDetection({
40+
fingerprint: true,
41+
onFingerprintError: (error) => {
42+
console.error('Fingerprint error:', error)
43+
},
44+
onFingerprintResult: (result) => {
45+
console.log('Fingerprinting completed:', result)
46+
}
47+
})
48+
```
49+
50+
**Watching for changes:**
51+
52+
```ts
53+
import { useBotDetection } from '#robots/app/composables/useBotDetection'
54+
import { watch } from 'vue'
55+
56+
const { isBot, botName, botCategory } = useBotDetection()
57+
58+
watch(isBot, (detected) => {
59+
if (detected) {
60+
console.log(`Bot: ${botName.value} (${botCategory.value})`)
61+
}
62+
})
63+
```
64+
65+
## Options
66+
67+
```ts
68+
interface UseBotDetectionOptions {
69+
fingerprint?: boolean
70+
onFingerprintError?: (error: Error) => void
71+
onFingerprintResult?: (result: BotDetectionContext | null) => void
72+
}
73+
```
74+
75+
### `fingerprint`
76+
77+
**Type:** `boolean`
78+
**Default:** `false`
79+
80+
Enable automatic client-side fingerprinting when no bot is detected server-side.
81+
82+
### `onFingerprintError`
83+
84+
**Type:** `(error: Error) => void`
85+
86+
Error handler for fingerprinting failures.
87+
88+
### `onFingerprintResult`
89+
90+
**Type:** `(result: BotDetectionContext | null) => void`
91+
92+
Callback that fires when fingerprinting completes, providing the final detection result.
93+
94+
## Return Type
95+
96+
```ts
97+
interface UseBotDetectionReturn {
98+
isBot: ComputedRef<boolean>
99+
botName: ComputedRef<BotName | undefined>
100+
botCategory: ComputedRef<BotCategory | undefined>
101+
trusted: ComputedRef<boolean | undefined>
102+
reset: () => void
103+
}
104+
```
105+
106+
## Return Value
107+
108+
### `isBot`
109+
110+
**Type:** `ComputedRef<boolean>`
111+
112+
Reactive boolean indicating whether a bot was detected.
113+
114+
### `botName`
115+
116+
**Type:** `ComputedRef<BotName | undefined>`
117+
118+
The specific bot identity (e.g., 'googlebot', 'facebook', 'claude', 'selenium'). `undefined` if no bot detected.
119+
120+
### `botCategory`
121+
122+
**Type:** `ComputedRef<BotCategory | undefined>`
123+
124+
The bot category/purpose (e.g., 'search-engine', 'social', 'ai', 'automation'). `undefined` if no bot detected.
125+
126+
### `trusted`
127+
128+
**Type:** `ComputedRef<boolean | undefined>`
129+
130+
Whether the detected bot is considered trusted. `undefined` if no bot detected.
131+
132+
### `reset()`
133+
134+
**Type:** `() => void`
135+
136+
Clear all detection state and cached results.
137+
138+
## Server Side Behavior
139+
140+
On the server, bot detection runs when you use the composable:
141+
142+
```ts
143+
import { useBotDetection } from '#robots/app/composables/useBotDetection'
144+
145+
// Only runs when composable is used
146+
const { isBot, botName, botCategory } = useBotDetection()
147+
148+
if (isBot.value) {
149+
// Bot detected via server-side analysis
150+
console.log('Bot:', botName.value, 'Category:', botCategory.value)
151+
}
152+
```
153+
154+
## Client Side Behavior
155+
156+
Client-side fingerprinting is automatic when enabled:
157+
158+
```ts
159+
import { useBotDetection } from '#robots/app/composables/useBotDetection'
160+
161+
const { isBot, botName, botCategory } = useBotDetection({
162+
fingerprint: true,
163+
onFingerprintError: (error) => {
164+
console.error('Fingerprinting failed:', error)
165+
}
166+
})
167+
168+
// Fingerprinting runs automatically if no server detection occurred
169+
```
170+
171+
## Configuration
172+
173+
### Disabling Bot Detection
174+
175+
You can disable the entire bot detection plugin:
176+
177+
```ts
178+
// nuxt.config.ts
179+
import { defineNuxtConfig } from 'nuxt/config'
180+
181+
export default defineNuxtConfig({
182+
robots: {
183+
botDetection: false
184+
}
185+
})
186+
```
187+
188+
When disabled, the `useBotDetection` composable will not be available.
189+
190+
## Bot Categories
191+
192+
The following bot types are detected:
193+
194+
- **search-engine**: Google, Bing, Yandex crawlers
195+
- **social**: Twitter, Facebook, LinkedIn bots
196+
- **seo**: Ahrefs, SEMrush, Majestic tools
197+
- **ai**: GPT, Claude, Perplexity crawlers
198+
- **automation**: Selenium, Puppeteer, WebDriver
199+
- **security-scanner**: nmap, Nikto, ZGrab
200+
- **http-tool**: curl, wget, Python requests

0 commit comments

Comments
 (0)