|
| 1 | +# Hall Analytics Integration Guide for Astro Documentation Site |
| 2 | + |
| 3 | +This guide explains how to integrate [Hall Analytics](https://docs.usehall.com/) with your Astro documentation site to track web traffic and understand visiting behavior, especially from AI assistants and crawlers. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +Hall Analytics provides: |
| 8 | +- **AI Traffic Analytics** - Track visits from conversational AI platforms |
| 9 | +- **Crawler Detection** - Identify AI assistants, agents, and crawlers |
| 10 | +- **Referral Analysis** - Understand traffic sources and patterns |
| 11 | +- **Real-time Tracking** - Monitor web traffic as it happens |
| 12 | + |
| 13 | +## Setup Instructions |
| 14 | + |
| 15 | +### 1. Environment Variables |
| 16 | + |
| 17 | +Create a `.env` file in your project root with your Hall API key: |
| 18 | + |
| 19 | +```bash |
| 20 | +# Hall Analytics Configuration |
| 21 | +HALL_API_KEY=your_hall_api_key_here |
| 22 | +``` |
| 23 | + |
| 24 | +### 2. Get Your Hall API Key |
| 25 | + |
| 26 | +1. Sign up at [Hall Dashboard](https://app.usehall.com) |
| 27 | +2. Create a new project |
| 28 | +3. Copy your API key from the dashboard |
| 29 | +4. Add it to your `.env` file |
| 30 | + |
| 31 | +### 3. Implementation |
| 32 | + |
| 33 | +The integration has been implemented in `src/middleware.ts` and will automatically: |
| 34 | + |
| 35 | +- **Track all page visits** to your documentation |
| 36 | +- **Extract IP addresses** from various proxy headers |
| 37 | +- **Capture request headers** for detailed analytics |
| 38 | +- **Send data asynchronously** without blocking page loads |
| 39 | +- **Handle errors gracefully** without affecting user experience |
| 40 | + |
| 41 | +## How It Works |
| 42 | + |
| 43 | +### Analytics Data Collected |
| 44 | + |
| 45 | +For each page visit, the middleware sends the following data to Hall: |
| 46 | + |
| 47 | +```typescript |
| 48 | +{ |
| 49 | + request_path: "/docs/get-started/", |
| 50 | + request_method: "GET", |
| 51 | + request_ip: "192.168.1.1", |
| 52 | + request_headers: { |
| 53 | + "User-Agent": "Mozilla/5.0...", |
| 54 | + "Host": "docs.kinde.com", |
| 55 | + "Referer": "https://google.com", |
| 56 | + "Accept-Language": "en-US,en;q=0.9", |
| 57 | + // ... additional headers |
| 58 | + }, |
| 59 | + request_timestamp: 1703123456789 |
| 60 | +} |
| 61 | +``` |
| 62 | + |
| 63 | +### IP Address Detection |
| 64 | + |
| 65 | +The middleware intelligently extracts IP addresses from multiple headers to handle various hosting setups: |
| 66 | + |
| 67 | +- `x-forwarded-for` - Standard proxy header |
| 68 | +- `x-real-ip` - Nginx and other proxies |
| 69 | +- `cf-connecting-ip` - Cloudflare |
| 70 | +- Fallback to `127.0.0.1` for local development |
| 71 | + |
| 72 | +### Asynchronous Tracking |
| 73 | + |
| 74 | +Analytics tracking happens asynchronously to ensure: |
| 75 | + |
| 76 | +- **No performance impact** on page load times |
| 77 | +- **Non-blocking requests** to Hall's API |
| 78 | +- **Graceful error handling** if Hall is unavailable |
| 79 | +- **User experience remains smooth** |
| 80 | + |
| 81 | +## Features |
| 82 | + |
| 83 | +### AI Traffic Detection |
| 84 | + |
| 85 | +Hall specializes in detecting and analyzing traffic from: |
| 86 | + |
| 87 | +- **ChatGPT and other AI assistants** |
| 88 | +- **AI-powered crawlers and bots** |
| 89 | +- **Conversational AI platforms** |
| 90 | +- **AI agents and automation tools** |
| 91 | + |
| 92 | +### Comprehensive Analytics |
| 93 | + |
| 94 | +Track detailed information about: |
| 95 | + |
| 96 | +- **Page visits and navigation patterns** |
| 97 | +- **User agents and device information** |
| 98 | +- **Referral sources and traffic origins** |
| 99 | +- **Geographic and temporal patterns** |
| 100 | +- **AI vs human visitor differentiation** |
| 101 | + |
| 102 | +## Configuration Options |
| 103 | + |
| 104 | +### Custom Headers |
| 105 | + |
| 106 | +You can modify which headers are tracked by editing the `requestHeaders` object in the middleware: |
| 107 | + |
| 108 | +```typescript |
| 109 | +const requestHeaders = { |
| 110 | + 'User-Agent': request.headers.get('user-agent'), |
| 111 | + 'Host': request.headers.get('host'), |
| 112 | + 'Referer': request.headers.get('referer'), |
| 113 | + // Add or remove headers as needed |
| 114 | +}; |
| 115 | +``` |
| 116 | + |
| 117 | +### Selective Tracking |
| 118 | + |
| 119 | +To track only specific pages or exclude certain paths: |
| 120 | + |
| 121 | +```typescript |
| 122 | +// In the middleware function |
| 123 | +if (url.pathname.startsWith('/docs/') && !url.pathname.includes('/private/')) { |
| 124 | + trackAnalytics(request, url).catch(error => { |
| 125 | + console.error('Hall analytics tracking failed:', error); |
| 126 | + }); |
| 127 | +} |
| 128 | +``` |
| 129 | + |
| 130 | +### Custom Analytics Data |
| 131 | + |
| 132 | +Extend the analytics payload with custom data: |
| 133 | + |
| 134 | +```typescript |
| 135 | +body: JSON.stringify({ |
| 136 | + request_path: requestPath, |
| 137 | + request_method: requestMethod, |
| 138 | + request_ip: requestIp, |
| 139 | + request_headers: requestHeaders, |
| 140 | + request_timestamp: Date.now(), |
| 141 | + // Add custom fields |
| 142 | + custom_data: { |
| 143 | + section: getSectionFromPath(requestPath), |
| 144 | + user_type: detectUserType(requestHeaders), |
| 145 | + content_type: getContentType(requestPath) |
| 146 | + } |
| 147 | +}) |
| 148 | +``` |
| 149 | + |
| 150 | +## Monitoring and Debugging |
| 151 | + |
| 152 | +### Enable Debug Logging |
| 153 | + |
| 154 | +Add debug logging to monitor analytics tracking: |
| 155 | + |
| 156 | +```typescript |
| 157 | +// In the trackAnalytics function |
| 158 | +console.log('Tracking analytics for:', requestPath); |
| 159 | +console.log('IP detected:', requestIp); |
| 160 | +console.log('User agent:', requestHeaders['User-Agent']); |
| 161 | +``` |
| 162 | + |
| 163 | +### Check Analytics Dashboard |
| 164 | + |
| 165 | +Monitor your analytics data in the [Hall Dashboard](https://app.usehall.com): |
| 166 | + |
| 167 | +- **Real-time traffic** from AI platforms |
| 168 | +- **Visitor behavior patterns** |
| 169 | +- **Traffic source analysis** |
| 170 | +- **AI vs human visitor metrics** |
| 171 | + |
| 172 | +## Performance Considerations |
| 173 | + |
| 174 | +### Minimal Impact |
| 175 | + |
| 176 | +The integration is designed for minimal performance impact: |
| 177 | + |
| 178 | +- **Asynchronous tracking** - doesn't block page loads |
| 179 | +- **Lightweight payload** - only essential data sent |
| 180 | +- **Error isolation** - failures don't affect user experience |
| 181 | +- **Efficient headers** - only relevant headers captured |
| 182 | + |
| 183 | +### Caching Considerations |
| 184 | + |
| 185 | +If you use caching (CDN, etc.), ensure analytics still work: |
| 186 | + |
| 187 | +- **Cache headers** don't affect tracking |
| 188 | +- **Analytics run server-side** regardless of caching |
| 189 | +- **Each request** is tracked even if content is cached |
| 190 | + |
| 191 | +## Troubleshooting |
| 192 | + |
| 193 | +### Common Issues |
| 194 | + |
| 195 | +1. **Analytics not appearing in dashboard** |
| 196 | + - Verify `HALL_API_KEY` is set correctly |
| 197 | + - Check network requests in browser dev tools |
| 198 | + - Ensure no firewall blocking requests to `analytics.usehall.com` |
| 199 | + |
| 200 | +2. **Performance concerns** |
| 201 | + - Analytics are asynchronous and non-blocking |
| 202 | + - Monitor server logs for any errors |
| 203 | + - Consider rate limiting if needed |
| 204 | + |
| 205 | +3. **Missing IP addresses** |
| 206 | + - Check your hosting provider's proxy headers |
| 207 | + - Verify `x-forwarded-for` or similar headers are set |
| 208 | + - Test with different hosting environments |
| 209 | + |
| 210 | +### Debug Mode |
| 211 | + |
| 212 | +Enable detailed logging for troubleshooting: |
| 213 | + |
| 214 | +```typescript |
| 215 | +// Add to middleware for debugging |
| 216 | +console.log('Hall API Key present:', !!HALL_API_KEY); |
| 217 | +console.log('Request URL:', url.toString()); |
| 218 | +console.log('Headers:', Object.fromEntries(request.headers.entries())); |
| 219 | +``` |
| 220 | + |
| 221 | +## Production Deployment |
| 222 | + |
| 223 | +### Environment Setup |
| 224 | + |
| 225 | +1. **Set environment variables** in your hosting platform |
| 226 | +2. **Verify API key** is correctly configured |
| 227 | +3. **Test analytics** in staging environment |
| 228 | +4. **Monitor dashboard** for incoming data |
| 229 | + |
| 230 | +### Security Considerations |
| 231 | + |
| 232 | +- **API key security** - keep your Hall API key secure |
| 233 | +- **HTTPS required** - ensure all requests use HTTPS |
| 234 | +- **Header sanitization** - sensitive headers are filtered out |
| 235 | +- **Rate limiting** - consider implementing if needed |
| 236 | + |
| 237 | +## Support |
| 238 | + |
| 239 | +- **Hall Documentation**: [docs.usehall.com](https://docs.usehall.com/) |
| 240 | +- **Hall Dashboard**: [app.usehall.com](https://app.usehall.com/) |
| 241 | +- **Astro Middleware**: [docs.astro.build/en/guides/middleware/](https://docs.astro.build/en/guides/middleware/) |
| 242 | + |
| 243 | +## Example Analytics Dashboard |
| 244 | + |
| 245 | +Once integrated, you'll be able to see in your Hall dashboard: |
| 246 | + |
| 247 | +- **AI Traffic Overview** - percentage of AI vs human visitors |
| 248 | +- **Popular Pages** - which documentation pages are most visited |
| 249 | +- **Traffic Sources** - where visitors are coming from |
| 250 | +- **User Behavior** - how visitors navigate your documentation |
| 251 | +- **Real-time Activity** - live traffic monitoring |
| 252 | + |
| 253 | +This integration provides valuable insights into how your documentation is being used, especially by AI platforms and automated tools. |
0 commit comments