|
| 1 | +# HyperDX Claude Agent Guide |
| 2 | + |
| 3 | +This guide helps Claude AI agents understand and work effectively with the |
| 4 | +HyperDX codebase. |
| 5 | + |
| 6 | +## 🏗️ Project Overview |
| 7 | + |
| 8 | +HyperDX is an observability platform built on ClickHouse that helps engineers |
| 9 | +search, visualize, and monitor logs, metrics, traces, and session replays. It's |
| 10 | +designed as an alternative to tools like Kibana but optimized for ClickHouse's |
| 11 | +performance characteristics. |
| 12 | + |
| 13 | +**Core Value Proposition:** |
| 14 | + |
| 15 | +- Unified observability: correlate logs, metrics, traces, and session replays in |
| 16 | + one place |
| 17 | +- ClickHouse-powered: blazing fast searches and visualizations |
| 18 | +- OpenTelemetry native: works out of the box with OTEL instrumentation |
| 19 | +- Schema agnostic: works on top of existing ClickHouse schemas |
| 20 | + |
| 21 | +## 📁 Architecture Overview |
| 22 | + |
| 23 | +HyperDX follows a microservices architecture with clear separation between |
| 24 | +components: |
| 25 | + |
| 26 | +### Core Services |
| 27 | + |
| 28 | +- **HyperDX UI (`packages/app`)**: Next.js frontend serving the user interface |
| 29 | +- **HyperDX API (`packages/api`)**: Node.js/Express backend handling queries and |
| 30 | + business logic |
| 31 | +- **OpenTelemetry Collector**: Receives and processes telemetry data |
| 32 | +- **ClickHouse**: Primary data store for all telemetry (logs, metrics, traces) |
| 33 | +- **MongoDB**: Metadata storage (users, dashboards, alerts, saved searches) |
| 34 | + |
| 35 | +### Data Flow |
| 36 | + |
| 37 | +1. Applications send telemetry via OpenTelemetry → OTel Collector |
| 38 | +2. OTel Collector processes and forwards data → ClickHouse |
| 39 | +3. Users interact with UI → API queries ClickHouse |
| 40 | +4. Configuration/metadata stored in MongoDB |
| 41 | + |
| 42 | +## 🛠️ Technology Stack |
| 43 | + |
| 44 | +### Frontend (`packages/app`) |
| 45 | + |
| 46 | +- **Framework**: Next.js 14 with TypeScript |
| 47 | +- **UI Components**: Mantine UI library + React Bootstrap |
| 48 | +- **State Management**: Jotai for global state, TanStack Query for server state |
| 49 | +- **Charts/Visualization**: Recharts, uPlot |
| 50 | +- **Code Editor**: CodeMirror (for SQL/JSON editing) |
| 51 | +- **Styling**: SCSS + CSS Modules |
| 52 | + |
| 53 | +### Backend (`packages/api`) |
| 54 | + |
| 55 | +- **Runtime**: Node.js 22+ with TypeScript |
| 56 | +- **Framework**: Express.js |
| 57 | +- **Database**: |
| 58 | + - ClickHouse (primary telemetry data) |
| 59 | + - MongoDB (metadata via Mongoose) |
| 60 | +- **Authentication**: Passport.js with local strategy |
| 61 | +- **Validation**: Zod schemas |
| 62 | +- **OpenTelemetry**: Self-instrumented with `@hyperdx/node-opentelemetry` |
| 63 | + |
| 64 | +### Common Utilities (`packages/common-utils`) |
| 65 | + |
| 66 | +- Shared TypeScript utilities for query parsing, ClickHouse operations |
| 67 | +- Zod schemas for data validation |
| 68 | +- SQL formatting and query building helpers |
| 69 | + |
| 70 | +## 🏛️ Key Architectural Patterns |
| 71 | + |
| 72 | +### Database Models (MongoDB) |
| 73 | + |
| 74 | +All models follow consistent patterns with: |
| 75 | + |
| 76 | +- Team-based multi-tenancy (most entities belong to a `team`) |
| 77 | +- ObjectId references between related entities |
| 78 | +- Timestamps for audit trails |
| 79 | +- Zod schema validation |
| 80 | + |
| 81 | +**Key Models:** |
| 82 | + |
| 83 | +- `Team`: Multi-tenant organization unit |
| 84 | +- `User`: Team members with authentication |
| 85 | +- `Source`: ClickHouse data source configuration |
| 86 | +- `Connection`: Database connection settings |
| 87 | +- `SavedSearch`: Saved queries and filters |
| 88 | +- `Dashboard`: Custom dashboard configurations |
| 89 | +- `Alert`: Monitoring alerts with thresholds |
| 90 | + |
| 91 | +### Frontend Architecture |
| 92 | + |
| 93 | +- **Page-level components**: Located in `pages/` (Next.js routing) |
| 94 | +- **Reusable components**: Located in `src/` directory |
| 95 | +- **State management**: |
| 96 | + - Server state via TanStack Query |
| 97 | + - Client state via Jotai atoms |
| 98 | + - URL state via query parameters |
| 99 | +- **API communication**: Custom hooks wrapping TanStack Query |
| 100 | + |
| 101 | +### Backend Architecture |
| 102 | + |
| 103 | +- **Router-based organization**: Separate routers for different API domains |
| 104 | +- **Middleware stack**: Authentication, CORS, error handling |
| 105 | +- **Controller pattern**: Business logic separated from route handlers |
| 106 | +- **Service layer**: Reusable business logic (e.g., `agentService`) |
| 107 | + |
| 108 | +## 🔧 Development Environment |
| 109 | + |
| 110 | +### Setup Commands |
| 111 | + |
| 112 | +```bash |
| 113 | +# Install dependencies and setup hooks |
| 114 | +yarn setup |
| 115 | + |
| 116 | +# Start full development stack (Docker + local services) |
| 117 | +yarn dev |
| 118 | +``` |
| 119 | + |
| 120 | +### Key Development Scripts |
| 121 | + |
| 122 | +- `yarn app:dev`: Start API, frontend, alerts task, and common-utils in watch |
| 123 | + mode |
| 124 | +- `yarn lint`: Run linting across all packages |
| 125 | +- `yarn dev:int`: Run integration tests in watch mode |
| 126 | +- `yarn dev:unit`: Run unit tests in watch mode (per package) |
| 127 | + |
| 128 | +### Lint Fix Commands |
| 129 | + |
| 130 | +To automatically fix linting issues: |
| 131 | + |
| 132 | +```bash |
| 133 | +# Fix linting issues in specific packages |
| 134 | +cd packages/api && yarn lint:fix |
| 135 | +cd packages/app && yarn lint:fix |
| 136 | +cd packages/common-utils && yarn lint:fix |
| 137 | + |
| 138 | +# Or use NX to run lint:fix across packages |
| 139 | +npx nx run-many -t lint:fix |
| 140 | +``` |
| 141 | + |
| 142 | +**Auto-fix on commit**: The project uses `lint-staged` with Husky to |
| 143 | +automatically fix linting issues on commit: |
| 144 | + |
| 145 | +- Prettier formatting for all files |
| 146 | +- ESLint auto-fix for TypeScript files |
| 147 | + |
| 148 | +### Environment Configuration |
| 149 | + |
| 150 | +- `.env.development`: Development environment variables |
| 151 | +- Docker Compose manages ClickHouse, MongoDB, OTel Collector |
| 152 | +- Hot reload enabled for all services in development |
| 153 | + |
| 154 | +## 📝 Code Style & Patterns |
| 155 | + |
| 156 | +### TypeScript Guidelines |
| 157 | + |
| 158 | +- **Strict typing**: Avoid `any` type assertions (use proper typing instead) |
| 159 | +- **Zod validation**: Use Zod schemas for runtime validation |
| 160 | +- **Interface definitions**: Clear interfaces for all data structures |
| 161 | +- **Error handling**: Proper error boundaries and serialization |
| 162 | + |
| 163 | +### Component Patterns |
| 164 | + |
| 165 | +- **Functional components**: Use React hooks over class components |
| 166 | +- **Custom hooks**: Extract reusable logic into custom hooks |
| 167 | +- **Props interfaces**: Define clear TypeScript interfaces for component props |
| 168 | +- **File organization**: Keep files under 300 lines, break down large components |
| 169 | + |
| 170 | +### UI Components & Styling |
| 171 | + |
| 172 | +**Prefer Mantine UI**: Use Mantine components as the primary UI library: |
| 173 | + |
| 174 | +```tsx |
| 175 | +// ✅ Good - Use Mantine components |
| 176 | +import { Button, TextInput, Modal, Select } from '@mantine/core'; |
| 177 | + |
| 178 | +// ✅ Good - Mantine hooks for common functionality |
| 179 | +import { useDisclosure, useForm } from '@mantine/hooks'; |
| 180 | +``` |
| 181 | + |
| 182 | +**Component Hierarchy**: |
| 183 | + |
| 184 | +1. **First choice**: Mantine components (`@mantine/core`, `@mantine/dates`, |
| 185 | + etc.) |
| 186 | +2. **Second choice**: Custom components built on Mantine primitives |
| 187 | +3. **Last resort**: React Bootstrap or custom CSS (only when Mantine doesn't |
| 188 | + provide the functionality) |
| 189 | + |
| 190 | +**Styling Approach**: |
| 191 | + |
| 192 | +- Use Mantine's built-in styling system and theme |
| 193 | +- SCSS modules for component-specific styles when needed |
| 194 | +- Avoid inline styles unless absolutely necessary |
| 195 | +- Leverage Mantine's responsive design utilities |
| 196 | + |
| 197 | +### API Patterns |
| 198 | + |
| 199 | +- **RESTful design**: Clear HTTP methods and resource-based URLs |
| 200 | +- **Middleware composition**: Reusable middleware for auth, validation, etc. |
| 201 | +- **Error handling**: Consistent error response format |
| 202 | +- **Input validation**: Zod schemas for request validation |
| 203 | + |
| 204 | +## 🧪 Testing Strategy |
| 205 | + |
| 206 | +### Testing Tools |
| 207 | + |
| 208 | +- **Unit Tests**: Jest with TypeScript support |
| 209 | +- **Integration Tests**: Jest with database fixtures |
| 210 | +- **Frontend Testing**: React Testing Library + Jest |
| 211 | +- **E2E Testing**: Custom smoke tests with BATS |
| 212 | + |
| 213 | +### Testing Patterns |
| 214 | + |
| 215 | +- **TDD Approach**: Write tests before implementation for new features |
| 216 | +- **Test organization**: Tests co-located with source files in `__tests__` |
| 217 | + directories |
| 218 | +- **Mocking**: MSW for API mocking in frontend tests |
| 219 | +- **Database testing**: Isolated test databases with fixtures |
| 220 | + |
| 221 | +### CI Testing |
| 222 | + |
| 223 | +For integration testing in CI environments: |
| 224 | + |
| 225 | +```bash |
| 226 | +# Start CI testing stack (ClickHouse, MongoDB, etc.) |
| 227 | +docker compose -p int -f ./docker-compose.ci.yml up -d |
| 228 | + |
| 229 | +# Run integration tests |
| 230 | +yarn dev:int |
| 231 | +``` |
| 232 | + |
| 233 | +**CI Testing Notes:** |
| 234 | + |
| 235 | +- Uses separate Docker Compose configuration optimized for CI |
| 236 | +- Isolated test environment with `-p int` project name |
| 237 | +- Includes all necessary services (ClickHouse, MongoDB, OTel Collector) |
| 238 | +- Tests run against real database instances for accurate integration testing |
| 239 | + |
| 240 | +## 🗄️ Data & Query Patterns |
| 241 | + |
| 242 | +### ClickHouse Integration |
| 243 | + |
| 244 | +- **Query building**: Use `common-utils` for safe query construction |
| 245 | +- **Schema flexibility**: Support for various telemetry schemas via `Source` |
| 246 | + configuration |
| 247 | + |
| 248 | +### MongoDB Patterns |
| 249 | + |
| 250 | +- **Multi-tenancy**: All queries filtered by team context |
| 251 | +- **Relationships**: Use ObjectId references with proper population |
| 252 | +- **Indexing**: Strategic indexes for query performance |
| 253 | +- **Migrations**: Versioned migrations for schema changes |
| 254 | + |
| 255 | +## 🚀 Common Development Tasks |
| 256 | + |
| 257 | +### Adding New Features |
| 258 | + |
| 259 | +1. **API First**: Define API endpoints and data models |
| 260 | +2. **Database Models**: Create/update Mongoose schemas and ClickHouse queries |
| 261 | +3. **Frontend Integration**: Build UI components and integrate with API |
| 262 | +4. **Testing**: Add unit and integration tests |
| 263 | +5. **Documentation**: Update relevant docs |
| 264 | + |
| 265 | +### Performance Considerations |
| 266 | + |
| 267 | +- **Frontend rendering**: Use virtualization for large datasets |
| 268 | +- **API responses**: Implement pagination and caching where appropriate |
| 269 | +- **Bundle size**: Monitor and optimize JavaScript bundle sizes |
| 270 | + |
| 271 | +## 🔍 Key Files & Directories |
| 272 | + |
| 273 | +### Configuration |
| 274 | + |
| 275 | +- `packages/api/src/config.ts`: API configuration and environment variables |
| 276 | +- `packages/app/next.config.js`: Next.js configuration |
| 277 | +- `docker-compose.dev.yml`: Development environment setup |
| 278 | + |
| 279 | +### Core Business Logic |
| 280 | + |
| 281 | +- `packages/api/src/models/`: MongoDB data models |
| 282 | +- `packages/api/src/routers/`: API route definitions |
| 283 | +- `packages/api/src/controllers/`: Business logic controllers |
| 284 | +- `packages/common-utils/src/`: Shared utilities and query builders |
| 285 | + |
| 286 | +### Frontend Architecture |
| 287 | + |
| 288 | +- `packages/app/pages/`: Next.js pages and routing |
| 289 | +- `packages/app/src/`: Reusable components and utilities |
| 290 | +- `packages/app/src/useUserPreferences.tsx`: Global user state management |
| 291 | + |
| 292 | +## 🚨 Common Pitfalls & Guidelines |
| 293 | + |
| 294 | +### Security |
| 295 | + |
| 296 | +- **Server-side validation**: Always validate and sanitize on the backend |
| 297 | +- **Team isolation**: Ensure proper team-based access control |
| 298 | +- **API authentication**: Use proper authentication middleware |
| 299 | +- **Environment variables**: Never commit secrets, use `.env` files |
| 300 | + |
| 301 | +### Performance |
| 302 | + |
| 303 | +- **React rendering**: Use proper keys and memoization for large lists |
| 304 | +- **API pagination**: Implement cursor-based pagination for large datasets |
| 305 | + |
| 306 | +### Code Quality |
| 307 | + |
| 308 | +- **Component responsibility**: Single responsibility principle |
| 309 | +- **Error boundaries**: Proper error handling at component boundaries |
| 310 | +- **Type safety**: Prefer type-safe approaches over runtime checks |
| 311 | + |
| 312 | +## 🔗 Useful Resources |
| 313 | + |
| 314 | +- **OpenTelemetry Docs**: Understanding telemetry data structures |
| 315 | +- **ClickHouse Docs**: Query optimization and schema design |
| 316 | +- **Mantine UI**: Component library documentation |
| 317 | +- **TanStack Query**: Server state management patterns |
| 318 | + |
| 319 | +## 🤝 Contributing Guidelines |
| 320 | + |
| 321 | +1. **Follow existing patterns**: Maintain consistency with current codebase |
| 322 | +2. **Test coverage**: Add tests for new functionality |
| 323 | +3. **Documentation**: Update relevant documentation |
| 324 | +4. **Code review**: Ensure changes align with architectural principles |
| 325 | +5. **Performance impact**: Consider impact on query performance and bundle size |
| 326 | + |
| 327 | +--- |
| 328 | + |
| 329 | +_This guide should be updated as the codebase evolves and new patterns emerge._ |
0 commit comments