Skip to content

Commit 17f6968

Browse files
authored
feat: add production-ready Prometheus metrics (#39)
1 parent faa99a8 commit 17f6968

File tree

7 files changed

+1109
-7
lines changed

7 files changed

+1109
-7
lines changed

README.md

Lines changed: 272 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,278 @@ See [`compose.yml`](./compose.yml) for Docker Compose configuration.
219219
- `GET /api/reports` - List of reports (paginated)
220220
- `GET /api/reports/:id` - Detailed report view
221221
- `GET /api/top-sources` - Top sending source IPs
222+
- `GET /metrics` - Prometheus metrics endpoint
223+
224+
## Prometheus Metrics & Grafana Integration
225+
226+
Parse DMARC includes production-ready Prometheus metrics for monitoring and alerting. Metrics are enabled by default and exposed at `/metrics`.
227+
228+
### Available Metrics
229+
230+
#### Build Information
231+
| Metric | Type | Description |
232+
|--------|------|-------------|
233+
| `parse_dmarc_build_info` | Gauge | Build information (version, commit, build_date) |
234+
235+
#### Report Processing
236+
| Metric | Type | Description |
237+
|--------|------|-------------|
238+
| `parse_dmarc_reports_fetched_total` | Counter | Total DMARC report emails fetched from IMAP |
239+
| `parse_dmarc_reports_parsed_total` | Counter | Total DMARC reports successfully parsed |
240+
| `parse_dmarc_reports_stored_total` | Counter | Total DMARC reports stored in database |
241+
| `parse_dmarc_reports_parse_errors_total` | Counter | Total parse errors |
242+
| `parse_dmarc_reports_store_errors_total` | Counter | Total storage errors |
243+
| `parse_dmarc_reports_attachments_total` | Counter | Total attachments processed |
244+
| `parse_dmarc_reports_fetch_duration_seconds` | Histogram | Duration of fetch operations |
245+
| `parse_dmarc_reports_last_fetch_timestamp_seconds` | Gauge | Unix timestamp of last successful fetch |
246+
| `parse_dmarc_reports_fetch_cycles_total` | Counter | Total fetch cycles executed |
247+
| `parse_dmarc_reports_fetch_errors_total` | Counter | Total fetch cycle errors |
248+
249+
#### IMAP Connection
250+
| Metric | Type | Labels | Description |
251+
|--------|------|--------|-------------|
252+
| `parse_dmarc_imap_connections_total` | Counter | status | IMAP connection attempts (success/error) |
253+
| `parse_dmarc_imap_connection_duration_seconds` | Histogram | | IMAP connection establishment duration |
254+
255+
#### DMARC Statistics
256+
| Metric | Type | Description |
257+
|--------|------|-------------|
258+
| `parse_dmarc_dmarc_reports_total` | Gauge | Total reports in database |
259+
| `parse_dmarc_dmarc_messages_total` | Gauge | Total messages across all reports |
260+
| `parse_dmarc_dmarc_compliant_messages_total` | Gauge | Total DMARC-compliant messages |
261+
| `parse_dmarc_dmarc_compliance_rate` | Gauge | Overall compliance rate (0-100) |
262+
| `parse_dmarc_dmarc_unique_source_ips` | Gauge | Number of unique source IPs |
263+
| `parse_dmarc_dmarc_unique_domains` | Gauge | Number of unique domains |
264+
265+
#### Per-Domain/Org Metrics
266+
| Metric | Type | Labels | Description |
267+
|--------|------|--------|-------------|
268+
| `parse_dmarc_dmarc_messages_by_domain` | Gauge | domain | Messages per domain |
269+
| `parse_dmarc_dmarc_compliance_rate_by_domain` | Gauge | domain | Compliance rate per domain |
270+
| `parse_dmarc_dmarc_reports_by_org` | Gauge | org_name | Reports per organization |
271+
| `parse_dmarc_dmarc_messages_by_disposition` | Gauge | disposition | Messages by disposition type |
272+
273+
#### Authentication Results
274+
| Metric | Type | Labels | Description |
275+
|--------|------|--------|-------------|
276+
| `parse_dmarc_dmarc_spf_results` | Gauge | result | SPF authentication result counts |
277+
| `parse_dmarc_dmarc_dkim_results` | Gauge | result | DKIM authentication result counts |
278+
279+
#### HTTP Server
280+
| Metric | Type | Labels | Description |
281+
|--------|------|--------|-------------|
282+
| `parse_dmarc_http_requests_total` | Counter | method, path, status | Total HTTP requests |
283+
| `parse_dmarc_http_request_duration_seconds` | Histogram | method, path | HTTP request duration |
284+
| `parse_dmarc_http_requests_in_flight` | Gauge | | Current in-flight requests |
285+
286+
#### Go Runtime (Built-in)
287+
Standard Go runtime metrics are also exposed:
288+
- `go_goroutines` - Number of goroutines
289+
- `go_memstats_*` - Memory statistics
290+
- `go_gc_*` - Garbage collection metrics
291+
- `process_*` - Process metrics (CPU, memory, file descriptors)
292+
293+
### Disabling Metrics
294+
295+
To disable the metrics endpoint:
296+
297+
```bash
298+
# Command line
299+
./parse-dmarc --metrics=false
300+
301+
# Environment variable
302+
export PARSE_DMARC_METRICS=false
303+
304+
# Docker
305+
docker run -e PARSE_DMARC_METRICS=false ghcr.io/meysam81/parse-dmarc:latest
306+
```
307+
308+
### Prometheus Configuration
309+
310+
Add Parse DMARC to your `prometheus.yml`:
311+
312+
```yaml
313+
scrape_configs:
314+
- job_name: 'parse-dmarc'
315+
static_configs:
316+
- targets: ['parse-dmarc:8080']
317+
scrape_interval: 30s
318+
metrics_path: /metrics
319+
```
320+
321+
For Kubernetes with ServiceMonitor (Prometheus Operator):
322+
323+
```yaml
324+
apiVersion: monitoring.coreos.com/v1
325+
kind: ServiceMonitor
326+
metadata:
327+
name: parse-dmarc
328+
labels:
329+
app: parse-dmarc
330+
spec:
331+
selector:
332+
matchLabels:
333+
app: parse-dmarc
334+
endpoints:
335+
- port: http
336+
path: /metrics
337+
interval: 30s
338+
```
339+
340+
### Grafana Dashboard
341+
342+
#### Quick Start
343+
344+
1. Import the dashboard JSON from `grafana/dashboard.json` (if available) or create a new dashboard
345+
2. Add Prometheus as a data source pointing to your Prometheus instance
346+
3. Create panels using the queries below
347+
348+
#### Example Grafana Panels
349+
350+
**Compliance Rate Gauge:**
351+
```promql
352+
parse_dmarc_dmarc_compliance_rate
353+
```
354+
355+
**Messages Over Time:**
356+
```promql
357+
rate(parse_dmarc_dmarc_messages_total[5m])
358+
```
359+
360+
**Compliance Rate by Domain:**
361+
```promql
362+
parse_dmarc_dmarc_compliance_rate_by_domain
363+
```
364+
365+
**SPF/DKIM Pass Rate:**
366+
```promql
367+
# SPF Pass Rate
368+
parse_dmarc_dmarc_spf_results{result="pass"} / ignoring(result) sum(parse_dmarc_dmarc_spf_results) * 100
369+
370+
# DKIM Pass Rate
371+
parse_dmarc_dmarc_dkim_results{result="pass"} / ignoring(result) sum(parse_dmarc_dmarc_dkim_results) * 100
372+
```
373+
374+
**Fetch Success Rate:**
375+
```promql
376+
1 - (rate(parse_dmarc_reports_fetch_errors_total[1h]) / rate(parse_dmarc_reports_fetch_cycles_total[1h]))
377+
```
378+
379+
**IMAP Connection Health:**
380+
```promql
381+
rate(parse_dmarc_imap_connections_total{status="success"}[5m]) /
382+
(rate(parse_dmarc_imap_connections_total{status="success"}[5m]) + rate(parse_dmarc_imap_connections_total{status="error"}[5m]))
383+
```
384+
385+
**HTTP Request Latency (p95):**
386+
```promql
387+
histogram_quantile(0.95, rate(parse_dmarc_http_request_duration_seconds_bucket[5m]))
388+
```
389+
390+
**Reports by Organization:**
391+
```promql
392+
topk(10, parse_dmarc_dmarc_reports_by_org)
393+
```
394+
395+
#### Alerting Rules
396+
397+
Example Prometheus alerting rules:
398+
399+
```yaml
400+
groups:
401+
- name: parse-dmarc
402+
rules:
403+
- alert: DMARCComplianceLow
404+
expr: parse_dmarc_dmarc_compliance_rate < 90
405+
for: 1h
406+
labels:
407+
severity: warning
408+
annotations:
409+
summary: "DMARC compliance rate is below 90%"
410+
description: "Current compliance rate: {{ $value }}%"
411+
412+
- alert: DMARCFetchFailures
413+
expr: rate(parse_dmarc_reports_fetch_errors_total[15m]) > 0
414+
for: 30m
415+
labels:
416+
severity: critical
417+
annotations:
418+
summary: "Parse DMARC fetch failures detected"
419+
description: "IMAP fetch operations are failing"
420+
421+
- alert: IMAPConnectionErrors
422+
expr: rate(parse_dmarc_imap_connections_total{status="error"}[5m]) > 0
423+
for: 10m
424+
labels:
425+
severity: warning
426+
annotations:
427+
summary: "IMAP connection errors detected"
428+
description: "Check IMAP credentials and server connectivity"
429+
430+
- alert: NoRecentFetch
431+
expr: time() - parse_dmarc_reports_last_fetch_timestamp_seconds > 600
432+
for: 5m
433+
labels:
434+
severity: warning
435+
annotations:
436+
summary: "No recent DMARC report fetch"
437+
description: "Last fetch was {{ humanizeDuration $value }} ago"
438+
```
439+
440+
### Docker Compose with Prometheus & Grafana
441+
442+
Complete monitoring stack:
443+
444+
```yaml
445+
version: '3.8'
446+
447+
services:
448+
parse-dmarc:
449+
image: ghcr.io/meysam81/parse-dmarc:latest
450+
ports:
451+
- "8080:8080"
452+
volumes:
453+
- ./config.json:/app/config.json
454+
- ./data:/data
455+
456+
prometheus:
457+
image: prom/prometheus:latest
458+
ports:
459+
- "9090:9090"
460+
volumes:
461+
- ./prometheus.yml:/etc/prometheus/prometheus.yml
462+
command:
463+
- '--config.file=/etc/prometheus/prometheus.yml'
464+
465+
grafana:
466+
image: grafana/grafana:latest
467+
ports:
468+
- "3000:3000"
469+
environment:
470+
- GF_SECURITY_ADMIN_PASSWORD=admin
471+
volumes:
472+
- grafana-data:/var/lib/grafana
473+
474+
volumes:
475+
grafana-data:
476+
```
477+
478+
With `prometheus.yml`:
479+
480+
```yaml
481+
global:
482+
scrape_interval: 15s
483+
484+
scrape_configs:
485+
- job_name: 'parse-dmarc'
486+
static_configs:
487+
- targets: ['parse-dmarc:8080']
488+
```
489+
490+
Access:
491+
- Parse DMARC Dashboard: http://localhost:8080
492+
- Prometheus: http://localhost:9090
493+
- Grafana: http://localhost:3000 (admin/admin)
222494

223495
### Why Parse DMARC vs ParseDMARC?
224496

0 commit comments

Comments
 (0)