Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
repos:
- repo: "https://github.com/adrienverge/yamllint"
rev: "v1.35.1"
hooks:
- id: "yamllint"
1 change: 1 addition & 0 deletions observability/simple-datadog/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.env
60 changes: 60 additions & 0 deletions observability/simple-datadog/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Spicedb Observability with the Datadog Agent

## Overview
This is a repository that demonstrates a configuration of SpiceDB and the Datadog Agent
that supports sending metrics and traces to Datadog. This is not the only valid configuration
and should be adapted to your use case.

The metrics produced in this configuration are submitted as custom metrics. We're actively working
on an official SpiceDB integration that would make the metrics into standard metrics and simplify
setup and configuration.

### Running in Production
A "real" deployment would use a container runtime of some sort. One approach would be to
run the datadog agent as a sidecar; another would be to run a set of agents using
the [Datadog Operator](https://docs.datadoghq.com/getting_started/containers/datadog_operator/)
and then point them at your SpiceDB instances using [annotations](https://docs.datadoghq.com/containers/kubernetes/integrations).
This repository is only intended to communicate the agent check configuration
and the required SpiceDB configuration.

## Running this repo
```
mv placeholder.env .env
```

Define your `DD_API_KEY` in the env file.

Run `docker compose up`.

### Thumper
This is an internal load-testing tool that we built a while back. We use it in this project to
exercise gRPC endpoints so that there are traces and metrics to look at.

## The Dashboard
This is a preview of the dashboard that will be bundled with the SpiceDB Community integration.
It shows throughput, latency, and some basic node CPU and memory metrics. Note that the CPU and memory
metrics may be missing context from the container runtime environment, such as limits provided by kubernetes.

Also note that the dashboard uses the metrics exported by SpiceDB as histogram metrics, which Datadog then internally
converts to its distribution-style metrics. There's likely some loss in resolution as a result; if this is a concern,
and 100% of traces are being collected, it may make more sense to make the latency graphs reference the trace
distribution supplied by Datadog.

To use the dashboard, grab `spicedb-dashboard.json` and import it into Datadog.

## Tracing
SpiceDB supports OTLP export of traces. This is configured in the environment variables in `docker-compose.yml` on
the `datadog` and `spicedb` services. Traces are pushed by SpiceDB to the Datadog agent via its OTLP endpoint,
and then the agent forwards them to Datadog.

## Metrics
SpiceDB exposes a Prometheus metrics endpoint on port 9090 by default. This can be scraped by the Datadog Agent
using its Openmetrics integration, which is compatible with the Prometheus metrics format. The configuration is
visible in `conf.d/openmetrics.d/conf.yaml`.

## Logs
SpiceDB writes structured JSON logs to stdout, which can be collected through your normal log collection mechanisms.

### All Available Metrics
The configuration in `conf.d/openmetrics.d/conf.yaml` currently only includes those metrics required to drive the dashboard.
If additional metrics are desired, their names and descriptions can be found in `all_metrics.txt`.
788 changes: 788 additions & 0 deletions observability/simple-datadog/all_metrics.txt

Large diffs are not rendered by default.

25 changes: 25 additions & 0 deletions observability/simple-datadog/conf.d/openmetrics.d/conf.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
init_config:
service: "spicedb"

instances:
- openmetrics_endpoint: "http://spicedb:9090/metrics"
# Prefixes all of the metrics scraped by DD with `spicedb.`
namespace: "spicedb"
metrics:
- grpc_server_handling_seconds:
name: "grpc.server.handling"
type: "histogram"
- grpc_server_handled:
name: "grpc.server.handled"
type: "counter"
# NOTE: for counter metrics that are suffixed by _total, you need to
# remove the suffix for Datadog to pick it up correctly.
- process_cpu_seconds:
name: "process.cpu.seconds"
type: "counter"
- process_virtual_memory_bytes:
name: "process.virtual_memory_bytes"
type: "gauge"
histogram_buckets_as_distributions: true
min_collection_interval: 5
5 changes: 5 additions & 0 deletions observability/simple-datadog/datadog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
# We also don't care about logs
process_config:
container_collection:
enabled: false
90 changes: 90 additions & 0 deletions observability/simple-datadog/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
name: "metrics-adventure"

services:
datadog:
image: "datadog/agent"
environment:
DD_API_KEY: "${DD_API_KEY}"
DD_HOSTNAME: "yetibox"

# NOTE: you can't set this via the config; it needs to come from environment variables.
# These are the environment variables that enable collection via OTLP.
DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT: "0.0.0.0:4317"
# Tells Datadog that we want to send along traces.
# Additional variables would be needed for OTLP logging and metrics
# if desired.
DD_OTLP_CONFIG_TRACES_ENABLED: true
volumes:
- "./conf.d:/conf.d"
- "./datadog.yaml:/datadog.yaml"

spicedb:
image: "authzed/spicedb:v1.38.1"
command: "serve"
restart: "on-failure"
environment:
SPICEDB_GRPC_PRESHARED_KEY: "thisisnotasecret"
SPICEDB_DATASTORE_ENGINE: &datastore_engine "postgres"
SPICEDB_DATASTORE_CONN_URI: &datastore_conn_uri "postgres://postgres:secret@database:5432/spicedb?sslmode=disable"

# OTLP flags; this is how traces are sent to the Datadog agent.
# This matches the receiver endpoint configured on the agent.
SPICEDB_OTEL_ENDPOINT: "datadog:4317"
# Whether to use HTTP or HTTPS
SPICEDB_OTEL_INSECURE: true
# Use gRPC for submission, since the DD agent supports it. `otlphttp`
# is a valid option as well.
SPICEDB_OTEL_PROVIDER: "otlpgrpc"
# Send all traces. This defaults to 0.01 and should be tuned for your system.
SPICEDB_OTEL_SAMPLE_RATIO: 1.0
# The default is w3c, which datadog doesn't support. In order to get trace
# propagation from a datadog-instrumented application, you'll need to
# ensure that the APM is configured to attach b3 trace propagation header.
SPICEDB_OTEL_TRACE_PROPAGATOR: "b3"
ports:
- "9090:9090"
depends_on:
- "spicedb-migrate"

spicedb-migrate:
image: "authzed/spicedb:v1.38.1"
command: "migrate head"
restart: "on-failure"
environment:
SPICEDB_DATASTORE_ENGINE: *datastore_engine
SPICEDB_DATASTORE_CONN_URI: *datastore_conn_uri
depends_on:
- "database"

# Load generation tooling. This is to ensure that there are metrics
# and traces to look at.
thumper:
image: "authzed/thumper"
command: "run --endpoint spicedb:50051 --token thisisnotasecret /scripts/example.yaml"
depends_on:
- "spicedb"
- "thumper-init"
# This runs the schema write for the subsequent load generation.
thumper-init:
image: "authzed/thumper"
command: "migrate --endpoint spicedb:50051 --token thisisnotasecret /scripts/schema.yaml"
# Restarting on failure should mean that the init reattempts until it succeeds
restart: "on-failure"
depends_on:
spicedb:
condition: "service_started"

database:
image: "postgres:16"
ports:
- "5432:5432"
environment:
- "POSTGRES_PASSWORD=secret"
- "POSTGRES_DB=spicedb"
# This keeps postgres's data around when you bring the system down and back up
volumes:
- "pgdata:/var/lib/postgresql/data"

volumes:
pgdata:
1 change: 1 addition & 0 deletions observability/simple-datadog/placeholder.env
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
DD_API_KEY=
1 change: 1 addition & 0 deletions observability/simple-datadog/spicedb-dashboard.json

Large diffs are not rendered by default.

Loading