Skip to content

Commit 257dab6

Browse files
aldy505markstoryBYK
authored
docs(self-hosted): explain self-hosted data flow (#13745)
Closes getsentry/self-hosted#3585 Preview here: https://develop-docs-git-fork-aldy505-docs-self-hosteddata-flow.sentry.dev/self-hosted/data-flow/ --- Co-authored-by: Mark Story <[email protected]> Co-authored-by: Burak Yigit Kaya <[email protected]>
1 parent 1dffebb commit 257dab6

File tree

1 file changed

+98
-0
lines changed

1 file changed

+98
-0
lines changed
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
---
2+
title: Self-hosted Data Flow
3+
sidebar_title: Data Flow
4+
sidebar_order: 20
5+
description: Learn about the data flow of self-hosted Sentry
6+
---
7+
8+
This diagram shows the data flow of self-hosted Sentry. It is similar with [Application Architecture](/application-architecture/overview/) but we are focusing more on the self-hosted components.
9+
10+
```mermaid
11+
graph LR
12+
kafka@{ shape: cyl, label: "Kafka\n(eventstream)" }
13+
redis@{ shape: cyl, label: "Redis" }
14+
postgres@{ shape: cyl, label: "Postgres" }
15+
memcached@{ shape: cyl, label: "Memcached" }
16+
clickhouse@{ shape: cyl, label: "Clickhouse" }
17+
smtp@{ shape: win-pane, label: "SMTP Server" }
18+
symbol-server@{ shape: win-pane, label: "Public/Private Symbol Servers" }
19+
internet@{ shape: trap-t, label: "Internet" }
20+
21+
internet --> nginx
22+
23+
nginx -- Event submitted by SDKs --> relay
24+
nginx -- Web UI & API --> web
25+
26+
subgraph querier [Event Querier]
27+
snuba-api --> clickhouse
28+
end
29+
30+
subgraph processing [Event Processing]
31+
kafka --> snuba-consumer --> clickhouse
32+
snuba-consumer --> kafka
33+
kafka --> snuba-replacer --> clickhouse
34+
kafka --> snuba-subscription-scheduler --> clickhouse
35+
kafka --> snuba-subscription-executor --> clickhouse
36+
redis -- As a celery queue --> sentry-consumer
37+
kafka --> sentry-consumer --> kafka
38+
kafka --> sentry-post-process-forwarder --> kafka
39+
sentry-post-process-forwarder -- Preventing concurrent processing of the same event --> redis
40+
41+
vroom-blob-storage@{ shape: cyl, label: "Blob Storage\n(default is filesystem)" }
42+
43+
kafka -- Profiling event processing --> vroom -- Republish to Kafka to be consumed by Snuba --> kafka
44+
vroom --> snuba-api
45+
vroom -- Store profiles data --> vroom-blob-storage
46+
47+
outgoing-monitors@{ shape: win-pane, label: "Outgoing HTTP Monitors" }
48+
redis -- Fetching uptime configs --> uptime-checker -- Publishing uptime monitoring results --> kafka
49+
uptime-checker --> outgoing-monitors
50+
end
51+
52+
subgraph ui [Web User Interface]
53+
sentry-blob-storage@{ shape: cyl, label: "Blob Storage\n(default is filesystem)" }
54+
55+
web --> worker
56+
web --> postgres
57+
web -- Caching layer --> memcached
58+
web -- Queries on event (errors, spans, etc) data (to snuba-api) --> snuba-api
59+
web -- Avatars, attachments, etc --> sentry-blob-storage
60+
worker -- As a celery queue --> redis
61+
worker --> postgres
62+
worker -- Alert & digest emails --> smtp
63+
web -- Sending test emails --> smtp
64+
end
65+
66+
subgraph ingestion [Event Ingestion]
67+
relay@{ shape: rect, label: 'Relay' }
68+
sentry_ingest_consumer[sentry-ingest-consumers]
69+
70+
relay -- Process envelope into specific types --> kafka --> sentry_ingest_consumer -- Caching event data (to redis) --> redis
71+
relay -- Register relay instance --> web
72+
relay -- Fetching project configs (to redis) --> redis
73+
sentry_ingest_consumer -- Symbolicate stack traces --> symbolicator --> symbol-server
74+
sentry_ingest_consumer -- Save event payload to Nodestore --> postgres
75+
sentry_ingest_consumer -- Republish to events topic --> kafka
76+
end
77+
```
78+
79+
### Event Ingestion Pipeline
80+
81+
1. Events from the SDK is sent to the `relay` service.
82+
2. Relay parses the incoming envelope, validates whether the DSN and Project ID are valid. It reads project config data from `redis`.
83+
3. Relay builds a new payload to be consumed by Sentry ingest consumers, and sends it to `kafka`.
84+
4. Sentry `ingest-*` consumers ( with `*` [wildcard] being the event type [errors, transaction, profiles, etc]) consumes the event, caches it in `redis` and starts the `preprocess_event` task.
85+
5. The `preprocess_event` task symbolicates stack traces with `symbolicator` service, and processes the event according to its event type.
86+
6. The `preprocess_event` task saves the event payload to nodestore (default nodestore backend is `postgres`).
87+
7. The `preprocess_event` task publishes the event to `kafka` under the `events` topic.
88+
89+
### Event Processing Pipeline
90+
91+
1. The `snuba-consumer` service consumes events from `events` topic and processes them. After the events are written to clickhouse, snuba publishes error & transaction events to `post-process-forwarder`.
92+
2. The Sentry `post-process-forwarder` consumer consumes messages and spawns a `post_process_group` task for each processed error & issue occurance.
93+
94+
### Web User Interface
95+
96+
1. The `web` service is what you see, it's the Django web UI and API that serves the Sentry's frontend.
97+
2. The `worker` service mainly consumes tasks from `redis` that acts as a celery queue. One notable task is to send emails through the SMTP server.
98+

0 commit comments

Comments
 (0)