Skip to content

Commit e887fbc

Browse files
author
Evgeniy Khyst
committed
Update README.md
1 parent 6a58bf2 commit e887fbc

9 files changed

+110
-82
lines changed

README.md

Lines changed: 110 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -17,20 +17,6 @@
1717
* [Drawbacks](#0cfc0523189294ac086e11c8e286ba2d)
1818
* [How to Run the Sample?](#53af957fc9dc9f7083531a00fe3f364e)
1919

20-
## <a name="0b79795d3efc95b9976c7c5b933afce2"></a>Introduction
21-
22-
PostgreSQL is the world's most advanced open source database. Also, PostgreSQL is suitable for Event
23-
Sourcing.
24-
25-
This repository provides a sample of event sourced system that uses PostgreSQL as event store.
26-
27-
![PostgreSQL Logo](img/potgresql-logo.png)
28-
29-
See also
30-
31-
* [Event Sourcing with Kafka and ksqlDB](https://github.com/evgeniy-khist/ksqldb-event-souring)
32-
* [Event Sourcing with EventStoreDB](https://github.com/evgeniy-khist/eventstoredb-event-sourcing)
33-
3420
## <a name="8753dff3c2879207fa06ef1844b1ea4d"></a>Example Domain
3521

3622
The example domain is ride hailing.
@@ -39,6 +25,10 @@ The example domain is ride hailing.
3925
* A driver can accept and complete an order.
4026
* An order can be cancelled before completion.
4127

28+
![Domain use case diagram](img/domain-1.png)
29+
30+
![Domain state diagram](img/domain-2.png)
31+
4232
## <a name="19025f75ca30ec4a46f55a6b9afdeba6"></a>Event Sourcing and CQRS 101
4333

4434
### <a name="436b314e78fec59a76bad8b93b52ee75"></a>State-Oriented Persistence
@@ -57,8 +47,26 @@ Whenever the state of an entity changes, a new event is appended to the list of
5747

5848
Current state of an entity can be restored by replaying all its events.
5949

50+
Event sourcing is best suited for short-living entities with relatively small total number of
51+
event (like orders).
52+
53+
Restoring the state of the short-living entity by replaying all its events doesn't have any
54+
performance impact. Thus, no optimizations for restoring state are required for short-living
55+
entities.
56+
57+
For endlessly stored entities (like users or bank accounts) with thousands of events restoring state
58+
by replaying all events is not optimal and snapshotting should be considered.
59+
60+
Snapshotting is an optimization technique where a snapshot of the aggregate's state is also saved,
61+
so an application can restore the current state of an aggregate from the snapshot instead of from
62+
scratch.
63+
64+
![Snapshotting in event souring](img/event-sourcing-snapshotting.png)
65+
6066
An entity in event sourcing is also referenced as an aggregate.
6167

68+
A sequence of events for the same aggregate are also referenced as a stream.
69+
6270
### <a name="b2cf9293622451d86574d2973398ca70"></a>CQRS
6371

6472
CQRS (Command-query responsibility segregation) stands for segregating the responsibility between
@@ -69,7 +77,10 @@ A command generates zero or more events or results in an error.
6977

7078
![CQRS](img/cqrs-1.png)
7179

72-
Event sourcing is usually used in conjunction with CQRS.
80+
CQRS is a self-sufficient architectural pattern and doesn't require event sourcing.
81+
82+
Event sourcing is usually used in conjunction with CQRS. Event store is used as a write database and
83+
SQL or NoSQL database as a read database.
7384

7485
![CQRS](img/cqrs-2.png)
7586

@@ -78,16 +89,18 @@ integration with other bounded contexts. Integration events representing the cur
7889
aggregate should be used for communication between bounded contexts instead of a raw event sourcing
7990
change events.
8091

81-
### <a name="d8818c2c5ba0364540a49273f684b85c"></a>Advantages of Event Sourcing and CQRS
92+
### <a name="cc00871be6276415cfb13eb24e97fe48"></a>Advantages of CQRS
93+
94+
* Independent scaling of the read and write databases.
95+
* Optimized data schema for the read database (e.g. the read databases can be denormalized).
96+
* Simpler queries (e.g. complex `JOIN` operations can be avoided).
97+
98+
### <a name="845b7e034fb763fcdf57e9467c0a8707"></a>Advantages of Event Sourcing
8299

83100
* Having a true history of the system (audit and traceability).
84101
* Ability to put the system in any prior state (e.g. for debugging).
85102
* Read-side projections can be created as needed (later) from events. It allows responding to future
86103
needs and new requirements.
87-
* Independent scaling. CQRS allows We can scale the read and write databases independently of each
88-
other.
89-
* Optimized data schema for read database (e.g. the read databases can be denormalized).
90-
* Simpler queries (e.g. complex `JOIN` operations can be avoided).
91104

92105
## <a name="70b356f41293ace9df0d04cd8175ac35"></a>Requirements for Event Store
93106

@@ -102,47 +115,104 @@ change events.
102115

103116
## <a name="9f6302143996033ebb94d536b860acc3"></a>Solution Architecture
104117

105-
TBD
118+
PostgreSQL can be used as an event store. It will natively support appending events, concurrency
119+
control and reading events. Subscribing on events requires additional implementation.
120+
121+
![PostgreSQL event store ER diagram](img/postgresql-event-store-1.png)
122+
123+
Separate table `ORDER_AGGREGATE` keeps track of the latest versions of the aggregates. It is
124+
required for optimistic concurrency control.
125+
126+
PostgreSQL doesn't allow subscribing on changes, so the solution is Transactional outbox pattern. A
127+
service that uses a database inserts events into an *outbox* table as part of the local transaction.
128+
A separate Message Relay process publishes the events inserted into database to a message broker.
129+
130+
![Transactional outbox pattern](img/transactional-outbox-1.png)
131+
132+
With event sourcing database model classical Transaction outbox pattern can be simplified. An *
133+
outbox* table is used to keep track of handled events. Outbox handler (aka *Message Relay*
134+
and *Polling Publisher*) processes new events by polling the database's *outbox* table.
135+
136+
![Simplified transactional outbox pattern](img/transactional-outbox-2.png)
137+
138+
Event processing includes updating the read model and publishing integration events.
139+
140+
All parts together look like this
141+
142+
![PostgreSQL event store](img/postgresql-event-store-2.png)
106143

107144
### <a name="205928bf89c3012be2e11d1e5e7ad01f"></a>Permanent Storage
108145

109-
TBD
146+
PostgreSQL stores all data permanently be default.
110147

111148
### <a name="6eec4db0e612f3a70dab7d96c463e8f6"></a>Optimistic concurrency control
112149

113-
TBD
150+
Optimistic concurrency control is done by checking aggregate versions in the `ORDER_AGGREGATE`
151+
table.
152+
153+
Appending an event operation consists of 2 SQL statements in a single transaction:
154+
155+
1. Check the actual and expected version match and increment version
156+
```sql
157+
UPDATE ORDER_AGGREGATE SET VERSION = VERSION + 1 WHERE ID = ? AND VERSION = ?
158+
```
159+
2. Insert new event
160+
```sql
161+
INSERT INTO ORDER_EVENT(AGGREGATE_ID, VERSION, EVENT_TYPE, JSON_DATA) VALUES(?, ?, ?, ?)
162+
```
114163

115164
### <a name="323effe18de24bcc666f161931c903f3"></a>Loading current state
116165

117-
TBD
166+
Current state of an aggregate can be loaded using simple query that fetches all aggregate events
167+
order by version in the ascending order
168+
169+
```sql
170+
SELECT ID, EVENT_TYPE, JSON_DATA FROM ORDER_EVENT WHERE AGGREGATE_ID = ? ORDER BY VERSION ASC
171+
```
118172

119173
### <a name="784ff5dca3b046266edf61637822bbff"></a>Subscribe to all events by aggregate type
120174

121-
Dual writes problem occur when you need to update the database **and** send the message
122-
reliable/atomically and 2-phase commit (2PC) is not an option.
175+
PostgreSQL doesn't allow subscribing on changes, so the solution is Transactional outbox pattern or
176+
its variations.
177+
178+
`ORDER_EVENT_OUTBOX` table keeps track of all subscribers (consumer groups) and the last processed
179+
event ID.
180+
181+
The concept of consumer groups is required to deliver events to only one consumer from the group.
182+
This is achieved by acquiring a locking on the same record of `ORDER_EVENT_OUTBOX` table.
123183
124-
PostgreSQL doesn't allow subscribing on changes, so the solution to the dual writes problem is
125-
Transactional outbox pattern. An *outbox* table is used to keep track of handled events. Outbox
126-
handler (aka *message relay* and *polling publisher*) processes new events by polling the
127-
database's *outbox* table. Event processing includes updating the read model and publishing
128-
integration events.
184+
Outbox handler polls `ORDER_EVENT_OUTBOX` table every second for new events and processes them
185+
186+
1. Read the last processed event ID and acquire lock
187+
```sql
188+
SELECT LAST_ID FROM ORDER_EVENT_OUTBOX WHERE SUBSCRIPTION_GROUP = ? FOR UPDATE NOWAIT
189+
```
190+
2. Fetch new events
191+
```sql
192+
SELECT ID, EVENT_TYPE, JSON_DATA FROM ORDER_EVENT WHERE ID > ? ORDER BY ID ASC
193+
```
194+
3. Update the ID of the last event processed by the subscription
195+
```sql
196+
UPDATE ORDER_EVENT_OUTBOX SET LAST_ID = ? WHERE SUBSCRIPTION_GROUP = ?
197+
```
129198
130199
### <a name="0b584912c4fa746206884e080303ed49"></a>Checkpoints
131200
132-
TBD
201+
The last known position from where the subscription starts getting events is stored in `LAST_ID`
202+
column of `ORDER_EVENT_OUTBOX` table.
133203
134204
### <a name="0cfc0523189294ac086e11c8e286ba2d"></a>Drawbacks
135205
136-
Polling database's *outbox* table for new messages with fixed delay introduces a big lag (delay
137-
between polls) in eventual consistency between the write and read models.
138-
139-
**In case of outbox handler failure duplicates and out of order events are possible.**
206+
1. Polling database's *outbox* table for new messages with fixed delay introduces a big lag (delay
207+
between polls) in eventual consistency between the write and read models.
208+
2. **The Outbox handler might process an event more than once.** It might crash after processing an
209+
event but before recording the fact that it has done so. When it restarts, it will then publish
210+
the message again.
140211

141-
Consumers of integration events should be idempotent and filter duplicates and out of order
142-
integration events.
212+
Consumers of events should be idempotent and filter duplicates and out of order integration events.
143213

144-
If your system can't accept even small chance of duplicates or unordering, then persistent
145-
subscription listener must be extracted into a separate microservice and run in a single replica (
214+
If your system can't accept even small chance of duplicates or unordering, then Message Relay must
215+
be extracted into a separate microservice and run in a single replica (
146216
`.spec.replicas=1` in Kubernetes). This microservice must not be updated using RollingUpdate
147217
Deployment strategy. Recreate Deployment strategy must be used
148218
instead (`.spec.strategy.type=Recreate`

img/domain-1.png

26.9 KB
Loading

img/domain-2.png

33.8 KB
Loading
126 KB
Loading

img/postgresql-event-store-1.png

53.2 KB
Loading

img/postgresql-event-store-2.png

124 KB
Loading

img/transactional-outbox-1.png

80.7 KB
Loading

img/transactional-outbox-2.png

119 KB
Loading

queries.sql

Lines changed: 0 additions & 42 deletions
This file was deleted.

0 commit comments

Comments
 (0)