Skip to content

Commit e141025

Browse files
committed
docs: add motivation
1 parent 390f89a commit e141025

File tree

2 files changed

+236
-6
lines changed

2 files changed

+236
-6
lines changed

docs/docs/motivation.md

Lines changed: 230 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,259 @@
11
---
2-
icon: lucide/circle-question-mark
2+
icon: lucide/brain
33
---
44

55
# Motivation
66

7+
## The AWS S3 Durability Model
8+
9+
AWS S3 is renowned for providing 11 nines (99.999999999%) of durability. This impressive guarantee is achieved through
10+
a robust architecture that also maintains **at least 3 copies of your data across different Availability Zones (AZs)**
11+
within a region. Each AZ represents one or more physically separate data centers with independent power, cooling,
12+
and networking.
713

814
```mermaid
915
graph TB
1016
User[User]
11-
17+
1218
subgraph Region["AWS Region"]
1319
Endpoint[Regional Endpoint]
14-
20+
1521
subgraph AZ1["Availability Zone 1"]
1622
DC1A[Data Center 1A]
1723
DC1B[Data Center 1B]
1824
end
19-
25+
2026
subgraph AZ2["Availability Zone 2"]
2127
DC2A[Data Center 2A]
2228
DC2B[Data Center 2B]
2329
end
24-
30+
2531
subgraph AZ3["Availability Zone 3"]
2632
DC3A[Data Center 3A]
2733
DC3B[Data Center 3B]
2834
end
2935
end
30-
36+
3137
User -->|Request| Endpoint
3238
Endpoint -->|Write Data| AZ1
3339
Endpoint -->|Write Data| AZ2
3440
Endpoint -->|Write Data| AZ3
3541
```
42+
43+
This architecture ensures that if an entire data center experiences a catastrophic failure, your data remains safe and
44+
accessible. For even greater protection, AWS also offers **cross-region replication**, allowing data to be replicated
45+
across geographically distant regions.
46+
47+
## The Budget Provider Model
48+
49+
In contrast, budget-friendly S3-compatible providers like Backblaze B2 typically achieve durability through **erasure
50+
coding within a single data center** rather than replicating complete copies across multiple physical locations.
51+
52+
```mermaid
53+
graph TB
54+
User[User]
55+
56+
subgraph DC["Single Data Center"]
57+
Endpoint[Storage Endpoint]
58+
59+
subgraph Vault["Backblaze Vault (20 Storage Pods)"]
60+
Pod1[Pod 1<br/>Shard 1]
61+
Pod2[Pod 2<br/>Shard 2]
62+
Pod3[Pod 3<br/>Shard 3]
63+
PodDots[...]
64+
Pod17[Pod 17<br/>Shard 17]
65+
Pod18[Pod 18<br/>Parity 1]
66+
Pod19[Pod 19<br/>Parity 2]
67+
Pod20[Pod 20<br/>Parity 3]
68+
end
69+
end
70+
71+
User -->|Request| Endpoint
72+
Endpoint -->|17 Data Shards| Pod1
73+
Endpoint -->|+| Pod2
74+
Endpoint -->|+| Pod3
75+
Endpoint -->|+| PodDots
76+
Endpoint -->|+| Pod17
77+
Endpoint -->|3 Parity Shards| Pod18
78+
Endpoint -->|+| Pod19
79+
Endpoint -->|+| Pod20
80+
```
81+
82+
For example, Backblaze's architecture uses **Reed-Solomon erasure coding** (17 data shards + 3 parity shards) to
83+
achieve 11 nines durability[^3]. This means your file is split into 17 pieces, with 3 additional parity pieces
84+
calculated from the original data. The file can be reconstructed from any 17 of the 20 shards, allowing the system to
85+
tolerate up to 3 simultaneous drive/pod failures.
86+
87+
While this provides excellent protection against individual hardware failures, all shards exist within a **single
88+
physical location**. If the entire data center experiences a catastrophic event, all 20 shards could be lost
89+
simultaneously.
90+
91+
## The Cost vs. Durability Trade-off
92+
93+
While AWS S3 provides exceptional durability, it comes at a premium price. Many S3-compatible storage providers have
94+
emerged offering significantly cheaper alternatives:
95+
96+
- [**Backblaze B2**](https://www.backblaze.com/cloud-storage)
97+
- [**Cloudflare R2**](https://www.cloudflare.com/developer-platform/products/r2/)
98+
- [**Hetzner Object Storage**](https://www.hetzner.com/storage/object-storage/)
99+
- [**OVH Object Storage**](https://www.ovhcloud.com/en-ie/public-cloud/object-storage/)
100+
- [**MinIO**](https://www.min.io/) (allows self-hosting)
101+
- And many others
102+
103+
These providers are often **considerably more affordable** than AWS S3. However, this cost savings comes with a
104+
trade-off: **reduced protection against data center-level failures**.
105+
106+
### Single-Location Storage
107+
108+
As shown in the Backblaze example above, budget-friendly S3-compatible providers typically use **erasure coding or
109+
RAID within a single data center** rather than maintaining complete copies across multiple physical locations. While
110+
this provides excellent protection against individual hardware failures, all data remains in one geographic location.
111+
112+
### What It Takes to Lose Data
113+
114+
A **catastrophic failure** means damage severe enough that the stored object data cannot be reconstructed. The
115+
difference in disaster resilience becomes clear when comparing what must fail for permanent data loss to occur:
116+
117+
- **Single Data Center**: If that one DC suffers catastrophic failure, your data is gone
118+
- **Multi-AZ Architecture**: Requires catastrophic failures across **at least 3 different data centers** (affecting
119+
all 3 AZs) for data loss to occur
120+
121+
```mermaid
122+
graph TB
123+
subgraph SingleDC["Single Data Center Model"]
124+
DC1["❌ Data Center<br/>(Catastrophic Failure)"]
125+
style DC1 fill:#ff6b6b,stroke:#c92a2a,stroke-width:4px,color:#fff
126+
Note1["Data cannot be reconstructed<br/>from anywhere else"]
127+
style Note1 fill:#fff,stroke:#c92a2a,stroke-width:2px
128+
DC1 -.-> Note1
129+
end
130+
131+
subgraph MultiAZ["Multi-AZ Model"]
132+
subgraph AZ1M["Availability Zone 1"]
133+
DC1A["❌ DC 1A<br/>(Catastrophic)"]
134+
DC1B["DC 1B"]
135+
style DC1A fill:#ff6b6b,stroke:#c92a2a,stroke-width:3px,color:#fff
136+
style DC1B fill:#51cf66,stroke:#2f9e44,stroke-width:1px,color:#000
137+
end
138+
style AZ1M fill:#ffe0e0,stroke:#c92a2a,stroke-width:2px
139+
140+
subgraph AZ2M["Availability Zone 2"]
141+
DC2A["❌ DC 2A<br/>(Catastrophic)"]
142+
DC2B["DC 2B"]
143+
style DC2A fill:#ff6b6b,stroke:#c92a2a,stroke-width:3px,color:#fff
144+
style DC2B fill:#51cf66,stroke:#2f9e44,stroke-width:1px,color:#000
145+
end
146+
style AZ2M fill:#ffe0e0,stroke:#c92a2a,stroke-width:2px
147+
148+
subgraph AZ3M["Availability Zone 3"]
149+
DC3A["❌ DC 3A<br/>(Catastrophic)"]
150+
DC3B["DC 3B"]
151+
style DC3A fill:#ff6b6b,stroke:#c92a2a,stroke-width:3px,color:#fff
152+
style DC3B fill:#51cf66,stroke:#2f9e44,stroke-width:1px,color:#000
153+
end
154+
style AZ3M fill:#ffe0e0,stroke:#c92a2a,stroke-width:2px
155+
156+
Note2["Catastrophic failures in at least<br/>3 different DCs (one per AZ)<br/>= Data cannot be reconstructed"]
157+
style Note2 fill:#fff,stroke:#c92a2a,stroke-width:2px
158+
DC1A -.-> Note2
159+
DC2A -.-> Note2
160+
DC3A -.-> Note2
161+
end
162+
```
163+
164+
With **single-location storage**, a catastrophic failure of one data center means total data loss—there's nowhere else
165+
to reconstruct from. With **multi-AZ architecture**, your data remains safe even if an entire AZ is destroyed, and
166+
requires the highly unlikely scenario of simultaneous catastrophic failures across at least 3 different data centers
167+
in geographically separated locations before data becomes unrecoverable.
168+
169+
!!! danger "Risk of Data Loss"
170+
If the data center hosting your data experiences a catastrophic failure (fire, flood, power loss, etc.), you could
171+
face **permanent data loss**. Unlike AWS S3's multi-AZ architecture, there are no additional copies in separate
172+
physical locations to fall back on.
173+
174+
This is not a theoretical risk: in March 2021, a fire at an OVH data center in Strasbourg destroyed servers and
175+
resulted in permanent data loss for customers who did not have off-site backups[^1] [^2].
176+
177+
## Limitations of Native Replication
178+
179+
Some S3-compatible providers do offer native replication features. For example, **Backblaze B2** provides bucket
180+
replication[^4]. However, these solutions have significant limitations:
181+
182+
### Async-Only Replication
183+
184+
Native replication is typically **asynchronous**, meaning there's a delay between when data is written to the primary
185+
location and when it appears in replicas, which may be up to several hours[^4]. During this window, you're vulnerable
186+
to data loss if the primary fails.
187+
188+
### Single-Cloud Restriction
189+
190+
Native replication features are **confined to the same cloud provider**. For example, Backblaze can only replicate to
191+
other Backblaze buckets[^4]. You cannot replicate from Backblaze to MinIO, or from Hetzner to OVH.
192+
193+
### No Cross-Cloud Disaster Recovery
194+
195+
If you want to protect against a provider-level failure (e.g., provider goes out of business, widespread service
196+
outage, compliance issues), native replication cannot help you because all copies remain with the same vendor.
197+
198+
## The Need for Manual Replication
199+
200+
To achieve AWS-like durability with budget storage providers, you need to **manually implement replication as a backup
201+
strategy**. This increases your effective durability by maintaining copies across multiple independent storage
202+
locations or providers.
203+
204+
### Option 1: Dual Writes in Your Application
205+
206+
You can implement replication directly in your application code:
207+
208+
```python
209+
# Pseudocode
210+
def upload_file(file, key):
211+
s3_client_primary.put_object(bucket='primary', key=key, body=file)
212+
s3_client_backup.put_object(bucket='backup', key=key, body=file)
213+
```
214+
215+
**Drawbacks:**
216+
217+
- Requires modifying application code
218+
- Must be implemented consistently across all applications
219+
- Increases application complexity
220+
- Difficult to change replication strategies
221+
- Error handling becomes complicated
222+
223+
### Option 2: Use a Transparent Proxy (ReplicaT4)
224+
225+
ReplicaT4 acts as a proxy layer between your application and storage backends:
226+
227+
```python
228+
# No code changes needed!
229+
s3_client = boto3.client('s3', endpoint_url='http://replicat4:3000')
230+
s3_client.put_object(bucket='my-bucket', key=key, body=file)
231+
# ReplicaT4 automatically replicates to all configured backends
232+
```
233+
234+
**Benefits:**
235+
236+
- **Zero application code changes**: your apps continue using standard S3 APIs
237+
- **Centralized replication logic**: change strategies without touching application code
238+
- **Consistent replication** across all applications automatically
239+
- **Flexible consistency models**: choose between async (fast) and sync (consistent) replication
240+
- **Mix and match providers**: combine different storage backends seamlessly
241+
242+
## Why ReplicaT4?
243+
244+
ReplicaT4 solves these challenges by providing:
245+
246+
- **Provider-agnostic replication**: works with any S3-compatible storage
247+
- **Cross-cloud capability**: replicate across different providers (Backblaze → MinIO → Hetzner)
248+
- **Flexible consistency models**: choose async for speed or sync for strong consistency
249+
- **Application transparency**: no code changes required
250+
- **Unified control**: manage all replication from a single configuration
251+
252+
Whether you're using budget providers to reduce costs or implementing a defense-in-depth strategy against vendor
253+
lock-in, ReplicaT4 enables you to achieve the durability you need without sacrificing flexibility or breaking the bank.
254+
255+
256+
[^1]: [Reddit Discussion: Did OVH customers lose data that shouldn't have been lost?](https://www.reddit.com/r/webhosting/comments/m8e5so/eli5_did_ovh_customers_lose_data_that_shouldnt/)
257+
[^2]: [Techzine: OVH shares overview of data lost in fire](https://www.techzine.eu/news/infrastructure/57005/ovh-share-overview-of-data-lost-in-fire/)
258+
[^3]: [Backblaze Vaults: Zettabyte-Scale Cloud Storage Architecture](https://www.backblaze.com/blog/vault-cloud-storage-architecture/)
259+
[^4]: [Backblaze B2 Cloud Replication](https://www.backblaze.com/docs/cloud-storage-cloud-replication)

docs/zensical.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -304,6 +304,12 @@ icon = "fontawesome/brands/github"
304304
link = "https://github.com/barreeeiroo/ReplicaT4"
305305

306306

307+
[project.markdown_extensions.admonition]
308+
309+
[project.markdown_extensions.footnotes]
310+
311+
[project.markdown_extensions.pymdownx.details]
312+
307313
[project.markdown_extensions.pymdownx.superfences]
308314
custom_fences = [
309315
{ name = "mermaid", class = "mermaid", format = "pymdownx.superfences.fence_code_format" }

0 commit comments

Comments
 (0)