Skip to content

Commit 962fe40

Browse files
Xuanwosoyeric128
authored andcommitted
Add rfc Disaster Recovery
Signed-off-by: Xuanwo <[email protected]>
1 parent d96cfb6 commit 962fe40

File tree

1 file changed

+260
-0
lines changed

1 file changed

+260
-0
lines changed
Lines changed: 260 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,260 @@
1+
---
2+
title: Disaster Recovery
3+
description: Enable Databend to recover from disasters involving the loss of either metadata or data.
4+
---
5+
6+
- RFC PR: [datafuselabs/databend#0000](https://github.com/databendlabs/databend/pull/0000)
7+
- Tracking Issue: [datafuselabs/databend#0000](https://github.com/databendlabs/databend/issues/0000)
8+
9+
## Summary
10+
11+
Enable databend to recover from disasters involving the loss of either metadata or data.
12+
13+
## Motivation
14+
15+
Databend is designed to be highly available and fault-tolerant. Its metadata is served by Databend MetaSrv, which is powered by [OpenRaft](https://github.com/databendlabs/openraft). The data is stored in object storage systems such as S3, GCS, and others, which guarantee 99.99% availability and 99.999999999% durability.
16+
17+
However, it is insufficient for our enterprise users who require a robust disaster recovery plan. These users either have significant needs for cross-continent disaster recovery or must comply with stringent regulatory requirements.
18+
19+
For example, [The Health Insurance Portability and Accountability Act (HIPAA)](https://www.hhs.gov/hipaa/index.html) mandates that healthcare organizations develop and implement contingency plans. Such planning ensures that, in the event of a natural or man-made disaster disrupting operations, the business can continue functioning until regular services are restored.
20+
21+
This RFC proposes a solution to enable Databend to recover from disasters involving the loss of metadata or data.
22+
23+
## Guide-level explanation
24+
25+
This RFC is the first step in enabling Databend to recover from disasters involving the loss of metadata or data. We will support `BACKUP` and `RESTORE` commands to back up and restore both metadata and data at the same time.
26+
27+
`BACKUP` and `RESTORE` table:
28+
29+
```sql
30+
BACKUP TABLE [ <database_name>. ]<table_name>
31+
INTO { internalStage | externalStage | externalLocation };
32+
33+
RESOTRE TABLE
34+
FROM { internalStage | externalStage | externalLocation };
35+
```
36+
37+
`BACKUP` and `RESTORE` database:
38+
39+
```sql
40+
BACKUP DATABASE <database_name>
41+
INTO { internalStage | externalStage | externalLocation };
42+
43+
RESOTRE DATABASE
44+
FROM { internalStage | externalStage | externalLocation };
45+
```
46+
47+
For example, users can backup the `test` table to an external stage:
48+
49+
```sql
50+
BACKUP TABLE test INTO @backup_stage/table/test/2025_01_09_08_00_00/;
51+
```
52+
53+
`BACKUP` supports both full and incremental backups. The full backup will back up all metadata and data, while the incremental backup will only back up the changes since the last full or incremental backup.
54+
55+
`BACKUP` will perform incremental backups by default. Users can specify the `FULL` keyword to perform a full backup:
56+
57+
```sql
58+
BACKUP TABLE test INTO @backup_stage/table/test/2025_01_09_08_00_00/ FULL;
59+
```
60+
61+
The backup will store all relevant metadata and data in the backup storage, ensuring that users can restore it even if the entire databend cluster is lost.
62+
63+
Users can restore the `test` table from the external stage in another databend cluster:
64+
65+
```sql
66+
RESOTRE TABLE FROM @backup_stage/table/test/2025_01_09_08_00_00/;
67+
```
68+
69+
`RESTORE` also supports `DRY RUN` to preview the restore operation without actually restoring the metadata and data.
70+
71+
```sql
72+
RESOTRE TABLE FROM @backup_stage/table/test/2025_01_09_08_00_00/ DRY RUN;
73+
```
74+
75+
Users can use `DRY RUN` to check and validate the backup without affecting the existing metadata and data.
76+
77+
### Maintenance
78+
79+
Databend will provide a set of system functions to manage the backups:
80+
81+
```sql
82+
-- scan backup manifest in given location
83+
SELECT list_backups(
84+
-- full identifier of the database or table,
85+
'test',
86+
location => '@backup_stage/table/test/'
87+
);
88+
89+
-- delete backup in given location
90+
SELECT delete_backup(
91+
-- full identifier of the database or table,
92+
'test',
93+
-- the location to search the backups.
94+
location => '@backup_stage/table/test/2025_01_09_08_00_00/'
95+
);
96+
97+
-- vacuum backup in given location to meet the retention policy.
98+
SELECT vacuum_backup(
99+
-- full identifier of the database or table,
100+
'test',
101+
-- the location to search the backups.
102+
location => '@backup_stage/table/test',
103+
-- keep recent 30 days backups
104+
RETENTION_DAYS => 30,
105+
-- keep at least for 7 days.
106+
MIN_RETENTION_DAYS => 7,
107+
-- keep at most 5 full backups.
108+
MAX_FULL_BACKUPS = 5,
109+
-- keep at least 2 full backups.
110+
MIN_FULL_BACKUPS = 2,
111+
-- keep at most 10 incremental backups.
112+
MAX_INCREMENTAL_BACKUPS = 10,
113+
);
114+
```
115+
116+
Perhaps we could integrate them into the SQL commands, allowing us to use `VACUUM BACKUP` to clean up the backups. However, we need to take the complexity of the SQL commands into account. Let's start with SQL functions first.
117+
118+
### Use cases
119+
120+
The backup and restore functionality can be used for the following scenarios:
121+
122+
#### Disaster Recovery
123+
124+
Users can back up databases or tables to an external location or storage to safeguard against data loss caused by disasters. In case of a disaster, they can restore the metadata and data to a new databend cluster to resume operations.
125+
126+
#### Dangerous Operations
127+
128+
Users can back up databases or tables before performing dangerous operations such as `VACUUM TABLE` or `ALTER TABLE`. If the operation fails or causes data loss, they can restore the metadata and data from the backup to recover the lost data.
129+
130+
In this case, users can backup table into internal stage directly for quick backup and restore:
131+
132+
```sql
133+
BACKUP TABLE test INTO '~/table/test/2025_01_09_08_00_00/';
134+
```
135+
136+
## Reference-level explanation
137+
138+
Databend will introduce an `BackupManifest` in which stores the following things:
139+
140+
- metadata of given backup: like backup time, backup location, backup type (full or incremental), etc.
141+
- the locations of metadata backup: the locations which points to the metadata backup.
142+
- the locations of data backup: the locations which contains all table data.
143+
144+
```rust
145+
struct BackupManifest {
146+
meta: BackupMeta,
147+
table_meta: BackupTableMeta,
148+
table_data: Vec<BackupTableData>,
149+
...
150+
}
151+
152+
struct BackupMeta {
153+
backup_time: DateTime<Utc>,
154+
backup_type: BackupType,
155+
...
156+
}
157+
158+
struct BackupTableMeta {
159+
location: String
160+
...
161+
}
162+
163+
struct BackupTableData {
164+
source_location: String,
165+
166+
backup_location: String,
167+
etag: String,
168+
}
169+
```
170+
171+
The `BackupManifest` will be encoded by protobuf and stored inside backup storage along with the backup metadata and data.
172+
173+
During the backup process, Databend reads existing table snapshots to generate a `BackupManifest` file, dumps metadata from the metasrv, and copies all related data files to the backup storage.
174+
175+
During the restore process, Databend reads the `BackupManifest` file from the backup storage, copies all related data files to their original location, and restores the metadata to the metasrv.
176+
177+
To perform incremental backups, Databend checks the existing `BackupManifest` file and copies only the modified data files to the backup storage.
178+
179+
The protobuf definition of `BackupManifest` will be versioned to ensure both backward and forward compatibility. This will enable Databend Query to restore backups created using different versions of Databend.
180+
181+
## Drawbacks
182+
183+
None.
184+
185+
## Rationale and alternatives
186+
187+
### Why not backup and restore the metadata and data directly?
188+
189+
It's simple and feasible to back up and restore both metadata and data directly. For instance, users can export the metadata using our existing tool and then copy the entire bucket to another location to back up the data.
190+
191+
However, this approach has several drawbacks:
192+
193+
**It's manual and error-prone**
194+
195+
Both the backup and restore processes require manual operations and external tools, making them susceptible to errors. For instance, users might forget to back up the metadata or data, or they might accidentally overwrite the backup files.
196+
197+
During emergency recovery, users must manually restore the metadata and data from the backup under highly stressful conditions, increasing the likelihood of mistakes that could result in data loss or prolonged recovery times.
198+
199+
**It's transaction unaware**
200+
201+
The backup and restore processes are not transaction-aware, making it highly likely that metadata and data will become inconsistent if these processes are not properly coordinated. For example, metadata might be backed up before the corresponding data, resulting in discrepancies between the two.
202+
203+
Alternatively, we could implement a constraint preventing operations on the table during the backup of metadata and data, but this could impact system availability.
204+
205+
**It doesn't work in complex scenarios**
206+
207+
Databend offers excellent support for `stage`. It is common for users to store and access data from various cloud locations. Backing up data from different buckets or even across multiple storage vendors is highly challenging and adds significant complexity to the backup and restore process.
208+
209+
## Prior art
210+
211+
### Databricks Clone
212+
213+
Databricks allows users to perform shadow and deep cloning of a table.
214+
215+
For example:
216+
217+
Use clone for data archiving
218+
219+
```sql
220+
CREATE OR REPLACE TABLE archive_table CLONE my_prod_table;
221+
```
222+
223+
Or use clone for short-term experiments on a production table
224+
225+
```sql
226+
-- Perform shallow clone
227+
CREATE OR REPLACE TABLE my_test SHALLOW CLONE my_prod_table;
228+
229+
UPDATE my_test WHERE user_id is null SET invalid=true;
230+
-- Run a bunch of validations. Once happy:
231+
232+
-- This should leverage the update information in the clone to prune to only
233+
-- changed files in the clone if possible
234+
MERGE INTO my_prod_table
235+
USING my_test
236+
ON my_test.user_id <=> my_prod_table.user_id
237+
WHEN MATCHED AND my_test.user_id is null THEN UPDATE *;
238+
239+
DROP TABLE my_test;
240+
```
241+
242+
## Unresolved questions
243+
244+
None.
245+
246+
## Future possibilities
247+
248+
### Task
249+
250+
After Databend adds native task support, users will be able to perform timely automatic backups for all existing tables as needed.
251+
252+
### Replication
253+
254+
In the future, we could extend the backup and restore functionality to support replication. This would allow users to replicate databases or tables across different databend clusters for disaster recovery or data distribution purposes.
255+
256+
Databend can also implement a warm standby to ensure high availability and fault tolerance.
257+
258+
### Iceberg
259+
260+
In the future, databend can support backup and restore an iceberg table.

0 commit comments

Comments
 (0)