Skip to content

Commit b0bbde9

Browse files
ns-janandaramclaude
andcommitted
feat: add custom encrypted GCS backup scripts
Add client-side encryption scripts for uploading backups to GCS using age encryption. This provides an alternative for users who need to encrypt data before it leaves their servers. Scripts included: - encrypted-upload.sh: compress, encrypt, and upload to GCS - encrypted-download.sh: download, decrypt, and extract - encrypted-list.sh: list remote backups with decrypted metadata - encrypted-delete.sh: delete backups from GCS - setup.sh: one-time installation and key generation - config-encrypted-gcs.yml: sample configuration - README.md: comprehensive documentation Co-Authored-By: Claude Opus 4.5 <[email protected]>
1 parent ebec8c0 commit b0bbde9

File tree

7 files changed

+667
-0
lines changed

7 files changed

+667
-0
lines changed
Lines changed: 342 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,342 @@
1+
# Custom Encrypted GCS Backup for clickhouse-backup
2+
3+
Client-side encryption for ClickHouse backups before uploading to Google Cloud Storage.
4+
5+
This solution uses [age](https://github.com/FiloSottile/age) encryption to encrypt backups locally before uploading to GCS, ensuring data is encrypted at rest with keys you control.
6+
7+
## Features
8+
9+
- **Client-side encryption** - Data is encrypted before leaving your server
10+
- **Age encryption** - Modern, secure, and simple encryption tool
11+
- **Key control** - You manage your own encryption keys
12+
- **Compatible** - Works with standard clickhouse-backup commands
13+
14+
## Prerequisites
15+
16+
| Tool | Installation | Purpose |
17+
|------|--------------|---------|
18+
| [age](https://github.com/FiloSottile/age) | `apt install age` / `brew install age` | Encryption |
19+
| [gsutil](https://cloud.google.com/sdk/docs/install) | Google Cloud SDK | GCS operations |
20+
| [jq](https://stedolan.github.io/jq/) | `apt install jq` / `brew install jq` | JSON processing |
21+
22+
## Quick Start
23+
24+
```bash
25+
# 1. Clone or navigate to the scripts directory
26+
cd scripts/custom-encrypted-gcs
27+
28+
# 2. Run the setup script
29+
./setup.sh
30+
31+
# 3. Configure your GCS bucket (add to ~/.bashrc or /etc/environment)
32+
export GCS_ENCRYPTED_BUCKET=gs://your-bucket/clickhouse-backups
33+
34+
# 4. Authenticate with GCS
35+
gcloud auth application-default login
36+
# Or for service accounts:
37+
# export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
38+
39+
# 5. Test the setup
40+
clickhouse-backup create test_backup
41+
clickhouse-backup upload test_backup
42+
clickhouse-backup list remote
43+
clickhouse-backup download test_backup
44+
clickhouse-backup delete remote test_backup
45+
```
46+
47+
## Installation
48+
49+
### Automated Setup
50+
51+
```bash
52+
./setup.sh
53+
```
54+
55+
This will:
56+
- Check for required dependencies
57+
- Create directories (`/opt/clickhouse-backup/scripts`, `/etc/clickhouse-backup`)
58+
- Generate an encryption key
59+
- Install scripts and configuration
60+
61+
### Manual Setup
62+
63+
1. Install dependencies:
64+
```bash
65+
# Ubuntu/Debian
66+
apt install age jq
67+
68+
# macOS
69+
brew install age jq
70+
71+
# Install Google Cloud SDK
72+
# https://cloud.google.com/sdk/docs/install
73+
```
74+
75+
2. Generate encryption key:
76+
```bash
77+
mkdir -p /etc/clickhouse-backup
78+
age-keygen -o /etc/clickhouse-backup/encryption.key
79+
chmod 600 /etc/clickhouse-backup/encryption.key
80+
```
81+
82+
3. Copy scripts:
83+
```bash
84+
mkdir -p /opt/clickhouse-backup/scripts
85+
cp encrypted-*.sh /opt/clickhouse-backup/scripts/
86+
chmod +x /opt/clickhouse-backup/scripts/*.sh
87+
```
88+
89+
4. Copy configuration:
90+
```bash
91+
cp config-encrypted-gcs.yml /etc/clickhouse-backup/config.yml
92+
```
93+
94+
## Configuration
95+
96+
### clickhouse-backup config (`/etc/clickhouse-backup/config.yml`)
97+
98+
```yaml
99+
general:
100+
remote_storage: custom
101+
backups_to_keep_local: 2
102+
backups_to_keep_remote: 5
103+
104+
custom:
105+
upload_command: "/opt/clickhouse-backup/scripts/encrypted-upload.sh {{.backup_name}}"
106+
download_command: "/opt/clickhouse-backup/scripts/encrypted-download.sh {{.backup_name}}"
107+
list_command: "/opt/clickhouse-backup/scripts/encrypted-list.sh"
108+
delete_command: "/opt/clickhouse-backup/scripts/encrypted-delete.sh {{.backup_name}}"
109+
command_timeout: "4h"
110+
```
111+
112+
### Environment Variables
113+
114+
| Variable | Default | Description |
115+
|----------|---------|-------------|
116+
| `GCS_ENCRYPTED_BUCKET` | `gs://your-bucket/clickhouse-backups` | GCS bucket and path for backups |
117+
| `ENCRYPTION_KEY_FILE` | `/etc/clickhouse-backup/encryption.key` | Path to age encryption key |
118+
| `CLICKHOUSE_BACKUP_PATH` | `/var/lib/clickhouse/backup` | Local backup directory |
119+
120+
Set these in `/etc/environment`, `~/.bashrc`, or your systemd unit file:
121+
122+
```bash
123+
# /etc/default/clickhouse-backup
124+
GCS_ENCRYPTED_BUCKET=gs://my-bucket/clickhouse-backups
125+
ENCRYPTION_KEY_FILE=/etc/clickhouse-backup/encryption.key
126+
```
127+
128+
## Usage
129+
130+
### Create and Upload Backup
131+
132+
```bash
133+
# Create local backup
134+
clickhouse-backup create my_backup
135+
136+
# Upload with encryption
137+
clickhouse-backup upload my_backup
138+
139+
# Or do both in one command
140+
clickhouse-backup create_remote my_backup
141+
```
142+
143+
### List Remote Backups
144+
145+
```bash
146+
clickhouse-backup list remote
147+
```
148+
149+
### Download and Restore
150+
151+
```bash
152+
# Download and decrypt
153+
clickhouse-backup download my_backup
154+
155+
# Restore
156+
clickhouse-backup restore my_backup
157+
```
158+
159+
### Delete Remote Backup
160+
161+
```bash
162+
clickhouse-backup delete remote my_backup
163+
```
164+
165+
## Scripts Reference
166+
167+
### encrypted-upload.sh
168+
169+
Compresses the backup directory, encrypts it with age, and uploads to GCS.
170+
171+
```
172+
Usage: encrypted-upload.sh <backup_name>
173+
174+
Uploads:
175+
- <backup_name>.tar.gz.age # Encrypted backup archive
176+
- <backup_name>.metadata.json.age # Encrypted metadata (for list command)
177+
```
178+
179+
### encrypted-download.sh
180+
181+
Downloads encrypted backup from GCS, decrypts, and extracts.
182+
183+
```
184+
Usage: encrypted-download.sh <backup_name>
185+
186+
Downloads and extracts to:
187+
/var/lib/clickhouse/backup/<backup_name>/
188+
```
189+
190+
### encrypted-list.sh
191+
192+
Lists all backups on GCS by reading and decrypting metadata files.
193+
194+
```
195+
Usage: encrypted-list.sh
196+
197+
Output: JSON objects (one per line) compatible with clickhouse-backup
198+
```
199+
200+
### encrypted-delete.sh
201+
202+
Deletes a backup from GCS (both archive and metadata).
203+
204+
```
205+
Usage: encrypted-delete.sh <backup_name>
206+
```
207+
208+
## GCS Bucket Structure
209+
210+
```
211+
gs://your-bucket/clickhouse-backups/
212+
├── backup_2024_01_15.tar.gz.age # Encrypted backup archive
213+
├── backup_2024_01_15.metadata.json.age # Encrypted metadata
214+
├── backup_2024_01_16.tar.gz.age
215+
├── backup_2024_01_16.metadata.json.age
216+
└── ...
217+
```
218+
219+
## Security Considerations
220+
221+
### Key Management
222+
223+
- **Back up your encryption key** - Without it, backups cannot be recovered
224+
- **Restrict key permissions** - `chmod 600 /etc/clickhouse-backup/encryption.key`
225+
- **Consider key rotation** - Periodically generate new keys for new backups
226+
- **Use a secrets manager** - For production, consider HashiCorp Vault, GCP Secret Manager, etc.
227+
228+
### Key Backup
229+
230+
```bash
231+
# Display key for backup
232+
sudo cat /etc/clickhouse-backup/encryption.key
233+
234+
# The output looks like:
235+
# # created: 2024-01-15T10:30:00Z
236+
# # public key: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
237+
# AGE-SECRET-KEY-1XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
238+
```
239+
240+
Store this securely (password manager, offline storage, etc.).
241+
242+
### GCS Permissions
243+
244+
Minimum required permissions for the service account:
245+
- `storage.objects.create` - Upload backups
246+
- `storage.objects.get` - Download backups
247+
- `storage.objects.list` - List backups
248+
- `storage.objects.delete` - Delete backups
249+
250+
Example IAM role: `roles/storage.objectAdmin` (scoped to the backup bucket)
251+
252+
## Limitations
253+
254+
| Limitation | Description |
255+
|------------|-------------|
256+
| No resume support | If upload fails, you must restart from the beginning |
257+
| No incremental backups | `--diff-from` flag won't work as expected |
258+
| No parallel uploads | Single-stream upload (limited by encryption pipe) |
259+
| Metadata decryption | List command must decrypt each metadata file (slower for many backups) |
260+
261+
## Troubleshooting
262+
263+
### "Encryption key file not found"
264+
265+
```bash
266+
# Check if key exists
267+
ls -la /etc/clickhouse-backup/encryption.key
268+
269+
# Generate if missing
270+
age-keygen -o /etc/clickhouse-backup/encryption.key
271+
chmod 600 /etc/clickhouse-backup/encryption.key
272+
```
273+
274+
### "gsutil: command not found"
275+
276+
```bash
277+
# Install Google Cloud SDK
278+
curl https://sdk.cloud.google.com | bash
279+
exec -l $SHELL
280+
gcloud init
281+
```
282+
283+
### "AccessDeniedException: 403"
284+
285+
```bash
286+
# Check authentication
287+
gcloud auth list
288+
289+
# Re-authenticate
290+
gcloud auth application-default login
291+
292+
# Or check service account permissions
293+
gsutil iam get gs://your-bucket
294+
```
295+
296+
### "age: error: no identity matched any of the recipients"
297+
298+
The encryption key doesn't match. Ensure you're using the same key that was used to encrypt:
299+
300+
```bash
301+
# Check which key is configured
302+
echo $ENCRYPTION_KEY_FILE
303+
cat $ENCRYPTION_KEY_FILE | head -3
304+
```
305+
306+
### List command returns empty
307+
308+
```bash
309+
# Check if backups exist in GCS
310+
gsutil ls gs://your-bucket/clickhouse-backups/
311+
312+
# Test decryption manually
313+
gsutil cat gs://your-bucket/clickhouse-backups/backup_name.metadata.json.age | \
314+
age --decrypt --identity /etc/clickhouse-backup/encryption.key
315+
```
316+
317+
### Slow uploads/downloads
318+
319+
For large backups, consider:
320+
- Using `gsutil -o GSUtil:parallel_composite_upload_threshold=150M` for parallel uploads
321+
- Increasing `command_timeout` in config
322+
- Running from a VM in the same GCP region as your bucket
323+
324+
## Alternative: Using GPG Instead of Age
325+
326+
If you prefer GPG over age, modify the scripts:
327+
328+
```bash
329+
# Generate key
330+
openssl rand -base64 32 > /etc/clickhouse-backup/encryption.key
331+
332+
# Encrypt (replace age command)
333+
gpg --batch --yes --passphrase-file "$ENCRYPTION_KEY_FILE" \
334+
--symmetric --cipher-algo AES256
335+
336+
# Decrypt (replace age command)
337+
gpg --batch --yes --passphrase-file "$ENCRYPTION_KEY_FILE" --decrypt
338+
```
339+
340+
## License
341+
342+
These scripts are provided as part of clickhouse-backup examples. Use at your own risk.
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# clickhouse-backup configuration for encrypted GCS storage
2+
# Copy this to /etc/clickhouse-backup/config.yml and customize
3+
4+
general:
5+
remote_storage: custom
6+
backups_to_keep_local: 2
7+
backups_to_keep_remote: 5
8+
log_level: info
9+
allow_empty_backups: false
10+
11+
clickhouse:
12+
username: default
13+
password: ""
14+
host: localhost
15+
port: 9000
16+
data_path: /var/lib/clickhouse
17+
skip_tables:
18+
- system.*
19+
- INFORMATION_SCHEMA.*
20+
- information_schema.*
21+
timeout: 5m
22+
secure: false
23+
skip_verify: false
24+
25+
custom:
26+
# Path to scripts - adjust based on your installation
27+
upload_command: "/opt/clickhouse-backup/scripts/encrypted-upload.sh {{.backup_name}}"
28+
download_command: "/opt/clickhouse-backup/scripts/encrypted-download.sh {{.backup_name}}"
29+
list_command: "/opt/clickhouse-backup/scripts/encrypted-list.sh"
30+
delete_command: "/opt/clickhouse-backup/scripts/encrypted-delete.sh {{.backup_name}}"
31+
command_timeout: "4h"
32+
33+
# Environment variables to set (or export before running clickhouse-backup):
34+
#
35+
# GCS_ENCRYPTED_BUCKET=gs://your-bucket/clickhouse-backups
36+
# ENCRYPTION_KEY_FILE=/etc/clickhouse-backup/encryption.key
37+
# CLICKHOUSE_BACKUP_PATH=/var/lib/clickhouse/backup
38+
#
39+
# You can also set these in /etc/default/clickhouse-backup or your systemd unit file

0 commit comments

Comments
 (0)