Kubernetes backup solution combining instant CSI snapshots with async BorgBackup for production-grade application backups.
Get instant, crash-consistent snapshots of your entire Kubernetes application (databases + data), then asynchronously backup to offsite BorgBackup storage. Perfect for applications like Immich, Nextcloud, or any stateful workload where you need:
- Instant point-in-time recovery via CSI snapshots (seconds)
- Offsite backup to BorgBackup repository (disaster recovery)
- Database consistency via pre/post hooks (PostgreSQL, MySQL, etc.)
- One Helm install per application - complete backup solution
Developed specifically to solve the problem of backing up complex Kubernetes applications with multiple PVCs and databases while maintaining consistency and enabling fast recovery.
- Kubernetes 1.25+
- CSI driver with VolumeSnapshot support
- For snapshot functionality
- Examples: Longhorn, Ceph RBD, ZFS, AWS EBS CSI, etc.
- Storage class must support CSI snapshots (creates
VolumeSnapshotresources)
- Storage class with snapshot cloning support
- For backup functionality (creates clone PVCs from snapshots)
- Clone PVCs must be creatable via
dataSource: VolumeSnapshot - Recommended: Use storage class with "Delete" reclaim policy for automatic clone cleanup
- VolumeSnapshotClass configured in cluster
- BorgBackup repository (e.g., BorgBase, self-hosted)
# Check CSI driver supports snapshots
kubectl get volumesnapshotclass
# Check if your storage class can create snapshots
kubectl get storageclass <your-class> -o yaml | grep -i snapshot
# Test snapshot creation (optional)
kubectl create -f - <<EOF
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: test-snapshot
spec:
source:
persistentVolumeClaimName: <your-pvc>
volumeSnapshotClassName: <your-snapshot-class>
EOF- CSI VolumeSnapshot Automation - Create and prune snapshots with tiered retention (hourly/daily/weekly/monthly)
- BorgBackup Integration - Backup snapshot clones to offsite BorgBackup repository with deduplication and compression
- Database Consistency Hooks - Pod-exec hooks for PostgreSQL, MySQL, etc. (pg_backup_start/stop)
- SIGTERM Safety - Guaranteed cleanup and post-hook execution even on pod termination
- Parallel Snapshots - Create multiple snapshots simultaneously for fast backup
- Privileged Container Support - Backup any PVC regardless of ownership (PostgreSQL 70:70, MySQL 999:999, etc.)
- Ephemeral Clone PVCs - Temporary PVCs from snapshots for backup, auto-cleaned after
- Configurable Timeouts - Per-backup timeouts for clone provisioning and backup execution
- Helm Deployment - Single chart with separate CronJobs for snapshot and backup operations
- Multi-Architecture - ARM64 and AMD64 support
Restore functionality is currently in development.
helm repo add kube-borg-backup https://frederikb96.github.io/kube-borg-backup
helm repo update# View all available options with inline documentation
helm show values kube-borg-backup/kube-borg-backup
# View complete Immich example with PostgreSQL hooks
curl -O https://raw.githubusercontent.com/frederikb96/kube-borg-backup/main/example/values.yaml# Create your values.yaml based on example
# Then install:
helm install my-backup kube-borg-backup/kube-borg-backup \
--namespace my-app \
--values my-values.yaml
# Verify deployment
kubectl get cronjobs -n my-app
kubectl get serviceaccount,role,rolebinding -n my-app | grep borg# Manually trigger snapshot job
kubectl create job --from=cronjob/my-backup-kube-borg-backup-snapshot \
snapshot-manual-test -n my-app
# Watch logs
kubectl logs -n my-app -l job-name=snapshot-manual-test -f
# Check created snapshots
kubectl get volumesnapshots -n my-appAll configuration is done via Helm values. See the following files for detailed documentation:
- charts/kube-borg-backup/values.yaml - All available configuration options with comprehensive inline comments
- example/values.yaml - Complete working example for Immich backup with PostgreSQL consistency hooks
- Snapshot Controller runs as CronJob and creates
VolumeSnapshotresources for configured PVCs - Pre-hooks execute sequentially before snapshots (e.g.,
pg_backup_start()to pause PostgreSQL writes) - Snapshots are created in parallel via CSI driver for instant point-in-time capture
- Post-hooks execute sequentially after snapshots (e.g.,
pg_backup_stop()to resume writes) - Tiered retention prunes old snapshots keeping 1 per time bucket (hourly/daily/weekly/monthly)
- Backup Controller runs as separate CronJob and finds latest snapshots
- Clone PVCs are created from snapshots with temporary storage class
- Borg pods are spawned dynamically (one per PVC) and run privileged to access any filesystem
- Backups execute sequentially (Borg limitation) with configurable timeouts and lock handling
- Cleanup happens automatically: borg pods deleted, clone PVCs removed, secrets cleaned up
SIGTERM safety: Both controllers have signal handlers ensuring post-hooks always run and resources are cleaned up even on pod eviction.
The tool consists of three components:
- Snapshot CronJob - Python controller that creates/prunes VolumeSnapshots with pod-exec hooks
- Backup CronJob - Python controller that creates clone PVCs and orchestrates Borg pods
- Borg Pods - Ephemeral privileged pods spawned per backup to run actual
borg createandborg prune
Both CronJobs use the unified kube-borg-backup/controller image (Python 3.13) with different entrypoints.
Borg pods use the kube-borg-backup/backup-runner image (Alpine + borgbackup + Python 3.13).
Technologies:
- Python 3.13 with kubernetes client library
- BorgBackup for deduplication and compression
- Kubernetes CSI VolumeSnapshot API
- Helm 3 for packaging
Linting and Type Checking:
# Set up testing venv
cd apps
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt -r controller/requirements.txt
# Run linting
ruff check apps/
# Run type checking
mypy --config-file mypy.iniContributions welcome! Please:
- Create issues for bugs or feature requests
- Fork and submit pull requests
- Follow existing code style (Ruff + Mypy enforced via CI)
- Update CHANGELOG.md with your changes
- Test in dedicated namespace in kubernetes before submitting
MIT License - see LICENSE file for details.