Skip to content

Commit ca67665

Browse files
committed
Renamed doc to docs
1 parent 38f8060 commit ca67665

File tree

6 files changed

+1021
-0
lines changed

6 files changed

+1021
-0
lines changed

docs/.ice.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
uri: http://localhost:8181
2+
s3:
3+
endpoint: http://localhost:9000
4+
pathStyleAccess: true
5+
accessKeyID: minio
6+
secretAccessKey: minio123
7+
region: us-east-1

docs/architecture.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# ICE REST Catalog Architecture
2+
3+
![ICE REST Catalog Architecture](ice-rest-catalog-architecture.drawio.png)
4+
5+
## Components
6+
7+
- **ice-rest-catalog**: Stateless REST API service (Kubernetes Deployment)
8+
- **etcd**: Distributed key-value store for catalog state (Kubernetes StatefulSet)
9+
- **Object Storage**: S3-compatible storage for data files
10+
- **Clients**: ClickHouse or other Iceberg-compatible engines
11+
12+
## Design Principles
13+
14+
### Stateless Catalog
15+
16+
The `ice-rest-catalog` is completely stateless and deployed as a Kubernetes Deployment with multiple replicas.
17+
It can be scaled horizontally without coordination. The catalog does not store any state locally—all metadata is persisted in etcd.
18+
19+
### State Management
20+
21+
All catalog state (namespaces, tables, schemas, snapshots, etc.) is maintained in **etcd**, a distributed, consistent key-value store.
22+
Each etcd instance runs as a StatefulSet pod with persistent storage, ensuring data durability across restarts.
23+
24+
### Service Discovery
25+
26+
`ice-rest-catalog` uses the k8s service to access the cluster.
27+
The catalog uses jetcd library to interact with etcd https://github.com/etcd-io/jetcd.
28+
In the etcd cluster, the data is replicated in all the nodes of the cluster.
29+
The service provides a round-robin approach to access the nodes in the cluster.
30+
31+
### High Availability
32+
33+
- Multiple `ice-rest-catalog` replicas behind a load balancer
34+
- etcd cluster.
35+
- Persistent volumes for etcd data
36+
- S3 for durable object storage
37+
38+
## Backup/Recovery
39+
All state information for the catalog is maintained in etcd. To back up the ICE REST Catalog state, you can use standard etcd snapshot tools. The official etcd documentation provides guidance on [snapshotting and recovery](https://etcd.io/docs/v3.5/op-guide/recovery/).
40+
41+
**Backup etcd Example**:
42+
```shell
43+
etcdctl --endpoints=<etcd-endpoint> \
44+
--cacert=<trusted-ca-file> \
45+
--cert=<cert-file> \
46+
--key=<key-file> \
47+
snapshot save /path/to/backup.db
48+
```
49+
50+
Replace the arguments as appropriate for your deployment (for example, endpoints, authentication, and TLS options).
51+
52+
**Restore etcd Example**:
53+
```shell
54+
etcdctl snapshot restore /path/to/backup.db \
55+
--data-dir /var/lib/etcd
56+
```
57+
58+
The ICE REST Catalog is designed such that if you restore etcd and point the catalog services at the restored etcd cluster, all catalog state (databases, tables, schemas, snapshots) will be recovered automatically.
59+
60+
**Note:** Data files themselves (table/parquet data) are stored in Object Storage (e.g., S3, MinIO), and should be backed up or protected in accordance with your object storage vendor's recommendations.
61+
62+
### k8s Manifest Files
63+
64+
Kubernetes deployment manifests and configuration files are available in the [`examples/eks`](../examples/eks/) folder:
65+
66+
- [`etcd.eks.yaml`](../examples/eks/etcd.eks.yaml) - etcd StatefulSet deployment
67+
- [`ice-rest-catalog.eks.envsubst.yaml`](../examples/eks/ice-rest-catalog.eks.envsubst.yaml) - ice-rest-catalog Deployment (requires envsubst)
68+
- [`eks.envsubst.yaml`](../examples/eks/eks.envsubst.yaml) - Combined EKS deployment template
69+
70+
See the [EKS README](../examples/eks/README.md) for detailed setup instructions.

0 commit comments

Comments
 (0)