1
1
# Service Quality Oracle - Kubernetes Deployment
2
2
3
- This directory contains Kubernetes manifests for deploying the Service Quality Oracle with persistent state management.
3
+ This directory contains Kubernetes manifests for deploying the Service Quality Oracle in different environments using Kustomize with persistent state management.
4
+
5
+ ## Structure
6
+
7
+ ```
8
+ k8s/
9
+ ├── README.md # This file
10
+ ├── auth.sh # Global auth script (configure for your cluster)
11
+ ├── base/ # Common base resources
12
+ │ ├── kustomization.yaml
13
+ │ ├── namespace.yaml
14
+ │ ├── deployment.yaml
15
+ │ ├── service.yaml
16
+ │ ├── servicemonitor.yaml
17
+ │ ├── serviceaccount.yaml
18
+ │ └── podmonitor.yaml
19
+ └── environments/
20
+ ├── mainnet/ # Production environment
21
+ │ ├── kustomization.yaml
22
+ │ ├── config.yaml # Mainnet configuration
23
+ │ ├── config.secret.yaml # Mainnet secrets (configure before use)
24
+ │ ├── persistent-volume-claim.yaml
25
+ │ ├── auth.sh
26
+ │ ├── apply.sh
27
+ │ ├── diff.sh
28
+ │ └── restart-deployments.sh
29
+ └── testnet/ # Staging environment
30
+ ├── kustomization.yaml
31
+ ├── config.yaml # Testnet configuration
32
+ ├── config.secret.yaml # Testnet secrets (configure before use)
33
+ ├── persistent-volume-claim.yaml
34
+ ├── auth.sh
35
+ ├── apply.sh
36
+ ├── diff.sh
37
+ └── restart-deployments.sh
38
+ ```
4
39
5
40
## Prerequisites
6
41
7
42
- Kubernetes cluster (version 1.19+)
8
43
- ` kubectl ` configured to access your cluster
44
+ - Kustomize (built into kubectl v1.14+)
9
45
- Docker image published to ` ghcr.io/graphprotocol/service-quality-oracle `
10
46
- ** Storage class configured** (see Storage Configuration below)
11
47
12
48
## Quick Start
13
49
14
- ### 1. Create Secrets (Required)
50
+ ### 1. Configure Cluster Access
51
+
52
+ Update ` auth.sh ` with your GKE cluster details:
53
+
54
+ ``` bash
55
+ # Edit auth.sh
56
+ vim auth.sh
57
+
58
+ # Connect to your cluster
59
+ ./auth.sh
60
+ ```
61
+
62
+ ### 2. Deploy to Testnet
63
+
64
+ ``` bash
65
+ cd environments/testnet
66
+
67
+ # Configure secrets (replace placeholder values)
68
+ vim config.secret.yaml
69
+
70
+ # Preview changes
71
+ ./diff.sh
72
+
73
+ # Deploy
74
+ ./apply.sh
75
+
76
+ # Monitor
77
+ kubectl logs -f deployment/service-quality-oracle -n service-quality-oracle
78
+ ```
79
+
80
+ ### 3. Deploy to Mainnet
15
81
16
82
``` bash
17
- # Copy the example secrets file
18
- cp k8s/secrets.yaml.example k8s/secrets.yaml
83
+ cd environments/mainnet
19
84
20
- # Edit with your actual credentials
21
- # IMPORTANT: Never commit secrets.yaml to version control
22
- nano k8s/secrets.yaml
85
+ # Configure secrets (replace placeholder values with production keys)
86
+ vim config.secret.yaml
87
+
88
+ # Configure mainnet contract address
89
+ vim config.yaml
90
+ # Update BLOCKCHAIN_CONTRACT_ADDRESS with actual mainnet contract
91
+
92
+ # Preview changes
93
+ ./diff.sh
94
+
95
+ # Deploy (includes safety checks)
96
+ ./apply.sh
97
+
98
+ # Monitor
99
+ kubectl logs -f deployment/service-quality-oracle -n service-quality-oracle
23
100
```
24
101
25
- ** Required secrets:**
102
+ ## Environment Configuration
103
+
104
+ ### Environment Differences
105
+
106
+ | Setting | Testnet | Mainnet |
107
+ | ---------| ---------| ---------|
108
+ | Chain | Arbitrum Sepolia | Arbitrum One |
109
+ | Contract | 0x6d5...91f6 | Configure in config.yaml |
110
+ | Image Tag | testnet-latest | mainnet-latest |
111
+ | Labels | environment: testnet, variant: staging | environment: mainnet, variant: production |
112
+
113
+ ### Secret Configuration
114
+
115
+ Before deploying, you must configure the following secrets in each environment's ` config.secret.yaml ` :
116
+
26
117
- ** ` google-credentials ` ** : Service account JSON for BigQuery access
27
- - ** ` blockchain-private-key ` ** : Private key for Arbitrum Sepolia transactions
118
+ - ** ` blockchain-private-key ` ** : Private key for blockchain transactions (64 chars, no 0x)
119
+ - ** ` etherscan-api-key ` ** : Etherscan API key
28
120
- ** ` arbitrum-api-key ` ** : API key for Arbiscan contract verification
29
- - ** ` slack-webhook-url ` ** : Webhook URL for operational notifications
121
+ - ** ` studio-api-key ` ** : The Graph Studio API key
122
+ - ** ` studio-deploy-key ` ** : The Graph Studio deploy key
123
+ - ** ` slack-webhook-url ` ** : Slack webhook for notifications
30
124
31
- ### 2. Configure Storage (Required)
125
+ ## Storage Configuration
32
126
33
127
``` bash
34
128
# Check available storage classes
35
129
kubectl get storageclass
36
130
37
- # If you see a default storage class (marked with *), skip to step 3
38
- # Otherwise, edit persistent-volume-claim.yaml and uncomment the appropriate storageClassName
131
+ # The manifests use 'ssd-retain' storage class by default
132
+ # Edit environments/{mainnet,testnet}/ persistent-volume-claim.yaml if needed
39
133
```
40
134
41
135
** Common storage classes by platform:**
42
136
- ** AWS EKS** : ` gp2 ` , ` gp3 ` , ` ebs-csi `
43
- - ** Google GKE** : ` standard ` , ` ssd `
137
+ - ** Google GKE** : ` standard ` , ` ssd `
44
138
- ** Azure AKS** : ` managed-premium ` , ` managed `
45
139
- ** Local/Development** : ` hostpath ` , ` local-path `
46
140
47
- ### 3. Deploy to Kubernetes
141
+ ## Operations
142
+
143
+ ### Restart Deployments
48
144
49
145
``` bash
50
- # Apply all manifests
51
- kubectl apply -f k8s/
146
+ ./restart-deployments.sh
147
+ ```
148
+
149
+ ### View Logs
52
150
53
- # Verify deployment
54
- kubectl get pods -l app=service-quality-oracle
55
- kubectl get pvc -l app=service-quality-oracle
151
+ ``` bash
152
+ kubectl logs -f deployment/service-quality-oracle -n service-quality-oracle
56
153
```
57
154
58
- ### 4. Monitor Deployment
155
+ ### Check Status
59
156
60
157
``` bash
61
- # Check pod status
62
- kubectl describe pod -l app=service-quality-oracle
158
+ kubectl get all -n service-quality-oracle
159
+ ```
63
160
64
- # View logs
65
- kubectl logs -l app=service-quality-oracle -f
161
+ ### Delete Environment
66
162
67
- # Check persistent volumes
68
- kubectl get pv
163
+ ``` bash
164
+ kubectl delete -k .
69
165
```
70
166
167
+ ## Monitoring
168
+
169
+ - Prometheus scraping enabled via annotations
170
+ - ServiceMonitor and PodMonitor configured for metrics collection
171
+ - Metrics exposed on port 8000 at ` /metrics ` endpoint
172
+ - Labels applied for environment-specific alerting
173
+
71
174
## Architecture
72
175
73
176
### Persistent Storage
74
177
75
178
The service uses ** two persistent volumes** to maintain state across pod restarts:
76
179
77
- - ** ` service-quality-oracle-data ` (5GB )** : Circuit breaker state, last run tracking, BigQuery cache, CSV outputs
78
- - ** ` service-quality-oracle-logs ` (2GB )** : Application logs
180
+ - ** ` service-quality-oracle-data ` (10GB )** : Circuit breaker state, last run tracking, BigQuery cache, CSV outputs
181
+ - ** ` service-quality-oracle-logs ` (5GB )** : Application logs
79
182
80
183
** Mount points:**
81
184
- ` /app/data ` → Critical state files (circuit breaker, cache, outputs)
82
185
- ` /app/logs ` → Application logs
83
186
84
187
### Configuration Management
85
188
86
- ** Non-sensitive configuration** → ` ConfigMap ` (` configmap .yaml` )
87
- ** Sensitive credentials** → ` Secret ` (` secrets .yaml` )
189
+ ** Non-sensitive configuration** → ` ConfigMap ` (generated from ` config .yaml` )
190
+ ** Sensitive credentials** → ` Secret ` (generated from ` config.secret .yaml` )
88
191
89
192
This separation provides:
90
193
- ✅ Easy configuration updates without rebuilding images
@@ -98,7 +201,7 @@ This separation provides:
98
201
- Memory: 512M
99
202
100
203
** Limits (maximum):**
101
- - CPU: 1000m (1.0 core)
204
+ - CPU: 1000m (1.0 core)
102
205
- Memory: 1G
103
206
104
207
## State Persistence Benefits
@@ -127,7 +230,7 @@ kubectl describe pod -l app=service-quality-oracle
127
230
128
231
# Common issues:
129
232
# - Missing secrets
130
- # - PVC provisioning failures
233
+ # - PVC provisioning failures
131
234
# - Image pull errors
132
235
```
133
236
@@ -151,13 +254,16 @@ kubectl exec -it deployment/service-quality-oracle -- env | grep -E "(BIGQUERY|B
151
254
kubectl exec -it deployment/service-quality-oracle -- ls -la /etc/secrets
152
255
```
153
256
154
- ## Security Best Practices
257
+ ## Security
155
258
156
- ✅ ** Secrets never committed** to version control
259
+ ✅ ** Never commit actual secrets** - ` config.secret.yaml ` files contain placeholders only
260
+ ✅ ** Mainnet deployment safety checks** for production secrets
261
+ ✅ ** Non-root containers** with dropped capabilities
157
262
✅ ** Service account** with minimal BigQuery permissions
158
263
✅ ** Private key** stored in Kubernetes secrets (base64 encoded)
159
264
✅ ** Resource limits** prevent resource exhaustion
160
- ✅ ** Read-only filesystem** where possible
265
+ ✅ ** Workload Identity** configured for secure GCP access
266
+ ✅ ** SSD storage with retention** for data persistence
161
267
162
268
## Production Considerations
163
269
@@ -171,7 +277,7 @@ kubectl exec -it deployment/service-quality-oracle -- ls -la /etc/secrets
171
277
## Next Steps
172
278
173
279
1 . ** Test deployment** in staging environment
174
- 2 . ** Verify state persistence** across pod restarts
280
+ 2 . ** Verify state persistence** across pod restarts
175
281
3 . ** Set up monitoring** and alerting
176
282
4 . ** Configure backup** for persistent volumes
177
- 5 . ** Enable quality checking** after successful validation
283
+ 5 . ** Enable quality checking** after successful validation
0 commit comments