diff --git a/k8s/README.md b/k8s/README.md index ce0a610..9dc3ffa 100644 --- a/k8s/README.md +++ b/k8s/README.md @@ -1,92 +1,198 @@ # Service Quality Oracle - Kubernetes Deployment -This directory contains Kubernetes manifests for deploying the Service Quality Oracle with persistent state management. +This directory contains Kubernetes manifests for deploying the Service Quality Oracle in different environments using Kustomize with persistent state management. + +## Structure + +``` +k8s/ +├── README.md # This file +├── auth.sh # Global auth script (configure for your cluster) +├── base/ # Common base resources +│ ├── kustomization.yaml +│ ├── namespace.yaml +│ ├── deployment.yaml +│ ├── service.yaml +│ ├── servicemonitor.yaml +│ ├── serviceaccount.yaml +│ └── podmonitor.yaml +└── environments/ + ├── mainnet/ # Production environment + │ ├── kustomization.yaml + │ ├── config.yaml # Mainnet configuration + │ ├── config.secret.yaml # Mainnet secrets (configure before use) + │ ├── persistent-volume-claim.yaml + │ ├── auth.sh + │ ├── apply.sh + │ ├── diff.sh + │ └── restart-deployments.sh + └── testnet/ # Staging environment + ├── kustomization.yaml + ├── config.yaml # Testnet configuration + ├── config.secret.yaml # Testnet secrets (configure before use) + ├── persistent-volume-claim.yaml + ├── auth.sh + ├── apply.sh + ├── diff.sh + └── restart-deployments.sh +``` ## Prerequisites - Kubernetes cluster (version 1.19+) - `kubectl` configured to access your cluster -- Docker image published to `ghcr.io/graphprotocol/service-quality-oracle` +- Kustomize (built into kubectl v1.14+) +- Docker image published to `ghcr.io/graphprotocol/rewards-eligibility-oracle` - **Storage class configured** (see Storage Configuration below) ## Quick Start -### 1. Create Secrets (Required) +### 1. Configure Cluster Access + +Update `auth.sh` with your GKE cluster details: + +```bash +# Edit auth.sh +vim auth.sh + +# Connect to your cluster +./auth.sh +``` + +### 2. Deploy to Testnet ```bash -# Copy the example secrets file -cp k8s/secrets.yaml.example k8s/secrets.yaml +cd environments/testnet + +# Configure secrets (replace placeholder values) +vim config.secret.yaml + +# Preview changes +./diff.sh + +# Deploy +./apply.sh + +# Monitor +kubectl logs -f deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle +``` + +### 3. Deploy to Mainnet + +```bash +cd environments/mainnet + +# Configure secrets (replace placeholder values with production keys) +vim config.secret.yaml + +# Configure mainnet contract address +vim config.yaml +# Update BLOCKCHAIN_CONTRACT_ADDRESS with actual mainnet contract -# Edit with your actual credentials -# IMPORTANT: Never commit secrets.yaml to version control -nano k8s/secrets.yaml +# Preview changes +./diff.sh + +# Deploy (includes safety checks) +./apply.sh + +# Monitor +kubectl logs -f deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle ``` -**Required secrets:** +## Environment Configuration + +### Environment Differences + +| Setting | Testnet | Mainnet | +|---------|---------|---------| +| Chain | Arbitrum Sepolia | Arbitrum One | +| Contract | 0x6d5...91f6 | Configure in config.yaml | +| Image Tag | testnet-latest | mainnet-latest | +| Labels | environment: testnet, variant: staging | environment: mainnet, variant: production | + +### Secret Configuration + +Before deploying, you must configure the following secrets in each environment's `config.secret.yaml`: + - **`google-credentials`**: Service account JSON for BigQuery access -- **`blockchain-private-key`**: Private key for Arbitrum Sepolia transactions +- **`blockchain-private-key`**: Private key for blockchain transactions (64 chars, no 0x) +- **`etherscan-api-key`**: Etherscan API key - **`arbitrum-api-key`**: API key for Arbiscan contract verification -- **`slack-webhook-url`**: Webhook URL for operational notifications +- **`studio-api-key`**: The Graph Studio API key +- **`studio-deploy-key`**: The Graph Studio deploy key +- **`slack-webhook-url`**: Slack webhook for notifications -### 2. Configure Storage (Required) +## Storage Configuration ```bash # Check available storage classes kubectl get storageclass -# If you see a default storage class (marked with *), skip to step 3 -# Otherwise, edit persistent-volume-claim.yaml and uncomment the appropriate storageClassName +# The manifests use 'ssd-retain' storage class by default +# Edit environments/{mainnet,testnet}/persistent-volume-claim.yaml if needed ``` **Common storage classes by platform:** + - **AWS EKS**: `gp2`, `gp3`, `ebs-csi` -- **Google GKE**: `standard`, `ssd` +- **Google GKE**: `standard`, `ssd` - **Azure AKS**: `managed-premium`, `managed` - **Local/Development**: `hostpath`, `local-path` -### 3. Deploy to Kubernetes +## Operations + +### Restart Deployments ```bash -# Apply all manifests -kubectl apply -f k8s/ +./restart-deployments.sh +``` -# Verify deployment -kubectl get pods -l app=service-quality-oracle -kubectl get pvc -l app=service-quality-oracle +### View Logs + +```bash +kubectl logs -f deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle ``` -### 4. Monitor Deployment +### Check Status ```bash -# Check pod status -kubectl describe pod -l app=service-quality-oracle +kubectl get all -n rewards-eligibility-oracle +``` -# View logs -kubectl logs -l app=service-quality-oracle -f +### Delete Environment -# Check persistent volumes -kubectl get pv +```bash +kubectl delete -k . ``` +## Monitoring + +- Prometheus scraping enabled via annotations +- ServiceMonitor and PodMonitor configured for metrics collection +- Metrics exposed on port 8000 at `/metrics` endpoint +- Labels applied for environment-specific alerting + ## Architecture ### Persistent Storage The service uses **two persistent volumes** to maintain state across pod restarts: -- **`service-quality-oracle-data` (5GB)**: Circuit breaker state, last run tracking, BigQuery cache, CSV outputs -- **`service-quality-oracle-logs` (2GB)**: Application logs +- **`rewards-eligibility-oracle-data` (10GB)**: Circuit breaker state, last run tracking, BigQuery cache, CSV outputs +- **`rewards-eligibility-oracle-logs` (5GB)**: Application logs **Mount points:** + - `/app/data` → Critical state files (circuit breaker, cache, outputs) - `/app/logs` → Application logs ### Configuration Management -**Non-sensitive configuration** → `ConfigMap` (`configmap.yaml`) -**Sensitive credentials** → `Secret` (`secrets.yaml`) +**Non-sensitive configuration** → `ConfigMap` (generated from `config.yaml`) +**Sensitive credentials** → `Secret` (generated from `config.secret.yaml`) This separation provides: + - ✅ Easy configuration updates without rebuilding images - ✅ Secure credential management with base64 encoding - ✅ Clear separation of concerns @@ -94,11 +200,13 @@ This separation provides: ### Resource Allocation **Requests (guaranteed):** + - CPU: 250m (0.25 cores) - Memory: 512M **Limits (maximum):** -- CPU: 1000m (1.0 core) + +- CPU: 1000m (1.0 core) - Memory: 1G ## State Persistence Benefits @@ -114,7 +222,7 @@ With persistent volumes, the service maintains: The deployment uses **file-based health checks** (same as docker-compose): -**Liveness probe:** Checks `/app/healthcheck` file modification time +**Liveness probe:** Checks `/app/healthcheck` file modification time **Readiness probe:** Verifies `/app/healthcheck` file exists ## Troubleshooting @@ -123,7 +231,7 @@ The deployment uses **file-based health checks** (same as docker-compose): ```bash # Check events -kubectl describe pod -l app=service-quality-oracle +kubectl describe pod -l app=rewards-eligibility-oracle # Common issues: # - Missing secrets @@ -138,26 +246,29 @@ kubectl describe pod -l app=service-quality-oracle kubectl get pvc # Check if volumes are mounted correctly -kubectl exec -it deployment/service-quality-oracle -- ls -la /app/data +kubectl exec -it deployment/rewards-eligibility-oracle -- ls -la /app/data ``` ### Debug Configuration ```bash # Check environment variables -kubectl exec -it deployment/service-quality-oracle -- env | grep -E "(BIGQUERY|BLOCKCHAIN)" +kubectl exec -it deployment/rewards-eligibility-oracle -- env | grep -E "(BIGQUERY|BLOCKCHAIN)" # Verify secrets are mounted -kubectl exec -it deployment/service-quality-oracle -- ls -la /etc/secrets +kubectl exec -it deployment/rewards-eligibility-oracle -- ls -la /etc/secrets ``` -## Security Best Practices +## Security -✅ **Secrets never committed** to version control -✅ **Service account** with minimal BigQuery permissions -✅ **Private key** stored in Kubernetes secrets (base64 encoded) -✅ **Resource limits** prevent resource exhaustion -✅ **Read-only filesystem** where possible +✅ **Never commit actual secrets** - `config.secret.yaml` files contain placeholders only +✅ **Mainnet deployment safety checks** for production secrets +✅ **Non-root containers** with dropped capabilities +✅ **Service account** with minimal BigQuery permissions +✅ **Private key** stored in Kubernetes secrets (base64 encoded) +✅ **Resource limits** prevent resource exhaustion +✅ **Workload Identity** configured for secure GCP access +✅ **SSD storage with retention** for data persistence ## Production Considerations @@ -171,7 +282,7 @@ kubectl exec -it deployment/service-quality-oracle -- ls -la /etc/secrets ## Next Steps 1. **Test deployment** in staging environment -2. **Verify state persistence** across pod restarts +2. **Verify state persistence** across pod restarts 3. **Set up monitoring** and alerting 4. **Configure backup** for persistent volumes -5. **Enable quality checking** after successful validation \ No newline at end of file +5. **Enable quality checking** after successful validation diff --git a/k8s/base/deployment.yaml b/k8s/base/deployment.yaml new file mode 100644 index 0000000..9422c99 --- /dev/null +++ b/k8s/base/deployment.yaml @@ -0,0 +1,119 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: rewards-eligibility-oracle + labels: + app: rewards-eligibility-oracle +spec: + replicas: 1 # Single instance due to state management + selector: + matchLabels: + app: rewards-eligibility-oracle + template: + metadata: + labels: + app: rewards-eligibility-oracle + annotations: + prometheus.io/scrape: "true" + prometheus.io/path: "/metrics" + prometheus.io/port: "8000" + spec: + imagePullSecrets: + - name: docker-registry + containers: + - name: rewards-eligibility-oracle + image: ghcr.io/graphprotocol/rewards-eligibility-oracle:latest + imagePullPolicy: IfNotPresent + ports: + - containerPort: 8000 + name: metrics + envFrom: + # Load all non-sensitive configuration from ConfigMap + - configMapRef: + name: rewards-eligibility-oracle-config + env: + # Secrets from Kubernetes Secret + - name: GOOGLE_APPLICATION_CREDENTIALS + valueFrom: + secretKeyRef: + name: rewards-eligibility-oracle-secrets + key: google-credentials + - name: BLOCKCHAIN_PRIVATE_KEY + valueFrom: + secretKeyRef: + name: rewards-eligibility-oracle-secrets + key: blockchain-private-key + - name: ETHERSCAN_API_KEY + valueFrom: + secretKeyRef: + name: rewards-eligibility-oracle-secrets + key: etherscan-api-key + - name: ARBITRUM_API_KEY + valueFrom: + secretKeyRef: + name: rewards-eligibility-oracle-secrets + key: arbitrum-api-key + - name: STUDIO_API_KEY + valueFrom: + secretKeyRef: + name: rewards-eligibility-oracle-secrets + key: studio-api-key + - name: STUDIO_DEPLOY_KEY + valueFrom: + secretKeyRef: + name: rewards-eligibility-oracle-secrets + key: studio-deploy-key + - name: SLACK_WEBHOOK_URL + valueFrom: + secretKeyRef: + name: rewards-eligibility-oracle-secrets + key: slack-webhook-url + volumeMounts: + - name: data-volume + mountPath: /app/data + - name: logs-volume + mountPath: /app/logs + resources: + requests: + memory: "512M" # Match docker-compose reservations + cpu: "250m" + limits: + memory: "1G" # Match docker-compose limits + cpu: "1000m" # Match docker-compose '1.0' cpus + # Use file-based healthcheck like docker-compose (not HTTP) + livenessProbe: + exec: + command: + - python + - -c + - "import os, time; assert os.path.exists('/app/healthcheck') and time.time() - os.path.getmtime('/app/healthcheck') < 300, 'Healthcheck failed'" + initialDelaySeconds: 60 # Match docker-compose start_period + periodSeconds: 120 # Match docker-compose interval (5m -> 300s, but use 2m for faster detection) + timeoutSeconds: 30 # Match docker-compose timeout + failureThreshold: 3 # Match docker-compose retries + readinessProbe: + exec: + command: + - python + - -c + - "import os; assert os.path.exists('/app/healthcheck'), 'Healthcheck file missing'" + initialDelaySeconds: 10 + periodSeconds: 30 + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + runAsUser: 1000 + runAsGroup: 1000 + readOnlyRootFilesystem: false + capabilities: + drop: + - ALL + volumes: + - name: data-volume + persistentVolumeClaim: + claimName: rewards-eligibility-oracle-data + - name: logs-volume + persistentVolumeClaim: + claimName: rewards-eligibility-oracle-logs + serviceAccountName: rewards-eligibility-oracle + restartPolicy: Always diff --git a/k8s/base/kustomization.yaml b/k8s/base/kustomization.yaml new file mode 100644 index 0000000..4a729fb --- /dev/null +++ b/k8s/base/kustomization.yaml @@ -0,0 +1,9 @@ +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +resources: + - namespace.yaml + - deployment.yaml + - service.yaml + - servicemonitor.yaml + - serviceaccount.yaml diff --git a/k8s/base/namespace.yaml b/k8s/base/namespace.yaml new file mode 100644 index 0000000..d316dd9 --- /dev/null +++ b/k8s/base/namespace.yaml @@ -0,0 +1,6 @@ +apiVersion: v1 +kind: Namespace +metadata: + name: rewards-eligibility-oracle + labels: + name: rewards-eligibility-oracle diff --git a/k8s/base/service.yaml b/k8s/base/service.yaml new file mode 100644 index 0000000..0907e9a --- /dev/null +++ b/k8s/base/service.yaml @@ -0,0 +1,15 @@ +apiVersion: v1 +kind: Service +metadata: + name: rewards-eligibility-oracle + labels: + app: rewards-eligibility-oracle +spec: + selector: + app: rewards-eligibility-oracle + ports: + - name: metrics + port: 8000 + targetPort: 8000 + protocol: TCP + type: ClusterIP diff --git a/k8s/base/serviceaccount.yaml b/k8s/base/serviceaccount.yaml new file mode 100644 index 0000000..b0464b1 --- /dev/null +++ b/k8s/base/serviceaccount.yaml @@ -0,0 +1,9 @@ +apiVersion: v1 +kind: ServiceAccount +metadata: + name: rewards-eligibility-oracle + labels: + app: rewards-eligibility-oracle + annotations: + iam.gke.io/gcp-service-account: rewards-eligibility-oracle@graph-mainnet.iam.gserviceaccount.com +automountServiceAccountToken: false diff --git a/k8s/base/servicemonitor.yaml b/k8s/base/servicemonitor.yaml new file mode 100644 index 0000000..4addeac --- /dev/null +++ b/k8s/base/servicemonitor.yaml @@ -0,0 +1,15 @@ +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: rewards-eligibility-oracle + labels: + app: rewards-eligibility-oracle +spec: + selector: + matchLabels: + app: rewards-eligibility-oracle + endpoints: + - port: metrics + path: /metrics + interval: 30s + scrapeTimeout: 10s diff --git a/k8s/configmap.yaml b/k8s/configmap.yaml index 6fa14af..036158a 100644 --- a/k8s/configmap.yaml +++ b/k8s/configmap.yaml @@ -1,9 +1,9 @@ apiVersion: v1 kind: ConfigMap metadata: - name: service-quality-oracle-config + name: rewards-eligibility-oracle-config labels: - app: service-quality-oracle + app: rewards-eligibility-oracle data: # BigQuery Configuration BIGQUERY_LOCATION_ID: "US" @@ -52,4 +52,4 @@ data: MIN_CURATION_SIGNAL: "500" # Runtime Configuration - RUN_ON_STARTUP: "true" \ No newline at end of file + RUN_ON_STARTUP: "true" diff --git a/k8s/deployment.yaml b/k8s/deployment.yaml index ff11dd9..ae04d37 100644 --- a/k8s/deployment.yaml +++ b/k8s/deployment.yaml @@ -1,62 +1,62 @@ apiVersion: apps/v1 kind: Deployment metadata: - name: service-quality-oracle + name: rewards-eligibility-oracle labels: - app: service-quality-oracle + app: rewards-eligibility-oracle spec: replicas: 1 # Single instance due to state management selector: matchLabels: - app: service-quality-oracle + app: rewards-eligibility-oracle template: metadata: labels: - app: service-quality-oracle + app: rewards-eligibility-oracle spec: containers: - - name: service-quality-oracle - image: ghcr.io/graphprotocol/service-quality-oracle:latest + - name: rewards-eligibility-oracle + image: ghcr.io/graphprotocol/rewards-eligibility-oracle:latest envFrom: # Load all non-sensitive configuration from ConfigMap - configMapRef: - name: service-quality-oracle-config + name: rewards-eligibility-oracle-config env: # Secrets from Kubernetes Secret - name: GOOGLE_APPLICATION_CREDENTIALS valueFrom: secretKeyRef: - name: service-quality-oracle-secrets + name: rewards-eligibility-oracle-secrets key: google-credentials - name: BLOCKCHAIN_PRIVATE_KEY valueFrom: secretKeyRef: - name: service-quality-oracle-secrets + name: rewards-eligibility-oracle-secrets key: blockchain-private-key - name: ETHERSCAN_API_KEY valueFrom: secretKeyRef: - name: service-quality-oracle-secrets + name: rewards-eligibility-oracle-secrets key: etherscan-api-key - name: ARBITRUM_API_KEY valueFrom: secretKeyRef: - name: service-quality-oracle-secrets + name: rewards-eligibility-oracle-secrets key: arbitrum-api-key - name: STUDIO_API_KEY valueFrom: secretKeyRef: - name: service-quality-oracle-secrets + name: rewards-eligibility-oracle-secrets key: studio-api-key - name: STUDIO_DEPLOY_KEY valueFrom: secretKeyRef: - name: service-quality-oracle-secrets + name: rewards-eligibility-oracle-secrets key: studio-deploy-key - name: SLACK_WEBHOOK_URL valueFrom: secretKeyRef: - name: service-quality-oracle-secrets + name: rewards-eligibility-oracle-secrets key: slack-webhook-url volumeMounts: - name: data-volume @@ -68,7 +68,7 @@ spec: memory: "512M" # Match docker-compose reservations cpu: "250m" limits: - memory: "1G" # Match docker-compose limits + memory: "1G" # Match docker-compose limits cpu: "1000m" # Match docker-compose '1.0' cpus # Use file-based healthcheck like docker-compose (not HTTP) livenessProbe: @@ -84,7 +84,7 @@ spec: readinessProbe: exec: command: - - python + - python - -c - "import os; assert os.path.exists('/app/healthcheck'), 'Healthcheck file missing'" initialDelaySeconds: 10 @@ -92,8 +92,8 @@ spec: volumes: - name: data-volume persistentVolumeClaim: - claimName: service-quality-oracle-data + claimName: rewards-eligibility-oracle-data - name: logs-volume persistentVolumeClaim: - claimName: service-quality-oracle-logs - restartPolicy: Always \ No newline at end of file + claimName: rewards-eligibility-oracle-logs + restartPolicy: Always diff --git a/k8s/environments/mainnet/apply.sh b/k8s/environments/mainnet/apply.sh new file mode 100755 index 0000000..ed9467d --- /dev/null +++ b/k8s/environments/mainnet/apply.sh @@ -0,0 +1,32 @@ +#!/bin/bash + +set -e + +# Apply Kubernetes manifests for rewards-eligibility-oracle (mainnet) +echo "Applying rewards-eligibility-oracle mainnet manifests..." + +# Ensure we're authenticated to the correct cluster +echo "Authenticating to cluster..." +./auth.sh + +# Check if config.secret.yaml has been configured +if ! grep -q "your-key-id-here" config.secret.yaml 2>/dev/null; then + echo "✓ Secret configuration appears to be customized" +else + echo "WARNING: config.secret.yaml contains placeholder values." + echo "Please configure your actual secrets before deploying to mainnet!" + read -p "Continue anyway? (y/N) " -n 1 -r + echo + if [[ ! $REPLY =~ ^[Yy]$ ]]; then + exit 1 + fi +fi + +# Apply using kustomize +kubectl apply -k . + +echo "All mainnet manifests applied successfully!" +echo "" +echo "To check status:" +echo " kubectl get all -n rewards-eligibility-oracle" +echo " kubectl logs -f deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle" diff --git a/k8s/environments/mainnet/auth.sh b/k8s/environments/mainnet/auth.sh new file mode 100755 index 0000000..e3ab19b --- /dev/null +++ b/k8s/environments/mainnet/auth.sh @@ -0,0 +1,13 @@ +#!/bin/bash + +# Get cluster credentials for rewards-eligibility-oracle deployment +# Update the following values based on your GKE cluster configuration: +# - PROJECT: Your GCP project ID +# - CLUSTER: Your GKE cluster name +# - ZONE: Your GKE cluster zone/region + +PROJECT="graph-mainnet" +CLUSTER="network" +ZONE="us-central1-a" + +gcloud container clusters get-credentials $CLUSTER --project $PROJECT --zone $ZONE diff --git a/k8s/environments/mainnet/config.secret.yaml b/k8s/environments/mainnet/config.secret.yaml new file mode 100644 index 0000000..10219de --- /dev/null +++ b/k8s/environments/mainnet/config.secret.yaml @@ -0,0 +1,45 @@ +# Mainnet Secrets for Service Quality Oracle# IMPORTANT: This is a TEMPLATE file - DO NOT commit actual secrets! +# +# Usage: +# 1. Replace all placeholder values with your actual mainnet secrets +# 2. This file is used by kustomize secretGenerator +# +# Security: Kubernetes automatically base64 encodes stringData values + +# Google Cloud Service Account JSON for BigQuery access +# Create a dedicated service account with BigQuery Data Viewer + Job User roles +# Download JSON key from Google Cloud Console > IAM > Service Accounts + google-credentials: | + { + "type": "service_account", + "project_id": "graph-mainnet", + "private_key_id": "your-key-id-here", + "private_key": "-----BEGIN PRIVATE KEY-----\nYOUR-PRIVATE-KEY-CONTENT-HERE\n-----END PRIVATE KEY-----\n", + "client_email": "rewards-eligibility-oracle@graph-mainnet.iam.gserviceaccount.com", + "client_id": "your-client-id-here", + "auth_uri": "https://accounts.google.com/o/oauth2/auth", + "token_uri": "https://oauth2.googleapis.com/token", + "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", + "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/rewards-eligibility-oracle%40graph-mainnet.iam.gserviceaccount.com" + } + +# Blockchain private key for Arbitrum Sepolia transactions (without 0x prefix) +# CRITICAL: This key controls blockchain transactions - keep secure + blockchain-private-key: "your-64-character-private-key-here" + +# Etherscan API key for mainnet contract verification (if needed) +# Get from: https://etherscan.io/apis + etherscan-api-key: "your-etherscan-api-key-here" + +# Arbitrum API key for contract verification on Arbitrum networks +# Get from: https://arbiscan.io/apis + arbitrum-api-key: "your-arbitrum-api-key-here" + +# The Graph Studio API credentials +# Get from: https://thegraph.com/studio/apikeys + studio-api-key: "your-studio-api-key-here" + studio-deploy-key: "your-studio-deploy-key-here" + +# Slack webhook URL for operational notifications +# Create webhook: Slack App > Incoming Webhooks > Add New Webhook + slack-webhook-url: "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX" diff --git a/k8s/environments/mainnet/config.yaml b/k8s/environments/mainnet/config.yaml new file mode 100644 index 0000000..a463c75 --- /dev/null +++ b/k8s/environments/mainnet/config.yaml @@ -0,0 +1,49 @@ +# Mainnet Configuration for Service Quality Oracle +# BigQuery Configuration +BIGQUERY_LOCATION_ID: "US" +BIGQUERY_PROJECT_ID: "graph-mainnet" +BIGQUERY_DATASET_ID: "internal_metrics" +BIGQUERY_TABLE_ID: "metrics_indexer_attempts" +BIGQUERY_CURATION_TABLE_ID: "metrics_curator_signals" +BIGQUERY_CURATOR_MAINNET_TABLE_ID: "curator_name_signal_dimensions_daily" +BIGQUERY_CURATOR_ARBITRUM_TABLE_ID: "curator_name_signal_dimensions_arbitrum_daily" +BIGQUERY_SUBGRAPH_LOOKUP_TABLE_ID: "subgraph_version_id_lookup" +BIGQUERY_ANALYSIS_PERIOD_DAYS: "28" + +# Blockchain Configuration (Arbitrum One - Mainnet) +BLOCKCHAIN_CONTRACT_ADDRESS: "0x0000000000000000000000000000000000000000" # TODO: Replace with mainnet contract +BLOCKCHAIN_FUNCTION_NAME: "allowIndexers" +BLOCKCHAIN_CHAIN_ID: "42161" +BLOCK_EXPLORER_URL: "https://arbiscan.io" +TX_TIMEOUT_SECONDS: "30" + +# RPC Provider URLs (Arbitrum One - Mainnet) +BLOCKCHAIN_RPC_URL_1: "https://arbitrum.drpc.org" +BLOCKCHAIN_RPC_URL_2: "https://arb1.arbitrum.io/rpc" +BLOCKCHAIN_RPC_URL_3: "https://api.zan.top/node/v1/arb/one" +BLOCKCHAIN_RPC_URL_4: "https://arbitrum-one.gateway.tenderly.co" + +# Scheduling Configuration +SCHEDULED_RUN_TIME: "10:00" + +# Subgraph URLs +SUBGRAPH_URL_PRE_PRODUCTION: "https://api.studio.thegraph.com/query/110664/issuance-eligibility-oracle/v0.1.4" +SUBGRAPH_URL_PRODUCTION: "https://gateway.thegraph.com/api/subgraphs/id/" + +# Processing Configuration +BATCH_SIZE: "125" +MAX_AGE_BEFORE_DELETION: "120" + +# Caching Configuration +CACHE_MAX_AGE_MINUTES: "30" +FORCE_BIGQUERY_REFRESH: "false" + +# Eligibility Criteria +MIN_ONLINE_DAYS: "5" +MIN_SUBGRAPHS: "10" +MAX_LATENCY_MS: "5000" +MAX_BLOCKS_BEHIND: "50000" +MIN_CURATION_SIGNAL: "500" + +# Runtime Configuration +RUN_ON_STARTUP: "true" diff --git a/k8s/environments/mainnet/diff.sh b/k8s/environments/mainnet/diff.sh new file mode 100755 index 0000000..5227232 --- /dev/null +++ b/k8s/environments/mainnet/diff.sh @@ -0,0 +1,14 @@ +#!/bin/bash + +set -e + +# Show diff of what would change when applying manifests (mainnet) +echo "Showing diff for rewards-eligibility-oracle mainnet manifests..." + +# Ensure we're authenticated to the correct cluster +echo "Authenticating to cluster..." +./auth.sh + +kubectl diff -k . || true + +echo "Mainnet diff complete." diff --git a/k8s/environments/mainnet/kustomization.yaml b/k8s/environments/mainnet/kustomization.yaml new file mode 100644 index 0000000..86837e8 --- /dev/null +++ b/k8s/environments/mainnet/kustomization.yaml @@ -0,0 +1,27 @@ +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +namespace: rewards-eligibility-oracle + +resources: + - ../../base + - persistent-volume-claim.yaml + +secretGenerator: + - name: rewards-eligibility-oracle-secrets + files: + - config.secret.yaml + +configMapGenerator: + - name: rewards-eligibility-oracle-config + files: + - config.yaml + +images: + - name: ghcr.io/graphprotocol/rewards-eligibility-oracle + newName: ghcr.io/graphprotocol/rewards-eligibility-oracle + newTag: mainnet-latest + +commonLabels: + environment: mainnet + variant: production diff --git a/k8s/environments/mainnet/persistent-volume-claim.yaml b/k8s/environments/mainnet/persistent-volume-claim.yaml new file mode 100644 index 0000000..18bfc98 --- /dev/null +++ b/k8s/environments/mainnet/persistent-volume-claim.yaml @@ -0,0 +1,28 @@ +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: rewards-eligibility-oracle-data + labels: + app: rewards-eligibility-oracle +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 10Gi + storageClassName: "ssd-retain" # Production storage class with retention + +--- +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: rewards-eligibility-oracle-logs + labels: + app: rewards-eligibility-oracle +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi + storageClassName: "ssd-retain" # Production storage class with retention diff --git a/k8s/environments/mainnet/restart-deployments.sh b/k8s/environments/mainnet/restart-deployments.sh new file mode 100755 index 0000000..932be32 --- /dev/null +++ b/k8s/environments/mainnet/restart-deployments.sh @@ -0,0 +1,11 @@ +#!/bin/bash + +set -e + +# Restart deployments for rewards-eligibility-oracle (mainnet) +echo "Restarting rewards-eligibility-oracle mainnet deployment..." + +kubectl rollout restart deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle +kubectl rollout status deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle --timeout=300s + +echo "Mainnet deployment restarted successfully!" diff --git a/k8s/environments/testnet/apply.sh b/k8s/environments/testnet/apply.sh new file mode 100755 index 0000000..41dbd3e --- /dev/null +++ b/k8s/environments/testnet/apply.sh @@ -0,0 +1,27 @@ +#!/bin/bash + +set -e + +# Apply Kubernetes manifests for rewards-eligibility-oracle (testnet) +echo "Applying rewards-eligibility-oracle testnet manifests..." + +# Ensure we're authenticated to the correct cluster +echo "Authenticating to cluster..." +./auth.sh + +# Check if config.secret.yaml has been configured +if ! grep -q "your-key-id-here" config.secret.yaml 2>/dev/null; then + echo "✓ Secret configuration appears to be customized" +else + echo "WARNING: config.secret.yaml contains placeholder values." + echo "Please configure your actual secrets before deploying!" +fi + +# Apply using kustomize +kubectl apply -k . + +echo "All testnet manifests applied successfully!" +echo "" +echo "To check status:" +echo " kubectl get all -n rewards-eligibility-oracle" +echo " kubectl logs -f deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle" diff --git a/k8s/environments/testnet/auth.sh b/k8s/environments/testnet/auth.sh new file mode 100755 index 0000000..76c7eba --- /dev/null +++ b/k8s/environments/testnet/auth.sh @@ -0,0 +1,13 @@ +#!/bin/bash + +# Get cluster credentials for rewards-eligibility-oracle deployment +# Update the following values based on your GKE cluster configuration: +# - PROJECT: Your GCP project ID +# - CLUSTER: Your GKE cluster name +# - ZONE: Your GKE cluster zone/region + +PROJECT="graph-mainnet" +CLUSTER="testnet" +ZONE="us-central1-a" + +gcloud container clusters get-credentials $CLUSTER --project $PROJECT --zone $ZONE diff --git a/k8s/environments/testnet/config.secret.yaml b/k8s/environments/testnet/config.secret.yaml new file mode 100644 index 0000000..89b92d6 --- /dev/null +++ b/k8s/environments/testnet/config.secret.yaml @@ -0,0 +1,46 @@ +# Testnet Secrets for Service Quality Oracle +# IMPORTANT: This is a TEMPLATE file - DO NOT commit actual secrets! +# +# Usage: +# 1. Replace all placeholder values with your actual testnet secrets +# 2. This file is used by kustomize secretGenerator +# +# Security: Kubernetes automatically base64 encodes stringData values + +# Google Cloud Service Account JSON for BigQuery access +# Create a dedicated service account with BigQuery Data Viewer + Job User roles +# Download JSON key from Google Cloud Console > IAM > Service Accounts + google-credentials: | + { + "type": "service_account", + "project_id": "graph-mainnet", + "private_key_id": "your-key-id-here", + "private_key": "-----BEGIN PRIVATE KEY-----\nYOUR-PRIVATE-KEY-CONTENT-HERE\n-----END PRIVATE KEY-----\n", + "client_email": "rewards-eligibility-oracle@graph-mainnet.iam.gserviceaccount.com", + "client_id": "your-client-id-here", + "auth_uri": "https://accounts.google.com/o/oauth2/auth", + "token_uri": "https://oauth2.googleapis.com/token", + "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", + "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/rewards-eligibility-oracle%40graph-mainnet.iam.gserviceaccount.com" + } + +# Blockchain private key for Arbitrum Sepolia transactions (without 0x prefix) +# CRITICAL: This key controls blockchain transactions - keep secure + blockchain-private-key: "your-64-character-private-key-here" + +# Etherscan API key for mainnet contract verification (if needed) +# Get from: https://etherscan.io/apis + etherscan-api-key: "your-etherscan-api-key-here" + +# Arbitrum API key for contract verification on Arbitrum networks +# Get from: https://arbiscan.io/apis + arbitrum-api-key: "your-arbitrum-api-key-here" + +# The Graph Studio API credentials +# Get from: https://thegraph.com/studio/apikeys + studio-api-key: "your-studio-api-key-here" + studio-deploy-key: "your-studio-deploy-key-here" + +# Slack webhook URL for operational notifications +# Create webhook: Slack App > Incoming Webhooks > Add New Webhook + slack-webhook-url: "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX" diff --git a/k8s/environments/testnet/config.yaml b/k8s/environments/testnet/config.yaml new file mode 100644 index 0000000..c0357ce --- /dev/null +++ b/k8s/environments/testnet/config.yaml @@ -0,0 +1,49 @@ +# Testnet Configuration for Service Quality Oracle +# BigQuery Configuration +BIGQUERY_LOCATION_ID: "US" +BIGQUERY_PROJECT_ID: "graph-mainnet" +BIGQUERY_DATASET_ID: "internal_metrics" +BIGQUERY_TABLE_ID: "metrics_indexer_attempts" +BIGQUERY_CURATION_TABLE_ID: "metrics_curator_signals" +BIGQUERY_CURATOR_MAINNET_TABLE_ID: "curator_name_signal_dimensions_daily" +BIGQUERY_CURATOR_ARBITRUM_TABLE_ID: "curator_name_signal_dimensions_arbitrum_daily" +BIGQUERY_SUBGRAPH_LOOKUP_TABLE_ID: "subgraph_version_id_lookup" +BIGQUERY_ANALYSIS_PERIOD_DAYS: "28" + +# Blockchain Configuration (Arbitrum Sepolia - Testnet) +BLOCKCHAIN_CONTRACT_ADDRESS: "0x6d5550698F930210c3f50efe744bF51C55D791f6" +BLOCKCHAIN_FUNCTION_NAME: "allowIndexers" +BLOCKCHAIN_CHAIN_ID: "421614" +BLOCK_EXPLORER_URL: "https://sepolia.arbiscan.io" +TX_TIMEOUT_SECONDS: "30" + +# RPC Provider URLs (Arbitrum Sepolia - Testnet) +BLOCKCHAIN_RPC_URL_1: "https://arbitrum-sepolia.drpc.org" +BLOCKCHAIN_RPC_URL_2: "https://sepolia-rollup.arbitrum.io/rpc" +BLOCKCHAIN_RPC_URL_3: "https://api.zan.top/arb-sepolia" +BLOCKCHAIN_RPC_URL_4: "https://arbitrum-sepolia.gateway.tenderly.co" + +# Scheduling Configuration +SCHEDULED_RUN_TIME: "10:00" + +# Subgraph URLs +SUBGRAPH_URL_PRE_PRODUCTION: "https://api.studio.thegraph.com/query/110664/issuance-eligibility-oracle/v0.1.4" +SUBGRAPH_URL_PRODUCTION: "https://gateway.thegraph.com/api/subgraphs/id/" + +# Processing Configuration +BATCH_SIZE: "125" +MAX_AGE_BEFORE_DELETION: "120" + +# Caching Configuration +CACHE_MAX_AGE_MINUTES: "30" +FORCE_BIGQUERY_REFRESH: "false" + +# Eligibility Criteria +MIN_ONLINE_DAYS: "5" +MIN_SUBGRAPHS: "10" +MAX_LATENCY_MS: "5000" +MAX_BLOCKS_BEHIND: "50000" +MIN_CURATION_SIGNAL: "500" + +# Runtime Configuration +RUN_ON_STARTUP: "true" diff --git a/k8s/environments/testnet/diff.sh b/k8s/environments/testnet/diff.sh new file mode 100755 index 0000000..1314178 --- /dev/null +++ b/k8s/environments/testnet/diff.sh @@ -0,0 +1,14 @@ +#!/bin/bash + +set -e + +# Show diff of what would change when applying manifests (testnet) +echo "Showing diff for rewards-eligibility-oracle testnet manifests..." + +# Ensure we're authenticated to the correct cluster +echo "Authenticating to cluster..." +./auth.sh + +kubectl diff -k . || true + +echo "Testnet diff complete." diff --git a/k8s/environments/testnet/kustomization.yaml b/k8s/environments/testnet/kustomization.yaml new file mode 100644 index 0000000..193e906 --- /dev/null +++ b/k8s/environments/testnet/kustomization.yaml @@ -0,0 +1,27 @@ +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +namespace: rewards-eligibility-oracle + +resources: + - ../../base + - persistent-volume-claim.yaml + +secretGenerator: + - name: rewards-eligibility-oracle-secrets + files: + - config.secret.yaml + +configMapGenerator: + - name: rewards-eligibility-oracle-config + files: + - config.yaml + +images: + - name: ghcr.io/graphprotocol/rewards-eligibility-oracle + newName: ghcr.io/graphprotocol/rewards-eligibility-oracle + newTag: testnet-latest + +commonLabels: + environment: testnet + variant: staging diff --git a/k8s/environments/testnet/persistent-volume-claim.yaml b/k8s/environments/testnet/persistent-volume-claim.yaml new file mode 100644 index 0000000..18bfc98 --- /dev/null +++ b/k8s/environments/testnet/persistent-volume-claim.yaml @@ -0,0 +1,28 @@ +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: rewards-eligibility-oracle-data + labels: + app: rewards-eligibility-oracle +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 10Gi + storageClassName: "ssd-retain" # Production storage class with retention + +--- +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: rewards-eligibility-oracle-logs + labels: + app: rewards-eligibility-oracle +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi + storageClassName: "ssd-retain" # Production storage class with retention diff --git a/k8s/environments/testnet/restart-deployments.sh b/k8s/environments/testnet/restart-deployments.sh new file mode 100755 index 0000000..16ccfd2 --- /dev/null +++ b/k8s/environments/testnet/restart-deployments.sh @@ -0,0 +1,11 @@ +#!/bin/bash + +set -e + +# Restart deployments for rewards-eligibility-oracle (testnet) +echo "Restarting rewards-eligibility-oracle testnet deployment..." + +kubectl rollout restart deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle +kubectl rollout status deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle --timeout=300s + +echo "Testnet deployment restarted successfully!" diff --git a/k8s/persistent-volume-claim.yaml b/k8s/persistent-volume-claim.yaml index ff79393..593bb71 100644 --- a/k8s/persistent-volume-claim.yaml +++ b/k8s/persistent-volume-claim.yaml @@ -1,9 +1,9 @@ apiVersion: v1 kind: PersistentVolumeClaim metadata: - name: service-quality-oracle-data + name: rewards-eligibility-oracle-data labels: - app: service-quality-oracle + app: rewards-eligibility-oracle spec: accessModes: - ReadWriteOnce @@ -13,7 +13,7 @@ spec: # Storage class - uncomment and modify based on your cluster: # storageClassName: "" # Use default storage class (most common) # storageClassName: "gp2" # AWS EKS - # storageClassName: "standard" # GKE + # storageClassName: "standard" # GKE # storageClassName: "managed-premium" # Azure AKS # storageClassName: "local-path" # K3s/Rancher # storageClassName: "hostpath" # Local development @@ -22,9 +22,9 @@ spec: apiVersion: v1 kind: PersistentVolumeClaim metadata: - name: service-quality-oracle-logs + name: rewards-eligibility-oracle-logs labels: - app: service-quality-oracle + app: rewards-eligibility-oracle spec: accessModes: - ReadWriteOnce @@ -37,4 +37,4 @@ spec: # storageClassName: "standard" # GKE # storageClassName: "managed-premium" # Azure AKS # storageClassName: "local-path" # K3s/Rancher - # storageClassName: "hostpath" # Local development \ No newline at end of file + # storageClassName: "hostpath" # Local development diff --git a/k8s/restart-deployments.sh b/k8s/restart-deployments.sh new file mode 100644 index 0000000..62843f9 --- /dev/null +++ b/k8s/restart-deployments.sh @@ -0,0 +1,11 @@ +#!/bin/bash + +set -e + +# Restart deployments for rewards-eligibility-oracle +echo "Restarting rewards-eligibility-oracle deployment..." + +kubectl rollout restart deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle +kubectl rollout status deployment/rewards-eligibility-oracle -n rewards-eligibility-oracle --timeout=300s + +echo "Deployment restarted successfully!" diff --git a/k8s/secrets.yaml.example b/k8s/secrets.yaml.example index 4dbbc53..dddcbc4 100644 --- a/k8s/secrets.yaml.example +++ b/k8s/secrets.yaml.example @@ -1,8 +1,8 @@ # Kubernetes Secrets for Service Quality Oracle # IMPORTANT: This is an EXAMPLE file - DO NOT commit actual secrets! -# +# # Usage: -# 1. Copy this file to secrets.yaml +# 1. Copy this file to secrets.yaml # 2. Replace all placeholder values with your actual secrets # 3. Apply: kubectl apply -f secrets.yaml # 4. Add secrets.yaml to .gitignore to prevent accidental commits @@ -12,9 +12,9 @@ apiVersion: v1 kind: Secret metadata: - name: service-quality-oracle-secrets + name: rewards-eligibility-oracle-secrets labels: - app: service-quality-oracle + app: rewards-eligibility-oracle type: Opaque stringData: # Google Cloud Service Account JSON for BigQuery access @@ -26,12 +26,12 @@ stringData: "project_id": "graph-mainnet", "private_key_id": "your-key-id-here", "private_key": "-----BEGIN PRIVATE KEY-----\nYOUR-PRIVATE-KEY-CONTENT-HERE\n-----END PRIVATE KEY-----\n", - "client_email": "service-quality-oracle@graph-mainnet.iam.gserviceaccount.com", + "client_email": "rewards-eligibility-oracle@graph-mainnet.iam.gserviceaccount.com", "client_id": "your-client-id-here", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", - "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/service-quality-oracle%40graph-mainnet.iam.gserviceaccount.com" + "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/rewards-eligibility-oracle%40graph-mainnet.iam.gserviceaccount.com" } # Blockchain private key for Arbitrum Sepolia transactions (without 0x prefix) @@ -53,4 +53,4 @@ stringData: # Slack webhook URL for operational notifications # Create webhook: Slack App > Incoming Webhooks > Add New Webhook - slack-webhook-url: "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX" \ No newline at end of file + slack-webhook-url: "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"