Skip to content

Commit fc73df8

Browse files
committed
helm chart
1 parent 4991e59 commit fc73df8

File tree

12 files changed

+985
-0
lines changed

12 files changed

+985
-0
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,3 +140,4 @@ notes.md
140140
notes/
141141

142142
stac_search/test.py
143+
helm-chart/examples

helm-chart/Chart.yaml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
apiVersion: v2
2+
name: stac-search
3+
description: A Helm chart for STAC Semantic Search application
4+
type: application
5+
version: 0.1.0
6+
appVersion: "0.1.0"
7+
keywords:
8+
- stac
9+
- search
10+
- semantic
11+
- geospatial
12+
- satellite
13+
home: https://github.com/your-org/stac-semantic-search
14+
sources:
15+
- https://github.com/your-org/stac-semantic-search
16+
maintainers:
17+
- name: Your Name
18+

helm-chart/README.md

Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
# STAC Semantic Search Helm Chart
2+
3+
This Helm chart deploys the STAC Semantic Search application, which consists of:
4+
- **API**: FastAPI backend for semantic search of STAC collections and items
5+
- **Frontend**: Streamlit web interface for natural language queries
6+
7+
## Prerequisites
8+
9+
- Kubernetes 1.19+
10+
- Helm 3.2.0+
11+
- Docker images for both API and frontend services
12+
13+
## Installation
14+
15+
### 1. Build and Push Docker Images
16+
17+
First, build and push your Docker images to a container registry:
18+
19+
```bash
20+
# Build API image
21+
docker build -t your-registry/stac-search-api:latest .
22+
23+
# Build Frontend image
24+
docker build -t your-registry/stac-search-frontend:latest ./frontend
25+
26+
# Push images
27+
docker push your-registry/stac-search-api:latest
28+
docker push your-registry/stac-search-frontend:latest
29+
```
30+
31+
### 2. Install the Helm Chart
32+
33+
```bash
34+
# Add your repository prefix to values
35+
helm install stac-search ./helm-chart \
36+
--set api.image.repository=your-registry/stac-search-api \
37+
--set frontend.image.repository=your-registry/stac-search-frontend \
38+
--set ingress.hosts[0].host=stac-search.yourdomain.com
39+
```
40+
41+
Or create a custom values file:
42+
43+
```bash
44+
cp helm-chart/values.yaml my-values.yaml
45+
# Edit my-values.yaml with your configuration
46+
helm install stac-search ./helm-chart -f my-values.yaml
47+
```
48+
49+
### 3. Access the Application
50+
51+
After installation, follow the NOTES output to access your application. Typically:
52+
53+
```bash
54+
# Port forward to access locally
55+
kubectl port-forward service/stac-search-frontend 8501:8501
56+
57+
# Then visit http://localhost:8501
58+
```
59+
60+
## Configuration
61+
62+
### Key Configuration Options
63+
64+
| Parameter | Description | Default |
65+
|-----------|-------------|---------|
66+
| `api.image.repository` | API Docker image repository | `stac-search-api` |
67+
| `api.image.tag` | API Docker image tag | `latest` |
68+
| `api.initContainer.enabled` | Enable init container to pre-load STAC data | `true` |
69+
| `api.initContainer.image.repository` | Init container image repository | `stac-search-api` |
70+
| `api.initContainer.image.tag` | Init container image tag | `latest` |
71+
| `api.initContainer.resources` | Init container resource limits and requests | See values.yaml |
72+
| `frontend.image.repository` | Frontend Docker image repository | `stac-search-frontend` |
73+
| `frontend.image.tag` | Frontend Docker image tag | `latest` |
74+
| `ingress.enabled` | Enable ingress | `true` |
75+
| `ingress.hosts[0].host` | Hostname for ingress | `stac-search.local` |
76+
77+
### Example Custom Values
78+
79+
```yaml
80+
# Custom values.yaml
81+
api:
82+
image:
83+
repository: ghcr.io/your-org/stac-search-api
84+
tag: "v1.0.0"
85+
resources:
86+
requests:
87+
memory: "2Gi"
88+
cpu: "1000m"
89+
90+
frontend:
91+
image:
92+
repository: ghcr.io/your-org/stac-search-frontend
93+
tag: "v1.0.0"
94+
95+
ingress:
96+
hosts:
97+
- host: stac-search.example.com
98+
paths:
99+
- path: /
100+
pathType: Prefix
101+
service: frontend
102+
tls:
103+
- secretName: stac-search-tls
104+
hosts:
105+
- stac-search.example.com
106+
107+
# API subdomain configuration
108+
api:
109+
enabled: true
110+
hosts:
111+
- host: api.stac-search.example.com
112+
paths:
113+
- path: /
114+
pathType: Prefix
115+
tls:
116+
- secretName: stac-search-api-tls
117+
hosts:
118+
- api.stac-search.example.com
119+
```
120+
121+
## STAC Data Loading with Init Container
122+
123+
The chart includes an init container that automatically loads STAC catalog data into the vector database before the API starts. This ensures that the API has searchable data available immediately upon startup.
124+
125+
### How It Works
126+
127+
1. **Init Container Execution**: Before the API container starts, the init container runs the `stac_search.load` module
128+
2. **Data Loading**: The init container fetches collections from the configured STAC catalog and generates embeddings
129+
3. **Storage**: The embeddings are stored in ChromaDB in the shared data volume
130+
4. **API Startup**: Once data loading is complete, the API container starts with pre-loaded searchable data
131+
132+
### Configuration
133+
134+
The init container uses the same STAC catalog configuration as the API:
135+
136+
```yaml
137+
api:
138+
env:
139+
STAC_CATALOG_URL: "https://planetarycomputer.microsoft.com/api/stac/v1"
140+
STAC_CATALOG_NAME: "planetarycomputer"
141+
142+
initContainer:
143+
enabled: true # Set to false to disable data pre-loading
144+
resources:
145+
limits:
146+
cpu: 1000m
147+
memory: 2Gi
148+
requests:
149+
cpu: 500m
150+
memory: 1Gi
151+
```
152+
153+
### Disabling Init Container
154+
155+
If you prefer to load data manually or have pre-existing data, you can disable the init container:
156+
157+
```yaml
158+
api:
159+
initContainer:
160+
enabled: false
161+
```
162+
163+
### Multiple Catalogs
164+
165+
To load data from multiple STAC catalogs, you can disable the init container and manually run the load script with different configurations after deployment.
166+
167+
## Architecture
168+
169+
```
170+
┌─────────────────┐ ┌─────────────────┐
171+
│ │ │ │
172+
│ Frontend │────│ API │
173+
│ (Streamlit) │ │ (FastAPI) │
174+
│ Port: 8501 │ │ Port: 8000 │
175+
│ │ │ │
176+
└─────────────────┘ └─────────────────┘
177+
│ │
178+
│ │
179+
┌─────────┐ ┌─────────┐
180+
│ Ingress │ │ChromaDB │
181+
│ │ │ Data │
182+
└─────────┘ └─────────┘
183+
```
184+
185+
**Note**: ChromaDB data is stored in ephemeral storage and will be lost when pods restart. The init container will reload the data automatically on pod startup.
186+
187+
## Development
188+
189+
### Local Development with Helm
190+
191+
```bash
192+
# Render templates locally
193+
helm template stac-search ./helm-chart
194+
195+
# Debug with custom values
196+
helm template stac-search ./helm-chart -f my-values.yaml --debug
197+
198+
# Validate chart
199+
helm lint ./helm-chart
200+
```
201+
202+
### Testing
203+
204+
```bash
205+
# Dry run installation
206+
helm install stac-search ./helm-chart --dry-run
207+
208+
# Test with different values
209+
helm install stac-search ./helm-chart -f test-values.yaml --dry-run

helm-chart/templates/NOTES.txt

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
1. Get the application URLs by running these commands:
2+
{{- if .Values.ingress.enabled }}
3+
{{- range $host := .Values.ingress.hosts }}
4+
{{- range .paths }}
5+
{{- if eq .service "frontend" }}
6+
http{{ if $.Values.ingress.tls }}s{{ end }}://{{ $host.host }}{{ .path }}
7+
{{- end }}
8+
{{- end }}
9+
{{- end }}
10+
{{- else if contains "NodePort" .Values.frontend.service.type }}
11+
export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "stac-search.frontend.serviceName" . }})
12+
export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
13+
echo http://$NODE_IP:$NODE_PORT
14+
{{- else if contains "LoadBalancer" .Values.frontend.service.type }}
15+
NOTE: It may take a few minutes for the LoadBalancer IP to be available.
16+
You can watch the status of by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "stac-search.frontend.serviceName" . }}'
17+
export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "stac-search.frontend.serviceName" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}")
18+
echo http://$SERVICE_IP:{{ .Values.frontend.service.port }}
19+
{{- else if contains "ClusterIP" .Values.frontend.service.type }}
20+
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "{{ include "stac-search.frontend.selectorLabels" . }}" -o jsonpath="{.items[0].metadata.name}")
21+
export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
22+
echo "Visit http://127.0.0.1:8080 to use your application"
23+
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT
24+
{{- end }}
25+
26+
2. Access the API directly:
27+
{{- if .Values.ingress.enabled }}
28+
{{- range $host := .Values.ingress.hosts }}
29+
{{- range .paths }}
30+
{{- if eq .service "api" }}
31+
API URL: http{{ if $.Values.ingress.tls }}s{{ end }}://{{ $host.host }}{{ .path }}
32+
{{- end }}
33+
{{- end }}
34+
{{- end }}
35+
{{- else }}
36+
export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "{{ include "stac-search.api.selectorLabels" . }}" -o jsonpath="{.items[0].metadata.name}")
37+
export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
38+
echo "Visit http://127.0.0.1:8000 to access the API"
39+
kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8000:$CONTAINER_PORT
40+
{{- end }}
41+
42+
3. Check the status of your deployment:
43+
kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/instance={{ .Release.Name }}"
44+
45+
4. View logs:
46+
# API logs
47+
kubectl logs --namespace {{ .Release.Namespace }} -l "{{ include "stac-search.api.selectorLabels" . }}" -f
48+
49+
# Frontend logs
50+
kubectl logs --namespace {{ .Release.Namespace }} -l "{{ include "stac-search.frontend.selectorLabels" . }}" -f
51+
52+
5. ChromaDB data at /app/data in the API container is ephemeral and will be lost if the pod is restarted.
53+
54+
Visit the Streamlit frontend to start searching STAC collections with natural language queries!

0 commit comments

Comments
 (0)