Skip to content

Commit ed210ba

Browse files
committed
Add kustomize pr overlay and update README
1 parent 3e03701 commit ed210ba

File tree

6 files changed

+172
-1
lines changed

6 files changed

+172
-1
lines changed

kubernetes/README.md

Lines changed: 64 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,11 @@ Copy the template and fill in your HuggingFace token (base64-encoded):
3333
cp base/huggingface-secret.template.yaml base/._huggingface-secret.yaml
3434
```
3535

36+
To generateb base64 encoded Hugging Face token:
37+
```
38+
echo -n "your_hf_token_here" | base64
39+
```
40+
3641
### 2. Sage User Secret
3742

3843
Copy the Sage user secret template and add your Sage account name and password:
@@ -41,7 +46,12 @@ Copy the Sage user secret template and add your Sage account name and password:
4146
cp base/sage-user-secret.template.yaml base/._sage-user-secret.yaml
4247
```
4348

44-
- Encode username and password values as above.
49+
Base64 encoded SAGE_USER and SAGE_PASS to generate:
50+
``
51+
echo -n "your_username_here" | base64
52+
echo -n "your_password_here" | base64
53+
```
54+
4555
- Update the `SAGE_USER` and `SAGE_PASS` fields.
4656
4757
> **Important:**
@@ -66,6 +76,59 @@ Or, using kubectl (if it supports native kustomize):
6676
kubectl apply -k base/
6777
```
6878

79+
Deploy all services:
80+
```
81+
kubectl kustomize nrp-dev | kubectl apply -f -
82+
kubectl kustomize nrp-prod | kubectl apply -f -
83+
```
84+
85+
Delete all services:
86+
```
87+
kubectl kustomize nrp-dev | kubectl delete -f -
88+
kubectl kustomize nrp-prod | kubectl delete -f -
89+
```
90+
91+
Debugging - output to yaml:
92+
```
93+
kubectl kustomize nrp-dev -o hybrid-search-dev.yaml
94+
kubectl kustomize nrp-prod -o hybrid-search-dev.yaml
95+
```
96+
97+
## Testing a Pull Request
98+
For testing a Pull Request (PR), the overlay [prs](/kubernetes/prs/) is provided. Github Actions is setup to create an image for each PR so that we can manually test or in the future automatically test an instance of the image search deployed on k8s.
99+
100+
The following manual steps are required for now:
101+
- [kubernetes/prs/kustomization.yaml](/kubernetes/prs/kustomization.yaml)
102+
- change the `namePrefix` to the name of the PR
103+
- change `commonLabels.env` to the name of the PR
104+
- change the `newTag` to the name of the PR for each service that needs it
105+
- port-forwarding for any of the services to test out (update `pr`):
106+
- `kubectl port-forward svc/pr-triton 8001:8001`: triton endpoint to call the LLM models locally
107+
- `kubectl port-forward svc/pr-gradio-ui 7860:7860`: Search UI
108+
- `kubectl port-forward svc/pr-weaviate 8080:8080`: Weaviate REST endpoint
109+
- `kubectl port-forward svc/pr-weaviate 50051:50051`: Weaviate GRPC endpoint
110+
- `kubectl port-forward svc/pr-weavloader-metrics 5555:5555`: Weavloader Flower endpoint
111+
- `kubectl port-forward svc/pr-weavloader-metrics 8081:8080`: Weavloader Prometheus endpoint
112+
113+
Deploy:
114+
```
115+
kubectl kustomize prs | kubectl apply -f -
116+
```
117+
118+
Delete all services:
119+
```
120+
kubectl kustomize prs | kubectl delete -f -
121+
```
122+
123+
Debugging - output to yaml:
124+
```
125+
kubectl kustomize prs -o hybrid-search-pr.yaml
126+
```
127+
128+
Notes:
129+
- Make sure that your PR is up-to-date with `main` so that the services that were not modified are reflected for the `latest` tag. This can be also be checked with the [docker-compose](/docker-compose.yml) local deployment (after the PR is up-to-date with `main`) to see if the changes in the PR are working with the rest of the services that were not modified.
130+
- Users can utilized this overlay to combine it with their local docker compose instance to use a triton instance that has an NVIDIA GPU. This involves commenting out the ports from the docker compose manifest file for triton and doing the kubectl port-forwarding described above.
131+
69132
## Managing and Customizing
70133

71134
You can extend or patch this `base/` deployment using kustomize overlays for different environments, resource limits, or development setups. See included overlays (such as those in benchmark subfolders) for example usage.

kubernetes/prs/gpus.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: PLACEHOLDER
5+
spec:
6+
template:
7+
spec:
8+
affinity:
9+
nodeAffinity:
10+
requiredDuringSchedulingIgnoredDuringExecution:
11+
nodeSelectorTerms:
12+
- matchExpressions:
13+
- key: nvidia.com/gpu.product
14+
operator: In
15+
values:
16+
- NVIDIA-A10
17+
tolerations:
18+
- key: nautilus.io/reservation
19+
operator: Equal
20+
value: sage
21+
effect: NoSchedule

kubernetes/prs/gradio-env.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: gradio-ui
5+
spec:
6+
template:
7+
spec:
8+
containers:
9+
- name: gradio-ui
10+
env:
11+
- name: UNALLOWED_NODES
12+
value: "W042,N001,V012,W015,W01C,W01E,W024,W026,W02C,W02D,W02E,W02F,W031,W040,W046,W047,W048,W049,W04A,W051,W055,W059,W05A,W05B,W05C,W05D,W05E,W05F,W060,W061,W062,W063,W064,W065,W066,W06E,W072,W073,W074,W075,W076,W077,W078,W079,W07A,W07B,W07D,W07E,W07F,W080,W081,W086,W088,W089,W08A,W08B,W08D,W08E,W08F,W090,W091,W092,W094,W096,W099,W09B,W09E,W0A0,W0A1,W0BB,W0BC"

kubernetes/prs/kustomization.yaml

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
apiVersion: kustomize.config.k8s.io/v1beta1
2+
kind: Kustomization
3+
namespace: sage
4+
5+
namePrefix: pr2-
6+
commonLabels:
7+
env: pr2
8+
9+
resources:
10+
- ../base
11+
12+
patches:
13+
- path: gpus.yaml
14+
target:
15+
kind: Deployment
16+
labelSelector: "app in (reranker-transformers, triton)"
17+
- path: taint.yaml
18+
target:
19+
kind: Deployment
20+
labelSelector: "app in (gradio-ui, weaviate, weavloader)"
21+
- path: gradio-env.yaml
22+
target:
23+
kind: Deployment
24+
labelSelector: "app=gradio-ui"
25+
- path: weavloader-env.yaml
26+
target:
27+
kind: Deployment
28+
labelSelector: "app=weavloader"
29+
30+
images:
31+
- name: semitechnologies/reranker-transformers
32+
newTag: cross-encoder-ms-marco-MiniLM-L-6-v2-latest
33+
- name: semitechnologies/weaviate
34+
newTag: 1.32.0
35+
- name: gitlab-registry.nrp-nautilus.io/ndp/sage/nrp-image-search/triton
36+
newTag: pr-2
37+
- name: gitlab-registry.nrp-nautilus.io/ndp/sage/nrp-image-search/gradio-ui
38+
newTag: latest
39+
- name: gitlab-registry.nrp-nautilus.io/ndp/sage/nrp-image-search/weavmanage
40+
newTag: latest
41+
- name: gitlab-registry.nrp-nautilus.io/ndp/sage/nrp-image-search/weavloader
42+
newTag: pr-2

kubernetes/prs/taint.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: PLACEHOLDER
5+
spec:
6+
template:
7+
spec:
8+
affinity:
9+
nodeAffinity:
10+
requiredDuringSchedulingIgnoredDuringExecution:
11+
nodeSelectorTerms:
12+
- matchExpressions:
13+
- key: nautilus.io/reservation
14+
operator: In
15+
values:
16+
- sage
17+
tolerations:
18+
- key: nautilus.io/reservation
19+
operator: Equal
20+
value: sage
21+
effect: NoSchedule

kubernetes/prs/weavloader-env.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
apiVersion: apps/v1
2+
kind: Deployment
3+
metadata:
4+
name: weavloader
5+
spec:
6+
template:
7+
spec:
8+
containers:
9+
- name: weavloader
10+
env:
11+
- name: UNALLOWED_NODES
12+
value: "W042,N001,V012,W015,W01C,W01E,W024,W026,W02C,W02D,W02E,W02F,W031,W040,W046,W047,W048,W049,W04A,W051,W055,W059,W05A,W05B,W05C,W05D,W05E,W05F,W060,W061,W062,W063,W064,W065,W066,W06E,W072,W073,W074,W075,W076,W077,W078,W079,W07A,W07B,W07D,W07E,W07F,W080,W081,W086,W088,W089,W08A,W08B,W08D,W08E,W08F,W090,W091,W092,W094,W096,W099,W09B,W09E,W0A0,W0A1,W0BB,W0BC"

0 commit comments

Comments
 (0)