Skip to content

Commit 7cf1a74

Browse files
feat: update kserve setup info (#141)
1 parent f060ea2 commit 7cf1a74

File tree

1 file changed

+57
-13
lines changed

1 file changed

+57
-13
lines changed

docs/kserve.md

Lines changed: 57 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -52,19 +52,13 @@ Replace `your_namespace` with the desired namespace where you want to install KS
5252

5353
For more detailed configuration and usage, refer to the [KServe Documentation](https://kserve.github.io/website/docs/admin-guide/kubernetes-deployment).
5454

55-
56-
57-
Here's a README section that explains the application of necessary roles and role bindings for managing permissions within the Kubernetes cluster:
58-
59-
---
60-
6155
# Role and RoleBinding Configuration for AI DIAL Admin
6256

6357
This section describes the configuration of necessary roles and role bindings to manage permissions for the AI DIAL Admin components within the Kubernetes cluster. These configurations ensure that the appropriate access controls are in place for managing inference services and related resources.
6458

6559
## Role Configuration
6660

67-
The following Role configuration grants permissions to manage inference services and other resources within the `kserve-models` namespace.
61+
The following Role configuration grants permissions to manage inference services and other resources within the `<model-namespace>` namespace.
6862

6963
### Role Manifest
7064

@@ -73,7 +67,7 @@ apiVersion: rbac.authorization.k8s.io/v1
7367
kind: Role
7468
metadata:
7569
name: ai-dial-admin-deployment-role
76-
namespace: kserve-models
70+
namespace: <model-namespace>
7771
rules:
7872
- apiGroups:
7973
- 'serving.kserve.io'
@@ -116,7 +110,7 @@ apiVersion: rbac.authorization.k8s.io/v1
116110
kind: RoleBinding
117111
metadata:
118112
name: ai-dial-admin-deployment-role
119-
namespace: kserve-models
113+
namespace: <model-namespace>
120114
subjects:
121115
- kind: ServiceAccount
122116
name: ai-dial-test-admin-deployment-manager-backend
@@ -135,7 +129,57 @@ To apply these configurations to your Kubernetes cluster, follow these steps:
135129

136130
2. **Apply the Manifests**: Use the `kubectl` command-line tool to apply the manifests to your cluster. Run the following commands in your terminal:
137131

138-
```bash
139-
kubectl apply -f role.yaml
140-
kubectl apply -f rolebinding.yaml
141-
```
132+
```bash
133+
kubectl apply -f role.yaml
134+
kubectl apply -f rolebinding.yaml
135+
```
136+
137+
# Using Hugging Face Token
138+
139+
To deploy models from private Hugging Face repositories, follow these steps:
140+
141+
1. **Update the default `ClusterStorageContainer`**: Remove the `hf://` entry from its `supportedUriFormats` list. This prevents conflicts between the default and custom storage containers when resolving Hugging Face model URIs. Note that `ClusterStorageContainer` is a cluster-scoped resource, so this change applies globally.
142+
143+
2. **Create a Kubernetes secret** with your Hugging Face access token in the namespace where your models will be deployed:
144+
145+
```bash
146+
kubectl create secret generic hf-secret \
147+
--from-literal=HF_TOKEN=<your_hf_token_here> \
148+
-n <model-namespace>
149+
```
150+
151+
3. **Create a custom `ClusterStorageContainer`** that references the secret:
152+
153+
```yaml
154+
apiVersion: "serving.kserve.io/v1alpha1"
155+
kind: ClusterStorageContainer
156+
metadata:
157+
name: hf-hub
158+
spec:
159+
container:
160+
image: kserve/storage-initializer:v0.16.0
161+
name: storage-initializer
162+
env:
163+
- name: HF_TOKEN
164+
valueFrom:
165+
secretKeyRef:
166+
name: hf-secret
167+
key: HF_TOKEN
168+
optional: false
169+
resources:
170+
limits:
171+
cpu: "1"
172+
memory: 1Gi
173+
requests:
174+
cpu: 100m
175+
memory: 100Mi
176+
securityContext:
177+
allowPrivilegeEscalation: false
178+
capabilities:
179+
drop:
180+
- ALL
181+
privileged: false
182+
runAsNonRoot: true
183+
supportedUriFormats:
184+
- prefix: hf://
185+
```

0 commit comments

Comments
 (0)