You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/kserve.md
+57-13Lines changed: 57 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,19 +52,13 @@ Replace `your_namespace` with the desired namespace where you want to install KS
52
52
53
53
For more detailed configuration and usage, refer to the [KServe Documentation](https://kserve.github.io/website/docs/admin-guide/kubernetes-deployment).
54
54
55
-
56
-
57
-
Here's a README section that explains the application of necessary roles and role bindings for managing permissions within the Kubernetes cluster:
58
-
59
-
---
60
-
61
55
# Role and RoleBinding Configuration for AI DIAL Admin
62
56
63
57
This section describes the configuration of necessary roles and role bindings to manage permissions for the AI DIAL Admin components within the Kubernetes cluster. These configurations ensure that the appropriate access controls are in place for managing inference services and related resources.
64
58
65
59
## Role Configuration
66
60
67
-
The following Role configuration grants permissions to manage inference services and other resources within the `kserve-models` namespace.
61
+
The following Role configuration grants permissions to manage inference services and other resources within the `<model-namespace>` namespace.
@@ -135,7 +129,57 @@ To apply these configurations to your Kubernetes cluster, follow these steps:
135
129
136
130
2. **Apply the Manifests**: Use the `kubectl` command-line tool to apply the manifests to your cluster. Run the following commands in your terminal:
137
131
138
-
```bash
139
-
kubectl apply -f role.yaml
140
-
kubectl apply -f rolebinding.yaml
141
-
```
132
+
```bash
133
+
kubectl apply -f role.yaml
134
+
kubectl apply -f rolebinding.yaml
135
+
```
136
+
137
+
# Using Hugging Face Token
138
+
139
+
To deploy models from private Hugging Face repositories, follow these steps:
140
+
141
+
1. **Update the default `ClusterStorageContainer`**: Remove the `hf://` entry from its `supportedUriFormats` list. This prevents conflicts between the default and custom storage containers when resolving Hugging Face model URIs. Note that `ClusterStorageContainer` is a cluster-scoped resource, so this change applies globally.
142
+
143
+
2. **Create a Kubernetes secret** with your Hugging Face access token in the namespace where your models will be deployed:
144
+
145
+
```bash
146
+
kubectl create secret generic hf-secret \
147
+
--from-literal=HF_TOKEN=<your_hf_token_here> \
148
+
-n <model-namespace>
149
+
```
150
+
151
+
3. **Create a custom `ClusterStorageContainer`** that references the secret:
0 commit comments