File tree Expand file tree Collapse file tree 1 file changed +5
-5
lines changed Expand file tree Collapse file tree 1 file changed +5
-5
lines changed Original file line number Diff line number Diff line change @@ -166,12 +166,12 @@ Next you create a container app with the NVIDIA GPU Cloud API key.
166
166
--image $ACR_NAME.azurecr.io/$CONTAINER_AND_TAG \
167
167
--cpu 24 \
168
168
--memory 220 \
169
- --gpu "NVIDIAA100" \
169
+ --target-port 8000 \
170
+ --ingress external \
170
171
--secrets ngc-api-key=<PASTE_NGC_API_KEY_HERE> \
171
172
--env-vars NGC_API_KEY=secretref:ngc-api-key \
172
173
--registry-server $ACR_NAME.azurecr.io \
173
- --registry-username <ACR_USERNAME> \
174
- --registry-password <ACR_PASSWORD> \
174
+ --workload-profile-name LLAMA_PROFILE \
175
175
--query properties.configuration.ingress.fqdn
176
176
```
177
177
@@ -189,8 +189,8 @@ curl -X POST \
189
189
-H ' accept: application/json' \
190
190
-H ' Content-Type: application/json' \
191
191
-d ' {
192
- " model" : " meta/llama3 -8b-instruct" ,
193
- " prompt" : " Once upon a time" ,
192
+ " model" : " meta/llama-3.1 -8b-instruct" ,
193
+ " prompt" : [{ " role " : " user " , " content " : " Once upon a time... " }] ,
194
194
" max_tokens" : 64
195
195
}'
196
196
```
You can’t perform that action at this time.
0 commit comments