We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 90dccf7 commit 43a2895Copy full SHA for 43a2895
recipes/inference/model_servers/llama-on-prem.md
@@ -145,7 +145,6 @@ Then run the command below to deploy a quantized version of the Llama 3 8b chat
145
146
```
147
docker run --gpus all --shm-size 1g -e HUGGING_FACE_HUB_TOKEN=$token -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:2.0 --model-id $model
148
-
149
150
151
After this, you'll be able to run the command below on another terminal:
0 commit comments