model_name raising error because of the dot (#1459)

pagezyhf · julien-c · web-flow · commit edab9cdcf952 · 2024-10-28T13:23:04.000+01:00
* model_name raising error because of the .

* Update docs/sagemaker/inference.md

Co-authored-by: Julien Chaumond &lt;julien@huggingface.co&gt;

---------

Co-authored-by: Julien Chaumond &lt;julien@huggingface.co&gt;
diff --git a/docs/sagemaker/inference.md b/docs/sagemaker/inference.md
@@ -358,12 +358,12 @@ You should also define `SM_NUM_GPUS`, which specifies the tensor parallelism deg
 Note that you can optionally reduce the memory and computational footprint of the model by setting the `HF_MODEL_QUANTIZE` environment variable to `true`, but this lower weight precision could affect the quality of the output for some models.
 
 ```python
-model_name = "llama-3.1-8b-instruct" + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
+model_name = "llama-3-1-8b-instruct" + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
 
 hub = {
-    'HF_MODEL_ID':'EleutherAI/gpt-neox-20b',
+    'HF_MODEL_ID':'meta-llama/Llama-3.1-8B-Instruct',
     'SM_NUM_GPUS':'1',
-	'HUGGING_FACE_HUB_TOKEN': '<REPLACE WITH YOUR TOKEN>'
+	'HUGGING_FACE_HUB_TOKEN': '<REPLACE WITH YOUR TOKEN>',
 }
 
 assert hub['HUGGING_FACE_HUB_TOKEN'] != '<REPLACE WITH YOUR TOKEN>', "You have to provide a token."