|
36 | 36 | | recipe_container_env | string | No | Values of the recipe container init arguments. See the Blueprint Arguments section below for details. Example: `[{"key": "tensor_parallel_size","value": "2"},{"key": "model_name","value": "NousResearch/Meta-Llama-3.1-8B-Instruct"},{"key": "Model_Path","value": "/models/NousResearch/Meta-Llama-3.1-8B-Instruct"}]` | |
37 | 37 | | skip_capacity_validation | boolean | No | Determines whether validation checks on shape capacity are performed before initiating deployment. If your deployment is failing validation due to capacity errors but you believe this not to be true, you should set `skip_capacity_validation` to be `true` in the recipe JSON to bypass all checks for Shape capacity. | |
38 | 38 |
|
39 | | -For autoscaling parameters, visit [autoscaling](sample_blueprints/platform_feature_blueprints/auto_scaling/README.md). |
| 39 | +For autoscaling parameters, visit [autoscaling](sample_blueprints/model_serving/auto_scaling/README.md). |
40 | 40 |
|
41 | | -For multinode inference parameters, visit [multinode inference](sample_blueprints/workload_blueprints/multi-node-inference/README.md) |
| 41 | +For multinode inference parameters, visit [multinode inference](sample_blueprints/model_serving/multi-node-inference/README.md) |
42 | 42 |
|
43 | | -For MIG parameters, visit [MIG shared pool configurations](sample_blueprints/platform_feature_blueprints/mig_multi_instance_gpu/mig_inference_single_replica.json), [update MIG configuration](sample_blueprints/platform_feature_blueprints/mig_multi_instance_gpu/mig_inference_single_replica.json), and [MIG recipe configuration](sample_blueprints/platform_feature_blueprints/mig_multi_instance_gpu/mig_inference_single_replica.json). |
| 43 | +For MIG parameters, visit [MIG shared pool configurations](sample_blueprints/model_serving/mig_multi_instance_gpu/mig_inference_single_replica.json), [update MIG configuration](sample_blueprints/model_serving/mig_multi_instance_gpu/mig_inference_single_replica.json), and [MIG recipe configuration](sample_blueprints/model_serving/mig_multi_instance_gpu/mig_inference_single_replica.json). |
44 | 44 |
|
45 | 45 | ### Blueprint Container Arguments |
46 | 46 |
|
@@ -94,13 +94,13 @@ This recipe deploys the vLLM container image. Follow the vLLM docs to pass the c |
94 | 94 | There are 3 blueprints that we are providing out of the box. Following are example recipe.json snippets that you can use to deploy the blueprints quickly for a test run. |
95 | 95 | |Blueprint|Scenario|Sample JSON| |
96 | 96 | |----|----|---- |
97 | | -|LLM Inference using NVIDIA shapes and vLLM|Deployment with default Llama-3.1-8B model using PAR|View sample JSON here [here](sample_blueprints/workload_blueprints/llm_inference_with_vllm/vllm-open-hf-model.json) |
98 | | -|MLCommons Llama-2 Quantized 70B LORA Fine-Tuning on A100|Default deployment with model and dataset ingested using PAR|View sample JSON here [here](sample_blueprints/workload_blueprints/lora-benchmarking/mlcommons_lora_finetune_nvidia_sample_recipe.json) |
99 | | -|LORA Fine-Tune Blueprint|Open Access Model Open Access Dataset Download from Huggingface (no token required)|View sample JSON [here](sample_blueprints/workload_blueprints/lora-fine-tuning/open_model_open_dataset_hf.backend.json) |
100 | | -|LORA Fine-Tune Blueprint|Closed Access Model Open Access Dataset Download from Huggingface (Valid Auth Token Is Required!!)|View sample JSON [here](sample_blueprints/workload_blueprints/lora-fine-tuning/closed_model_open_dataset_hf.backend.json) |
101 | | -|LORA Fine-Tune Blueprint|Bucket Model Open Access Dataset Download from Huggingface (no token required)|View sample JSON [here](sample_blueprints/workload_blueprints/lora-fine-tuning/bucket_par_open_dataset.backend.json) |
102 | | -|LORA Fine-Tune Blueprint|Get Model from Bucket in Another Region / Tenancy using Pre-Authenticated_Requests (PAR) Open Access Dataset Download from Huggingface (no token required)|View sample JSON [here](sample_blueprints/workload_blueprints/lora-fine-tuning/bucket_model_open_dataset.backend.json) |
103 | | -|LORA Fine-Tune Blueprint|Bucket Model Bucket Checkpoint Open Access Dataset Download from Huggingface (no token required)|View sample JSON [here](sample_blueprints/workload_blueprints/lora-fine-tuning/bucket_par_open_dataset.backend.json) |
| 97 | +|LLM Inference using NVIDIA shapes and vLLM|Deployment with default Llama-3.1-8B model using PAR|View sample JSON here [here](sample_blueprints/model_serving/llm_inference_with_vllm/vllm-open-hf-model.json) |
| 98 | +|MLCommons Llama-2 Quantized 70B LORA Fine-Tuning on A100|Default deployment with model and dataset ingested using PAR|View sample JSON here [here](sample_blueprints/gpu_benchmarking/lora-benchmarking/mlcommons_lora_finetune_nvidia_sample_recipe.json) |
| 99 | +|LORA Fine-Tune Blueprint|Open Access Model Open Access Dataset Download from Huggingface (no token required)|View sample JSON [here](sample_blueprints/model_fine_tuning/lora-fine-tuning/open_model_open_dataset_hf.backend.json) |
| 100 | +|LORA Fine-Tune Blueprint|Closed Access Model Open Access Dataset Download from Huggingface (Valid Auth Token Is Required!!)|View sample JSON [here](sample_blueprints/model_fine_tuning/lora-fine-tuning/closed_model_open_dataset_hf.backend.json) |
| 101 | +|LORA Fine-Tune Blueprint|Bucket Model Open Access Dataset Download from Huggingface (no token required)|View sample JSON [here](sample_blueprints/model_fine_tuning/lora-fine-tuning/bucket_par_open_dataset.backend.json) |
| 102 | +|LORA Fine-Tune Blueprint|Get Model from Bucket in Another Region / Tenancy using Pre-Authenticated_Requests (PAR) Open Access Dataset Download from Huggingface (no token required)|View sample JSON [here](sample_blueprints/model_fine_tuning/lora-fine-tuning/bucket_model_open_dataset.backend.json) |
| 103 | +|LORA Fine-Tune Blueprint|Bucket Model Bucket Checkpoint Open Access Dataset Download from Huggingface (no token required)|View sample JSON [here](sample_blueprints/model_fine_tuning/lora-fine-tuning/bucket_par_open_dataset.backend.json) |
104 | 104 |
|
105 | 105 | ## Undeploy a Blueprint |
106 | 106 |
|
|
0 commit comments