Decrease readinessProbe period to 10s

sd109 · sd109 · commit 69041e56c886 · 2024-03-18T20:48:27.000Z
diff --git a/README.md b/README.md
@@ -44,6 +44,7 @@ The following is a non-exhaustive list of models which have been tested with thi
 - [AWQ Quantized Llama 2 70B](https://huggingface.co/TheBloke/Llama-2-70B-Chat-AWQ)
 - [Magicoder 6.7B](https://huggingface.co/ise-uiuc/Magicoder-S-DS-6.7B)
 - [Mistral 7B Instruct v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
+- [WizardCoder Python 34B](https://huggingface.co/WizardLM/WizardCoder-Python-34B-V1.0)
 <!-- - [AWQ Quantized Mixtral 8x7B Instruct v0.1](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ) (Not producing output properly) -->
 
 Due to the combination of [components](##Components) used in this app, some HuggingFace models may not work as expected (usually due to the way in which LangChain formats the prompt messages). Any errors when using new model will appear in the logs for either the web-app pod or the backend API pod. Please open an issue if you would like explicit support for a specific model which is not in the above list.
diff --git a/chart/templates/api/deployment.yml b/chart/templates/api/deployment.yml
@@ -50,7 +50,7 @@ spec:
           httpGet:
             port: 8000
             path: /health
-          periodSeconds: 60
+          periodSeconds: 10
         resources:
           limits:
             nvidia.com/gpu: {{ .Values.api.gpus | int }}