You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/localdev.md
+41-7Lines changed: 41 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -46,25 +46,59 @@ You may want to save costs by developing against a local LLM server, such as
46
46
[llamafile](https://github.com/Mozilla-Ocho/llamafile/). Note that a local LLM
47
47
will generally be slower and not as sophisticated.
48
48
49
-
Once you've got your local LLM running and serving an OpenAI-compatible endpoint, set these environment variables:
49
+
Once the local LLM server is running and serving an OpenAI-compatible endpoint, set these environment variables:
50
50
51
51
```shell
52
+
azd env set USE_VECTORS false
52
53
azd env set OPENAI_HOST local
53
54
azd env set OPENAI_BASE_URL <your local endpoint>
55
+
azd env set AZURE_OPENAI_CHATGPT_MODEL local-model-name
54
56
```
55
57
56
-
For example, to point at a local llamafile server running on its default port:
58
+
Then restart the local development server.
59
+
You should now be able to use the "Ask" tab.
60
+
61
+
⚠️ Limitations:
62
+
63
+
- The "Chat" tab will only work if the local language model supports function calling.
64
+
- Your search mode must be text only (no vectors), since the search index is only populated with OpenAI-generated embeddings, and the local OpenAI host can't generate those.
65
+
- The conversation history will be truncated using the GPT tokenizers, which may not be the same as the local model's tokenizer, so if you have a long conversation, you may end up with token limit errors.
66
+
67
+
> [!NOTE]
68
+
> You must set `OPENAI_HOST` back to a non-local value ("azure", "azure_custom", or "openai")
69
+
> before running `azd up` or `azd provision`, since the deployed backend can't access your local server.
70
+
71
+
### Using Ollama server
72
+
73
+
For example, to point at a local Ollama server running the `llama3.1:8b` model:
74
+
75
+
```shell
76
+
azd env set OPENAI_HOST local
77
+
azd env set OPENAI_BASE_URL http://localhost:11434/v1
78
+
azd env set AZURE_OPENAI_CHATGPT_MODEL llama3.1:8b
79
+
azd env set USE_VECTORS false
80
+
```
81
+
82
+
If you're running the app inside a VS Code Dev Container, use this local URL instead:
83
+
84
+
```shell
85
+
azd env set OPENAI_BASE_URL http://host.docker.internal:11434/v1
86
+
```
87
+
88
+
### Using llamafile server
89
+
90
+
To point at a local llamafile server running on its default port:
57
91
58
92
```shell
93
+
azd env set OPENAI_HOST local
59
94
azd env set OPENAI_BASE_URL http://localhost:8080/v1
95
+
azd env set USE_VECTORS false
60
96
```
61
97
62
-
If you're running inside a dev container, use this local URL instead:
98
+
Llamafile does *not* require a model name to be specified.
99
+
100
+
If you're running the app inside a VS Code Dev Container, use this local URL instead:
63
101
64
102
```shell
65
103
azd env set OPENAI_BASE_URL http://host.docker.internal:8080/v1
66
104
```
67
-
68
-
> [!NOTE]
69
-
> You must set this back to a non-local value ("azure", "azure_custom", or "openai")
70
-
> before running `azd up` or `azd provision`, since the deployed backend can't access your local server.
0 commit comments