You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/advanced-usage/local-models.md
+5-101Lines changed: 5 additions & 101 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,113 +21,17 @@ Roo Code currently supports two main local model providers:
21
21
1.**Ollama:** A popular open-source tool for running large language models locally. It supports a wide range of models.
22
22
2.**LM Studio:** A user-friendly desktop application that simplifies the process of downloading, configuring, and running local models. It also provides a local server that emulates the OpenAI API.
23
23
24
-
## Setting Up Ollama
24
+
## Setting Up Local Models
25
25
26
-
1.**Download and Install Ollama:** Download the Ollama installer for your operating system from the [Ollama website](https://ollama.com/). Follow the installation instructions. Make sure Ollama is running
26
+
For detailed setup instructions, see:
27
27
28
-
```bash
29
-
ollama serve
30
-
```
28
+
*[Setting up Ollama](../providers/ollama)
29
+
*[Setting up LM Studio](../providers/lmstudio)
31
30
32
-
2. **Download a Model:** Ollama supports many different models. You can find a list of available models on the [Ollama website](https://ollama.com/library). Some recommended models for coding tasks include:
33
-
34
-
*`codellama:7b-code` (good starting point, smaller)
35
-
*`codellama:13b-code` (better quality, larger)
36
-
*`codellama:34b-code` (even better quality, very large)
*`deepseek-coder:6.7b-base` (good for coding tasks)
40
-
*`llama3:8b-instruct-q5_1` (good for general tasks)
41
-
42
-
To download a model, open your terminal and run:
43
-
44
-
```bash
45
-
ollama pull <model_name>
46
-
```
47
-
48
-
For example:
49
-
50
-
```bash
51
-
ollama pull qwen2.5-coder:32b
52
-
```
53
-
54
-
3. **Configure the Model:** by default, Ollama uses a context window size of 2048 tokens, which is too small for Roo Code requests. You need to have at least 12k to get decent results, ideally - 32k. To configure a model, you actually need to set its parameters and save a copy of it.
55
-
56
-
##### Using Ollama runtime
57
-
Load the model (we will use `qwen2.5-coder:32b` as an example):
58
-
59
-
```bash
60
-
ollama run qwen2.5-coder:32b
61
-
```
62
-
63
-
Change context size parameter:
64
-
65
-
```bash
66
-
/set parameter num_ctx 32768
67
-
```
68
-
69
-
Save the model with a new name:
70
-
71
-
```bash
72
-
/save your_model_name
73
-
```
74
-
##### Using Ollama command line
75
-
Alternatively, you can write all your settings into a text file and generate the model in the command-line.
76
-
77
-
78
-
Create a text file with model settings, and save it (~/qwen2.5-coder-32k.txt). Here we've only used the `num_ctx` parameter, but you could include more parameters on the next line using the `PARAMETER name value` syntax.
79
-
80
-
```text
81
-
FROM qwen2.5-coder:32b
82
-
# sets the context window size to 32768, this controls how many tokens the LLM can use as context to generate the next token
83
-
PARAMETER num_ctx 32768
84
-
```
85
-
Change directory to the `.ollama/models` directory. On most Macs, thats `~/.ollama/models` by default (`%HOMEPATH%\.ollama` on Windows).
86
-
87
-
```bash
88
-
cd ~/.ollama/models
89
-
```
90
-
91
-
Create your model from the settings text file you created. The syntax is `ollama create (name of the model you want to see) -f (text file with settings)`
* Open the Roo Code sidebar (<Codicon name="rocket" /> icon).
101
-
* Click the settings gear icon (<Codicon name="gear" />).
102
-
* Select "ollama" as the API Provider.
103
-
* Enter the Model name from the previous step (e.g., `your_model_name`) or choose it from the radio button list that should appear below `Model ID` if Ollama is currently running.
104
-
* (Optional) You can configure the base URL if you're running Ollama on a different machine. The default is `http://localhost:11434`.
105
-
* (Optional) Configure Model context size in Advanced settings, so Roo Code knows how to manage its sliding window.
106
-
107
-
## Setting Up LM Studio
108
-
109
-
1. **Download and Install LM Studio:** Download LM Studio from the [LM Studio website](https://lmstudio.ai/).
110
-
2. **Download a Model:** Use the LM Studio interface to search forand download a model. Some recommended models include those listed above for Ollama. Look for modelsin the GGUF format.
111
-
3. **Start the Local Server:**
112
-
* In LM Studio, click the **"Local Server"** tab (the icon looks like `<->`).
113
-
* Select your downloaded model.
114
-
* Click **"Start Server"**.
115
-
4. **Configure Roo Code:**
116
-
* Open the Roo Code sidebar (<Codicon name="rocket" /> icon).
117
-
* Click the settings gear icon (<Codicon name="gear" />).
118
-
* Select "lmstudio" as the API Provider.
119
-
* Enter the Model ID. This should be the name of the model file you loaded in LM Studio (e.g., `codellama-7b.Q4_0.gguf`). LM Studio shows a list of "Currently loaded models"in its UI.
120
-
* (Optional) You can configure the base URL if you're running LM Studio on a different machine. The default is `http://localhost:1234`.
31
+
Both providers offer similar capabilities but with different user interfaces and workflows. Ollama provides more control through its command-line interface, while LM Studio offers a more user-friendly graphical interface.
121
32
122
33
## Troubleshooting
123
34
124
-
* **"Please check the LM Studio developer logs to debug what went wrong":** This error usually indicates a problem with the model or its configuration in LM Studio. Try the following:
125
-
* Make sure the LM Studio local server is running and that the correct model is loaded.
126
-
* Check the LM Studio logs for any error messages.
127
-
* Try restarting the LM Studio server.
128
-
* Ensure your chosen model is compatible with Roo Code. Some very small models may not work well.
129
-
* Some models may require a larger context length.
130
-
131
35
***"No connection could be made because the target machine actively refused it":** This usually means that the Ollama or LM Studio server isn't running, or is running on a different port/address than Roo Code is configured to use. Double-check the Base URL setting.
132
36
133
37
***Slow Response Times:** Local models can be slower than cloud-based models, especially on less powerful hardware. If performance is an issue, try using a smaller model.
0 commit comments