|
| 1 | +--- |
| 2 | +sidebar_label: Arch LLM Gateway |
| 3 | +description: The smart edge and AI gateway for agents. Arch is a proxy server that handles the low-level work in building agents like applying guardrails, routing prompts to the right agent, and unifying access to LLMs. It is framework-agnostic, natively understands prompts, and helps you build agents faster. |
| 4 | +keywords: |
| 5 | + - archgw |
| 6 | + - roo code |
| 7 | + - api provider |
| 8 | + - unified api |
| 9 | + - openai compatible |
| 10 | + - multi model |
| 11 | + - llm proxy |
| 12 | + - local deployment |
| 13 | + - cost management |
| 14 | + - model routing |
| 15 | + - preference based routing |
| 16 | + - developer preferences |
| 17 | +image: /img/social-share.jpg |
| 18 | +--- |
| 19 | + |
| 20 | +# Using Arch LLM Gateway With Roo Code |
| 21 | + |
| 22 | +[Arch Gateway](https://github.com/katanemo/archgw) unifies access and routing to any LLM, including dynamic routing via [user preferences](https://github.com/katanemo/archgw#Preference-based-Routing). For example, it can direct a query to the appropriate model according to specified user preferences. |
| 23 | + |
| 24 | + |
| 25 | +Arch LLM Gateway provides a unified interface to many Large Language Models (LLMs) by offering an OpenAI-compatible API. This allows you to run a local server that can proxy requests to various model providers or serve local models, all accessible through a consistent API endpoint. |
| 26 | + |
| 27 | +**Website:** [github.com/katanemo/archgw](https://github.com/katanemo/archgw) (Main project) & [docs.archgw.com/](https://docs.archgw.com/) (Documentation) |
| 28 | + |
| 29 | +--- |
| 30 | + |
| 31 | +## Key Benefits |
| 32 | + |
| 33 | +* 🚦 **Routing to Agents:** Engineered with purpose-built [LLMs](https://huggingface.co/collections/katanemo/arch-function-66f209a693ea8df14317ad68) for fast (<100ms) agent routing and hand-off scenarios |
| 34 | +* 🔗 **Routing to LLMs:** Unify access and routing to any LLM, including dynamic routing via [preference policies](https://github.com/katanemo/archgw#Preference-based-Routing). |
| 35 | +* ⛨ **Guardrails:** Centrally configure and prevent harmful outcomes and ensure safe user interactions |
| 36 | +* ⚡ **Tools Use:** For common agentic scenarios let Arch instantly clarify and convert prompts to tools/API calls |
| 37 | +* 🕵 **Observability:** W3C compatible request tracing and LLM metrics that instantly plugin with popular tools |
| 38 | +* 🧱 **Built on Envoy:** Arch runs alongside app servers as a containerized process, and builds on top of [Envoy's](https://envoyproxy.io/) proven HTTP management and scalability features to handle ingress and egress traffic related to prompts and LLMs. |
| 39 | + |
| 40 | +--- |
| 41 | + |
| 42 | +## Setting Up Arch LLM Gateway |
| 43 | + |
| 44 | +To use Arch Gateway with Roo Code, you first need to set up and run archgw with LLM `arch_config.yaml` (see below). |
| 45 | + |
| 46 | +### Installation |
| 47 | + |
| 48 | +1. Install Arch gateway pre-requisites: |
| 49 | + Follow (these steps)[https://github.com/katanemo/archgw?tab=readme-ov-file#prerequisites] to ensure that you have pre-requisites installed. |
| 50 | + |
| 51 | +### Configuration |
| 52 | + |
| 53 | +2. Create a configuration file (`arch_config.yaml`) to define your models and providers: |
| 54 | + ```yaml |
| 55 | + version: v0.1.0 |
| 56 | + |
| 57 | + listeners: |
| 58 | + egress_traffic: |
| 59 | + address: 0.0.0.0 |
| 60 | + port: 12000 |
| 61 | + message_format: openai |
| 62 | + timeout: 30s |
| 63 | + |
| 64 | + llm_providers: |
| 65 | + |
| 66 | + - model: openai/gpt-4o-mini |
| 67 | + access_key: $OPENAI_API_KEY |
| 68 | + default: true |
| 69 | + |
| 70 | + - model: openai/gpt-4o |
| 71 | + access_key: $OPENAI_API_KEY |
| 72 | + routing_preferences: |
| 73 | + - name: code understanding |
| 74 | + description: understand and explain existing code snippets, functions, or libraries |
| 75 | + |
| 76 | + - model: openai/gpt-4.1 |
| 77 | + access_key: $OPENAI_API_KEY |
| 78 | + routing_preferences: |
| 79 | + - name: code generation |
| 80 | + description: generating new code snippets, functions, or boilerplate based on user prompts or requirements |
| 81 | + ``` |
| 82 | +
|
| 83 | +### Starting the Arch LLM Gateway |
| 84 | +
|
| 85 | +3. Start the LLM Gateway: |
| 86 | + ```bash |
| 87 | + |
| 88 | + # In foreground mode with arch_config.yaml (recommended) |
| 89 | + |
| 90 | + $ OPENAI_API_KEY=some_key archgw up --service archgw --foreground |
| 91 | + ``` |
| 92 | + |
| 93 | +4. The proxy will run at `http://0.0.0.0:12000/v1` by default (accessible as `http://localhost:12000/v1`). |
| 94 | + |
| 95 | +Refer to the [Arch Gateway documentation](https://docs.archgw.com/) for detailed instructions on advanced server configuration and features. |
| 96 | + |
| 97 | +--- |
| 98 | + |
| 99 | +## Configuration in Roo Code |
| 100 | + |
| 101 | +Once your Arch LLM Gateway server is running, you have two options for configuring it in Roo Code: |
| 102 | + |
| 103 | +### Option 1: Using the Arch LLM Gateway Provider (Recommended) |
| 104 | + |
| 105 | +1. **Open Roo Code Settings:** Click the gear icon (<Codicon name="gear" />) in the Roo Code panel. |
| 106 | +2. **Select Provider:** Choose "Arch LLM Gateway" from the "API Provider" dropdown. |
| 107 | +3. **Enter Base URL:** |
| 108 | + * Input the URL of your Arch LLM Gateway server. |
| 109 | + * Defaults to `http://localhost:12000/v1` if left blank. |
| 110 | +4. **Enter API Key (Optional):** |
| 111 | + * If you've configured an API key for your Arch Gateway, enter it here. |
| 112 | + * If your Arch Gateway doesn't require an API key, use a default dummy key (`"dummy-key"`), which should work fine. |
| 113 | +5. **Select Model:** |
| 114 | + * Roo Code will attempt to fetch the list of available models from your Arch Gateway by querying the `${baseUrl}/v1/model/info` endpoint. |
| 115 | + * The models displayed in the dropdown are sourced from this endpoint. |
| 116 | + * Use the refresh button to update the model list if you've added new models to your Arch Gateway. |
| 117 | + * If no model is selected, Roo Code defaults to `openai/gpt-4.1` (this is `archgwDefaultModelId`). Ensure this model (or your desired default) is configured and available on your Arch LLM Gateway. |
| 118 | +6. **Configure Routing:** |
| 119 | + * Select "use preference based routing" option and use following configuration for routing. Note: model name must match with model names listed in arch_config.yaml. |
| 120 | + * If you leave "use preference based routing" unchecked then routing configuration from arch_config.yaml will be used. |
| 121 | + |
| 122 | + ```yaml |
| 123 | + - model: openai/gpt-4o |
| 124 | + routing_preferences: |
| 125 | + - name: code understanding |
| 126 | + description: understand and explain code |
| 127 | + |
| 128 | + - model: openai/gpt-4.1 |
| 129 | + routing_preferences: |
| 130 | + - name: code generation |
| 131 | + description: generating new code |
| 132 | + ``` |
| 133 | + * At this point you are ready. Fire away your queries and see arch router use dynamic models based on query type. |
| 134 | +
|
| 135 | +
|
| 136 | +### Option 2: Using OpenAI Compatible Provider |
| 137 | +
|
| 138 | +Alternatively, you can configure Arch LLM Gateway using the "OpenAI Compatible" provider: |
| 139 | +
|
| 140 | +1. **Open Roo Code Settings:** Click the gear icon (<Codicon name="gear" />) in the Roo Code panel. |
| 141 | +2. **Select Provider:** Choose "OpenAI Compatible" from the "API Provider" dropdown. |
| 142 | +3. **Enter Base URL:** Input your Arch LLM Gateway proxy URL (e.g., `http://localhost:12000/v1`). |
| 143 | +4. **Enter API Key:** Use any string as the API key (e.g., `"sk-1234"`) since Arch Gateway handles the actual provider authentication. |
| 144 | +5. **Select Model:** Choose the model name you configured in your `arch_config.yaml` file. |
| 145 | + |
| 146 | +--- |
0 commit comments