You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The compilation depends on [vcpkg](https://github.com/microsoft/vcpkg). The Docker image already includes VCPKG_ROOT preconfigured. If you want to manually set it up, you can:
156
-
```bash
157
-
git clone https://gitcode.com/xLLM-AI/vcpkg.git
158
-
cd vcpkg && git checkout ffc42e97c866ce9692f5c441394832b86548422c
159
-
export VCPKG_ROOT=/your/path/to/vcpkg
160
-
```
161
114
162
-
#### Compilation
163
-
When compiling, generate executable files `build/xllm/core/server/xllm` under `build/`:
164
-
```bash
165
-
python setup.py build
166
-
```
167
-
Or, compile directly using the following command to generate the whl package under `dist/`:
168
-
```bash
169
-
python setup.py bdist_wheel
170
-
```
171
-
172
-
#### Launch
173
-
Run the following command to start xLLM engine:
174
-
```bash
175
-
./build/xllm/core/server/xllm \ # launch xllm server
176
-
--model=/path/to/your/llm \ # model path(to replace with your own path)
177
-
--port=9977 \ # set service port to 9977
178
-
--max_memory_utilization 0.90 # set the maximal utilization of device memory
179
-
```
115
+
Please refer to [Quick Start](docs/en/getting_started/quick_start.md) for more details.
180
116
181
117
---
182
118
@@ -217,6 +153,7 @@ This project was made possible thanks to the following open-source projects:
217
153
-[safetensors](https://github.com/huggingface/safetensors) - xLLM relies on the C binding safetensors capability.
218
154
-[Partial JSON Parser](https://github.com/promplate/partial-json-parser) - Implement xLLM's C++ JSON parser with insights from Python and Go implementations.
219
155
-[concurrentqueue](https://github.com/cameron314/concurrentqueue) - A fast multi-producer, multi-consumer lock-free concurrent queue for C++11.
0 commit comments