Skip to content

Commit 234aa33

Browse files
authored
Update README.md
fix the GPU device setting for CLI
1 parent 10e7387 commit 234aa33

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ We currently support inference in the single GPU and batch size 1 setting, which
9797

9898
You can use the following command for lauching a CLI interface:
9999
```bash
100-
python -m medusa.inference.cli --model [path of medusa model]
100+
CUDA_VISIBLE_DEVICES=0 python -m medusa.inference.cli --model [path of medusa model]
101101
```
102102
You can also pass `--load-in-8bit` or `--load-in-4bit` to load the base model in quantized format.
103103

@@ -162,4 +162,4 @@ We also provide some illustrative notebooks in `notebooks/` to help you understa
162162
We welcome community contributions to Medusa. If you have an idea for how to improve it, please open an issue to discuss it with us. When submitting a pull request, please ensure that your changes are well-tested. Please split each major change into a separate pull request. We also have a [Roadmap](ROADMAP.md) summarizing our future plans for Medusa. Don't hesitate to reach out if you are interested in contributing to any of the items on the roadmap.
163163

164164
## Acknowledgements
165-
This codebase is influenced by amazing works from the community, including [FastChat](https://github.com/lm-sys/FastChat), [TinyChat](https://github.com/mit-han-lab/llm-awq/tree/main/), [vllm](https://github.com/vllm-project/vllm) and many others.
165+
This codebase is influenced by amazing works from the community, including [FastChat](https://github.com/lm-sys/FastChat), [TinyChat](https://github.com/mit-han-lab/llm-awq/tree/main/), [vllm](https://github.com/vllm-project/vllm) and many others.

0 commit comments

Comments
 (0)