Skip to content

Commit 165d7ca

Browse files
committed
add citation
1 parent 09338c5 commit 165d7ca

File tree

1 file changed

+10
-1
lines changed

1 file changed

+10
-1
lines changed

README.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,16 @@ Note that the RAYON_NUM_THREADS environment variable control the maximum number
111111
In the examples above, we default to use Vicuna and CodeLlama. But actually you can use any LLaMA-based models you like by simply changing the "--model-path" argument. You can also build the datastore from any data you like. If you want to use architectures other than LLaMA, you can also modify the file model/modeling_llama_kv.py to match the corresponding model.
112112

113113
## Citation
114-
TODO
114+
```
115+
@misc{he2023rest,
116+
title={REST: Retrieval-Based Speculative Decoding},
117+
author={Zhenyu He and Zexuan Zhong and Tianle Cai and Jason D Lee and Di He},
118+
year={2023},
119+
eprint={2311.08252},
120+
archivePrefix={arXiv},
121+
primaryClass={cs.CL}
122+
}
123+
```
115124

116125
## Acknowledgements
117126
The codebase is from [Medusa](https://github.com/FasterDecoding/Medusa) and influenced by remarkable projects from the LLM community, including [FastChat](https://github.com/lm-sys/FastChat), [TinyChat](https://github.com/mit-han-lab/llm-awq/tree/main/), [vllm](https://github.com/vllm-project/vllm) and many others.

0 commit comments

Comments
 (0)