Skip to content

Commit e914e2f

Browse files
authored
Update README.md
1 parent 36ce336 commit e914e2f

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,12 @@ docker run -p 8000:8000 ghcr.io/codelion/optillm:latest
2828
2024-10-22 07:45:06,293 - INFO - Starting server with approach: auto
2929
```
3030

31+
To use optillm without local inference and only as a proxy you can add the `-proxy` suffix.
32+
33+
```bash
34+
docker pull ghcr.io/codelion/optillm:latest-proxy
35+
```
36+
3137
### Install from source
3238

3339
Clone the repository with `git` and use `pip install` to setup the dependencies.
@@ -299,6 +305,7 @@ When the API key is set, clients must include it in their requests using the `Au
299305
```plain
300306
Authorization: Bearer your_secret_api_key
301307
```
308+
302309
## SOTA results on benchmarks with optillm
303310

304311
### coc-claude-3-5-sonnet-20241022 on AIME 2024 pass@1 (Nov 2024)
@@ -348,7 +355,7 @@ called patchflows. We saw huge performance gains across all the supported patchf
348355

349356
## References
350357

351-
- [Chain of Code: Reasoning with a Language Model-Augmented Code Emulator](https://arxiv.org/abs/2312.04474) - [Implementation](https://github.com/codelion/optillm/blob/main/optillm/plugins/coc_plugin.py)
358+
- [Chain of Code: Reasoning with a Language Model-Augmented Code Emulator](https://arxiv.org/abs/2312.04474) - [Inspired the implementation of coc plugin](https://github.com/codelion/optillm/blob/main/optillm/plugins/coc_plugin.py)
352359
- [Entropy Based Sampling and Parallel CoT Decoding](https://github.com/xjdr-alt/entropix) - [Implementation](https://github.com/codelion/optillm/blob/main/optillm/entropy_decoding.py)
353360
- [Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation](https://arxiv.org/abs/2409.12941) - [Evaluation script](https://github.com/codelion/optillm/blob/main/scripts/eval_frames_benchmark.py)
354361
- [Writing in the Margins: Better Inference Pattern for Long Context Retrieval](https://www.arxiv.org/abs/2408.14906) - [Inspired the implementation of the memory plugin](https://github.com/codelion/optillm/blob/main/optillm/plugins/memory_plugin.py)

0 commit comments

Comments
 (0)