Skip to content

Conversation

@par4m
Copy link
Contributor

@par4m par4m commented Jul 2, 2025

Implements #659

Since vLLM doesn't support MacOS, models can only be run on CPU.

Choose a model with less than 1.5B Parameters and at least 2400 token support, for testing DeepSeek Coder 1.3B is used on MacOS M4

Install vLLM: (python < 3.13)

pip install vllm

Run vLLM Server

vllm serve deepseek-ai/deepseek-coder-1.3b-instruct
cocoindex.LlmSpec(
    api_type=cocoindex.LlmApiType.VLLM,
    model="deepseek-ai/deepseek-coder-1.3b-instruct",
    address="http://127.0.0.1:8000/v1",                    # /v1 is mandatory 
)

Also vLLM Supports Embedding the same way as OpenAI, a new PR for it would be better - https://docs.vllm.ai/en/v0.6.6/serving/openai_compatible_server.html#embeddings-api

More info - https://docs.vllm.ai/en/v0.6.6/serving/openai_compatible_server.html#

@par4m
Copy link
Contributor Author

par4m commented Jul 3, 2025

Not sure why formatting failed can you please rerun the checks

@badmonster0
Copy link
Member

Not sure why formatting failed can you please rerun the checks

Should be caused by a format issue of some example code. fixed in #687.

The check still fails as it's not merged with the latest main yet, but it's safe to merge. I'll merge now.

@badmonster0 badmonster0 merged commit c1ce446 into cocoindex-io:main Jul 3, 2025
21 of 24 checks passed
@badmonster0
Copy link
Member

thank you @par4m ! new release note is out and we made a section for you, we love your contribution!!
https://cocoindex.io/blogs/cocoindex-changelog-2025-08-18#par4m ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants