Skip to content

Add a llama-cpp Model?Β #1801

@joy-void-joy

Description

@joy-void-joy

Description

I know it is already possible to use Ollama servers with pydantic using its OpenAI interface, however, there are a few reasons for why I would be interested in a direct llama-cpp binding:

  • Tool-use and huggingface compatibility: When using Ollama on huggface models that support tool use (e.g. GLM-4-32B), Ollama says the model "does not support tool use", while llama-cpp works correctly
  • Standalone file: llama-cpp does not need to boot the server separately, which matters for ease of use and portability as everything can be included in a single python file
  • Better configurability/nativity? Unsure about this one, but it seems like llama-cpp-python might be faster and have overall more optimization builtin as compared to the ollama server

This is why I would like a llama-cpp Model to pydantic-ai.

Would there be interest for a PR regarding this? I already have started working on it for a personal project, so it is only a matter of packaging/adding tests etc...

References

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions