Skip to content

Conversation

@RobinPicard
Copy link
Contributor

This PR is a minimal proposition of an interface for the integration of Outlines in Pydantic AI.

The idea is that there would be single OutlinesModel that would take as an argument in its initialization an Outlines model instance and that would then be in charge of interacting with its interface. The class has class methods (transformers, llama_cpp...) to avoid having to first create the Outlines model oneself and instead directly providing the arguments required for each model.

The proposal only supports json schema as an output type as Pydantic AI is currently focused on that type.

@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch 5 times, most recently from 7cb22ca to f118c9f Compare August 20, 2025 15:34
@Kludex Kludex self-assigned this Aug 25, 2025
)
model_settings_dict = dict(model_settings) if model_settings else {}
if isinstance(self.model, OutlinesAsyncBaseModel):
response: str = await self.model(prompt, output_type, None, **model_settings_dict)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect the keys in our ModelSettings to map to valid kwargs in outlines. Should we do some mapping explicitly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outlines does not standardize inference arguments so it would need to be model specific.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copying in what I wrote on Slack:

Another option would be to map the existing fields on ModelSettings if we know what their equivalent is for the model they're using (e.g. max_tokens , frequency), and tell them to put additional properties in extra_body : https://ai.pydantic.dev/api/settings/#pydantic_ai.settings.ModelSettings.extra_body (which can take an arbitrary dict). I'd like using extra_body better than passing the entire ModelSettings and hitting errors for unsupported keys

chat_template = '{% for message in messages %}{{ message.role }}: {{ message.content }}{% endfor %}'
hf_tokenizer.chat_template = chat_template

model = OutlinesModel.transformers(hf_model, hf_tokenizer, settings=ModelSettings(max_new_tokens=100))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my comment below, we should try to use and support the existing fields on ModelSettings

@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch from 6edfe71 to e34df9f Compare September 5, 2025 16:56
@RobinPicard RobinPicard marked this pull request as ready for review September 5, 2025 17:01
@RobinPicard RobinPicard requested review from DouweM and Kludex September 5, 2025 17:02
@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch from c2ede74 to a574c05 Compare September 8, 2025 07:58
@RobinPicard RobinPicard changed the title Proposition MVP integration with Outlines Proposition integration with Outlines Sep 8, 2025
@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch 2 times, most recently from 9f112c7 to 7dbb43a Compare September 8, 2025 12:29
bedrock = ["boto3>=1.39.0"]
huggingface = ["huggingface-hub[inference]>=0.33.5"]
outlines = ["outlines>=1.0.0, <1.3.0"]
outlines-transformers = ["outlines[transformers]>=1.0.0, <1.3.0"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? For tests? We should at least document it then.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather add it as an optional group to pydantic-ai, which should be installed as part of the all-extras CI test

pyproject.toml Outdated
[tool.hatch.metadata.hooks.uv-dynamic-versioning]
dependencies = [
"pydantic-ai-slim[openai,vertexai,google,groq,anthropic,mistral,cohere,bedrock,huggingface,cli,mcp,evals,ag-ui,retries,temporal,logfire]=={{ version }}",
"pydantic-ai-slim[openai,vertexai,google,groq,anthropic,mistral,cohere,bedrock,huggingface,outlines-transformers,cli,mcp,evals,ag-ui,retries,temporal,logfire]=={{ version }}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not include transformers by default



async def test_request_streaming(outlines_model: 'Transformers') -> None:
# The transformers model does not support streaming,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test with another model that does support streaming?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can test with llama_cpp

result = await agent.run('What is the capital of Germany?', message_history=result.all_messages())
assert len(result.output) > 0
all_messages = result.all_messages()
assert len(all_messages) == 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use inline-snapshot instead of these explicit assertions: assert result.all_messages() == snapshot() and run the test to get the snapshot to be filled in. You can replace any variable values with IsStr (see examples across the test suite)

vertex_provider_auth: None,
):
# Skip Outlines examples if loading the transformers model fails
if 'from transformers import' in example.source and 'from pydantic_ai.models.outlines import' in example.source:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this -- would it be enough to add test="ci_only" to the example so it's only run in CI, where all extras will be installed?

@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch from 7b3deb2 to 3441849 Compare October 23, 2025 15:33
@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch 2 times, most recently from f2c1d45 to 2b5e71b Compare October 23, 2025 16:23
@DouweM DouweM changed the title Add OutlinesModel to run Transformers, Llama.cpp, MLXLM, SGLang and vLLM Add OutlinesModel to run Transformers, Llama.cpp, MLXLM, SGLang and vLLM via Outlines Oct 23, 2025
@DouweM DouweM enabled auto-merge (squash) October 23, 2025 17:49
auto-merge was automatically disabled October 24, 2025 12:02

Head branch was pushed to by a user without write access

@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch 4 times, most recently from 30decd2 to edc23db Compare October 24, 2025 14:50
@DouweM
Copy link
Collaborator

DouweM commented Oct 24, 2025

@RobinPicard Heh, we're hitting HuggingFace rate limits:

E               429 Client Error: Too Many Requests for url: https://huggingface.co/api/models/erwanf/gpt2-mini/xet-read-token/f12cc7ee54aa4d3e6366597f72ff7acb39b0ab3a (Request ID: Root=1-68fb9343-3b4898b33337e4ed6923263e;4433d677-476f-4296-8e11-8ba47752b357)
E               
E               We had to rate limit your IP (172.183.89.68). To continue using our service, create a HF account or login to your existing account, and make sure you pass a HF_TOKEN if you're using the API.

Is there a way to not do any live requests?

@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch 4 times, most recently from 9a8f509 to b9005a7 Compare October 25, 2025 14:39
@RobinPicard RobinPicard force-pushed the mvp_outlines_integration branch from b9005a7 to be6bed0 Compare October 25, 2025 15:04
@RobinPicard
Copy link
Contributor Author

I've made 2 modifications to fix the remaining issues @DouweM

  • Added a commit to add a step in the CI to cache the HF model download. That way models are downloaded only once (instead of over a dozen times) and there's no problem of rate limit anymore. I think this is best as it allows us not to mock anything so that we can have more reliable tests
  • Added the # pyright: reportUnnecessaryTypeIgnoreComment = false escape command on top of the files pydantic_ai_slim/pydantic_ai/models/outlines.py and tests/outlines/test_outlines.py. The reason for this is that there was a conflict between the pyright linting for python3.10 and 3.13: one was requesting pyright: ignore while the other would fail it were included because of unnecessary ignore commands.

@RobinPicard RobinPicard requested a review from DouweM October 25, 2025 15:37
@DouweM DouweM merged commit a70c2a5 into pydantic:main Oct 27, 2025
31 checks passed
@DouweM
Copy link
Collaborator

DouweM commented Oct 27, 2025

@RobinPicard Thanks for all your work and patience here; merged! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new models Support for new model(s)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants