Skip to content

Commit c9898f4

Browse files
cpsievertclaude
andauthored
feat: add data_model parameter to .to_solver() for structured output in evals (#264)
* feat: add data_model parameter to to_solver() for structured output in evals When a data_model (Pydantic model) is provided to .to_solver(), the solver uses .chat_structured_async() instead of .chat_async() to generate responses. The resulting Pydantic model instance is serialized to JSON and set as the completion text in state.output.completion. This allows using chatlas for structured data extraction tasks in Inspect AI evaluations, where scorers can parse and validate the JSON output. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]> * docs: add PR number to CHANGELOG entry 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]> * refactor: consolidate completion logic in to_solver() Move all completion text determination to one place, right before setting state.output. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]> * Apply suggestions from code review * Update chatlas/_chat.py --------- Co-authored-by: Claude Opus 4.5 <[email protected]>
1 parent b32db90 commit c9898f4

File tree

4 files changed

+320
-3
lines changed

4 files changed

+320
-3
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1212
### New features
1313

1414
* `.stream()` and `.stream_async()` now support a `data_model` parameter for structured data extraction while streaming. (#262)
15+
* `.to_solver()` now supports a `data_model` parameter for structured data extraction in evals. When provided, the solver uses `.chat_structured()` instead of `.chat()` and outputs JSON-serialized data. (#264)
1516

1617
## [0.15.0] - 2026-01-06
1718

chatlas/_chat.py

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -836,6 +836,7 @@ def to_solver(
836836
*,
837837
include_system_prompt: bool = False,
838838
include_turns: bool = False,
839+
data_model: type[BaseModel] | None = None,
839840
):
840841
"""
841842
Create an InspectAI solver from this chat.
@@ -847,6 +848,11 @@ def to_solver(
847848
848849
Parameters
849850
----------
851+
data_model
852+
A Pydantic model describing the structure of the data to extract.
853+
When provided, the solver will use `.chat_structured()` instead of
854+
`.chat()` to generate responses, and the output completion will be
855+
JSON serialized from the model instance.
850856
include_system_prompt
851857
Whether to include the system prompt in the solver's starting
852858
messages.
@@ -977,8 +983,14 @@ async def solve(state: InspectTaskState, generate):
977983
input_content = [input_content]
978984
input_content = [inspect_content_as_chatlas(x) for x in input_content]
979985

980-
# Generate the response (this can generate multiple turns!)
981-
await chat_instance.chat_async(*input_content, echo="none")
986+
# Generate the response
987+
structured_result: BaseModel | None = None
988+
if data_model is not None:
989+
structured_result = await chat_instance.chat_structured_async(
990+
*input_content, data_model=data_model, echo="none"
991+
)
992+
else:
993+
await chat_instance.chat_async(*input_content, echo="none")
982994

983995
# Map change in chatlas Turn state back to Inspect message.state
984996
# (Note: we skip the user prompt turn since it's already included)
@@ -1001,10 +1013,15 @@ async def solve(state: InspectTaskState, generate):
10011013
"Expected the last message in InspectAI state to be an assistant message"
10021014
)
10031015

1016+
if structured_result is not None:
1017+
completion = structured_result.model_dump_json()
1018+
else:
1019+
completion = turns[-1].text
1020+
10041021
state.output = imodel.ModelOutput(
10051022
model=model,
10061023
choices=[imodel.ChatCompletionChoice(message=last_message)],
1007-
completion=turns[-1].text,
1024+
completion=completion,
10081025
usage=usage,
10091026
time=time.perf_counter() - start_time,
10101027
)

0 commit comments

Comments
 (0)