Skip to content

Commit 15f62f3

Browse files
Merge pull request #13 from vstorm-co/feature/issue#9
issue #9 resolved
2 parents 5d325fc + 3d5fbbe commit 15f62f3

22 files changed

+1096
-81
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,3 +43,5 @@ build/
4343
# Custom
4444
*.db
4545
.ruff_cache
46+
ocr_parsing/files/results
47+
ocr_parsing/files/temp_files

README.md

Lines changed: 185 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,41 @@
11
# PydanticAI Examples
22

3+
A comprehensive collection of examples demonstrating PydanticAI framework capabilities, from basic model requests to advanced document processing with schema validation.
4+
5+
## Prerequisites
6+
7+
### System Requirements
8+
9+
- Python 3.10+
10+
- `uv` package manager
11+
12+
### Environment Setup
13+
14+
```bash
15+
# Install dependencies
16+
uv sync
17+
18+
# Create .env file with your API key
19+
echo "OPENAI_API_KEY=your-key-here" > .env
20+
```
21+
22+
**Note**: Most examples use OpenAI's GPT-5.1. Ensure your API key has appropriate permissions and sufficient quota.
23+
24+
## Learning Path
25+
26+
**Recommended order for learning PydanticAI**:
27+
28+
1. **[Direct Model Requests](direct_model_request/)** - Understand basic LLM API calls
29+
2. **[Temperature](temperature/)** - Understand model parameters
30+
3. **[Reasoning Effort](reasoning_effort/)** - Uncover how the reasoning effort may change the model's output
31+
4. **[Basic Sentiment](basic_sentiment/)** - Learn structured outputs with Pydantic
32+
5. **[Dynamic Classification](dynamic_classification/)** - Runtime schema generation
33+
6. **[Bielik](bielik_example/)** - Local models and tools
34+
7. **[History Processor](history_processor/)** - Multi-turn conversations
35+
8. **[OCR Parsing](ocr_parsing_demo/)** - Complex real-world document processing
36+
37+
## Examples Overview
38+
339
### 1. Direct Model Requests
440

541
**Location**: `direct_model_request/`
@@ -32,6 +68,7 @@ Demonstrates reasoning_effort parameter for gpt-5.2.
3268

3369
- Control depth of internal reasoning
3470
- Complex problem-solving examples
71+
- Trade-off between accuracy and latency
3572

3673
[View Example →](reasoning_effort/)
3774

@@ -71,6 +108,36 @@ Learn to use **Bielik**, a Polish language LLM, with PydanticAI running locally
71108

72109
[View Example →](bielik_example/)
73110

111+
### 7. Conversation History Management
112+
113+
**Location**: `history_processor/`
114+
115+
Learn how to manage conversation history in AI agents.
116+
117+
- Basic history handling and inspection
118+
- Multi-turn conversations with context awareness
119+
- History persistence (JSON and Database)
120+
- Advanced filtering and transformation
121+
- Context window management strategies (fixed, dynamic, and tool-aware)
122+
- Production-ready database archival
123+
124+
[View Example →](history_processor/)
125+
126+
### 8. OCR Parsing with Data Validation
127+
128+
**Location**: `ocr_parsing/`
129+
130+
Learn how to work with documents using PydanticAI for OCR (Optical Character Recognition).
131+
132+
- **Basic OCR**: Unstructured text extraction from PDFs
133+
- **Structured Output**: Type-safe document analysis with schema validation
134+
- **Validation Errors**: Error handling when LLM output doesn't match schema
135+
- PDF to image conversion pipeline
136+
- Parallel async processing with concurrency control
137+
- Production-ready document processing patterns
138+
139+
[View Example →](ocr_parsing/)
140+
74141
## Quick Start
75142

76143
### Setup
@@ -107,82 +174,100 @@ cd dynamic_classification
107174
uv run dynamic_classifier.py
108175

109176
# Bielik local model examples
177+
# Note: Requires Ollama setup (see bielik_example/README.md)
110178
cd bielik_example
111-
uv run python bielik_basic_inference.py
112-
uv run python bielik_basic_tools.py
179+
uv run bielik_basic_inference.py
180+
uv run bielik_basic_tools.py
181+
182+
# History processor - Run individual examples
183+
cd history_processor
184+
uv run 1_basic_history_handling.py
185+
uv run 2_continuous_history.py
186+
uv run 3_history_usage.py
187+
uv run 4_history_filtering.py
188+
uv run 5a_history_length_fixed.py
189+
uv run 5b_history_length_dynamic.py
190+
uv run 5c_history_with_tools.py
191+
uv run 6_persistent_history.py
192+
193+
# OCR Parsing - Run examples in order
194+
cd ocr_parsing
195+
uv run 1_basic_ocr_demo.py
196+
uv run 2_ocr_with_structured_output.py
197+
uv run 3_ocr_validation.py # Uncomment validation line in code first
113198
```
114199

115-
### 6. History Processor
200+
## Key Concepts Demonstrated
116201

117-
**Location**: `history_processor/`
202+
### Agents
118203

119-
Learn how to manage conversation history in AI agents.
204+
Most examples use PydanticAI's `Agent` class, which wraps an LLM with:
120205

121-
- Basic history handling and inspection
122-
- Multi-turn conversations with context
123-
- History persistence (JSON and Database)
124-
- Advanced filtering and transformation
125-
- Context window management strategies (fixed, dynamic, and tool-aware)
126-
- Production-ready database archival
206+
- System prompts to guide behavior
207+
- Output type schemas for structured responses
208+
- Async/await support for concurrent requests
127209

128-
[View Example →](history_processor/)
210+
### Tools
129211

130-
## Quick Start
212+
It's worth noticing that since those are examples, most of them are pretty basic. However, it's easy to add an a tool for given agent. Let's look at **[OCR Parsing](ocr_parsing/) code.
131213

132-
### Setup - General
214+
Currently the Agent does all the work itself - classifies document, parses the output, does the OCR and so on for every document in the same way. But what if we'd like to have a different behavior based on the document type?
133215

134-
```bash
135-
# Install dependencies
136-
uv sync
216+
```python
217+
from pydantic_ai import Agent, RunContext
218+
from my_schemas import OCRInvoiceOutput, ReportOcrOutput
137219

138-
# Set API key
139-
echo "OPENAI_API_KEY=your-key" > .env
220+
# The Agent acts as a router, deciding which tool to call
221+
# based on the document's visual or textual cues.
222+
agent = Agent(
223+
'openai:gpt-5.1',
224+
system_prompt="Analyze the document and use the appropriate tool for parsing."
225+
)
226+
227+
@agent.tool
228+
async def parse_invoice(ctx: RunContext[MyDeps], data: bytes) -> OCRInvoiceOutput:
229+
"""Use this tool when the document is identified as an Invoice."""
230+
# Your specialized OCR & validation logic here
231+
return await ctx.deps.ocr_service.process(data, schema=OCRInvoiceOutput)
232+
233+
@agent.tool
234+
async def parse_report(ctx: RunContext[MyDeps], data: bytes) -> ReportOcrOutput:
235+
"""Use this tool when the document is a multi-page Annual Report."""
236+
# Custom logic for complex reports
237+
return await ctx.deps.ocr_service.process(data, schema=ReportOcrOutput)
140238
```
141239

142-
### Run Examples 1-5
240+
### Structured Outputs
143241

144-
```bash
145-
# Direct model requests
146-
cd direct_model_request
147-
uv run direct_request_demo.py
242+
Examples show how to enforce type safety using Pydantic `BaseModel`:
148243

149-
# Model parameters
150-
cd temperature
151-
uv run temperature_demo.py
244+
- Basic classification: `Literal` types
245+
- Dynamic classification: `create_model()` for runtime schemas
246+
- OCR parsing: Complex nested schemas with validation
152247

153-
# Reasoning effort
154-
cd reasoning_effort
155-
uv run reasoning_demo.py
248+
### Async Concurrency
156249

157-
# Basic sentiment classifier
158-
cd basic_sentiment
159-
uv run sentiment_classifier.py
250+
Several examples demonstrate async patterns:
160251

161-
# Dynamic classifier
162-
cd dynamic_classification
163-
uv run dynamic_classifier.py
164-
```
252+
- Parallel processing with `asyncio.gather()`
253+
- Semaphore-based rate limiting
254+
- Efficient handling of multiple documents
165255

166-
### Run Example 6 - History Processor
256+
### Context & History
167257

168-
```bash
169-
cd history_processor
258+
Learn how to manage conversational context:
170259

171-
# Configure environment
172-
cp .env.example .env
173-
# Edit .env and add your OpenAI API key (i.e., with `nano`)
174-
nano .env
175-
176-
# Run individual examples
177-
uv run python 1_basic_history_handling.py
178-
uv run python 2_continuous_history.py
179-
uv run python 3_history_usage.py
180-
uv run python 4_history_filtering.py
181-
uv run python 5a_history_length_fixed.py
182-
uv run python 5b_history_length_dynamic.py
183-
uv run python 5c_history_with_tools.py
184-
uv run python 6_persistent_history.py
185-
```
260+
- Persistent history storage
261+
- Token-aware context windowing
262+
- History filtering and transformation
263+
264+
### Local Models
265+
266+
Bielik example shows alternative to cloud APIs:
267+
268+
- Local model serving with Ollama
269+
- Custom tool integration
270+
- Same agent patterns as OpenAI models
186271

187272
## Project Structure
188273

@@ -218,13 +303,54 @@ uv run python 6_persistent_history.py
218303
│ ├── 5c_history_with_tools.py
219304
│ ├── 6_persistent_history.py
220305
│ ├── README.md
221-
│ ├── .env.example
222-
│ └── pyproject.toml
306+
│ ├── output_3.json
307+
├── ocr_parsing/
308+
│ ├── 1_basic_ocr_demo.py
309+
│ ├── 2_ocr_with_structured_output.py
310+
│ ├── 3_ocr_validation.py
311+
│ ├── README.md
312+
│ ├── files/
313+
│ │ ├── samples/ # Sample PDF documents
314+
│ │ ├── temp_files/ # Temporary image files during processing
315+
│ │ ├── results/ # Output JSON files
223316
├── pyproject.toml
224317
└── README.md
225318
```
226319

320+
## Common Issues & Troubleshooting
321+
322+
### API Key Issues
323+
324+
- Ensure `OPENAI_API_KEY` is set in `.env`
325+
- Verify key has appropriate permissions
326+
- Check for rate limiting (503 errors)
327+
328+
### Import Errors
329+
330+
- Run `uv sync` to install all dependencies
331+
- Verify you're using Python 3.10+
332+
333+
### Async Issues
334+
335+
- Some examples require async-compatible event loops
336+
- On Windows, you may need to set event loop policy
337+
338+
### OCR-Specific Issues
339+
340+
- **poppler not found**: Install via your package manager (brew/apt/choco)
341+
- **PDF conversion fails**: Ensure PDF is valid and readable
342+
- **Rate limiting**: Reduce semaphore value in `ocr_parsing/shared_fns.py`
343+
344+
See individual example READMEs for specific setup requirements.
345+
227346
## Resources
228347

229-
- [Pydantic AI Documentation](https://ai.pydantic.dev/)
230-
- [Python Documentation](https://docs.python.org/)
348+
- [Python Documentation](https://docs.python.org/3/)
349+
- [PydanticAI Documentation](https://ai.pydantic.dev/)
350+
- [Pydantic Documentation](https://docs.pydantic.dev/)
351+
- [OpenAI API Reference](https://platform.openai.com/docs/api-reference)
352+
- [Python asyncio Guide](https://docs.python.org/3/library/asyncio.html)
353+
354+
## Contributing
355+
356+
Found an issue or have an improvement? Feel free to contribute to this example repository.

bielik_example/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,9 +83,9 @@ For the tools example, you'll need a free API key:
8383
2. Sign up for a free account
8484
3. Create a `.env` file in this directory:
8585

86-
```
87-
WEATHER_API_KEY=your_key_here
88-
```
86+
```bash
87+
WEATHER_API_KEY=your_key_here
88+
```
8989

9090
## Running the Examples
9191

history_processor/1_basic_history_handling.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
def main() -> None:
2020
"""Run basic history inspection example."""
2121
# Create a basic agent
22-
agent = Agent(model="openai:gpt-4o", system_prompt="Be a helpful assistant")
22+
agent = Agent(model="openai:gpt-5.1", system_prompt="Be a helpful assistant")
2323

2424
# Run a single inference
2525
prompt = "Tell me a funny joke. Respond in plain text."

history_processor/2_continuous_history.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
def main() -> None:
2222
"""Run multi-turn conversation example."""
2323
# Create agent
24-
agent = Agent(model="openai:gpt-4o", system_prompt="Be a helpful assistant")
24+
agent = Agent(model="openai:gpt-5.1", system_prompt="Be a helpful assistant")
2525

2626
# First turn: Agent generates a joke
2727
prompt_1 = "Provide a really, really funny joke. Respond in plain text."

history_processor/3_history_usage.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
def main() -> None:
2222
"""Run multi-turn conversation with persistence example."""
2323
# Create agent
24-
agent = Agent(model="openai:gpt-4o", system_prompt="Be a helpful assistant")
24+
agent = Agent(model="openai:gpt-5.1", system_prompt="Be a helpful assistant")
2525

2626
# Turn 1: Get initial motto
2727
log.info("\n=== Turn 1 ===")

history_processor/4_history_filtering.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,13 +59,13 @@ def main() -> None:
5959

6060
# Example 1: Summarize only user messages
6161
log.info("\n=== Filtering: User Messages Only ===")
62-
agent_user = Agent("openai:gpt-4o", history_processors=[user_message_filter])
62+
agent_user = Agent("openai:gpt-5.1", history_processors=[user_message_filter])
6363
result_1 = agent_user.run_sync("Please summarize the whole chat history until now.", message_history=history)
6464
log.info(f"Summary (user messages only):\n{result_1.output}")
6565

6666
# Example 2: Attempt to filter only model messages (will fail)
6767
log.info("\n=== Filtering: Model Messages Only ===")
68-
agent_model = Agent("openai:gpt-4o", history_processors=[model_message_filter])
68+
agent_model = Agent("openai:gpt-5.1", history_processors=[model_message_filter])
6969
try:
7070
result_2 = agent_model.run_sync("Please summarize the whole chat history until now.", message_history=history)
7171
log.info(f"Summary (model messages only):\n{result_2.output}")

history_processor/5a_history_length_fixed.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ def main() -> None:
5151

5252
# Create agent with message count limiter
5353
log.info("\n=== Agent with Fixed Message Limit (last 3) ===")
54-
agent_1 = Agent("openai:gpt-4o", history_processors=[keep_last_messages])
54+
agent_1 = Agent("openai:gpt-5.1", history_processors=[keep_last_messages])
5555
result_1 = agent_1.run_sync("What were we talking about?", message_history=history)
5656
log.info(f"Answer (with truncated history):\n{result_1.output}")
5757

history_processor/5b_history_length_dynamic.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
# `tiktoken` is used for OpenAI models, therefore if you're going to
2424
# use different model provided, this bit will need to be changed
2525
# to different tokenizer that corresponding to model used
26-
tokenizer = tiktoken.encoding_for_model("gpt-4o")
26+
tokenizer = tiktoken.encoding_for_model("gpt-5.1")
2727

2828

2929
@dataclass
@@ -58,7 +58,7 @@ def estimate_tokens(messages: list[ModelMessage]) -> int:
5858
# of this example, threshold is set low for the logic to trigger. Usually,
5959
# this value is much bigger and corresponds to used model's context
6060
# window size. To fully utilize model processing capabilities it is best to
61-
# set this value close to context size. For `gpt-4o` model this value is
61+
# set this value close to context size. For `gpt-5.1` model this value is
6262
# equal to 128_000 tokens
6363

6464

@@ -100,7 +100,7 @@ def main() -> None:
100100

101101
log.info("\n=== Agent with Dynamic Token-Based Context Guard ===")
102102
agent_2 = Agent(
103-
"openai:gpt-4o",
103+
"openai:gpt-5.1",
104104
deps_type=MemoryState,
105105
history_processors=[context_guard],
106106
system_prompt="You are a helpful and concise assistant.",

0 commit comments

Comments
 (0)