fix(llma): streaming providers with tool calls #319

Radu-Raicea · 2025-08-29T21:50:52Z

There are two things happening in this PR:

Multiple fixes related to streaming providers and tool calls.
A significant refactoring that applies DRY principles to the providers by introducing a converter pattern for providers. The refactoring is not perfect, but this is a big step in a good direction, and we should continue improving on it.

The fixes are the following:

Anthropic's streaming implementation was missing $ai_tools
Anthropic's streaming implementation was missing tool calls in $ai_output_choices
Gemini's streaming and non-streaming implementations were not sending text messages in the correct parsable format
Gemini's streaming implementation was missing $ai_tools
Gemini's streaming implementation was missing tool calls in $ai_output_choices
OpenAI Chat Completions' streaming implementation was missing tool calls in $ai_output_choices

Resolved conflicts by: - Keeping both StreamingEventData approach and new sanitization imports - Applying sanitization to formatted inputs before passing to StreamingEventData - Ensuring privacy mode and special token fields are handled by capture_streaming_event

greptile-apps

_{17 files reviewed, 3 comments}

_{Edit Code Review Bot Settings | Greptile}

posthog/ai/utils.py

posthog/ai/anthropic/anthropic_async.py

posthog/ai/gemini/gemini_converter.py

carlos-marchal-ph

I really like the direction of this PR. One of my first thoughts when I started working on this repo was that the code was more complex and repetitive than probably needed. This starts addressing this in a reasonable way.

The PR itself looks good to me, but when testing it locally I ran into some issues with OpenAI. I tried reproducing them in master, and the first one is also there:

OpenAI Responses streaming has 0 token count and is missing the assistant response after a tool call

OpenAI Chat completions is missing the assistant response after a tool call

Other than that I have some more comments on the direction we want to head towards, I'll add them as a separate comment so we can discuss. They are non blocking as they probably belong on separate PRs.

posthog/ai/types.py

posthog/ai/anthropic/anthropic.py

posthog/version.py

carlos-marchal-ph · 2025-09-03T17:33:13Z

In terms of strategy moving forward, I still think there's quite a bit of room for improvement. I don't think it belongs in this PR, but I'm just writing it out to sync on it.

There is still quite a bit of unnecessary code repetition in the repo. I think some of these utilities could probably be shared with the non-streaming implementation. The data types are the first that come to mind, but I'm sure there are many transformations that could be reused.

There's also still some code that's repeated in the sync and async implementations, such as the core event listening loop. Maybe we would benefit from having a class handling the entire event loop, which we could reuse across implementations, and which could hold some data that we are currently passing around every time.

More generally, I think some of these utilities should probably be decomposed into smaller classes in smaller files, with a clearer separation of concerns. The current approach of exporting a bunch of functions from a single file is not helping with readability or testability.

Again, these are all unordered thoughts, which I'm sure you've also had at some point. We can tackle this in other PRs down the road as we work on other stuff. Just wanted to write this down to see if you agree or if you think otherwise on some of these points.

Radu-Raicea · 2025-09-03T17:34:52Z

In terms of strategy moving forward, I still think there's quite a bit of room for improvement. I don't think it belongs in this PR, but I'm just writing it out to sync on it.

There is still quite a bit of unnecessary code repetition in the repo. I think some of these utilities could probably be shared with the non-streaming implementation. The data types are the first that come to mind, but I'm sure there are many transformations that could be reused.

There's also still some code that's repeated in the sync and async implementations, such as the core event listening loop. Maybe we would benefit from having a class handling the entire event loop, which we could reuse across implementations, and which could hold some data that we are currently passing around every time.

More generally, I think some of these utilities should probably be decomposed into smaller classes in smaller files, with a clearer separation of concerns. The current approach of exporting a bunch of functions from a single file is not helping with readability or testability.

Again, these are all unordered thoughts, which I'm sure you've also had at some point. We can tackle this in other PRs down the road as we work on other stuff. Just wanted to write this down to see if you agree or if you think otherwise on some of these points.

Completely agreed! If you have an opportunity to take steps in those directions, like I did in this PR, I will gladly review those PRs :D

Radu-Raicea · 2025-09-03T17:38:35Z

@carlos-marchal-ph

The LLM returns the tool call, but the handling of the tool call (and so the response with the weather) is not sent to LLMA unless you create a span event, or add it to the input of the next LLM call.

carlos-marchal-ph

Ah gotcha, missing context on my end then. Any idea about the 0 input 0 output tokens thing? In any case since it's happening on master too, it's probably unrelated to this PR, so approving 👏

Radu-Raicea · 2025-09-03T18:19:49Z

Fixed the Responses API streaming tokens, nice catch!

Radu-Raicea added 7 commits August 29, 2025 15:03

fix(llma): tool calls in streaming Anthropic

00b4e5a

fix(llma): Gemini content

4edab3d

fix(llma): extract converters for providers

43af2c3

fix(llma): continuation of DRY refactoring

b3a7f72

fix(llma): add $ai_tools to streaming Gemini

1ab9e9d

fix(llma): tool calls in streaming Gemini

02889ff

fix(llma): tool calls in streaming OpenAI Chat Completions

722eead

Radu-Raicea changed the title ~~Fix/llma streaming providers with tool calls~~ fix(llma): streaming providers with tool calls Sep 2, 2025

Radu-Raicea added 6 commits September 2, 2025 12:15

fix(llma): fix test

ecb9737

fix(llma): run ruff

686da52

fix(llma): fix types

e4c4884

fix(llma): run ruff

f2985cd

chore(llma): run mypy baseline sync

9af7527

Radu-Raicea marked this pull request as ready for review September 2, 2025 17:55

Radu-Raicea requested a review from a team September 2, 2025 17:56

greptile-apps bot reviewed Sep 2, 2025

View reviewed changes

posthog/ai/utils.py Outdated Show resolved Hide resolved

posthog/ai/anthropic/anthropic_async.py Outdated Show resolved Hide resolved

posthog/ai/gemini/gemini_converter.py Outdated Show resolved Hide resolved

Radu-Raicea added 2 commits September 2, 2025 14:00

chore(llma): bump version

4d34ffd

fix(llma): fix test

2468618

Radu-Raicea mentioned this pull request Sep 3, 2025

fix: dynamically extract posthog params from OpenAI calls PostHog/posthog-js#2272

Merged

carlos-marchal-ph reviewed Sep 3, 2025

View reviewed changes

posthog/ai/types.py Show resolved Hide resolved

posthog/ai/anthropic/anthropic.py Show resolved Hide resolved

posthog/version.py Show resolved Hide resolved

chore(llma): update CHANGELOG

661909a

carlos-marchal-ph approved these changes Sep 3, 2025

View reviewed changes

fix(llma): Responses API streaming tokens

cb1e463

Radu-Raicea added 2 commits September 3, 2025 14:20

fix(llma): run ruff

7d5e35e

fix(llma): run ruff

2209a80

Radu-Raicea enabled auto-merge (squash) September 3, 2025 18:31

chore(llma): run mypy-baseline sync

bd40d3e

Radu-Raicea merged commit 08b11cb into master Sep 3, 2025
10 checks passed

Radu-Raicea deleted the fix/llma-streaming-providers-with-tool-calls branch September 3, 2025 20:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(llma): streaming providers with tool calls #319

fix(llma): streaming providers with tool calls #319

Uh oh!

Radu-Raicea commented Aug 29, 2025 •

edited

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

carlos-marchal-ph left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

carlos-marchal-ph commented Sep 3, 2025

Uh oh!

Radu-Raicea commented Sep 3, 2025

Uh oh!

Radu-Raicea commented Sep 3, 2025

Uh oh!

carlos-marchal-ph left a comment

Uh oh!

Radu-Raicea commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix(llma): streaming providers with tool calls #319

fix(llma): streaming providers with tool calls #319

Uh oh!

Conversation

Radu-Raicea commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

carlos-marchal-ph left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

carlos-marchal-ph commented Sep 3, 2025

Uh oh!

Radu-Raicea commented Sep 3, 2025

Uh oh!

Radu-Raicea commented Sep 3, 2025

Uh oh!

carlos-marchal-ph left a comment

Choose a reason for hiding this comment

Uh oh!

Radu-Raicea commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Radu-Raicea commented Aug 29, 2025 •

edited

Loading