Skip to content

Conversation

@oliverb123
Copy link
Contributor

This change:

  • Makes distinct_id an optional argument to most event-capturing functions
  • Changes the order of arguments to most event-capturing functions
  • Uses kwargs to make it impossible to upgrade to 6.0.0 without modifying how you call these functions
  • Changes these functions to return the UUID of the captured event, rather than the event contents
  • Deletes identify, page and screen - identify is set with a misleading name, page and screen are just capture
  • Deletes unmaintained exception-capture integrations (users should prefer the general purpose django middleware, and we should prefer to build general purpose integrations)

Generally, the goal of this PR is to move the python SDK towards a place where we can use it as the "template" for all server-side SDKs - focusing on semantic scope based state management, simplified interfaces, and reducing how much we expose the internals of our libraries, reducing the cost of future refactoring and improvements to our users. It's the culmination of the previous PR's I've made re: nested contexts, context-based identity and session management, etc.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Major overhaul of the PostHog Python SDK for version 6.0.0, focusing on simplifying the API and improving maintainability through scope-based state management.

  • Removal of identify, page, and screen functions in favor of more semantically accurate alternatives (set and capture)
  • Changed all event-capturing functions to return event UUIDs instead of event contents
  • Made distinct_id optional and moved to kwargs in event-capturing functions
  • Removed unmaintained exception-capture integrations in favor of general-purpose Django middleware
  • Enforced breaking changes through kwargs-only parameters to ensure intentional upgrades

20 files reviewed, 6 comments
Edit PR Review Bot Settings | Greptile

oliverb123 and others added 5 commits June 24, 2025 13:47
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
@oliverb123
Copy link
Contributor Author

Tagged recent reviewers on this code, as well as ET.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a migration guide somewhere telling people the main things needed to migrate from v4 / v5 to v6

This release contains a number of major breaking changes:
- feat: make distinct_id an optional parameter in posthog.capture and related functions
- feat: make capture and related functions return `Optional[str]`, which is the UUID of the sent event, if it was sent
- fix: remove `identify` (prefer `posthog.set()`), and `page` and `screen` (prefer `posthog.capture()`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the thinking behind removing these methods?

  • I get identify is pretty pointless because it doesn't actually persist across captures in the same way it does for the JS Web SDK. Can't you set the user id with contexts now? I'm guess that's the better way than setting the distcint_id on every capture
  • Worth keeping the page method so that people don't have to familiarize themselves with the $pageview property? Screen I'm less concerned with because it doesn't really make sense on the web and is only used in the mobile SDKs afaik

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, identify is removed because it's semantically confusing - we should push people towards identifying at the context level always.

Re: page, my impression is pageview is generally captured by the frontend?

Copy link
Member

@lricoy lricoy Jun 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several people prefer to use a backend-only instrumentation, this being a framework integration for a non-SPA website (e.g: only django/flask views - not using our utilities) or doing it all themselves on their backend.

I think that using context + capture is enough for them; it is also a more focused API, so I personally prefer it.

Comment on lines -85 to -88
"ip": headers.get("X-Forwarded-For"),
"user_agent": headers.get("User-Agent"),
"traceparent": traceparent,
"$request_path": self.request.path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with ip, user agent and request path. Should we be adding them to the Django integration or are they there already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request path is set as $current_url. Ip, user_agent and traceparent aren't set by the middleware currently. Re: should we be adding them, I guess probably? I can do so in a follow up PR.

Copy link
Member

@lricoy lricoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and I like the direction 😄

I tried the bits I am most aware of locally and they are working nicely.

Just left some comments for context.

Copy link
Contributor

@hpouillot hpouillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added few comments but it looks good to me. There is also a comment saying proxy methods have not been used in 5-6 months, and I think you capture signature will not be used with the recommended way of using posthog. Might be a good opportunity to clean them ?

before_send=my_before_send,
sync_mode=True,
)
msg_uuid = client.capture("user1", "test_event", {"original": "value"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we change this capture call ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change it to what? The client object itself already supported optional distinct IDs (it already being a keyword arg), so only the proxy functions needed to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I got confused, the proxy method and the client method don't have the same signature. Proxy capture method accepts event as first argument while client method takes the distinct_id ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to using Unpack[TypedDict] based kwargs everywhere it seemed to make sense


self.capture_exception_fn(sys.exc_info(), extra_props)

signals.got_request_exception.connect(_got_request_exception)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure the exceptions this previously caught are handled by the try / catch in the new_context scope?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The right way to intercept exceptions on requests is using the middleware, not django signals. If someone has the middleware in their middleware stack, and hasn't disabled exception capture in it, it'll catch their exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our regular autocapture still works the same way, to be clear

@oliverb123 oliverb123 requested review from daibhin and hpouillot June 26, 2025 13:07
Copy link
Contributor

@daibhin daibhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable. I assume you've tested it and are happy. Just a changelog / section in the readme about the upgrade steps would be great

"""
Add a tag to the current context.
Add a tag to the current context. All tags are added as properties to any event, including exceptions, captured
within the context.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@oliverb123 oliverb123 merged commit 37bd301 into master Jun 27, 2025
7 checks passed
@oliverb123 oliverb123 deleted the err/fix-arguments branch June 27, 2025 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants