Skip to content

Conversation

@Twixes
Copy link
Member

@Twixes Twixes commented Nov 13, 2025

Changes

An early draft. HN, let's face off.

@vercel
Copy link

vercel bot commented Nov 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Updated (UTC)
posthog Ready Ready Preview Nov 25, 2025 8:53pm

@cleo-pleurodon
Copy link
Contributor

Related art request

@Twixes Twixes force-pushed the draft-intelligence branch from 8417ccd to fd95efe Compare November 21, 2025 23:43
@Twixes Twixes changed the title Draft AI launch post PostHog AI launch post Nov 21, 2025
@cleo-pleurodon
Copy link
Contributor

cleo-pleurodon commented Nov 23, 2025

Unfiltered feedback as requested @Twixes

  1. Who is this post for? Why should they care what you're excited about? I feel the passion in your writing, but right now it reads like pure marketing.

  2. Too much setup. IMO you can cut the entire “software has eaten the world” preamble, and cut the 'knowledge curse', 'missing link' sections by 60%. This sentence is really good:

The information age has been great – it's just also been too much. Too much information captured: business events, clicks, pageviews, feedback, entities, errors, entire user sessions.

The reader already knows the rest of what you presented just by being alive in the modern business world. Tie it to the visual metaphor of knitting data noodles into structured, useful outputs, and that's all you need to set up the meat of the post.

Example: "Product data is usually a bowl of noodles — long, tangled strands of events, sessions, errors and clicks. Humans have to spend hours untangling them just to understand the plot. PostHog AI pulls the strands apart, then knits them into a coherent chain you can act on immediately."

  1. If this is meant to be more than a marketing piece, tell the reader something they can't get from the docs or our website. For example, what was the dev timeline of events like?

1. ReAct gave us the architecture
4. LangChain built the infrastructure
5. GPT-4 provided the intelligence
6. AutoGPT proved it worked
7. Function Calling made it native
8. MCP standardized everything

  • Where does PostHog AI fit into this timeline?
  • What did we learn in beta?
  • Where did things break, go wrong, get delayed?
  • Did you change your opinion about anything as you built it?
  • WHY NOW? Why is PostHog AI ready to come out of beta?
  1. Tell me about the gold nuggets of AI at PostHog. Use this as a CTA for AI engineers to join our team.
  • Tool gating
  • Anchored state
  • Page-aware agent state
  • Self-correction loops
  • What else is benchmark vs cutting edge?


## How it works

Our number one learning: subagents are dangerous, as full context makes all the difference. We're rolling out PostHog AI as a single-loop agent that instead switches modes depending on the set of tools needed for the moment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 'how it works' section is the part I want most as a reader.
Add the real timeline and scars.

Example:
"v1: we tried specialized subagents (analysis, SQL, experiment-setup → Succeeded in isolation, failed in stitching"
“Task completion went from 23% → 81% after switching to single-loop.”
"When Anthropic released X model, it changed our workflow because..."


The two model-related step changes in our implementation were:

- introduction of cost-effective reasoning with o4-mini – this significantly improved and simplified creation of complex queries, especially those requiring data exploration to get right, ReAct BeGone!
Copy link
Contributor

@cleo-pleurodon cleo-pleurodon Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introduction of cost-effective reasoning with OpenAI's o4-mini


![https://res.cloudinary.com/dmukukwp6/image/upload/Screenshot_2025_11_24_at_21_22_20_ddf57a0dd4.png]

The truth is, graphs are a terrible way of orchestrating free-form work. In a graph, the LLM can't self-correct and context is all too easily lost. Today, thanks to the advances in model capabilities, the PostHog AI architecture is oddly straightforward:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oddly pleasingly straightforward


The elegance of LiteLLM or Vercel AI crumbles when AI providers add new models or new features. Just check out the mess that web search results are, as OpenAI and Anthropic format them completely differently, while the frameworks try to maintain one facade.

But never, _ever_ store your state with an opinionated framework. Roll your own state. The orchestration layer of a framework like LangGraph – fine. Perhaps it's [obsolete now](#agents-beat-workflows), but it's just a way of calling functions. The state management layer of LangGraph – little-death that brings total obliteration. A custom format that relies on Python's pickling format is the mind-killer. These sorts of frameworks are fun at first, then a pain to migrate from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a little lost in this paragraph. Can you explain a bit further for the AI newcomer reader like me?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, I think this paragraph either explains more technically what is happening with serde and pickling, or it will be too obscure to a reader who doesn't know the internals of LangGraph.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refocused this paragraph on the orchestration aspect of LangGraph, as the serde details seemed just a bit too specific to all go into

Copy link
Contributor

@kappa90 kappa90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Just a few comments


The truth is, graphs are a terrible way of orchestrating free-form work. In a graph, the LLM can't self-correct and context is all too easily lost. Today, thanks to the advances in model capabilities, the PostHog AI architecture is oddly straightforward:

![https://res.cloudinary.com/dmukukwp6/image/upload/Screenshot_2025_11_24_at_21_23_25_edeef723b9.png]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Switch Mode tool will be confusing because I think we don't explain it anywhere, and is not an industry standard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an aside it's our not-so-secret sauce we'll tak more in an upcoming post, similar to tool search!


The elegance of LiteLLM or Vercel AI crumbles when AI providers add new models or new features. Just check out the mess that web search results are, as OpenAI and Anthropic format them completely differently, while the frameworks try to maintain one facade.

But never, _ever_ store your state with an opinionated framework. Roll your own state. The orchestration layer of a framework like LangGraph – fine. Perhaps it's [obsolete now](#agents-beat-workflows), but it's just a way of calling functions. The state management layer of LangGraph – little-death that brings total obliteration. A custom format that relies on Python's pickling format is the mind-killer. These sorts of frameworks are fun at first, then a pain to migrate from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, I think this paragraph either explains more technically what is happening with serde and pickling, or it will be too obscure to a reader who doesn't know the internals of LangGraph.


## 6. Evals are not nearly all you need

[Some](https://x.com/gdb/status/1733553161884127435) say evals can be all you need. For foundations models: certainly, you should curate your datasets. For agents: you should curate datasets too! We've found evals important for making meaningful changes with base confidence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads like this:

  • Some say evals are all you need (hinting that it's not this way)
  • This is certainly true for foundational models (thesis)
  • and it's true for agents too (missed antithesis, here I would have expected you to say: "but for agents, it's a bit more nuanced", and then move to the next paragraph where you explain the synthesis)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, the thesis wasn't clear. I hope it's clarified now

Copy link
Contributor

@tatoalo tatoalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments but looks great, thanks a ton Michael!

Copy link
Contributor

@sortafreel sortafreel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, added a couple of comments, and I think it would add value to have a little bit more screenshots.

@Twixes Twixes enabled auto-merge (squash) November 25, 2025 20:37
@Twixes Twixes merged commit 693fb79 into master Nov 25, 2025
12 checks passed
@Twixes Twixes deleted the draft-intelligence branch November 25, 2025 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants