|
1 | | -### Q4 2025 objectives |
| 1 | +### Q1 2026 objectives |
2 | 2 |
|
3 | | -#### Goal 1: Evals |
| 3 | +#### Goal 1: Feature GA Releases |
4 | 4 |
|
5 | | -*Description*: AI systems are inherently unpredictable, outputs often vary even when inputs are the same. Evals will give teams a structured way to measure output quality, detect regressions, and prioritize improvements as their AI applications evolve. |
| 5 | +*Description*: Move several features currently in alpha/beta to general availability. |
6 | 6 |
|
7 | 7 | *What we will ship*: |
8 | | -- Support for **LLM-as-a-judge** evaluations during AI event ingestion |
9 | | -- **Semantic Clustering and Labeling** to group similar outputs and spot patterns |
10 | | -- Ability to run Evals in CI pipelines to catch issues before production |
11 | | -- Tools for creating and managing **Datasets** for consistent, repeatable evaluations |
| 8 | +- **Evaluations** to GA - online LLM-as-a-Judge evaluations for measuring AI output quality |
| 9 | +- **Prompts** to GA - prompt management directly in PostHog |
| 10 | +- **Clustering** to GA - automatic grouping of similar traces and outputs |
| 11 | +- **Errors** to GA - grouped error tracking for LLM applications |
| 12 | +- **Sessions** to GA - session-level observability |
| 13 | +- **Playground** to GA - interactive testing environment for prompts and models |
| 14 | +- **LLM translation** to GA - translation of non-English LLM traces to English |
| 15 | +- **LLM trace/session summarization** to GA - AI-generated summaries for quick understanding |
12 | 16 |
|
13 | | -#### Goal 2: New Ingestion Pipeline |
| 17 | +#### Goal 2: OpenTelemetry & SDK |
14 | 18 |
|
15 | | -*Description*: A dedicated ingestion pipeline built specifically for LLM events, enabling better performance and new capabilities for LLM tracking. Without this, models with large context windows might exceed our current event size limits, preventing proper tracking. This pipeline is also a prerequisite for multimodal support and compliance requirements. |
| 19 | +*Description*: Add OpenTelemetry support and rethink our SDK architecture to better serve diverse integration patterns. |
16 | 20 |
|
17 | 21 | *What we will ship*: |
18 | | -- Independent ingestion endpoint optimized for LLM event data |
19 | | -- Better scaling for high-volume customers |
| 22 | +- **OpenTelemetry support** - native Otel integration for teams using OpenTelemetry instrumentation |
| 23 | +- **SDK improvements** - rethink and improve our SDK architecture for better developer experience |
20 | 24 |
|
21 | | -#### Goal 3: Query Performance Improvements |
| 25 | +#### Goal 3: Evals |
22 | 26 |
|
23 | | -*Description*: Faster, more reliable queries in the LLM Analytics dashboard. |
| 27 | +*Description*: Continue building out our evaluations system to help teams measure AI output quality at scale. This includes expanding how evaluations can be created and run, and improving how results surface actionable insights. |
24 | 28 |
|
25 | 29 | *What we will ship*: |
26 | | -- Optimized LLM events storage for fast querying, even for customers with large datasets |
| 30 | +- **Code-based evaluations** for evaluating LLM outputs using user-defined code evaluators instead of LLM-as-a-Judge |
| 31 | +- **Offline evaluations** based on datasets for consistent, repeatable testing |
| 32 | +- **Alerts and surfacing** to proactively notify teams of evaluation issues via alerts, news feed, and other channels |
27 | 33 |
|
28 | | -#### Goal 4: Multi-Modal Messages |
| 34 | +#### Goal 4: Ingestion Pipeline |
29 | 35 |
|
30 | | -*Description*: Support for including images, audio, and other media types alongside text in LLM events. |
| 36 | +*Description*: Complete our new ingestion pipeline optimized for LLM events, enabling better performance and new capabilities. |
31 | 37 |
|
32 | 38 | *What we will ship*: |
33 | | -- Allow sending and storing non-text data (images, audio, video) in LLM events |
| 39 | +- **New ingestion pipeline** launch - dedicated pipeline optimized for LLM events |
| 40 | +- **Multimodal support** - ingest and store images, audio, and other media in LLM events |
34 | 41 |
|
35 | | -#### Goal 5: Prompt Management |
| 42 | +#### Goal 5: Docs, Onboarding & Wizard |
36 | 43 |
|
37 | | -*Description*: Manage and optimize your prompts directly in PostHog, with your application fetching the latest versions programmatically. |
| 44 | +*Description*: Make it easier for new users to get started with LLM Analytics across different frameworks and tools. |
38 | 45 |
|
39 | 46 | *What we will ship*: |
40 | | -- Create and edit prompts in the PostHog UI |
41 | | -- Prompts version control |
42 | | -- Performance metrics for different prompt versions |
| 47 | +- **Framework guides** - documentation for every major LLM framework |
| 48 | +- **Wizard support** - add LLM Analytics to the PostHog setup wizard for seamless onboarding |
| 49 | + |
| 50 | +#### Goal 6: PostHog AI Integration |
| 51 | + |
| 52 | +*Description*: Integrate LLM Analytics capabilities with PostHog AI to enable powerful search and insights across traces. |
| 53 | + |
| 54 | +*What we will ship*: |
| 55 | +- **Agentic search** - find specific traces using natural language queries and get insights |
| 56 | +- **Find traces via evals** - search for traces based on evaluation results (passing or failing) |
| 57 | +- **Trace/session summarization** via AI - leverage PostHog AI for generating summaries |
| 58 | +- **Trace translation** via AI - use PostHog AI for translating traces into English |
0 commit comments