Agenta-AI
diff --git a/‎api/oss/src/core/auth/supertokens/overrides.py‎
Lines changed: 13 additions & 0 deletions b/‎api/oss/src/core/auth/supertokens/overrides.py‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎docs/blog/entries/projects-within-organizations.mdx‎
Lines changed: 1 addition & 1 deletion b/‎docs/blog/entries/projects-within-organizations.mdx‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/blog/entries/testset-versioning.mdx‎
Lines changed: 103 additions & 0 deletions b/‎docs/blog/entries/testset-versioning.mdx‎
Lines changed: 103 additions & 0 deletions
diff --git a/‎docs/blog/main.mdx‎
Lines changed: 12 additions & 0 deletions b/‎docs/blog/main.mdx‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/docs/misc/01-opensource.mdx‎
Lines changed: 2 additions & 1 deletion b/‎docs/docs/misc/01-opensource.mdx‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/docs/misc/02-faq.mdx‎
Lines changed: 0 additions & 66 deletions b/‎docs/docs/misc/02-faq.mdx‎
Lines changed: 0 additions & 66 deletions
diff --git a/‎docs/docs/misc/faq/_category_.json‎
Lines changed: 10 additions & 0 deletions b/‎docs/docs/misc/faq/_category_.json‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/docs/misc/faq/index.mdx‎
Lines changed: 43 additions & 0 deletions b/‎docs/docs/misc/faq/index.mdx‎
Lines changed: 43 additions & 0 deletions
diff --git a/‎docs/docs/misc/faq/integrations/_category_.json‎
Lines changed: 10 additions & 0 deletions b/‎docs/docs/misc/faq/integrations/_category_.json‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/docs/misc/faq/integrations/llm-providers.mdx‎
Lines changed: 15 additions & 0 deletions b/‎docs/docs/misc/faq/integrations/llm-providers.mdx‎
Lines changed: 15 additions & 0 deletions
@@ -258,6 +258,19 @@ async def _create_account(email: str, uid: str) -> bool:
 
         payload["organization_id"] = str(organization_db.id)
         await create_accounts(payload)
+
+    if env.posthog.enabled and env.posthog.api_key:
+        try:
+            posthog.capture(
+                distinct_id=email,
+                event="user_signed_up_v1",
+                properties={
+                    "source": "auth",
+                    "is_ee": is_ee(),
+                },
+            )
+        except Exception:
+            log.error("[AUTH] Failed to capture PostHog signup event", exc_info=True)
     log.info("[AUTH] _create_account done", email=email, uid=uid)
     return True
 
 
@@ -36,4 +36,4 @@ Projects work well when you need to:
 
 If you're managing complex AI initiatives across multiple products, projects give you the structure to keep everything organized. You can create your first project from the sidebar and start organizing your prompts and evaluations.
 
-For questions about projects or organizational structure, check the [FAQ](/docs/misc/faq) or reach out through our [support channels](/docs/misc/getting_support).
+For questions about projects or organizational structure, check the [FAQ](/docs/faq) or reach out through our [support channels](/docs/misc/getting_support).
@@ -0,0 +1,103 @@
+---
+title: "Test Set Versioning and New Test Set UI"
+slug: testset-versioning
+date: 2026-01-20
+tags: [v0.74.0]
+description: "Track test set changes with versioning and link evaluations to specific versions. Plus a completely rebuilt test set UI that scales to hundreds of thousands of rows."
+---
+
+# Test Set Versioning and New Test Set UI
+
+## Overview
+
+When you compare evaluation results from last week to today, how do you know the test data didn't change? You don't. Until now.
+
+Test set versioning tracks every change to your test sets. Each edit, upload, or programmatic update creates a new version. Evaluations link to specific versions, so you can trust your comparisons.
+
+We also rebuilt the test set UI from scratch. It handles hundreds of thousands of rows without slowing down. Editing is faster, especially for chat messages and complex JSON data.
+
+<div style={{display: 'flex', justifyContent: 'center', marginTop: "20px", marginBottom: "20px", flexDirection: 'column', alignItems: 'center'}}>
+  <iframe
+    width="100%"
+    height="500"
+    src="https://www.youtube.com/embed/hh1OHhzak6Q"
+    title="Test Set Versioning Demo"
+    frameBorder="0"
+    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
+    allowFullScreen
+  ></iframe>
+</div>
+
+## Test Set Versioning
+
+Every change to a test set creates a new version. You can see the version history, compare versions, and revert to previous versions.
+
+**What gets versioned:**
+- Adding, editing, or deleting test cases
+- Uploading new data (CSV, JSON)
+- Programmatic updates via SDK or API
+- Column changes
+
+**Evaluation linking:**
+When you run an evaluation, it links to the specific test set version used. This means:
+- You can compare evaluations knowing they used the same test data
+- If someone updates the test set, your historical evaluations still reference the original version
+- You can filter evaluations by test set version
+
+**Programmatic versioning:**
+Upload test sets via the SDK or API. The system detects changes and creates new versions automatically.
+
+```python
+import agenta as ag
+
+# Upload a test set - creates a new version if content changed
+testset = ag.testsets.upload(
+    name="my-test-set",
+    data=test_cases,  # Your test case data
+)
+
+# The testset object includes version information
+print(f"Version: {testset.version}")
+```
+
+## New Test Set UI
+
+The test set view is completely rebuilt. It uses virtualized rendering, so it stays fast with large datasets.
+
+**What's new:**
+- **Scale**: Handle 100,000+ rows without performance issues
+- **JSON support**: View and edit complex JSON directly. Toggle between raw JSON and formatted views
+- **String or JSON columns**: Choose how each column stores data. Use JSON for structured data like chat messages
+
+**Chat message editing:**
+Test cases with chat messages (like `[{"role": "user", "content": "..."}]`) now have a dedicated editor. Add, remove, or reorder messages. Edit content with proper formatting.
+
+**Upload options:**
+- Upload CSV or JSON files
+- Create test sets in the UI
+- Create programmatically via SDK
+- Add spans from observability to test sets
+
+## Traceability
+
+Everything connects. When you view a trace in observability:
+- See which test case it came from
+- See which test set version
+- Filter traces by test case or test set
+
+When you view an evaluation:
+- See the exact test set version used
+- Compare only evaluations that used the same version
+- Navigate to the test set to see the data
+
+## Getting Started
+
+Test set versioning is automatic. Any change creates a new version.
+
+To use versioned test sets in evaluations:
+1. Create or upload a test set
+2. Make your edits (each save creates a version)
+3. Run an evaluation (it links to the current version)
+4. Later, compare evaluations knowing they used the same test data
+
+For programmatic access, check the [test sets documentation](/evaluation/evaluation-from-sdk/managing-testsets).
@@ -12,6 +12,18 @@ import Image from "@theme/IdealImage";
 
 
 
+### [Test Set Versioning and New Test Set UI](/changelog/testset-versioning)
+
+_20 January 2026_
+
+**v0.74.0**
+
+Test sets now have versioning. Every edit, upload, or programmatic update creates a new version. Evaluations link to specific versions, so you can compare results knowing they used the same test data.
+
+The test set UI is completely rebuilt. It handles hundreds of thousands of rows without slowing down. Editing is much easier, especially for chat messages. You can view and edit complex JSON directly, toggle between raw and formatted views, and choose whether columns store strings or JSON.
+
+---
+
 ### [Playground UX Improvements](/changelog/playground-ux-improvements-jan-2026)
 
 _13 January 2026_
 
@@ -17,6 +17,7 @@ Open source matters for an LLMOps platform like Agenta because it gives you:
 1. **Flexibility**: Modify and customize the software to fit your needs.
 2. **Independence**: Avoid vendor lock-in, ensuring long-term project continuity.
 3. **Community Support**: Benefit from contributions and shared improvements from the community.
+
 ### License Information
 
 The Agenta open-core is licensed under the MIT license. Everything in our public repositories is open source, with no closed-source components included. All SDKs, client libraries, and APIs are fully open source under MIT. You can self-host it, modify it, and use it in commercial projects without restrictions.
@@ -77,4 +78,4 @@ Community support is available via GitHub and Discord. Professional support requ
 ### How do I upgrade from self-hosted open source to commercial features?
 You have two options:
 1. Switch to our cloud offering by [creating an account](https://app.agenta.ai)
-2. [Contact our team](mailto:[email protected]) or [book a call](https://cal.com/mahmoud-mabrouk-ogzgey/demo) for enterprise self-hosting
+2. [Contact our team](mailto:[email protected]) or [book a call](https://cal.com/mahmoud-mabrouk-ogzgey/demo) for enterprise self-hosting
@@ -0,0 +1,10 @@
+{
+  "label": "FAQ",
+  "position": 2,
+  "collapsed": false,
+  "collapsible": true,
+  "link": {
+    "type": "doc",
+    "id": "misc/faq/index"
+  }
+}
@@ -0,0 +1,43 @@
+---
+title: "Frequently Asked Questions"
+description: "Answers to common questions about Agenta"
+slug: /faq
+---
+
+import Link from '@docusaurus/Link';
+
+# Frequently Asked Questions
+
+These FAQs cover common questions about Agenta.
+
+## Platform
+
+<div className="faq-cards">
+  <Link to="/faq/platform/api-rate-limits" className="faq-card">
+    <span className="faq-card-title">What are the API rate limits?</span>
+    <span className="faq-card-desc">Rate limits by endpoint type and plan</span>
+  </Link>
+  <Link to="/faq/platform/data-retention" className="faq-card">
+    <span className="faq-card-title">What is the data retention period?</span>
+    <span className="faq-card-desc">Data retention by plan</span>
+  </Link>
+</div>
+
+## Integrations
+
+<div className="faq-cards">
+  <Link to="/faq/integrations/typescript" className="faq-card">
+    <span className="faq-card-title">Does Agenta work with TypeScript?</span>
+    <span className="faq-card-desc">How to use Agenta from TypeScript and other languages</span>
+  </Link>
+  <Link to="/faq/integrations/llm-providers" className="faq-card">
+    <span className="faq-card-title">What LLM providers does Agenta support?</span>
+    <span className="faq-card-desc">Supported model providers and custom endpoints</span>
+  </Link>
+</div>
+
+<div className="faq-tip-spacer"></div>
+
+:::tip
+If you cannot find your answer, open an issue on [GitHub](https://github.com/agenta-ai/agenta/issues).
+:::
@@ -0,0 +1,10 @@
+{
+  "label": "Integrations",
+  "position": 2,
+  "link": {
+    "type": "generated-index",
+    "title": "Integrations",
+    "description": "Questions about language support, LLM providers, and third party integrations.",
+    "slug": "/faq/integrations"
+  }
+}
@@ -0,0 +1,15 @@
+---
+title: "What LLM providers does Agenta support?"
+description: "Supported model providers and custom endpoints"
+slug: /faq/integrations/llm-providers
+---
+
+Agenta works with almost any provider.
+
+Common providers include OpenAI, Anthropic, Cohere, OpenRouter, Perplexity AI, TogetherAI, DeepInfra, Groq, Gemini, Mistral AI, and Ollama.
+
+Agenta also supports AWS Bedrock, Azure OpenAI, and Vertex AI.
+
+You can add any OpenAI compatible endpoint, including self hosted models and custom deployments.
+
+See [custom providers](/prompt-engineering/playground/custom-providers).
Original file line number	Diff line number	Diff line change
`@@ -36,4 +36,4 @@ Projects work well when you need to:`
`36`	`36`
`37`	`37`	`If you're managing complex AI initiatives across multiple products, projects give you the structure to keep everything organized. You can create your first project from the sidebar and start organizing your prompts and evaluations.`
`38`	`38`
`39`		`-For questions about projects or organizational structure, check the [FAQ](/docs/misc/faq) or reach out through our [support channels](/docs/misc/getting_support).`
	`39`	`+For questions about projects or organizational structure, check the [FAQ](/docs/faq) or reach out through our [support channels](/docs/misc/getting_support).`