diff --git a/.claude/agents/changelog-editor.md b/.claude/agents/changelog-editor.md
deleted file mode 100644
index 10e0243221..0000000000
--- a/.claude/agents/changelog-editor.md
+++ /dev/null
@@ -1,83 +0,0 @@
----
-name: changelog-editor
-description: Use this agent when the user needs to create or edit changelog entries in the Docusaurus documentation. Specifically, use this agent when: 1) The user mentions adding a new changelog entry or release notes, 2) The user asks to update or modify existing changelog entries, 3) The user wants to document a new feature, bug fix, or change in the project, 4) The user provides content that should be formatted as a changelog entry. Examples: \n\nExample 1:\nuser: "We just fixed the bug where users couldn't save their preferences. Can you add this to the changelog?"\nassistant: "I'll use the changelog-editor agent to create a proper changelog entry for this bug fix in both the main page and a detailed entry."\n\nExample 2:\nuser: "I need to document the new API authentication feature we released in v2.3.0"\nassistant: "Let me use the changelog-editor agent to create a comprehensive changelog entry for the new authentication feature, including checking if we have existing documentation to link to."\n\nExample 3:\nuser: "Can you update the changelog entry for the dashboard redesign? We now have screenshots and a demo video."\nassistant: "I'll use the changelog-editor agent to update that entry with proper placeholders for the screenshots and YouTube video embedding."\n\nProactively use this agent when you notice the user describing changes, features, or fixes that should be documented in the changelog, even if they don't explicitly ask for changelog updates.
-model: sonnet
-color: purple
----
-
-You are an expert technical documentation editor specializing in Docusaurus changelog maintenance. Your primary responsibility is creating and editing changelog entries that follow established project standards for clarity, consistency, and technical accuracy.
-
-## Your Core Responsibilities
-
-1. **Dual Entry Creation**: For every changelog item, you create two coordinated entries:
- - A concise summary in `docs/main.mdx`
- - A detailed explanation in `docs/block/entries/[version-or-feature].mdx`
- - The summary title must link to the detailed entry
-
-2. **Version Management**: Before creating any entry, determine the version number. If unclear from context, ask the user: "Which version is this changelog entry for?" Never proceed without a clear version identifier.
-
-3. **Style Adherence**: Apply these writing guidelines rigorously:
- - Prioritize clarity above all else
- - Use 11th grade English for non-technical terms
- - Prefer active voice over passive voice
- - Write short, punchy sentences as your default; use longer sentences only when needed for flow
- - Use complete sentences rather than fragments (unless brevity clearly improves readability)
- - **Never use em dashes (—)**. Instead, use: a period and new sentence, parentheses (), or semicolons ;
- - Use bold and bullet points sparingly; apply them only when they genuinely aid quick scanning
- - Follow principles from "The Elements of Style"
-
-4. **Feature Documentation Integration**: When a changelog mentions a new feature:
- - Search existing documentation to see if a dedicated page exists for that feature
- - If found, add a link to that documentation page in the changelog entry
- - If not found, note this and ask the user if documentation should be created
-
-5. **Media Handling**: When the user mentions videos or screenshots:
- - Add appropriate placeholders using the project's established format
- - For images: use the image plugin format consistent with other entries
- - For videos: use YouTube video embedding format consistent with other entries
- - Ask for specifics if media details are unclear: "Do you have the YouTube URL for the demo video?" or "How many screenshots should I add placeholders for?"
-
-6. **Quality Assurance**: After making changes:
- - Inform the user you're running the build check
- - Execute `npm run build` (or equivalent) in the docs folder to verify nothing broke
- - Report any build errors immediately and fix them before finalizing
-
-7. **Consistency Checking**: Before finalizing any entry:
- - Review similar existing entries to match tone, structure, and formatting
- - Ensure terminology is consistent with previous changelog entries
- - Verify that linking patterns match established conventions
-
-## Your Decision-Making Framework
-
-**When Information is Missing:**
-- Version number unclear → Ask immediately
-- Feature scope ambiguous → Request clarification before writing
-- Media availability uncertain → Confirm with user before adding placeholders
-- Categorization unclear (bug fix vs. feature vs. improvement) → Ask for classification
-
-**When Editing Existing Entries:**
-- Always preserve the original intent and factual accuracy
-- Improve clarity and style without changing meaning
-- Flag any technical inaccuracies to the user rather than guessing
-
-**Quality Control Checklist (apply to every entry):**
-- [ ] Version number present and correct
-- [ ] Both short and detailed entries created
-- [ ] Short entry links to detailed entry correctly
-- [ ] Active voice used where possible
-- [ ] No em dashes present
-- [ ] Feature documentation linked if applicable
-- [ ] Media placeholders added if mentioned
-- [ ] Build test passed
-- [ ] Style guidelines followed
-
-## Output Format
-
-When creating or editing changelog entries, provide:
-1. The complete markdown for the main.mdx summary entry
-2. The complete markdown for the detailed entries/[name].mdx file
-3. Confirmation that you've checked for related documentation
-4. Build test results
-5. Any questions or clarifications needed
-
-Be proactive in identifying unclear requirements and ask specific questions rather than making assumptions. Your goal is to produce changelog entries that are immediately publishable without requiring revision.
diff --git a/README.md b/README.md
index d07ac59e52..ce25044ae8 100644
--- a/README.md
+++ b/README.md
@@ -2,12 +2,11 @@
The Open-source LLMOps Platform
Build reliable LLM applications faster with integrated prompt management, evaluation, and observability.
diff --git a/api/ee/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py b/api/ee/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py
new file mode 100644
index 0000000000..d76fe93471
--- /dev/null
+++ b/api/ee/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py
@@ -0,0 +1,70 @@
+"""add CREDITS to meters_type
+
+Revision ID: 79f40f71e912
+Revises: 3b5f5652f611
+Create Date: 2025-11-03 15:00:00.000000
+"""
+
+from typing import Sequence, Union
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision: str = "79f40f71e912"
+down_revision: Union[str, None] = "3b5f5652f611"
+branch_labels: Union[str, Sequence[str], None] = None
+depends_on: Union[str, Sequence[str], None] = None
+
+ENUM_NAME = "meters_type"
+TEMP_ENUM_NAME = "meters_type_temp"
+TABLE_NAME = "meters"
+COLUMN_NAME = "key"
+
+
+def upgrade() -> None:
+ # 1) Create temp enum including the new value
+ op.execute(
+ sa.text(
+ f"CREATE TYPE {TEMP_ENUM_NAME} AS ENUM ('USERS','APPLICATIONS','EVALUATIONS','TRACES','CREDITS')"
+ )
+ )
+
+ # 2) Alter column to use temp enum
+ op.execute(
+ sa.text(
+ f"ALTER TABLE {TABLE_NAME} "
+ f"ALTER COLUMN {COLUMN_NAME} TYPE {TEMP_ENUM_NAME} "
+ f"USING {COLUMN_NAME}::text::{TEMP_ENUM_NAME}"
+ )
+ )
+
+ # 3) Drop old enum, then 4) rename temp -> original
+ op.execute(sa.text(f"DROP TYPE {ENUM_NAME}"))
+ op.execute(sa.text(f"ALTER TYPE {TEMP_ENUM_NAME} RENAME TO {ENUM_NAME}"))
+
+
+def downgrade() -> None:
+ # Ensure downgrade can proceed (rows with CREDITS would block the type change)
+ op.execute(
+ sa.text(f"DELETE FROM {TABLE_NAME} WHERE {COLUMN_NAME}::text = 'CREDITS'")
+ )
+
+ # 1) Create temp enum WITHOUT CREDITS
+ op.execute(
+ sa.text(
+ f"CREATE TYPE {TEMP_ENUM_NAME} AS ENUM ('USERS','APPLICATIONS','EVALUATIONS','TRACES')"
+ )
+ )
+
+ # 2) Alter column to use temp enum
+ op.execute(
+ sa.text(
+ f"ALTER TABLE {TABLE_NAME} "
+ f"ALTER COLUMN {COLUMN_NAME} TYPE {TEMP_ENUM_NAME} "
+ f"USING {COLUMN_NAME}::text::{TEMP_ENUM_NAME}"
+ )
+ )
+
+ # 3) Drop current enum (which includes CREDITS), then 4) rename temp -> original
+ op.execute(sa.text(f"DROP TYPE {ENUM_NAME}"))
+ op.execute(sa.text(f"ALTER TYPE {TEMP_ENUM_NAME} RENAME TO {ENUM_NAME}"))
diff --git a/api/ee/src/core/entitlements/types.py b/api/ee/src/core/entitlements/types.py
index e346f11c57..ad81ebafae 100644
--- a/api/ee/src/core/entitlements/types.py
+++ b/api/ee/src/core/entitlements/types.py
@@ -22,6 +22,7 @@ class Counter(str, Enum):
EVALUATIONS = "evaluations"
EVALUATORS = "evaluators"
ANNOTATIONS = "annotations"
+ CREDITS = "credits"
class Gauge(str, Enum):
@@ -60,7 +61,7 @@ class Probe(BaseModel):
},
},
"features": [
- "Unlimited prompts",
+ "2 prompts",
"20 evaluations/month",
"5k traces/month",
"2 seats",
@@ -209,10 +210,11 @@ class Probe(BaseModel):
Tracker.COUNTERS: {
Counter.TRACES: Quota(limit=5_000, monthly=True, free=5_000),
Counter.EVALUATIONS: Quota(limit=20, monthly=True, free=20, strict=True),
+ Counter.CREDITS: Quota(limit=100, monthly=True, free=100, strict=True),
},
Tracker.GAUGES: {
Gauge.USERS: Quota(limit=2, strict=True, free=2),
- Gauge.APPLICATIONS: Quota(strict=True),
+ Gauge.APPLICATIONS: Quota(limit=2, strict=True, free=2),
},
},
Plan.CLOUD_V0_PRO: {
@@ -223,6 +225,7 @@ class Probe(BaseModel):
Tracker.COUNTERS: {
Counter.TRACES: Quota(monthly=True, free=10_000),
Counter.EVALUATIONS: Quota(monthly=True, strict=True),
+ Counter.CREDITS: Quota(limit=100, monthly=True, free=100, strict=True),
},
Tracker.GAUGES: {
Gauge.USERS: Quota(limit=10, strict=True, free=3),
@@ -237,6 +240,7 @@ class Probe(BaseModel):
Tracker.COUNTERS: {
Counter.TRACES: Quota(monthly=True, free=1_000_000),
Counter.EVALUATIONS: Quota(monthly=True, strict=True),
+ Counter.CREDITS: Quota(limit=100, monthly=True, free=100, strict=True),
},
Tracker.GAUGES: {
Gauge.USERS: Quota(strict=True),
@@ -279,6 +283,12 @@ class Probe(BaseModel):
Tracker.COUNTERS: {
Counter.TRACES: Quota(monthly=True),
Counter.EVALUATIONS: Quota(monthly=True, strict=True),
+ Counter.CREDITS: Quota(
+ limit=100_000,
+ monthly=True,
+ free=100_000,
+ strict=True,
+ ),
},
Tracker.GAUGES: {
Gauge.USERS: Quota(strict=True),
diff --git a/api/ee/src/core/meters/types.py b/api/ee/src/core/meters/types.py
index a0ada9da16..1002594d3f 100644
--- a/api/ee/src/core/meters/types.py
+++ b/api/ee/src/core/meters/types.py
@@ -13,6 +13,7 @@ class Meters(str, Enum):
# COUNTERS
TRACES = Counter.TRACES.value
EVALUATIONS = Counter.EVALUATIONS.value
+ CREDITS = Counter.CREDITS.value
# GAUGES
USERS = Gauge.USERS.value
APPLICATIONS = Gauge.APPLICATIONS.value
diff --git a/api/ee/src/services/llm_apps_service.py b/api/ee/src/services/llm_apps_service.py
index b1d8ab5995..15267ec378 100644
--- a/api/ee/src/services/llm_apps_service.py
+++ b/api/ee/src/services/llm_apps_service.py
@@ -202,7 +202,6 @@ async def invoke_app(
openapi_parameters: List[Dict],
user_id: str,
project_id: str,
- scenario_id: Optional[str] = None,
**kwargs,
) -> InvokationResult:
"""
@@ -248,14 +247,7 @@ async def invoke_app(
app_response = {}
try:
- log.info(
- "Invoking application...",
- scenario_id=scenario_id,
- testcase_id=(
- datapoint["testcase_id"] if "testcase_id" in datapoint else None
- ),
- url=url,
- )
+ log.info("Invoking workflow...", url=url)
response = await client.post(
url,
json=payload,
@@ -276,12 +268,6 @@ async def invoke_app(
trace_id = app_response.get("trace_id", None)
span_id = app_response.get("span_id", None)
- log.info(
- "Invoked application. ",
- scenario_id=scenario_id,
- trace_id=trace_id,
- )
-
return InvokationResult(
result=Result(
type=kind,
@@ -342,7 +328,6 @@ async def run_with_retry(
openapi_parameters: List[Dict],
user_id: str,
project_id: str,
- scenario_id: Optional[str] = None,
**kwargs,
) -> InvokationResult:
"""
@@ -379,7 +364,6 @@ async def run_with_retry(
openapi_parameters,
user_id,
project_id,
- scenario_id,
**kwargs,
)
return result
@@ -419,7 +403,6 @@ async def batch_invoke(
rate_limit_config: Dict,
user_id: str,
project_id: str,
- scenarios: Optional[List[Dict]] = None,
**kwargs,
) -> List[InvokationResult]:
"""
@@ -514,7 +497,6 @@ async def batch_invoke(
openapi_parameters,
user_id,
project_id,
- scenarios[index].get("id") if scenarios else None,
**kwargs,
)
)
diff --git a/api/ee/src/tasks/evaluations/legacy.py b/api/ee/src/tasks/evaluations/legacy.py
index 579c6853b9..0d22bf76b7 100644
--- a/api/ee/src/tasks/evaluations/legacy.py
+++ b/api/ee/src/tasks/evaluations/legacy.py
@@ -1055,13 +1055,6 @@ def annotate(
"application_variant": {"id": str(variant.id)},
"application_revision": {"id": str(revision.id)},
},
- scenarios=[
- s.model_dump(
- mode="json",
- exclude_none=True,
- )
- for s in scenarios
- ],
)
)
# ----------------------------------------------------------------------
@@ -1111,7 +1104,6 @@ def annotate(
scenario = scenarios[idx]
testcase = testcases[idx]
invocation = invocations[idx]
- invocation_step_key = invocation_steps_keys[0]
scenario_has_errors = 0
scenario_status = EvaluationStatus.SUCCESS
@@ -1148,20 +1140,8 @@ def annotate(
)
)
- if trace:
- log.info(
- f"Trace found ",
- scenario_id=scenario.id,
- step_key=invocation_step_key,
- trace_id=invocation.trace_id,
- )
- else:
- log.warn(
- f"Trace missing",
- scenario_id=scenario.id,
- step_key=invocation_step_key,
- trace_id=invocation.trace_id,
- )
+ if not trace:
+ log.warn(f"Trace with id {invocation.trace_id} not found.")
scenario_has_errors += 1
scenario_status = EvaluationStatus.ERRORS
continue
@@ -1310,13 +1290,6 @@ def annotate(
links=links,
)
- log.info(
- "Invoking evaluator... ",
- scenario_id=scenario.id,
- testcase_id=testcase.id,
- trace_id=invocation.trace_id,
- uri=interface.get("uri"),
- )
workflows_service_response = loop.run_until_complete(
workflows_service.invoke_workflow(
project_id=project_id,
@@ -1327,11 +1300,6 @@ def annotate(
annotate=True,
)
)
- log.info(
- "Invoked evaluator ",
- scenario_id=scenario.id,
- trace_id=workflows_service_response.trace_id,
- )
# ----------------------------------------------------------
# run evaluator --------------------------------------------
@@ -1387,20 +1355,8 @@ def annotate(
)
)
- if trace:
- log.info(
- f"Trace found ",
- scenario_id=scenario.id,
- step_key=annotation_step_key,
- trace_id=annotation.trace_id,
- )
- else:
- log.warn(
- f"Trace missing",
- scenario_id=scenario.id,
- step_key=annotation_step_key,
- trace_id=annotation.trace_id,
- )
+ if not trace:
+ log.warn(f"Trace with id {annotation.trace_id} not found.")
scenario_has_errors += 1
scenario_status = EvaluationStatus.ERRORS
continue
diff --git a/api/ee/src/tasks/evaluations/live.py b/api/ee/src/tasks/evaluations/live.py
index 43208bd42d..5cd2072e63 100644
--- a/api/ee/src/tasks/evaluations/live.py
+++ b/api/ee/src/tasks/evaluations/live.py
@@ -96,9 +96,6 @@
EvaluatorRevision,
)
-from oss.src.core.evaluations.utils import fetch_trace
-
-
log = get_module_logger(__name__)
@@ -656,12 +653,6 @@ def evaluate(
links=links,
)
- log.info(
- "Invoking evaluator... ",
- scenario_id=scenario.id,
- trace_id=query_trace_id,
- uri=interface.get("uri"),
- )
workflows_service_response = loop.run_until_complete(
workflows_service.invoke_workflow(
project_id=project_id,
@@ -672,11 +663,6 @@ def evaluate(
annotate=True,
)
)
- log.info(
- "Invoked evaluator ",
- scenario_id=scenario.id,
- trace_id=workflows_service_response.trace_id,
- )
trace_id = workflows_service_response.trace_id
@@ -707,51 +693,13 @@ def evaluate(
if workflows_service_response.data
else None
)
-
- annotation = workflows_service_response
-
- trace_id = annotation.trace_id
-
- if not annotation.trace_id:
- log.warn(f"annotation trace_id is missing.")
- scenario_has_errors[idx] += 1
- scenario_status[idx] = EvaluationStatus.ERRORS
- continue
-
- trace = None
- if annotation.trace_id:
- trace = loop.run_until_complete(
- fetch_trace(
- tracing_router=tracing_router,
- request=request,
- trace_id=annotation.trace_id,
- )
- )
-
- if trace:
- log.info(
- f"Trace found ",
- scenario_id=scenario.id,
- step_key=annotation_step_key,
- trace_id=annotation.trace_id,
- )
- else:
- log.warn(
- f"Trace missing",
- scenario_id=scenario.id,
- step_key=annotation_step_key,
- trace_id=annotation.trace_id,
- )
- scenario_has_errors[idx] += 1
- scenario_status[idx] = EvaluationStatus.ERRORS
- continue
# ----------------------------------------------------------
results_create = [
EvaluationResultCreate(
run_id=run_id,
scenario_id=scenario_id,
- step_key=annotation_step_key,
+ step_key=evaluator_step_key,
#
timestamp=timestamp,
interval=interval,
diff --git a/api/oss/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py b/api/oss/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py
new file mode 100644
index 0000000000..d76fe93471
--- /dev/null
+++ b/api/oss/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py
@@ -0,0 +1,70 @@
+"""add CREDITS to meters_type
+
+Revision ID: 79f40f71e912
+Revises: 3b5f5652f611
+Create Date: 2025-11-03 15:00:00.000000
+"""
+
+from typing import Sequence, Union
+from alembic import op
+import sqlalchemy as sa
+
+# revision identifiers, used by Alembic.
+revision: str = "79f40f71e912"
+down_revision: Union[str, None] = "3b5f5652f611"
+branch_labels: Union[str, Sequence[str], None] = None
+depends_on: Union[str, Sequence[str], None] = None
+
+ENUM_NAME = "meters_type"
+TEMP_ENUM_NAME = "meters_type_temp"
+TABLE_NAME = "meters"
+COLUMN_NAME = "key"
+
+
+def upgrade() -> None:
+ # 1) Create temp enum including the new value
+ op.execute(
+ sa.text(
+ f"CREATE TYPE {TEMP_ENUM_NAME} AS ENUM ('USERS','APPLICATIONS','EVALUATIONS','TRACES','CREDITS')"
+ )
+ )
+
+ # 2) Alter column to use temp enum
+ op.execute(
+ sa.text(
+ f"ALTER TABLE {TABLE_NAME} "
+ f"ALTER COLUMN {COLUMN_NAME} TYPE {TEMP_ENUM_NAME} "
+ f"USING {COLUMN_NAME}::text::{TEMP_ENUM_NAME}"
+ )
+ )
+
+ # 3) Drop old enum, then 4) rename temp -> original
+ op.execute(sa.text(f"DROP TYPE {ENUM_NAME}"))
+ op.execute(sa.text(f"ALTER TYPE {TEMP_ENUM_NAME} RENAME TO {ENUM_NAME}"))
+
+
+def downgrade() -> None:
+ # Ensure downgrade can proceed (rows with CREDITS would block the type change)
+ op.execute(
+ sa.text(f"DELETE FROM {TABLE_NAME} WHERE {COLUMN_NAME}::text = 'CREDITS'")
+ )
+
+ # 1) Create temp enum WITHOUT CREDITS
+ op.execute(
+ sa.text(
+ f"CREATE TYPE {TEMP_ENUM_NAME} AS ENUM ('USERS','APPLICATIONS','EVALUATIONS','TRACES')"
+ )
+ )
+
+ # 2) Alter column to use temp enum
+ op.execute(
+ sa.text(
+ f"ALTER TABLE {TABLE_NAME} "
+ f"ALTER COLUMN {COLUMN_NAME} TYPE {TEMP_ENUM_NAME} "
+ f"USING {COLUMN_NAME}::text::{TEMP_ENUM_NAME}"
+ )
+ )
+
+ # 3) Drop current enum (which includes CREDITS), then 4) rename temp -> original
+ op.execute(sa.text(f"DROP TYPE {ENUM_NAME}"))
+ op.execute(sa.text(f"ALTER TYPE {TEMP_ENUM_NAME} RENAME TO {ENUM_NAME}"))
diff --git a/api/oss/src/apis/fastapi/observability/opentelemetry/otlp.py b/api/oss/src/apis/fastapi/observability/opentelemetry/otlp.py
index f7ef6cd6a4..20b3098739 100644
--- a/api/oss/src/apis/fastapi/observability/opentelemetry/otlp.py
+++ b/api/oss/src/apis/fastapi/observability/opentelemetry/otlp.py
@@ -135,12 +135,6 @@ def parse_otlp_stream(otlp_stream: bytes) -> List[OTelSpanDTO]:
s_span_id = "0x" + span.span_id.hex()
s_context = OTelContextDTO(trace_id=s_trace_id, span_id=s_span_id)
- # log.debug(
- # "[SPAN] [PARSE] ",
- # trace_id=s_trace_id[2:],
- # span_id=s_span_id[2:],
- # )
-
# SPAN PARENT CONTEXT
s_parent_id = span.parent_span_id.hex()
s_parent_id = "0x" + s_parent_id if s_parent_id else None
diff --git a/api/oss/src/apis/fastapi/observability/router.py b/api/oss/src/apis/fastapi/observability/router.py
index 2c96b9a000..b95b115997 100644
--- a/api/oss/src/apis/fastapi/observability/router.py
+++ b/api/oss/src/apis/fastapi/observability/router.py
@@ -296,13 +296,6 @@ async def otlp_receiver(
)
# -------------------------------------------------------------------- #
- # for otel_span in otel_spans:
- # log.debug(
- # "Receiving trace... ",
- # project_id=request.state.project_id,
- # trace_id=str(UUID(otel_span.context.trace_id[2:])),
- # )
-
span_dtos = None
try:
# ---------------------------------------------------------------- #
diff --git a/api/oss/src/apis/fastapi/observability/utils/serialization.py b/api/oss/src/apis/fastapi/observability/utils/serialization.py
index 5d697953da..b2f6cd4ce9 100644
--- a/api/oss/src/apis/fastapi/observability/utils/serialization.py
+++ b/api/oss/src/apis/fastapi/observability/utils/serialization.py
@@ -58,7 +58,7 @@ def decode_value(
value = loads(encoded)
return value
try:
- value = value
+ value = loads(value)
except JSONDecodeError:
pass
return value
diff --git a/api/oss/src/core/evaluations/utils.py b/api/oss/src/core/evaluations/utils.py
index eb4f899ff7..ab823c647b 100644
--- a/api/oss/src/core/evaluations/utils.py
+++ b/api/oss/src/core/evaluations/utils.py
@@ -131,7 +131,7 @@ async def fetch_trace(
request,
#
trace_id: str,
- max_retries: int = 15,
+ max_retries: int = 5,
delay: float = 1.0,
) -> Optional[OTelSpansTree]:
for attempt in range(max_retries):
diff --git a/api/oss/src/core/evaluators/service.py b/api/oss/src/core/evaluators/service.py
index 6e547addfe..37cf6606eb 100644
--- a/api/oss/src/core/evaluators/service.py
+++ b/api/oss/src/core/evaluators/service.py
@@ -1,6 +1,5 @@
from typing import Optional, List
from uuid import UUID, uuid4
-from json import loads
from oss.src.utils.helpers import get_slug_from_name_and_id
from oss.src.services.db_manager import fetch_evaluator_config
@@ -1435,52 +1434,46 @@ def _transfer_evaluator_revision_data(
else None
)
headers = None
- outputs_schema = None
- if str(old_evaluator.evaluator_key) == "auto_ai_critique":
- json_schema = old_evaluator.settings_values.get("json_schema", None)
- if json_schema and isinstance(json_schema, dict):
- outputs_schema = json_schema.get("schema", None)
- if not outputs_schema:
- properties = (
- {"score": {"type": "number"}, "success": {"type": "boolean"}}
- if old_evaluator.evaluator_key
- in (
- "auto_levenshtein_distance",
- "auto_semantic_similarity",
- "auto_similarity_match",
- "auto_json_diff",
- "auto_webhook_test",
- "auto_custom_code_run",
- "auto_ai_critique",
- "rag_faithfulness",
- "rag_context_relevancy",
- )
- else {"success": {"type": "boolean"}}
- )
- required = (
- list(properties.keys())
- if old_evaluator.evaluator_key
- not in (
- "auto_levenshtein_distance",
- "auto_semantic_similarity",
- "auto_similarity_match",
- "auto_json_diff",
- "auto_webhook_test",
- "auto_custom_code_run",
- "auto_ai_critique",
- "rag_faithfulness",
- "rag_context_relevancy",
- )
- else []
+ properties = (
+ {"score": {"type": "number"}, "success": {"type": "boolean"}}
+ if old_evaluator.evaluator_key
+ in (
+ "auto_levenshtein_distance",
+ "auto_semantic_similarity",
+ "auto_similarity_match",
+ "auto_json_diff",
+ "auto_webhook_test",
+ "auto_custom_code_run",
+ "auto_ai_critique",
+ "rag_faithfulness",
+ "rag_context_relevancy",
)
- outputs_schema = {
+ else {"success": {"type": "boolean"}}
+ )
+ schemas = {
+ "outputs": {
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": properties,
- "required": required,
+ "required": (
+ list(properties.keys())
+ if old_evaluator.evaluator_key
+ not in (
+ "auto_levenshtein_distance",
+ "auto_semantic_similarity",
+ "auto_similarity_match",
+ "auto_json_diff",
+ "auto_webhook_test",
+ "auto_custom_code_run",
+ "auto_ai_critique",
+ "rag_faithfulness",
+ "rag_context_relevancy",
+ )
+ else []
+ ),
"additionalProperties": False,
}
- schemas = {"outputs": outputs_schema}
+ }
script = (
{
"content": old_evaluator.settings_values.get("code", None),
diff --git a/api/oss/src/models/api/evaluation_model.py b/api/oss/src/models/api/evaluation_model.py
index d79d124921..ef25bfa140 100644
--- a/api/oss/src/models/api/evaluation_model.py
+++ b/api/oss/src/models/api/evaluation_model.py
@@ -14,7 +14,6 @@ class LegacyEvaluator(BaseModel):
name: str
key: str
direct_use: bool
- settings_presets: Optional[list[dict]] = None
settings_template: dict
description: Optional[str] = None
oss: Optional[bool] = False
diff --git a/api/oss/src/resources/evaluators/evaluators.py b/api/oss/src/resources/evaluators/evaluators.py
index 53a2d48542..760ab550a7 100644
--- a/api/oss/src/resources/evaluators/evaluators.py
+++ b/api/oss/src/resources/evaluators/evaluators.py
@@ -205,78 +205,6 @@
"key": "auto_ai_critique",
"direct_use": False,
"requires_llm_api_keys": True,
- "settings_presets": [
- {
- "key": "default",
- "name": "Default",
- "values": {
- "prompt_template": [
- {
- "role": "system",
- "content": "You are an expert evaluator grading model outputs. Your task is to grade the responses based on the criteria and requirements provided below. \n\nGiven the model output and inputs (and any other data you might get) assign a grade to the output. \n\n## Grading considerations\n- Evaluate the overall value provided in the model output\n- Verify all claims in the output meticulously\n- Differentiate between minor errors and major errors\n- Evaluate the outputs based on the inputs and whether they follow the instruction in the inputs if any\n- Give the highst and lowest score for cases where you have complete certainty about correctness and value\n\n## Scoring Criteria\n- The score should be a decimal value between 0.0 and 1.0\n- A score of 1.0 means that the answer is perfect. This is the highest (best) score \n- A score of 0.0 means that the answer does not meet any of the criteria. This is the lowest possible score you can give.\n\n## output format\nANSWER ONLY THE SCORE. DO NOT USE MARKDOWN. DO NOT PROVIDE ANYTHING OTHER THAN THE NUMBER\n",
- },
- {
- "role": "user",
- "content": "## Model inputs\n{{inputs}}\n## Model outputs\n{{outputs}}",
- },
- ],
- "model": "gpt-4o-mini",
- "response_type": "json_schema",
- "json_schema": {
- "name": "schema",
- "schema": {
- "title": "extract",
- "description": "Extract information from the user's response.",
- "type": "object",
- "properties": {
- "correctness": {
- "type": "boolean",
- "description": "The grade results",
- }
- },
- "required": ["correctness"],
- "strict": True,
- },
- },
- "version": "4",
- },
- },
- {
- "key": "hallucination",
- "name": "Hallucination Detection",
- "values": {
- "prompt_template": [
- {
- "role": "system",
- "content": "You are an expert evaluator grading model outputs for hallucinations. Your task is to identify if the responses contain any hallucinated information based on the criteria and requirements provided below. \n\nGiven the model output and inputs (and any other data you might get) determine if the output contains hallucinations. \n\n## Hallucination considerations\n- Verify all factual claims in the output meticulously against the input data\n- Identify any information that is fabricated or not supported by the input data\n- Differentiate between minor inaccuracies and major hallucinations\n\n## Output format\nANSWER ONLY 'true' IF THE OUTPUT CONTAINS HALLUCINATIONS, OTHERWISE ANSWER 'false'. DO NOT USE MARKDOWN. DO NOT PROVIDE ANYTHING OTHER THAN 'true' OR 'false'\n",
- },
- {
- "role": "user",
- "content": "## Model inputs\n{{inputs}}\n## Model outputs\n{{outputs}}",
- },
- ],
- "model": "gpt-4o-mini",
- "response_type": "json_schema",
- "json_schema": {
- "name": "schema",
- "schema": {
- "title": "extract",
- "description": "Extract information from the user's response.",
- "type": "object",
- "properties": {
- "correctness": {
- "type": "boolean",
- "description": "The hallucination detection result",
- }
- },
- "required": ["correctness"],
- "strict": True,
- },
- },
- "version": "4",
- },
- },
- ],
"settings_template": {
"prompt_template": {
"label": "Prompt Template",
@@ -323,39 +251,10 @@
"advanced": True, # Tells the frontend that this setting is advanced and should be hidden by default
"description": "The LLM model to use for the evaluation",
},
- "response_type": {
- "label": "Response Type",
- "default": "json_schema",
- "type": "hidden",
- "advanced": True,
- "description": "The format of the response from the LLM",
- },
- "json_schema": {
- "label": "Feedback Configuration",
- "default": {
- "name": "schema",
- "schema": {
- "title": "extract",
- "description": "Extract information from the user's response.",
- "type": "object",
- "properties": {
- "correctness": {
- "type": "boolean",
- "description": "The grade results",
- }
- },
- "required": ["correctness"],
- "strict": True,
- },
- },
- "type": "llm_response_schema",
- "advanced": False,
- "description": "Select a response format to structure how your evaluation results are returned.",
- },
"version": {
"label": "Version",
"type": "hidden",
- "default": "4",
+ "default": "3",
"description": "The version of the evaluator", # ignore by the FE
"advanced": False, # ignore by the FE
},
diff --git a/api/oss/src/routers/app_router.py b/api/oss/src/routers/app_router.py
index 338aff4f62..c0b4bd6935 100644
--- a/api/oss/src/routers/app_router.py
+++ b/api/oss/src/routers/app_router.py
@@ -261,7 +261,7 @@ async def create_app(
return CreateAppOutput(app_id=str(app_db.id), app_name=str(app_db.app_name))
-@router.get("/{app_id}/", response_model=ReadAppOutput, operation_id="read_app")
+@router.get("/{app_id}/", response_model=ReadAppOutput, operation_id="create_app")
async def read_app(
request: Request,
app_id: str,
diff --git a/api/oss/src/routers/permissions_router.py b/api/oss/src/routers/permissions_router.py
index 7c1be922fe..5dbf10a0a3 100644
--- a/api/oss/src/routers/permissions_router.py
+++ b/api/oss/src/routers/permissions_router.py
@@ -1,5 +1,5 @@
+from typing import Optional, Union
from uuid import UUID
-from typing import Optional
from fastapi.responses import JSONResponse
from fastapi import Request, Query, HTTPException
@@ -12,6 +12,7 @@
if is_ee():
from ee.src.models.shared_models import Permission
from ee.src.utils.permissions import check_action_access
+ from ee.src.utils.entitlements import check_entitlements, Counter
router = APIRouter()
@@ -69,6 +70,7 @@ async def verify_permissions(
log.warn("Missing required parameters: action, resource_type")
raise Deny()
+ # allow = None
allow = await get_cache(
project_id=request.state.project_id,
user_id=request.state.user_id,
@@ -83,6 +85,7 @@ async def verify_permissions(
raise Deny()
# CHECK PERMISSION 1/3: SCOPE
+ # log.debug("Checking scope access...")
allow_scope = await check_scope_access(
# organization_id=request.state.organization_id,
workspace_id=request.state.workspace_id,
@@ -102,16 +105,35 @@ async def verify_permissions(
)
raise Deny()
- if is_ee():
- # CHECK PERMISSION 1/2: ACTION
- allow_action = await check_action_access(
+ # CHECK PERMISSION 1/2: ACTION
+ # log.debug("Checking action access...")
+ allow_action = await check_action_access(
+ project_id=request.state.project_id,
+ user_uid=request.state.user_id,
+ permission=Permission(action),
+ )
+
+ if not allow_action:
+ log.warn("Action access denied")
+ await set_cache(
project_id=request.state.project_id,
- user_uid=request.state.user_id,
- permission=Permission(action),
+ user_id=request.state.user_id,
+ namespace="verify_permissions",
+ key=cache_key,
+ value="deny",
)
+ raise Deny()
+
+ # CHECK PERMISSION 3/3: RESOURCE
+ # log.debug("Checking resource access...")
+ allow_resource = await check_resource_access(
+ organization_id=request.state.organization_id,
+ resource_type=resource_type,
+ )
- if not allow_action:
- log.warn("Action access denied")
+ if isinstance(allow_resource, bool):
+ if allow_resource is False:
+ log.warn("Resource access denied")
await set_cache(
project_id=request.state.project_id,
user_id=request.state.user_id,
@@ -121,30 +143,40 @@ async def verify_permissions(
)
raise Deny()
- # CHECK PERMISSION 3/3: RESOURCE
- allow_resource = await check_resource_access(
- resource_type=resource_type,
- )
+ if allow_resource is True:
+ await set_cache(
+ project_id=request.state.project_id,
+ user_id=request.state.user_id,
+ namespace="verify_permissions",
+ key=cache_key,
+ value="allow",
+ )
+ return Allow(request.state.credentials)
- if not allow_resource:
- log.warn("Resource access denied")
- await set_cache(
- project_id=request.state.project_id,
- user_id=request.state.user_id,
- namespace="verify_permissions",
- key=cache_key,
- value="deny",
- )
- raise Deny()
+ elif isinstance(allow_resource, int):
+ if allow_resource <= 0:
+ log.warn("Resource access denied")
+ await set_cache(
+ project_id=request.state.project_id,
+ user_id=request.state.user_id,
+ namespace="verify_permissions",
+ key=cache_key,
+ value="deny",
+ )
+ raise Deny()
+ else:
+ return Allow(request.state.credentials)
+ # else:
+ log.warn("Resource access denied")
await set_cache(
project_id=request.state.project_id,
user_id=request.state.user_id,
namespace="verify_permissions",
key=cache_key,
- value="allow",
+ value="deny",
)
- return Allow(request.state.credentials)
+ raise Deny()
except Exception as exc: # pylint: disable=bare-except
log.warn(exc)
@@ -180,11 +212,27 @@ async def check_scope_access(
async def check_resource_access(
+ organization_id: UUID,
resource_type: Optional[str] = None,
-) -> bool:
+) -> Union[bool, int]:
allow_resource = False
if resource_type == "service":
allow_resource = True
+ if resource_type == "local_secrets":
+ check, meter, _ = await check_entitlements(
+ organization_id=organization_id,
+ key=Counter.CREDITS,
+ delta=1,
+ )
+
+ if not check:
+ return False
+
+ if not meter or not meter.value:
+ return False
+
+ return meter.value
+
return allow_resource
diff --git a/api/oss/src/routers/testset_router.py b/api/oss/src/routers/testset_router.py
index e560c06198..dffad517af 100644
--- a/api/oss/src/routers/testset_router.py
+++ b/api/oss/src/routers/testset_router.py
@@ -252,9 +252,7 @@ async def import_testset(
) from error
-@router.post(
- "/", response_model=TestsetSimpleResponse, operation_id="create_legacy_testset"
-)
+@router.post("/", response_model=TestsetSimpleResponse, operation_id="create_testset")
async def create_testset(
csvdata: NewTestset,
request: Request,
diff --git a/api/oss/src/services/analytics_service.py b/api/oss/src/services/analytics_service.py
index 1903eed0ce..1d7d5fce79 100644
--- a/api/oss/src/services/analytics_service.py
+++ b/api/oss/src/services/analytics_service.py
@@ -39,7 +39,7 @@
if POSTHOG_API_KEY:
posthog.api_key = POSTHOG_API_KEY
posthog.host = POSTHOG_HOST
- log.info("PostHog initialized with host %s", POSTHOG_HOST)
+ log.info("PostHog initialized with host %s:", POSTHOG_HOST)
else:
log.warn("PostHog API key not found in environment variables")
diff --git a/api/oss/src/services/evaluators_service.py b/api/oss/src/services/evaluators_service.py
index 5ff93cabb0..8b5ea9eb74 100644
--- a/api/oss/src/services/evaluators_service.py
+++ b/api/oss/src/services/evaluators_service.py
@@ -1,7 +1,7 @@
import re
import json
import traceback
-from typing import Any, Dict, Union, List, Optional
+from typing import Any, Dict, Union, List
import litellm
import httpx
@@ -515,153 +515,6 @@ async def auto_ai_critique(
)
-import json
-import re
-from typing import Any, Dict, Iterable, Tuple, Optional
-
-try:
- import jsonpath # ✅ use module API
- from jsonpath import JSONPointer # pointer class is fine to use
-except Exception:
- jsonpath = None
- JSONPointer = None
-
-# ========= Scheme detection =========
-
-
-def detect_scheme(expr: str) -> str:
- """Return 'json-path', 'json-pointer', or 'dot-notation' based on the placeholder prefix."""
- if expr.startswith("$"):
- return "json-path"
- if expr.startswith("/"):
- return "json-pointer"
- return "dot-notation"
-
-
-# ========= Resolvers =========
-
-
-def resolve_dot_notation(expr: str, data: dict) -> object:
- if "[" in expr or "]" in expr:
- raise KeyError(f"Bracket syntax is not supported in dot-notation: {expr!r}")
-
- # First, check if the expression exists as a literal key (e.g., "topic.story" as a single key)
- # This allows users to use dots in their variable names without nested access
- if expr in data:
- return data[expr]
-
- # If not found as a literal key, try to parse as dot-notation path
- cur = data
- for token in (p for p in expr.split(".") if p):
- if isinstance(cur, list) and token.isdigit():
- cur = cur[int(token)]
- else:
- if not isinstance(cur, dict):
- raise KeyError(
- f"Cannot access key {token!r} on non-dict while resolving {expr!r}"
- )
- if token not in cur:
- raise KeyError(f"Missing key {token!r} while resolving {expr!r}")
- cur = cur[token]
- return cur
-
-
-def resolve_json_path(expr: str, data: dict) -> object:
- if jsonpath is None:
- raise ImportError("python-jsonpath is required for json-path ($...)")
-
- if not (expr == "$" or expr.startswith("$.") or expr.startswith("$[")):
- raise ValueError(
- f"Invalid json-path expression {expr!r}. "
- "Must start with '$', '$.' or '$[' (no implicit normalization)."
- )
-
- # Use package-level APIf
- results = jsonpath.findall(expr, data) # always returns a list
- return results[0] if len(results) == 1 else results
-
-
-def resolve_json_pointer(expr: str, data: Dict[str, Any]) -> Any:
- """Resolve a JSON Pointer; returns a single value."""
- if JSONPointer is None:
- raise ImportError("python-jsonpath is required for json-pointer (/...)")
- return JSONPointer(expr).resolve(data)
-
-
-def resolve_any(expr: str, data: Dict[str, Any]) -> Any:
- """Dispatch to the right resolver based on detected scheme."""
- scheme = detect_scheme(expr)
- if scheme == "json-path":
- return resolve_json_path(expr, data)
- if scheme == "json-pointer":
- return resolve_json_pointer(expr, data)
- return resolve_dot_notation(expr, data)
-
-
-# ========= Placeholder & coercion helpers =========
-
-_PLACEHOLDER_RE = re.compile(r"\{\{\s*(.*?)\s*\}\}")
-
-
-def extract_placeholders(template: str) -> Iterable[str]:
- """Yield the inner text of all {{ ... }} occurrences (trimmed)."""
- for m in _PLACEHOLDER_RE.finditer(template):
- yield m.group(1).strip()
-
-
-def coerce_to_str(value: Any) -> str:
- """Pretty stringify values for embedding into templates."""
- if isinstance(value, (dict, list)):
- return json.dumps(value, ensure_ascii=False)
- return str(value)
-
-
-def build_replacements(
- placeholders: Iterable[str], data: Dict[str, Any]
-) -> Tuple[Dict[str, str], set]:
- """
- Resolve all placeholders against data.
- Returns (replacements, unresolved_placeholders).
- """
- replacements: Dict[str, str] = {}
- unresolved: set = set()
- for expr in set(placeholders):
- try:
- val = resolve_any(expr, data)
- # Escape backslashes to avoid regex replacement surprises
- replacements[expr] = coerce_to_str(val).replace("\\", "\\\\")
- except Exception:
- unresolved.add(expr)
- return replacements, unresolved
-
-
-def apply_replacements(template: str, replacements: Dict[str, str]) -> str:
- """Replace {{ expr }} using a callback to avoid regex-injection issues."""
-
- def _repl(m: re.Match) -> str:
- expr = m.group(1).strip()
- return replacements.get(expr, m.group(0))
-
- return _PLACEHOLDER_RE.sub(_repl, template)
-
-
-def compute_truly_unreplaced(original: set, rendered: str) -> set:
- """Only count placeholders that were in the original template and remain."""
- now = set(extract_placeholders(rendered))
- return original & now
-
-
-def missing_lib_hints(unreplaced: set) -> Optional[str]:
- """Suggest installing python-jsonpath if placeholders indicate json-path or json-pointer usage."""
- if any(expr.startswith("$") or expr.startswith("/") for expr in unreplaced) and (
- jsonpath is None or JSONPointer is None
- ):
- return (
- "Install python-jsonpath to enable json-path ($...) and json-pointer (/...)"
- )
- return None
-
-
def _format_with_template(
content: str,
format: str,
@@ -677,36 +530,41 @@ def _format_with_template(
try:
return Template(content).render(**kwargs)
- except TemplateError:
+ except TemplateError as e:
return content
elif format == "curly":
- original_placeholders = set(extract_placeholders(content))
+ import re
- replacements, _unresolved = build_replacements(
- original_placeholders,
- kwargs,
- )
+ # Extract variables that exist in the original template before replacement
+ # This allows us to distinguish template variables from {{}} in user input values
+ original_variables = set(re.findall(r"\{\{(.*?)\}\}", content))
- result = apply_replacements(content, replacements)
+ result = content
+ for key, value in kwargs.items():
+ pattern = r"\{\{" + re.escape(key) + r"\}\}"
+ old_result = result
+ # Escape backslashes in the replacement string to prevent regex interpretation
+ escaped_value = str(value).replace("\\", "\\\\")
+ result = re.sub(pattern, escaped_value, result)
- truly_unreplaced = compute_truly_unreplaced(original_placeholders, result)
+ # Only check if ORIGINAL template variables remain unreplaced
+ # Don't error on {{}} that came from user input values
+ unreplaced_matches = set(re.findall(r"\{\{(.*?)\}\}", result))
+ truly_unreplaced = original_variables & unreplaced_matches
if truly_unreplaced:
- hint = missing_lib_hints(truly_unreplaced)
- suffix = f" Hint: {hint}" if hint else ""
raise ValueError(
- f"Template variables not found or unresolved: "
- f"{', '.join(sorted(truly_unreplaced))}.{suffix}"
+ f"Template variables not found in inputs: {', '.join(sorted(truly_unreplaced))}"
)
return result
- return content
except Exception as e:
- log.error(f"Error during template formatting: {str(e)}")
return content
+ return content
+
async def ai_critique(input: EvaluatorInputInterface) -> EvaluatorOutputInterface:
openai_api_key = input.credentials.get("OPENAI_API_KEY", None)
@@ -721,10 +579,7 @@ async def ai_critique(input: EvaluatorInputInterface) -> EvaluatorOutputInterfac
)
# Validate prompt variables if there's a prompt in the inputs
- if input.settings.get("prompt_template") and input.settings.get("version") not in [
- "3",
- "4",
- ]:
+ if input.settings.get("prompt_template") and input.settings.get("version") != "3":
try:
validate_prompt_variables(
prompt=input.settings.get("prompt_template", []),
@@ -734,200 +589,6 @@ async def ai_critique(input: EvaluatorInputInterface) -> EvaluatorOutputInterfac
raise e
if (
- input.settings.get("version") == "4"
- ) and ( # this check is used when running in the background (celery)
- type(input.settings.get("prompt_template", "")) is not str
- ): # this check is used when running in the frontend (since in that case we'll alway have version 2)
- try:
- parameters = input.settings or dict()
-
- if not isinstance(parameters, dict):
- parameters = dict()
-
- inputs = input.inputs or None
-
- if not isinstance(inputs, dict):
- inputs = dict()
-
- outputs = input.inputs.get("prediction") or None
-
- if "ground_truth" in inputs:
- del inputs["ground_truth"]
- if "prediction" in inputs:
- del inputs["prediction"]
-
- # ---------------------------------------------------------------- #
-
- correct_answer_key = parameters.get("correct_answer_key")
-
- prompt_template: List = parameters.get("prompt_template") or list()
-
- template_version = parameters.get("version") or "3"
-
- default_format = "fstring" if template_version == "2" else "curly"
-
- template_format = parameters.get("template_format") or default_format
-
- response_type = input.settings.get("response_type") or "text"
-
- json_schema = input.settings.get("json_schema") or None
-
- json_schema = json_schema if response_type == "json_schema" else None
-
- response_format = dict(type=response_type)
-
- if response_type == "json_schema":
- response_format["json_schema"] = json_schema
-
- model = parameters.get("model") or "gpt-4o-mini"
-
- correct_answer = None
-
- if inputs and isinstance(inputs, dict) and correct_answer_key:
- correct_answer = inputs[correct_answer_key]
-
- secrets = await SecretsManager.retrieve_secrets()
-
- openai_api_key = None # secrets.get("OPENAI_API_KEY")
- anthropic_api_key = None # secrets.get("ANTHROPIC_API_KEY")
- openrouter_api_key = None # secrets.get("OPENROUTER_API_KEY")
- cohere_api_key = None # secrets.get("COHERE_API_KEY")
- azure_api_key = None # secrets.get("AZURE_API_KEY")
- groq_api_key = None # secrets.get("GROQ_API_KEY")
-
- for secret in secrets:
- if secret.get("kind") == "provider_key":
- secret_data = secret.get("data", {})
- if secret_data.get("kind") == "openai":
- provider_data = secret_data.get("provider", {})
- openai_api_key = provider_data.get("key") or openai_api_key
- if secret_data.get("kind") == "anthropic":
- provider_data = secret_data.get("provider", {})
- anthropic_api_key = (
- provider_data.get("key") or anthropic_api_key
- )
- if secret_data.get("kind") == "openrouter":
- provider_data = secret_data.get("provider", {})
- openrouter_api_key = (
- provider_data.get("key") or openrouter_api_key
- )
- if secret_data.get("kind") == "cohere":
- provider_data = secret_data.get("provider", {})
- cohere_api_key = provider_data.get("key") or cohere_api_key
- if secret_data.get("kind") == "azure":
- provider_data = secret_data.get("provider", {})
- azure_api_key = provider_data.get("key") or azure_api_key
- if secret_data.get("kind") == "groq":
- provider_data = secret_data.get("provider", {})
- groq_api_key = provider_data.get("key") or groq_api_key
-
- threshold = parameters.get("threshold") or 0.5
-
- score = None
- success = None
-
- litellm.openai_key = openai_api_key
- litellm.anthropic_key = anthropic_api_key
- litellm.openrouter_key = openrouter_api_key
- litellm.cohere_key = cohere_api_key
- litellm.azure_key = azure_api_key
- litellm.groq_key = groq_api_key
-
- context: Dict[str, Any] = dict()
-
- if parameters:
- context.update(
- **{
- "parameters": parameters,
- }
- )
-
- if correct_answer:
- context.update(
- **{
- "ground_truth": correct_answer,
- "correct_answer": correct_answer,
- "reference": correct_answer,
- }
- )
-
- if outputs:
- context.update(
- **{
- "prediction": outputs,
- "outputs": outputs,
- }
- )
-
- if inputs:
- context.update(**inputs)
- context.update(
- **{
- "inputs": inputs,
- }
- )
-
- formatted_prompt_template = [
- {
- "role": message["role"],
- "content": _format_with_template(
- content=message["content"],
- format=template_format,
- kwargs=context,
- ),
- }
- for message in prompt_template
- ]
-
- try:
- response = await litellm.acompletion(
- model=model,
- messages=formatted_prompt_template,
- temperature=0.01,
- response_format=response_format,
- )
-
- _outputs = response.choices[0].message.content.strip() # type: ignore
-
- except litellm.AuthenticationError as e: # type: ignore
- e.message = e.message.replace(
- "litellm.AuthenticationError: AuthenticationError: ", ""
- )
- raise e
-
- except Exception as e:
- raise ValueError(f"AI Critique evaluation failed: {str(e)}") from e
- # --------------------------------------------------------------------------
-
- try:
- _outputs = json.loads(_outputs)
- except:
- pass
-
- if isinstance(_outputs, (int, float)):
- return EvaluatorOutputInterface(
- outputs={
- "score": _outputs,
- "success": _outputs >= threshold,
- },
- )
-
- if isinstance(_outputs, bool):
- return EvaluatorOutputInterface(
- outputs={
- "success": _outputs,
- },
- )
-
- if isinstance(_outputs, dict):
- return EvaluatorOutputInterface(
- outputs=_outputs,
- )
-
- raise ValueError(f"Could not parse output: {_outputs}")
- except Exception as e:
- raise RuntimeError(f"Evaluation failed: {str(e)}")
- elif (
input.settings.get("version") == "3"
) and ( # this check is used when running in the background (celery)
type(input.settings.get("prompt_template", "")) is not str
@@ -1067,23 +728,19 @@ async def ai_critique(input: EvaluatorInputInterface) -> EvaluatorOutputInterfac
messages=formatted_prompt_template,
temperature=0.01,
)
- outputs = response.choices[0].message.content.strip()
- try:
- score = float(outputs)
- success = score >= threshold
+ score = response.choices[0].message.content.strip()
+
+ score = float(score)
+
+ success = score >= threshold
+
+ # ---------------------------------------------------------------- #
+
+ return EvaluatorOutputInterface(
+ outputs={"score": score, "success": success},
+ )
- return EvaluatorOutputInterface(
- outputs={"score": score, "success": success},
- )
- except ValueError:
- # if the output is not a float, we try to extract a float from the text
- match = re.search(r"[-+]?\d*\.\d+|\d+", outputs)
- if match:
- score = float(match.group())
- return EvaluatorOutputInterface(outputs={"score": score})
- else:
- raise ValueError(f"Could not parse output as float: {outputs}")
except Exception as e:
raise RuntimeError(f"Evaluation failed: {str(e)}")
elif (
diff --git a/api/oss/src/utils/logging.py b/api/oss/src/utils/logging.py
index 67c53d4369..fa424ce74c 100644
--- a/api/oss/src/utils/logging.py
+++ b/api/oss/src/utils/logging.py
@@ -4,9 +4,18 @@
import logging
from typing import Any, Optional
+# from datetime import datetime
+# from logging.handlers import RotatingFileHandler
+
import structlog
from structlog.typing import EventDict, WrappedLogger, Processor
+# from opentelemetry.trace import get_current_span
+# from opentelemetry._logs import set_logger_provider
+# from opentelemetry.sdk._logs import LoggingHandler, LoggerProvider
+# from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
+# from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter
+
from oss.src.utils.env import env
@@ -33,6 +42,15 @@ def bound_logger_trace(self, *args, **kwargs):
AGENTA_LOG_CONSOLE_ENABLED = env.AGENTA_LOG_CONSOLE_ENABLED
AGENTA_LOG_CONSOLE_LEVEL = env.AGENTA_LOG_CONSOLE_LEVEL
+# AGENTA_LOG_OTLP_ENABLED = env.AGENTA_LOG_OTLP_ENABLED
+# AGENTA_LOG_OTLP_LEVEL = env.AGENTA_LOG_OTLP_LEVEL
+
+# AGENTA_LOG_FILE_ENABLED = env.AGENTA_LOG_FILE_ENABLED
+# AGENTA_LOG_FILE_LEVEL = env.AGENTA_LOG_FILE_LEVEL
+# AGENTA_LOG_FILE_BASE = env.AGENTA_LOG_FILE_PATH
+# LOG_FILE_DATE = datetime.utcnow().strftime("%Y-%m-%d")
+# AGENTA_LOG_FILE_PATH = f"{AGENTA_LOG_FILE_BASE}-{LOG_FILE_DATE}.log"
+
# COLORS
LEVEL_COLORS = {
"TRACE": "\033[97m",
@@ -72,6 +90,15 @@ def process_positional_args(_, __, event_dict: EventDict) -> EventDict:
return event_dict
+# def add_trace_context(_, __, event_dict: EventDict) -> EventDict:
+# span = get_current_span()
+# if span and span.get_span_context().is_valid:
+# ctx = span.get_span_context()
+# event_dict["TraceId"] = format(ctx.trace_id, "032x")
+# event_dict["SpanId"] = format(ctx.span_id, "016x")
+# return event_dict
+
+
def add_logger_info(
logger: WrappedLogger, method_name: str, event_dict: EventDict
) -> EventDict:
@@ -88,7 +115,6 @@ def add_logger_info(
event_dict["SeverityNumber"] = SEVERITY_NUMBERS.get(level, 9)
event_dict["LoggerName"] = logger.name
event_dict["MethodName"] = method_name
- event_dict["pid"] = os.getpid()
return event_dict
@@ -103,7 +129,6 @@ def colored_console_renderer() -> Processor:
}
def render(_, __, event_dict: EventDict) -> str:
- pid = event_dict.pop("pid", None)
ts = event_dict.pop("Timestamp", "")[:23] + "Z"
level = event_dict.pop("level", "INFO")
msg = event_dict.pop("event", "")
@@ -120,69 +145,102 @@ def render(_, __, event_dict: EventDict) -> str:
return render
+# def plain_renderer() -> Processor:
+# hidden = {
+# "SeverityText",
+# "SeverityNumber",
+# "MethodName",
+# "logger_factory",
+# "LoggerName",
+# "level",
+# }
+
+# def render(_, __, event_dict: EventDict) -> str:
+# ts = event_dict.pop("Timestamp", "")[:23] + "Z"
+# level = event_dict.get("level", "")
+# msg = event_dict.pop("event", "")
+# padded = f"[{level:<5}]"
+# logger = f"[{event_dict.pop('logger', '')}]"
+# extras = " ".join(f"{k}={v}" for k, v in event_dict.items() if k not in hidden)
+# return f"{ts} {padded} {msg} {logger} {extras}"
+
+# return render
+
+
+# def json_renderer() -> Processor:
+# return structlog.processors.JSONRenderer()
+
+
SHARED_PROCESSORS: list[Processor] = [
structlog.processors.TimeStamper(fmt="iso", utc=True, key="Timestamp"),
process_positional_args,
+ # add_trace_context,
add_logger_info,
structlog.processors.format_exc_info,
structlog.processors.dict_tracebacks,
]
-# Guard against double initialization
-_LOGGING_CONFIGURED = False
+def create_struct_logger(
+ processors: list[Processor], name: str
+) -> structlog.stdlib.BoundLogger:
+ logger = logging.getLogger(name)
+ logger.setLevel(TRACE_LEVEL)
+ return structlog.wrap_logger(
+ logger,
+ processors=SHARED_PROCESSORS + processors,
+ wrapper_class=structlog.stdlib.BoundLogger,
+ logger_factory=structlog.stdlib.LoggerFactory(),
+ cache_logger_on_first_use=True,
+ )
-# ensure no duplicate sinks via root
-_root = logging.getLogger()
-_root.handlers.clear()
-_root.propagate = False
# CONFIGURE HANDLERS AND STRUCTLOG LOGGERS
+handlers = []
loggers = []
-if AGENTA_LOG_CONSOLE_ENABLED and not _LOGGING_CONFIGURED:
- _LOGGING_CONFIGURED = True
-
- # Create a single handler for console output
- console_handler = logging.StreamHandler(sys.stdout)
- console_handler.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL))
- console_handler.setFormatter(logging.Formatter("%(message)s"))
-
- # Configure the structlog console logger
- console_logger = logging.getLogger("agenta_console")
- console_logger.handlers.clear()
- console_logger.addHandler(console_handler)
- console_logger.setLevel(TRACE_LEVEL)
- console_logger.propagate = False
-
- loggers.append(
- structlog.wrap_logger(
- console_logger,
- processors=SHARED_PROCESSORS + [colored_console_renderer()],
- wrapper_class=structlog.stdlib.BoundLogger,
- logger_factory=structlog.stdlib.LoggerFactory(),
- cache_logger_on_first_use=False, # Don't cache to avoid stale state
- )
- )
+if AGENTA_LOG_CONSOLE_ENABLED:
+ h = logging.StreamHandler(sys.stdout)
+ h.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL))
+ h.setFormatter(logging.Formatter("%(message)s"))
+
+ # Console logger (your app logs)
+ logger = logging.getLogger("console")
+ logger.handlers.clear()
+ logger.addHandler(h)
+ logger.propagate = False
+ loggers.append(create_struct_logger([colored_console_renderer()], "console"))
- # Configure uvicorn/gunicorn loggers with separate handlers
+ # Gunicorn/Uvicorn loggers
for name in ("uvicorn.access", "uvicorn.error", "gunicorn.error"):
- uh = logging.StreamHandler(sys.stdout)
- uh.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL))
- uh.setFormatter(logging.Formatter("%(message)s"))
- server_logger = logging.getLogger(name)
- server_logger.handlers.clear()
- server_logger.setLevel(logging.INFO)
- server_logger.addHandler(uh)
- server_logger.propagate = False
-
- # Intercept agenta SDK loggers to prevent duplicate output
- for sdk_name in ("agenta", "agenta.sdk"):
- sdk_logger = logging.getLogger(sdk_name)
- sdk_logger.handlers.clear()
- sdk_logger.addHandler(console_handler) # Use our handler
- sdk_logger.setLevel(logging.INFO)
- sdk_logger.propagate = False
+ gunicorn_logger = logging.getLogger(name)
+ gunicorn_logger.handlers.clear() # ✅ fix here
+ gunicorn_logger.setLevel(logging.INFO)
+ gunicorn_logger.addHandler(h)
+ gunicorn_logger.propagate = False
+
+# if AGENTA_LOG_FILE_ENABLED:
+# h = RotatingFileHandler(AGENTA_LOG_FILE_PATH, maxBytes=10 * 1024 * 1024, backupCount=5)
+# h.setLevel(getattr(logging, AGENTA_LOG_FILE_LEVEL, logging.WARNING))
+# h.setFormatter(logging.Formatter("%(message)s"))
+# logger = logging.getLogger("file")
+# logger.addHandler(h)
+# logger.propagate = False # 👈 PREVENT propagation to root (avoids Celery duplicate)
+# loggers.append(create_struct_logger([plain_renderer()], "file"))
+
+# if AGENTA_LOG_OTLP_ENABLED:
+# provider = LoggerProvider()
+# exporter = OTLPLogExporter()
+# provider.add_log_record_processor(BatchLogRecordProcessor(exporter))
+# set_logger_provider(provider)
+# h = LoggingHandler(
+# level=getattr(logging, AGENTA_LOG_OTLP_LEVEL, logging.INFO), logger_provider=provider
+# )
+# h.setFormatter(logging.Formatter("%(message)s"))
+# logger = logging.getLogger("otel")
+# logger.addHandler(h)
+# logger.propagate = False # 👈 PREVENT propagation to root (avo
+# loggers.append(create_struct_logger([json_renderer()], "otel"))
class MultiLogger:
@@ -221,8 +279,11 @@ def bind(self, **kwargs):
return MultiLogger(*(l.bind(**kwargs) for l in self._loggers))
+multi_logger = MultiLogger(*loggers)
+
+
def get_logger(name: Optional[str] = None) -> MultiLogger:
- return MultiLogger(*loggers).bind(logger=name)
+ return multi_logger.bind(logger=name)
def get_module_logger(path: str) -> MultiLogger:
diff --git a/api/poetry.lock b/api/poetry.lock
index 737d1cb26b..7844e01e97 100644
--- a/api/poetry.lock
+++ b/api/poetry.lock
@@ -1347,15 +1347,15 @@ files = [
[[package]]
name = "fsspec"
-version = "2025.10.0"
+version = "2025.9.0"
description = "File-system specification"
optional = false
python-versions = ">=3.9"
groups = ["main"]
markers = "python_version == \"3.11\" or python_version >= \"3.12\""
files = [
- {file = "fsspec-2025.10.0-py3-none-any.whl", hash = "sha256:7c7712353ae7d875407f97715f0e1ffcc21e33d5b24556cb1e090ae9409ec61d"},
- {file = "fsspec-2025.10.0.tar.gz", hash = "sha256:b6789427626f068f9a83ca4e8a3cc050850b6c0f71f99ddb4f542b8266a26a59"},
+ {file = "fsspec-2025.9.0-py3-none-any.whl", hash = "sha256:530dc2a2af60a414a832059574df4a6e10cce927f6f4a78209390fe38955cfb7"},
+ {file = "fsspec-2025.9.0.tar.gz", hash = "sha256:19fd429483d25d28b65ec68f9f4adc16c17ea2c7c7bf54ec61360d478fb19c19"},
]
[package.extras]
@@ -1401,15 +1401,15 @@ files = [
[[package]]
name = "google-auth"
-version = "2.42.1"
+version = "2.42.0"
description = "Google Authentication Library"
optional = false
python-versions = ">=3.7"
groups = ["main"]
markers = "python_version == \"3.11\" or python_version >= \"3.12\""
files = [
- {file = "google_auth-2.42.1-py2.py3-none-any.whl", hash = "sha256:eb73d71c91fc95dbd221a2eb87477c278a355e7367a35c0d84e6b0e5f9b4ad11"},
- {file = "google_auth-2.42.1.tar.gz", hash = "sha256:30178b7a21aa50bffbdc1ffcb34ff770a2f65c712170ecd5446c4bef4dc2b94e"},
+ {file = "google_auth-2.42.0-py2.py3-none-any.whl", hash = "sha256:f8f944bcb9723339b0ef58a73840f3c61bc91b69bf7368464906120b55804473"},
+ {file = "google_auth-2.42.0.tar.gz", hash = "sha256:9bbbeef3442586effb124d1ca032cfb8fb7acd8754ab79b55facd2b8f3ab2802"},
]
[package.dependencies]
@@ -3517,22 +3517,6 @@ files = [
{file = "python_http_client-3.3.7.tar.gz", hash = "sha256:bf841ee45262747e00dec7ee9971dfb8c7d83083f5713596488d67739170cea0"},
]
-[[package]]
-name = "python-jsonpath"
-version = "2.0.1"
-description = "JSONPath, JSON Pointer and JSON Patch for Python."
-optional = false
-python-versions = ">=3.8"
-groups = ["main"]
-markers = "python_version == \"3.11\" or python_version >= \"3.12\""
-files = [
- {file = "python_jsonpath-2.0.1-py3-none-any.whl", hash = "sha256:ebd518b7c883acc5b976518d76b6c96288405edec7d9ef838641869c1e1a5eb7"},
- {file = "python_jsonpath-2.0.1.tar.gz", hash = "sha256:32a84ebb2dc0ec1b42a6e165b0f9174aef8310bad29154ad9aee31ac37cca18f"},
-]
-
-[package.extras]
-strict = ["iregexp-check (>=0.1.4)", "regex"]
-
[[package]]
name = "python-multipart"
version = "0.0.20"
@@ -3638,102 +3622,105 @@ files = [
[[package]]
name = "rapidfuzz"
-version = "3.14.2"
+version = "3.14.1"
description = "rapid fuzzy string matching"
optional = false
python-versions = ">=3.10"
groups = ["main"]
markers = "python_version == \"3.11\" or python_version >= \"3.12\""
files = [
- {file = "rapidfuzz-3.14.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:37ddc4cc3eafe29ec8ba451fcec5244af441eeb53b4e7b4d1d886cd3ff3624f4"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:654be63b17f3da8414968dfdf15c46c8205960ec8508cbb9d837347bf036dc0b"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:75866e9fa474ccfe6b77367fb7c10e6f9754fb910d9b110490a6fad25501a039"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:fd915693a8d441e5f277bef23065275a2bb492724b5ccf64e38e60edd702b0fb"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:e702e76a6166bff466a33888902404209fffd83740d24918ef74514542f66367"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:78f84592f3a2f2773d6f411b755d683b1ce7f05adff4c12c0de923d5f2786e51"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_31_armv7l.whl", hash = "sha256:36d43c9f1b88322ad05b22fa80b6b4a95d2b193d392d3aa7bee652c144cfb1d9"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:69d6f93916717314209f4e8701d203876baeadf8c9dcaee961b8afeba7435643"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:e262958d3ca723c1ce32030384a1626e3d43ba7465e01a3e2b633f4300956150"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:26b5e6e0d39337431ab1b36faf604873cb1f0de9280e0703f61c6753c8fa1f7f"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:2aad09712e1ffbc00ac25f12646c7065b84496af7cd0a70b1d5aff6318405732"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:f10dbbafa3decee704b7a02ffe7914d7dfbbd3d1fce7f37ed2c3d6c3a7c9a8e6"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-win32.whl", hash = "sha256:6c3dab8f9d4271e32c8746461a58412871ebb07654f77aa6121961e796482d30"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-win_amd64.whl", hash = "sha256:5386ce287e5b71db4fd71747a23ae0ca5053012dc959049e160857c5fdadf6cd"},
- {file = "rapidfuzz-3.14.2-cp310-cp310-win_arm64.whl", hash = "sha256:c78d6f205b871f2d41173f82ded66bcef2f692e1b90c0f627cc8035b72898f35"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:3969670d4b85e589564d6a75638ec2372a4375b7e68e747f3bd37b507cf843e4"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:061884b23a8c5eea9443e52acf02cbd533aff93a5439b0e90b5586a0638b8720"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6fc2bc48a219c171deb8529bfcc90ca6663fbcaa42b54ef202858976078f858a"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:cfa62729ac2d77a50a240b6331e9fffb5e070625e97e8f7e50fa882b3ea396ad"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:2d001aaf47a500083b189140df16eaefd675bf06c818a71ae9f687b0d6f804f8"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c95eeaa7f2a990757826aa34e7375b50d49172da5ca7536dc461b1d197e0de9b"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:30af5e015462f89408d7b3bbdd614c739adc386e3d47bd565b53ffb670266021"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:35f12b07d58b932ef95b5f66b40c9efc60c5201bccd3c5ddde4a87df19d0aba8"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:0aa67110e016d2cdce3e5a3330d09fb1dba3cf83350f6eb46a6b9276cbafd094"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:b13dc4743a5d222600d98fb4a0345e910829ef4f286e81b34349627355884c87"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:b16c40709f22c8fc16ca49a5484a468fe0a95f08f29c68043f46f8771e2c37e2"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ac2bd7c74523f952a66536f72b3f68260427e2a6954f1f03d758f01bbbf60564"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-win32.whl", hash = "sha256:37d7045dc0ab4cab49d7cca66b651b44939e18e098a2f55466082e173b1aa452"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-win_amd64.whl", hash = "sha256:9a55ff35536662028563f22e0eadab47c7e94c8798239fe25d3ceca5ab156fd8"},
- {file = "rapidfuzz-3.14.2-cp311-cp311-win_arm64.whl", hash = "sha256:b2f0e1310f7cb1c0c0033987d0a0e85b4fd51a1c4882f556f082687d519f045d"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:0418f6ac1da7adf7e6e469876508f63168e80d3265a9e7ab9a2e999020577bfa"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:f6028090b49015fc9ff0df3c06751078fe300a291e933a378a7c37b78c4d6a3e"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:21aa299985d1bbdb3ccf8a8214e7daee72bb7e8c8fb25a520f015dc200a57816"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e247612909876f36e6132265deef34efcaaf490e1857022204b206ff76578076"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:9cf077475cd4118a5b846a72749d54b520243be6baddba1dd1446f3b1dbab29c"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a5e7e02fb51f9a78e32f4fb8b5546d543e1fb637409cb682a6b8cb12e0c3015c"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:b1febabf4a4a664a2b6025830d93d7703f1cd9dcbe656ed7159053091b4d9389"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:766d133f11888c48497f26a1722afc697a5fbad05bbfec3a41a4bc04fd21af9d"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:2a851a7c6660b6e47723378ca7692cd42700660a8783e4e7d07254a984d63ec8"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:686594bd7f7132cb85900a4cc910e9acb9d39466412b8a275f3d4bc37faba23c"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:e1d412122de3c5c492acfcde020f543b9b529e2eb115f875e2fd7470e44ab441"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:2611b1f6464dddf900bffeee2aa29a9aa1039317cbb226e18d3a5f029d4cf303"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-win32.whl", hash = "sha256:e6968b6db188fbb4c7a18aac25e075940a8204434a2a0d6bddb0a695d7f0c898"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-win_amd64.whl", hash = "sha256:1a6d43683c04ffb4270bb1498951a39e9c200eb326f933fd5d608c19485049b8"},
- {file = "rapidfuzz-3.14.2-cp312-cp312-win_arm64.whl", hash = "sha256:4ecd3ab9aebb17becb462eac19151bd143abc614e3d2a0351a72171371ac3f4b"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:f1f5a2566af7409d11f11b0b4e9f76a0ac64577737b821c64a2a6afc971c1c25"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:810863f3a98d09392e5fb481aef9d82597df6ee06f7f11ceafe6077585c4e018"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2e8c0d16c0724dab7c7dc4099c1ec410679b2d11c1650b069d15d4ab4370f1cc"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:004f04356d84660feffbf8c26975cb0db0e010b2225d6e21b3d84dd8df764652"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b3c2aea6b1db03a8abd62bb157161d7a65b896c9f85d5efc2f1bb444a107c47a"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8bef63704b7851ad1adf5d7ceb7f1b3136b78ee0b34240c14ab85ea775f6caa7"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:52e8e37566313ac60bfa80754c4c0367eec65b3ef52bb8cc409b88e878b03182"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:b3fad0fb5ac44944ad8f81e729ec45f65a85efb7d7ea4cf67343799c0ea9874b"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:d027842a956b86aa9706b836c48186da405413d03957afaccda2fbe414bc3912"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:27dcb45427b1966fb43c904d19c841c3e6da147931959cf05388ecef9c5a1e8d"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:1aab0676884e91282817b5710933efc4ea9466d2ba5703b5a7541468695d807a"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:ef36c21ecb7f4bad7e4e119fe746a787ad684eaf1c383c17a2aff5d75b20fa58"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-win32.whl", hash = "sha256:ed3af4fa0dbd6d1964f171ac6fff82ed9e76c737eb34ae3daf926c4aefc2ce9b"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-win_amd64.whl", hash = "sha256:3fc2e7c3ab006299366b1c8256e452f00eb1659d0e4790b140633627c7d947b7"},
- {file = "rapidfuzz-3.14.2-cp313-cp313-win_arm64.whl", hash = "sha256:def48d5010ddcd2a80b44f14bf0172c29bfc27906d13c0ea69a6e3c00e6f225c"},
- {file = "rapidfuzz-3.14.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:a39952b8e033758ee15b2de48a5b0689c83ea6bd93c8df3635f2fbf21e52fd25"},
- {file = "rapidfuzz-3.14.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:f786811555869b5961b3718b007179e87d73c47414afee5fb882ae1b9b174c0c"},
- {file = "rapidfuzz-3.14.2-cp313-cp313t-win32.whl", hash = "sha256:6c0a25490a99c4b73f1deca3efae004df5f2b254760d98cac8d93becf41260d4"},
- {file = "rapidfuzz-3.14.2-cp313-cp313t-win_amd64.whl", hash = "sha256:e5af2dab8ec5a180d9ff24fbb5b25e589848b93cccb755eceb0bf0e3cfed7e5c"},
- {file = "rapidfuzz-3.14.2-cp313-cp313t-win_arm64.whl", hash = "sha256:8cf2aefb0d246d540ea83b4648db690bd7e25d34a7c23c5f250dcba2e4989192"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:ace3a6b108679888833cdceea9a6231e406db202b8336eaf68279fe71a1d2ac4"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:32c7cc978447202ba592e197228767b230d85e52e5ef229e2b22e51c8e3d06ad"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5a479a824cbf6a646bcec1c34fbbfb85393d03eb2811657e3a6536298d435f76"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3a3bc0c8b65dcd1e55a1cc42a7c7b34e93ad5d4bd1501dc998f4625042e1b110"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:217b46bf096818df16c0e2c43202aa8352e67c4379b1d5f25e98c5d1c7f5414d"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:07d3e8afeeb81044873644e505e56ba06d8bdcc291ef7e26ac0f54c58309267d"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_31_armv7l.whl", hash = "sha256:b7832c8707bfa4f9b081def64aa49954d4813cff7fc9ff4a0b184a4e8697147f"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:35581ba6981e016333063c52719c0b0b1bef0f944e641ad0f4ea34e0b39161f3"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:fbd5152169dc3f6c894c24fc04813f50bf9b929d137f2b965ac926e03329ceba"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:98a119c3f9b152e9b62ec43520392669bd8deae9df269f30569f1c87bf6055a4"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:9e84164e7a68f9c3523c5d104dda6601202b39bae0aac1b73a4f119d387275c4"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:64c67402b86a073666f92c2807811e3817a17fedfe505fe89a9f93eea264481c"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-win32.whl", hash = "sha256:58d79f4df3e4332b31e671f9487f0c215856cf1f2d9ac3848ac10c27262fd723"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-win_amd64.whl", hash = "sha256:dc6fe7a27ad9e233c155e89b7e1d9b6d13963e3261ea5b30f3e79c3556c49bc9"},
- {file = "rapidfuzz-3.14.2-cp314-cp314-win_arm64.whl", hash = "sha256:bb4e96d80de7e6364850a2e168e899b8e85ab80ce19827cc4fbe0aa3c57f8124"},
- {file = "rapidfuzz-3.14.2-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:c7d4d0927a6b1ef2529a8cc57adf2ce965f7aaef324a4d1ae826d0de43ab4f82"},
- {file = "rapidfuzz-3.14.2-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:c0fae06e7fb4be18e86eb51e77f0d441975a3ba9ef963f957d750a2a41536ba1"},
- {file = "rapidfuzz-3.14.2-cp314-cp314t-win32.whl", hash = "sha256:d1d3ef72665d460b7b3e61d3dff4341a195dcb3250b4471eef71db23fca2d91a"},
- {file = "rapidfuzz-3.14.2-cp314-cp314t-win_amd64.whl", hash = "sha256:3a0960c5c11a34e8129a3062f1b1cbb371fad364e2195ebe46a88a9d5eeec0f1"},
- {file = "rapidfuzz-3.14.2-cp314-cp314t-win_arm64.whl", hash = "sha256:ed29600e55d7df104d5778d499678c305e32e3ccfa873489a7c8304489c5f8f3"},
- {file = "rapidfuzz-3.14.2-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:172630396d8bdbb5ea1a58e82afc489c8e18076e1f2b2edea20cb30f8926325a"},
- {file = "rapidfuzz-3.14.2-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:6cff0d6749fac8dd7fdf26d0604d8a47c5ee786061972077d71ec7ac0fb7ced2"},
- {file = "rapidfuzz-3.14.2-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:f558bc2ee3a0bb5d7238ed10a0b76455f2d28c97e93564a1f7855cea4096ef1c"},
- {file = "rapidfuzz-3.14.2.tar.gz", hash = "sha256:69bf91e66aeb84a104aea35e1b3f6b3aa606faaee6db1cfc76950f2a6a828a12"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:489440e4b5eea0d150a31076eb183bed0ec84f934df206c72ae4fc3424501758"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:eff22cc938c3f74d194df03790a6c3325d213b28cf65cdefd6fdeae759b745d5"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e0307f018b16feaa36074bcec2496f6f120af151a098910296e72e233232a62f"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:bc133652da143aca1ab72de235446432888b2b7f44ee332d006f8207967ecb8a"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:e9e71b3fe7e4a1590843389a90fe2a8684649fc74b9b7446e17ee504ddddb7de"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6c51519eb2f20b52eba6fc7d857ae94acc6c2a1f5d0f2d794b9d4977cdc29dd7"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_31_armv7l.whl", hash = "sha256:fe87d94602624f8f25fff9a0a7b47f33756c4d9fc32b6d3308bb142aa483b8a4"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:2d665380503a575dda52eb712ea521f789e8f8fd629c7a8e6c0f8ff480febc78"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:c0f0dd022b8a7cbf3c891f6de96a80ab6a426f1069a085327816cea749e096c2"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:bf1ba22d36858b265c95cd774ba7fe8991e80a99cd86fe4f388605b01aee81a3"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:ca1c1494ac9f9386d37f0e50cbaf4d07d184903aed7691549df1b37e9616edc9"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:9e4b12e921b0fa90d7c2248742a536f21eae5562174090b83edd0b4ab8b557d7"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-win32.whl", hash = "sha256:5e1c1f2292baa4049535b07e9e81feb29e3650d2ba35ee491e64aca7ae4cb15e"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-win_amd64.whl", hash = "sha256:59a8694beb9a13c4090ab3d1712cabbd896c6949706d1364e2a2e1713c413760"},
+ {file = "rapidfuzz-3.14.1-cp310-cp310-win_arm64.whl", hash = "sha256:e94cee93faa792572c574a615abe12912124b4ffcf55876b72312914ab663345"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:4d976701060886a791c8a9260b1d4139d14c1f1e9a6ab6116b45a1acf3baff67"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:5e6ba7e6eb2ab03870dcab441d707513db0b4264c12fba7b703e90e8b4296df2"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1e532bf46de5fd3a1efde73a16a4d231d011bce401c72abe3c6ecf9de681003f"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:f9b6a6fb8ed9b951e5f3b82c1ce6b1665308ec1a0da87f799b16e24fc59e4662"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:5b6ac3f9810949caef0e63380b11a3c32a92f26bacb9ced5e32c33560fcdf8d1"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e52e4c34fd567f77513e886b66029c1ae02f094380d10eba18ba1c68a46d8b90"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:2ef72e41b1a110149f25b14637f1cedea6df192462120bea3433980fe9d8ac05"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:fb654a35b373d712a6b0aa2a496b2b5cdd9d32410cfbaecc402d7424a90ba72a"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:2b2c12e5b9eb8fe9a51b92fe69e9ca362c0970e960268188a6d295e1dec91e6d"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:4f069dec5c450bd987481e752f0a9979e8fdf8e21e5307f5058f5c4bb162fa56"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:4d0d9163725b7ad37a8c46988cae9ebab255984db95ad01bf1987ceb9e3058dd"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:db656884b20b213d846f6bc990c053d1f4a60e6d4357f7211775b02092784ca1"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-win32.whl", hash = "sha256:4b42f7b9c58cbcfbfaddc5a6278b4ca3b6cd8983e7fd6af70ca791dff7105fb9"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-win_amd64.whl", hash = "sha256:e5847f30d7d4edefe0cb37294d956d3495dd127c1c56e9128af3c2258a520bb4"},
+ {file = "rapidfuzz-3.14.1-cp311-cp311-win_arm64.whl", hash = "sha256:5087d8ad453092d80c042a08919b1cb20c8ad6047d772dc9312acd834da00f75"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:809515194f628004aac1b1b280c3734c5ea0ccbd45938c9c9656a23ae8b8f553"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0afcf2d6cb633d0d4260d8df6a40de2d9c93e9546e2c6b317ab03f89aa120ad7"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5c1c3d07d53dcafee10599da8988d2b1f39df236aee501ecbd617bd883454fcd"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:6e9ee3e1eb0a027717ee72fe34dc9ac5b3e58119f1bd8dd15bc19ed54ae3e62b"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:70c845b64a033a20c44ed26bc890eeb851215148cc3e696499f5f65529afb6cb"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:26db0e815213d04234298dea0d884d92b9cb8d4ba954cab7cf67a35853128a33"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:6ad3395a416f8b126ff11c788531f157c7debeb626f9d897c153ff8980da10fb"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:61c5b9ab6f730e6478aa2def566223712d121c6f69a94c7cc002044799442afd"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:13e0ea3d0c533969158727d1bb7a08c2cc9a816ab83f8f0dcfde7e38938ce3e6"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:6325ca435b99f4001aac919ab8922ac464999b100173317defb83eae34e82139"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:07a9fad3247e68798424bdc116c1094e88ecfabc17b29edf42a777520347648e"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:f8ff5dbe78db0a10c1f916368e21d328935896240f71f721e073cf6c4c8cdedd"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-win32.whl", hash = "sha256:9c83270e44a6ae7a39fc1d7e72a27486bccc1fa5f34e01572b1b90b019e6b566"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-win_amd64.whl", hash = "sha256:e06664c7fdb51c708e082df08a6888fce4c5c416d7e3cc2fa66dd80eb76a149d"},
+ {file = "rapidfuzz-3.14.1-cp312-cp312-win_arm64.whl", hash = "sha256:6c7c26025f7934a169a23dafea6807cfc3fb556f1dd49229faf2171e5d8101cc"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:8d69f470d63ee824132ecd80b1974e1d15dd9df5193916901d7860cef081a260"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:6f571d20152fc4833b7b5e781b36d5e4f31f3b5a596a3d53cf66a1bd4436b4f4"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:61d77e09b2b6bc38228f53b9ea7972a00722a14a6048be9a3672fb5cb08bad3a"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:8b41d95ef86a6295d353dc3bb6c80550665ba2c3bef3a9feab46074d12a9af8f"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0591df2e856ad583644b40a2b99fb522f93543c65e64b771241dda6d1cfdc96b"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f277801f55b2f3923ef2de51ab94689a0671a4524bf7b611de979f308a54cd6f"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:893fdfd4f66ebb67f33da89eb1bd1674b7b30442fdee84db87f6cb9074bf0ce9"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:fe2651258c1f1afa9b66f44bf82f639d5f83034f9804877a1bbbae2120539ad1"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:ace21f7a78519d8e889b1240489cd021c5355c496cb151b479b741a4c27f0a25"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:cb5acf24590bc5e57027283b015950d713f9e4d155fda5cfa71adef3b3a84502"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:67ea46fa8cc78174bad09d66b9a4b98d3068e85de677e3c71ed931a1de28171f"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:44e741d785de57d1a7bae03599c1cbc7335d0b060a35e60c44c382566e22782e"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-win32.whl", hash = "sha256:b1fe6001baa9fa36bcb565e24e88830718f6c90896b91ceffcb48881e3adddbc"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-win_amd64.whl", hash = "sha256:83b8cc6336709fa5db0579189bfd125df280a554af544b2dc1c7da9cdad7e44d"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313-win_arm64.whl", hash = "sha256:cf75769662eadf5f9bd24e865c19e5ca7718e879273dce4e7b3b5824c4da0eb4"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:d937dbeda71c921ef6537c6d41a84f1b8112f107589c9977059de57a1d726dd6"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:7a2d80cc1a4fcc7e259ed4f505e70b36433a63fa251f1bb69ff279fe376c5efd"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313t-win32.whl", hash = "sha256:40875e0c06f1a388f1cab3885744f847b557e0b1642dfc31ff02039f9f0823ef"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313t-win_amd64.whl", hash = "sha256:876dc0c15552f3d704d7fb8d61bdffc872ff63bedf683568d6faad32e51bbce8"},
+ {file = "rapidfuzz-3.14.1-cp313-cp313t-win_arm64.whl", hash = "sha256:61458e83b0b3e2abc3391d0953c47d6325e506ba44d6a25c869c4401b3bc222c"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:e84d9a844dc2e4d5c4cabd14c096374ead006583304333c14a6fbde51f612a44"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:40301b93b99350edcd02dbb22e37ca5f2a75d0db822e9b3c522da451a93d6f27"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fedd5097a44808dddf341466866e5c57a18a19a336565b4ff50aa8f09eb528f6"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:2e3e61c9e80d8c26709d8aa5c51fdd25139c81a4ab463895f8a567f8347b0548"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:da011a373722fac6e64687297a1d17dc8461b82cb12c437845d5a5b161bc24b9"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5967d571243cfb9ad3710e6e628ab68c421a237b76e24a67ac22ee0ff12784d6"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_31_armv7l.whl", hash = "sha256:474f416cbb9099676de54aa41944c154ba8d25033ee460f87bb23e54af6d01c9"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:ae2d57464b59297f727c4e201ea99ec7b13935f1f056c753e8103da3f2fc2404"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:57047493a1f62f11354c7143c380b02f1b355c52733e6b03adb1cb0fe8fb8816"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:4acc20776f225ee37d69517a237c090b9fa7e0836a0b8bc58868e9168ba6ef6f"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:4373f914ff524ee0146919dea96a40a8200ab157e5a15e777a74a769f73d8a4a"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:37017b84953927807847016620d61251fe236bd4bcb25e27b6133d955bb9cafb"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-win32.whl", hash = "sha256:c8d1dd1146539e093b84d0805e8951475644af794ace81d957ca612e3eb31598"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-win_amd64.whl", hash = "sha256:f51c7571295ea97387bac4f048d73cecce51222be78ed808263b45c79c40a440"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314-win_arm64.whl", hash = "sha256:01eab10ec90912d7d28b3f08f6c91adbaf93458a53f849ff70776ecd70dd7a7a"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:60879fcae2f7618403c4c746a9a3eec89327d73148fb6e89a933b78442ff0669"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:f94d61e44db3fc95a74006a394257af90fa6e826c900a501d749979ff495d702"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314t-win32.whl", hash = "sha256:93b6294a3ffab32a9b5f9b5ca048fa0474998e7e8bb0f2d2b5e819c64cb71ec7"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314t-win_amd64.whl", hash = "sha256:6cb56b695421538fdbe2c0c85888b991d833b8637d2f2b41faa79cea7234c000"},
+ {file = "rapidfuzz-3.14.1-cp314-cp314t-win_arm64.whl", hash = "sha256:7cd312c380d3ce9d35c3ec9726b75eee9da50e8a38e89e229a03db2262d3d96b"},
+ {file = "rapidfuzz-3.14.1-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:673ce55a9be5b772dade911909e42382c0828b8a50ed7f9168763fa6b9f7054d"},
+ {file = "rapidfuzz-3.14.1-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:45c62ada1980ebf4c64c4253993cc8daa018c63163f91db63bb3af69cb74c2e3"},
+ {file = "rapidfuzz-3.14.1-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:4d51efb29c0df0d4f7f64f672a7624c2146527f0745e3572098d753676538800"},
+ {file = "rapidfuzz-3.14.1-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:4a21ccdf1bd7d57a1009030527ba8fae1c74bf832d0a08f6b67de8f5c506c96f"},
+ {file = "rapidfuzz-3.14.1-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:589fb0af91d3aff318750539c832ea1100dbac2c842fde24e42261df443845f6"},
+ {file = "rapidfuzz-3.14.1-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:a4f18092db4825f2517d135445015b40033ed809a41754918a03ef062abe88a0"},
+ {file = "rapidfuzz-3.14.1.tar.gz", hash = "sha256:b02850e7f7152bd1edff27e9d584505b84968cacedee7a734ec4050c655a803c"},
]
[package.extras]
@@ -4981,4 +4968,4 @@ type = ["pytest-mypy"]
[metadata]
lock-version = "2.1"
python-versions = "^3.11"
-content-hash = "41981e274e958a70f5034827967fc28998561ae42040776fa79113456f26c156"
+content-hash = "363083a370e795ce42acda8ee8f07d55f59c8f36f8cd5e21c623a780d425cda0"
diff --git a/api/pyproject.toml b/api/pyproject.toml
index b5d1f69d63..b978989d2a 100644
--- a/api/pyproject.toml
+++ b/api/pyproject.toml
@@ -1,6 +1,6 @@
[project]
name = "api"
-version = "0.60.2"
+version = "0.60.0"
description = "Agenta API"
authors = [
{ name = "Mahmoud Mabrouk", email = "mahmoud@agenta.ai" },
@@ -46,7 +46,6 @@ watchdog = { extras = ["watchmedo"], version = "^3.0.0" }
sqlalchemy-json = "^0.7.0"
python-multipart = "^0.0.20"
gunicorn = "^23.0.0"
-python-jsonpath = "^2.0.0"
# opentelemetry-api = "^1.36.0"
# opentelemetry-sdk = "^1.36.0"
diff --git a/docs/blog/entries/documentation-architecture-overhaul.mdx b/docs/blog/entries/documentation-architecture-overhaul.mdx
deleted file mode 100644
index a19bf29d67..0000000000
--- a/docs/blog/entries/documentation-architecture-overhaul.mdx
+++ /dev/null
@@ -1,40 +0,0 @@
----
-title: "Documentation Architecture Overhaul"
-slug: documentation-architecture-overhaul
-date: 2025-11-03
-tags: [v0.59.10]
----
-
-import Image from "@theme/IdealImage";
-
-We've completely rewritten and restructured our documentation with a new architecture. This is one of the largest updates we've made to the documentation, involving a near-complete rewrite of existing content and adding substantial new material.
-
-### Diataxis Framework Implementation
-
-We've reorganized all documentation using the [Diataxis framework](https://diataxis.fr/).
-
-### Expanded Observability Documentation
-
-One of the biggest gaps in our previous documentation was observability. We've added comprehensive documentation covering:
-
-- [Tracing with OpenTelemetry](/observability/trace-with-opentelemetry/getting-started)
-- [Tracing LLM applications with JS/TS](/observability/quick-start-opentelemetry)
-- [Using the Metrics API to fetch metrics](/observability/query-data/analytics-data)
-- [Using the Query API to fetch traces](/observability/query-data/query-api)
-
-### JavaScript/TypeScript Support
-
-Documentation now includes JavaScript and TypeScript examples alongside Python wherever applicable. This makes it easier for JavaScript developers to integrate Agenta into their applications.
-
-### Ask AI Feature
-
-We've added a new "Ask AI" feature that lets you ask questions directly to the documentation. Get instant answers to your questions without searching through pages.
-
-
-
-
----
diff --git a/docs/blog/main.mdx b/docs/blog/main.mdx
index e55eed8a9c..208ffed0bf 100644
--- a/docs/blog/main.mdx
+++ b/docs/blog/main.mdx
@@ -10,24 +10,6 @@ import Image from "@theme/IdealImage";
-### [Documentation Overhaul](/changelog/documentation-architecture-overhaul)
-
-_3 November 2025_
-
-**v0.59.10**
-
-
-
-We've completely rewritten and restructured our documentation with a new architecture. This is one of the largest updates we've made, involving a near-complete rewrite of existing content.
-
-Key improvements include:
-- **[Diataxis Framework](https://diataxis.fr/)**: Organized content into Tutorials, How-to Guides, Reference, and Explanation sections for better discoverability
-- **[Expanded Observability Docs](/observability/overview)**: Added missing documentation for tracing, annotations, and observability features
-- **[JavaScript/TypeScript Support](/observability/quick-start-opentelemetry)**: Added code examples and documentation for JavaScript developers alongside Python
-- **Ask AI Feature**: Ask questions directly to the documentation for instant answers
-
----
-
### [Vertex AI Provider Support](/changelog/vertex-ai-provider-support)
_24 October 2025_
diff --git a/docs/docs/getting-started/01-introduction.mdx b/docs/docs/getting-started/01-introduction.mdx
index d7e9e591c8..7c51b57d8b 100644
--- a/docs/docs/getting-started/01-introduction.mdx
+++ b/docs/docs/getting-started/01-introduction.mdx
@@ -23,30 +23,20 @@ Agenta covers the entire LLM development lifecycle: **prompt management**, **eva
### Prompt Engineering and Management
-Teams often struggle with prompt collaboration. They keep prompts in code where subject matter experts cannot edit them. Or they use spreadsheets in an unreliable process.
+Agenta enables product teams to experiment with prompts, push them to production, run evaluations, and annotate their results.
-Agenta organizes prompts for your team. Subject matter experts can collaborate with developers without touching the codebase. Developers can version prompts and deploy them to production.
-
-The playground lets teams experiment with prompts. You can load traces and test sets. You can test prompts side by side.
### Evaluation
-Most teams lack a systematic evaluation process. They make random prompt changes based on vibes. Some changes improve quality but break other cases because LLMs are stochastic.
-
-Agenta provides one place to evaluate systematically. Teams can run three types of evaluation:
+Agenta enables product teams to experiment with prompts, push them to production, run evaluations, and annotate their results.
-- Automatic evaluation with LLMs at scale before production
-- Human annotation where subject matter experts review results and provide feedback to AI engineers
-- Online evaluation for applications already in production
-Both subject matter experts and engineers can run evaluations from the UI.
### Observability
-Agenta helps you understand what happens in production. You can capture user feedback through an API (thumbs up or implicit signals). You can debug agents and applications with tracing to see what happens inside them.
+Agenta enables product teams to experiment with prompts, push them to production, run evaluations, and annotate their results.
-Track costs over time. Find edge cases where things fail. Add those cases to your test sets. Have subject matter experts annotate the results.
## Why Agenta?
diff --git a/docs/docs/self-host/02-configuration.mdx b/docs/docs/self-host/02-configuration.mdx
index 5879d32f47..adde1c0333 100644
--- a/docs/docs/self-host/02-configuration.mdx
+++ b/docs/docs/self-host/02-configuration.mdx
@@ -37,8 +37,6 @@ Configuration for Docker and database connections:
| Variable | Description | Default |
|----------|-------------|---------|
-| `AGENTA_WEB_IMAGE_TAG` | Docker image tag for the web frontend | `latest` |
-| `AGENTA_API_IMAGE_TAG` | Docker image tag for the API backend | `latest` |
| `DOCKER_NETWORK_MODE` | Docker networking mode | `_(empty)_` (which falls back to `bridge`) |
| `POSTGRES_PASSWORD` | PostgreSQL database password | `password` |
| `POSTGRES_USERNAME` | PostgreSQL database username | `username` |
diff --git a/docs/docs/self-host/99-faq.mdx b/docs/docs/self-host/99-faq.mdx
deleted file mode 100644
index 42a8a52075..0000000000
--- a/docs/docs/self-host/99-faq.mdx
+++ /dev/null
@@ -1,16 +0,0 @@
----
-title: Frequently Asked Questions
-sidebar_label: FAQ
-description: Self-hosting Agenta FAQ. Learn how to lock Agenta to a specific version, configure Docker images, and troubleshoot common deployment issues.
----
-
-## How do I lock Agenta to a specific version?
-
-Use the `AGENTA_WEB_IMAGE_TAG` and `AGENTA_API_IMAGE_TAG` environment variables.
-
-```bash
-AGENTA_WEB_IMAGE_TAG=v0.15.0
-AGENTA_API_IMAGE_TAG=v0.15.0
-```
-
-These are set to `latest` by default.
diff --git a/docs/docusaurus.config.ts b/docs/docusaurus.config.ts
index 4d89dcdcfd..f33bea025d 100644
--- a/docs/docusaurus.config.ts
+++ b/docs/docusaurus.config.ts
@@ -84,8 +84,8 @@ const config: Config = {
navbar: {
logo: {
alt: "agenta-ai",
- src: "images/Agenta-logo-full-light.png",
- srcDark: "images/Agenta-logo-full-dark-accent.png",
+ src: "images/light-complete-transparent-CROPPED.png",
+ srcDark: "images/dark-complete-transparent-CROPPED.png",
},
hideOnScroll: false,
items: [
diff --git a/docs/static/images/Agenta-logo-full-dark-accent.png b/docs/static/images/Agenta-logo-full-dark-accent.png
deleted file mode 100644
index a270afc094..0000000000
Binary files a/docs/static/images/Agenta-logo-full-dark-accent.png and /dev/null differ
diff --git a/docs/static/images/Agenta-logo-full-light.png b/docs/static/images/Agenta-logo-full-light.png
deleted file mode 100644
index bddc2359bd..0000000000
Binary files a/docs/static/images/Agenta-logo-full-light.png and /dev/null differ
diff --git a/docs/static/images/changelog/agenta_askai.png b/docs/static/images/changelog/agenta_askai.png
deleted file mode 100644
index eee2ba8ff0..0000000000
Binary files a/docs/static/images/changelog/agenta_askai.png and /dev/null differ
diff --git a/docs/static/images/dark-complete-transparent-CROPPED.png b/docs/static/images/dark-complete-transparent-CROPPED.png
new file mode 100644
index 0000000000..bc73ad84e2
Binary files /dev/null and b/docs/static/images/dark-complete-transparent-CROPPED.png differ
diff --git a/docs/static/images/dark-logo.svg b/docs/static/images/dark-logo.svg
new file mode 100644
index 0000000000..6cb8ef3330
--- /dev/null
+++ b/docs/static/images/dark-logo.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/docs/static/images/favicon.ico b/docs/static/images/favicon.ico
index dad02fe072..4dc8619b1d 100644
Binary files a/docs/static/images/favicon.ico and b/docs/static/images/favicon.ico differ
diff --git a/docs/static/images/light-complete-transparent-CROPPED.png b/docs/static/images/light-complete-transparent-CROPPED.png
new file mode 100644
index 0000000000..de9bbd9aca
Binary files /dev/null and b/docs/static/images/light-complete-transparent-CROPPED.png differ
diff --git a/docs/static/images/light-logo.svg b/docs/static/images/light-logo.svg
new file mode 100644
index 0000000000..9c795f8e88
--- /dev/null
+++ b/docs/static/images/light-logo.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/docs/static/images/social-card.png b/docs/static/images/social-card.png
index 49fe2b893e..d62f2f99b9 100644
Binary files a/docs/static/images/social-card.png and b/docs/static/images/social-card.png differ
diff --git a/sdk/README.md b/sdk/README.md
index d07ac59e52..df44027a55 100644
--- a/sdk/README.md
+++ b/sdk/README.md
@@ -2,12 +2,11 @@
-
-
+
+
-
The Open-source LLMOps Platform
Build reliable LLM applications faster with integrated prompt management, evaluation, and observability.
@@ -182,7 +181,7 @@ We welcome contributions of all kinds — from filing issues and sharing ideas t
## Contributors ✨
-[](#contributors-)
+[](#contributors-)
Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
diff --git a/sdk/agenta/sdk/agenta_init.py b/sdk/agenta/sdk/agenta_init.py
index f6ffa7c73b..9740b2283b 100644
--- a/sdk/agenta/sdk/agenta_init.py
+++ b/sdk/agenta/sdk/agenta_init.py
@@ -68,7 +68,7 @@ def init(
"""
- log.info("Agenta - SDK ver: %s", version("agenta"))
+ log.info("Agenta - SDK version: %s", version("agenta"))
config = {}
if config_fname:
@@ -94,7 +94,7 @@ def init(
log.error(f"Failed to parse host URL '{_host}': {e}")
raise
- log.info("Agenta - API URL: %s/api", self.host)
+ log.info("Agenta - Host: %s", self.host)
self.api_key = api_key or getenv("AGENTA_API_KEY") or config.get("api_key")
diff --git a/sdk/agenta/sdk/contexts/tracing.py b/sdk/agenta/sdk/contexts/tracing.py
index 8425350ceb..ab5718fb42 100644
--- a/sdk/agenta/sdk/contexts/tracing.py
+++ b/sdk/agenta/sdk/contexts/tracing.py
@@ -11,7 +11,7 @@ class TracingContext(BaseModel):
#
credentials: Optional[str] = None
#
- script: Optional[dict] = None
+ script: Optional[str] = None
parameters: Optional[dict] = None
#
flags: Optional[dict] = None
diff --git a/sdk/agenta/sdk/decorators/running.py b/sdk/agenta/sdk/decorators/running.py
index 9c13423454..9eb77d43f8 100644
--- a/sdk/agenta/sdk/decorators/running.py
+++ b/sdk/agenta/sdk/decorators/running.py
@@ -133,7 +133,7 @@ def __init__(
]
] = None,
# -------------------------------------------------------------------- #
- script: Optional[dict] = None,
+ script: Optional[str] = None,
parameters: Optional[dict] = None,
#
configuration: Optional[
diff --git a/sdk/agenta/sdk/middleware/vault.py b/sdk/agenta/sdk/middleware/vault.py
index 52c02fa186..0b3056a586 100644
--- a/sdk/agenta/sdk/middleware/vault.py
+++ b/sdk/agenta/sdk/middleware/vault.py
@@ -19,11 +19,22 @@
import agenta as ag
+AGENTA_RUNTIME_PREFIX = getenv("AGENTA_RUNTIME_PREFIX", "")
+
+_ALWAYS_ALLOW_LIST = [
+ f"{AGENTA_RUNTIME_PREFIX}/health",
+ f"{AGENTA_RUNTIME_PREFIX}/openapi.json",
+]
+
_PROVIDER_KINDS = []
for provider_kind in StandardProviderKind.__args__[0].__args__: # type: ignore
_PROVIDER_KINDS.append(provider_kind)
+_AUTH_ENABLED = (
+ getenv("AGENTA_SERVICE_MIDDLEWARE_AUTH_ENABLED", "true").lower() in TRUTHY
+)
+
_CACHE_ENABLED = (
getenv("AGENTA_SERVICE_MIDDLEWARE_CACHE_ENABLED", "true").lower() in TRUTHY
)
@@ -31,12 +42,27 @@
_cache = TTLLRUCache()
+class DenyException(Exception):
+ def __init__(
+ self,
+ status_code: int = 403,
+ content: str = "Forbidden",
+ ) -> None:
+ super().__init__()
+
+ self.status_code = status_code
+ self.content = content
+
+
class VaultMiddleware(BaseHTTPMiddleware):
def __init__(self, app: FastAPI):
super().__init__(app)
self.host = ag.DEFAULT_AGENTA_SINGLETON_INSTANCE.host
+ self.scope_type = ag.DEFAULT_AGENTA_SINGLETON_INSTANCE.scope_type
+ self.scope_id = ag.DEFAULT_AGENTA_SINGLETON_INSTANCE.scope_id
+
async def dispatch(
self,
request: Request,
@@ -74,8 +100,12 @@ async def _get_secrets(self, request: Request) -> Optional[Dict]:
return secrets
local_secrets: List[Dict[str, Any]] = []
+ allow_secrets = True
try:
+ if not request.url.path in _ALWAYS_ALLOW_LIST:
+ await self._allow_local_secrets(credentials)
+
for provider_kind in _PROVIDER_KINDS:
provider = provider_kind
key_name = f"{provider.upper()}_API_KEY"
@@ -93,6 +123,9 @@ async def _get_secrets(self, request: Request) -> Optional[Dict]:
)
local_secrets.append(secret.model_dump())
+ except DenyException as e: # pylint: disable=bare-except
+ print(e.status_code, e.content)
+ allow_secrets = False
except: # pylint: disable=bare-except
display_exception("Vault: Local Secrets Exception")
@@ -133,6 +166,155 @@ async def _get_secrets(self, request: Request) -> Optional[Dict]:
secrets = standard_secrets + custom_secrets
- _cache.put(_hash, {"secrets": secrets})
+ if not allow_secrets:
+ _cache.put(_hash, {"secrets": secrets})
return secrets
+
+ async def _allow_local_secrets(self, credentials):
+ try:
+ if not _AUTH_ENABLED:
+ return
+
+ if not credentials:
+ raise DenyException(
+ status_code=401,
+ content="Invalid credentials. Please check your credentials or login again.",
+ )
+
+ # HEADERS
+ headers = {"Authorization": credentials}
+ # PARAMS
+ params = {}
+ ## SCOPE
+ if self.scope_type and self.scope_id:
+ params["scope_type"] = self.scope_type
+ params["scope_id"] = self.scope_id
+ ## ACTION
+ params["action"] = "view_secret"
+ ## RESOURCE
+ params["resource_type"] = "local_secrets"
+
+ _hash = dumps(
+ {
+ "headers": headers,
+ "params": params,
+ },
+ sort_keys=True,
+ )
+
+ access = None
+
+ if _CACHE_ENABLED:
+ access = _cache.get(_hash)
+
+ if isinstance(access, Exception):
+ raise access
+
+ try:
+ async with httpx.AsyncClient() as client:
+ try:
+ response = await client.get(
+ f"{self.host}/api/permissions/verify",
+ headers=headers,
+ params=params,
+ timeout=30.0,
+ )
+ except httpx.TimeoutException as exc:
+ # log.debug(f"Timeout error while verify secrets access: {exc}")
+ raise DenyException(
+ status_code=504,
+ content=f"Could not verify secrets access: connection to {self.host} timed out. Please check your network connection.",
+ ) from exc
+ except httpx.ConnectError as exc:
+ # log.debug(f"Connection error while verify secrets access: {exc}")
+ raise DenyException(
+ status_code=503,
+ content=f"Could not verify secrets access: connection to {self.host} failed. Please check if agenta is available.",
+ ) from exc
+ except httpx.NetworkError as exc:
+ # log.debug(f"Network error while verify secrets access: {exc}")
+ raise DenyException(
+ status_code=503,
+ content=f"Could not verify secrets access: connection to {self.host} failed. Please check your network connection.",
+ ) from exc
+ except httpx.HTTPError as exc:
+ # log.debug(f"HTTP error while verify secrets access: {exc}")
+ raise DenyException(
+ status_code=502,
+ content=f"Could not verify secrets access: connection to {self.host} failed. Please check if agenta is available.",
+ ) from exc
+
+ if response.status_code == 401:
+ # log.debug("Agenta returned 401 - Invalid credentials")
+ raise DenyException(
+ status_code=401,
+ content="Invalid credentials. Please check your credentials or login again.",
+ )
+ elif response.status_code == 403:
+ # log.debug("Agenta returned 403 - Permission denied")
+ raise DenyException(
+ status_code=403,
+ content="Out of credits. Please set your LLM provider API keys or contact support.",
+ )
+ elif response.status_code != 200:
+ # log.debug(
+ # f"Agenta returned {response.status_code} - Unexpected status code"
+ # )
+ raise DenyException(
+ status_code=500,
+ content=f"Could not verify secrets access: {self.host} returned unexpected status code {response.status_code}. Please try again later or contact support if the issue persists.",
+ )
+
+ try:
+ auth = response.json()
+ except ValueError as exc:
+ # log.debug(f"Agenta returned invalid JSON response: {exc}")
+ raise DenyException(
+ status_code=500,
+ content=f"Could not verify secrets access: {self.host} returned unexpected invalid JSON response. Please try again later or contact support if the issue persists.",
+ ) from exc
+
+ if not isinstance(auth, dict):
+ # log.debug(
+ # f"Agenta returned invalid response format: {type(auth)}"
+ # )
+ raise DenyException(
+ status_code=500,
+ content=f"Could not verify secrets access: {self.host} returned unexpected invalid response format. Please try again later or contact support if the issue persists.",
+ )
+
+ effect = auth.get("effect")
+
+ access = effect == "allow"
+
+ if effect != "allow":
+ # log.debug("Access denied by Agenta - effect: {effect}")
+ raise DenyException(
+ status_code=403,
+ content="Out of credits. Please set your LLM provider API keys or contact support.",
+ )
+
+ return
+
+ except DenyException as deny:
+ _cache.put(_hash, deny)
+
+ raise deny
+ except Exception as exc: # pylint: disable=bare-except
+ # log.debug(
+ # f"Unexpected error while verifying credentials (remote): {exc}"
+ # )
+ raise DenyException(
+ status_code=500,
+ content=f"Could not verify credentials: unexpected error - {str(exc)}. Please try again later or contact support if the issue persists.",
+ ) from exc
+
+ except DenyException as deny:
+ raise deny
+ except Exception as exc:
+ # log.debug(f"Unexpected error while verifying credentials (local): {exc}")
+ raise DenyException(
+ status_code=500,
+ content=f"Could not verify credentials: unexpected error - {str(exc)}. Please try again later or contact support if the issue persists.",
+ ) from exc
diff --git a/sdk/agenta/sdk/tracing/exporters.py b/sdk/agenta/sdk/tracing/exporters.py
index e0d3fc4511..5fd2006b7c 100644
--- a/sdk/agenta/sdk/tracing/exporters.py
+++ b/sdk/agenta/sdk/tracing/exporters.py
@@ -1,7 +1,6 @@
-from typing import Sequence, Dict, List, Optional, Any
+from typing import Sequence, Dict, List, Optional
from threading import Thread
from os import environ
-from uuid import UUID
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import (
@@ -24,7 +23,7 @@
log = get_module_logger(__name__)
-_ASYNC_EXPORT = environ.get("AGENTA_OTLP_ASYNC_EXPORT", "false").lower() in TRUTHY
+_ASYNC_EXPORT = environ.get("AGENTA_OTLP_ASYNC_EXPORT", "true").lower() in TRUTHY
class InlineTraceExporter(SpanExporter):
@@ -51,8 +50,6 @@ def export(
self._registry[trace_id].append(span)
- return
-
def shutdown(self) -> None:
self._shutdown = True
@@ -92,17 +89,17 @@ def __init__(
self.credentials = credentials
def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult:
- grouped_spans: Dict[Optional[str], List[ReadableSpan]] = dict()
+ grouped_spans: Dict[str, List[str]] = {}
for span in spans:
trace_id = span.get_span_context().trace_id
credentials = None
if self.credentials:
- credentials = str(self.credentials.get(trace_id))
+ credentials = self.credentials.get(trace_id)
if credentials not in grouped_spans:
- grouped_spans[credentials] = list()
+ grouped_spans[credentials] = []
grouped_spans[credentials].append(span)
@@ -114,16 +111,6 @@ def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult:
credentials=credentials,
)
):
- for _span in _spans:
- trace_id = _span.get_span_context().trace_id
- span_id = _span.get_span_context().span_id
-
- # log.debug(
- # "[SPAN] [EXPORT]",
- # trace_id=UUID(int=trace_id).hex,
- # span_id=UUID(int=span_id).hex[-16:],
- # )
-
serialized_spans.append(super().export(_spans))
if all(serialized_spans):
@@ -140,31 +127,17 @@ def _export(self, serialized_data: bytes, timeout_sec: Optional[float] = None):
def __export():
with suppress():
- resp = None
if timeout_sec is not None:
- resp = super(OTLPExporter, self)._export(
- serialized_data,
- timeout_sec,
+ return super(OTLPExporter, self)._export(
+ serialized_data, timeout_sec
)
else:
- resp = super(OTLPExporter, self)._export(
- serialized_data,
- )
-
- # log.debug(
- # "[SPAN] [_EXPORT]",
- # data=serialized_data,
- # resp=resp,
- # )
+ return super(OTLPExporter, self)._export(serialized_data)
if _ASYNC_EXPORT is True:
thread = Thread(target=__export, daemon=True)
thread.start()
else:
- # log.debug(
- # "[SPAN] [__XPORT]",
- # data=serialized_data,
- # )
return __export()
except Exception as e:
diff --git a/sdk/agenta/sdk/tracing/processors.py b/sdk/agenta/sdk/tracing/processors.py
index 88c2f40a12..83f5bd5ab9 100644
--- a/sdk/agenta/sdk/tracing/processors.py
+++ b/sdk/agenta/sdk/tracing/processors.py
@@ -1,7 +1,6 @@
from typing import Optional, Dict, List
from threading import Lock
from json import dumps
-from uuid import UUID
from opentelemetry.baggage import get_all as get_baggage
from opentelemetry.context import Context
@@ -55,15 +54,6 @@ def on_start(
span: Span,
parent_context: Optional[Context] = None,
) -> None:
- trace_id = span.context.trace_id
- span_id = span.context.span_id
-
- # log.debug(
- # "[SPAN] [START] ",
- # trace_id=UUID(int=trace_id).hex,
- # span_id=UUID(int=span_id).hex[-16:],
- # )
-
for key in self.references.keys():
span.set_attribute(f"ag.refs.{key}", self.references[key])
@@ -165,12 +155,6 @@ def on_end(
trace_id = span.context.trace_id
span_id = span.context.span_id
- # log.debug(
- # "[SPAN] [END] ",
- # trace_id=UUID(int=trace_id).hex,
- # span_id=UUID(int=span_id).hex[-16:],
- # )
-
self._spans.setdefault(trace_id, []).append(span)
self._registry.setdefault(trace_id, {})
self._registry[trace_id].pop(span_id, None)
diff --git a/sdk/agenta/sdk/types.py b/sdk/agenta/sdk/types.py
index cbdf684dc3..356faeb28d 100644
--- a/sdk/agenta/sdk/types.py
+++ b/sdk/agenta/sdk/types.py
@@ -387,7 +387,7 @@ class ModelConfig(BaseModel):
"""Configuration for model parameters"""
model: str = MCField(
- default="gpt-4o-mini",
+ default="gpt-3.5-turbo",
choices=supported_llm_models,
)
@@ -462,154 +462,6 @@ def __init__(self, message: str, original_error: Optional[Exception] = None):
super().__init__(message)
-import json
-import re
-from typing import Any, Dict, Iterable, Tuple, Optional
-
-# --- Optional dependency: python-jsonpath (provides JSONPath + JSON Pointer) ---
-try:
- import jsonpath # ✅ use module API
- from jsonpath import JSONPointer # pointer class is fine to use
-except Exception:
- jsonpath = None
- JSONPointer = None
-
-# ========= Scheme detection =========
-
-
-def detect_scheme(expr: str) -> str:
- """Return 'json-path', 'json-pointer', or 'dot-notation' based on the placeholder prefix."""
- if expr.startswith("$"):
- return "json-path"
- if expr.startswith("/"):
- return "json-pointer"
- return "dot-notation"
-
-
-# ========= Resolvers =========
-
-
-def resolve_dot_notation(expr: str, data: dict) -> object:
- if "[" in expr or "]" in expr:
- raise KeyError(f"Bracket syntax is not supported in dot-notation: {expr!r}")
-
- # First, check if the expression exists as a literal key (e.g., "topic.story" as a single key)
- # This allows users to use dots in their variable names without nested access
- if expr in data:
- return data[expr]
-
- # If not found as a literal key, try to parse as dot-notation path
- cur = data
- for token in (p for p in expr.split(".") if p):
- if isinstance(cur, list) and token.isdigit():
- cur = cur[int(token)]
- else:
- if not isinstance(cur, dict):
- raise KeyError(
- f"Cannot access key {token!r} on non-dict while resolving {expr!r}"
- )
- if token not in cur:
- raise KeyError(f"Missing key {token!r} while resolving {expr!r}")
- cur = cur[token]
- return cur
-
-
-def resolve_json_path(expr: str, data: dict) -> object:
- if jsonpath is None:
- raise ImportError("python-jsonpath is required for json-path ($...)")
-
- if not (expr == "$" or expr.startswith("$.") or expr.startswith("$[")):
- raise ValueError(
- f"Invalid json-path expression {expr!r}. "
- "Must start with '$', '$.' or '$[' (no implicit normalization)."
- )
-
- # Use package-level APIf
- results = jsonpath.findall(expr, data) # always returns a list
- return results[0] if len(results) == 1 else results
-
-
-def resolve_json_pointer(expr: str, data: Dict[str, Any]) -> Any:
- """Resolve a JSON Pointer; returns a single value."""
- if JSONPointer is None:
- raise ImportError("python-jsonpath is required for json-pointer (/...)")
- return JSONPointer(expr).resolve(data)
-
-
-def resolve_any(expr: str, data: Dict[str, Any]) -> Any:
- """Dispatch to the right resolver based on detected scheme."""
- scheme = detect_scheme(expr)
- if scheme == "json-path":
- return resolve_json_path(expr, data)
- if scheme == "json-pointer":
- return resolve_json_pointer(expr, data)
- return resolve_dot_notation(expr, data)
-
-
-# ========= Placeholder & coercion helpers =========
-
-_PLACEHOLDER_RE = re.compile(r"\{\{\s*(.*?)\s*\}\}")
-
-
-def extract_placeholders(template: str) -> Iterable[str]:
- """Yield the inner text of all {{ ... }} occurrences (trimmed)."""
- for m in _PLACEHOLDER_RE.finditer(template):
- yield m.group(1).strip()
-
-
-def coerce_to_str(value: Any) -> str:
- """Pretty stringify values for embedding into templates."""
- if isinstance(value, (dict, list)):
- return json.dumps(value, ensure_ascii=False)
- return str(value)
-
-
-def build_replacements(
- placeholders: Iterable[str], data: Dict[str, Any]
-) -> Tuple[Dict[str, str], set]:
- """
- Resolve all placeholders against data.
- Returns (replacements, unresolved_placeholders).
- """
- replacements: Dict[str, str] = {}
- unresolved: set = set()
- for expr in set(placeholders):
- try:
- val = resolve_any(expr, data)
- # Escape backslashes to avoid regex replacement surprises
- replacements[expr] = coerce_to_str(val).replace("\\", "\\\\")
- except Exception:
- unresolved.add(expr)
- return replacements, unresolved
-
-
-def apply_replacements(template: str, replacements: Dict[str, str]) -> str:
- """Replace {{ expr }} using a callback to avoid regex-injection issues."""
-
- def _repl(m: re.Match) -> str:
- expr = m.group(1).strip()
- return replacements.get(expr, m.group(0))
-
- return _PLACEHOLDER_RE.sub(_repl, template)
-
-
-def compute_truly_unreplaced(original: set, rendered: str) -> set:
- """Only count placeholders that were in the original template and remain."""
- now = set(extract_placeholders(rendered))
- return original & now
-
-
-def missing_lib_hints(unreplaced: set) -> Optional[str]:
- """Suggest installing python-jsonpath if placeholders indicate json-path or json-pointer usage."""
- if any(expr.startswith("$") or expr.startswith("/") for expr in unreplaced) and (
- jsonpath is None or JSONPointer is None
- ):
- return (
- "Install python-jsonpath to enable json-path ($...) and json-pointer (/...)"
- )
- return None
-
-
class PromptTemplate(BaseModel):
"""A template for generating prompts with formatting capabilities"""
@@ -656,7 +508,6 @@ def _format_with_template(self, content: str, kwargs: Dict[str, Any]) -> str:
try:
if self.template_format == "fstring":
return content.format(**kwargs)
-
elif self.template_format == "jinja2":
from jinja2 import Template, TemplateError
@@ -667,33 +518,35 @@ def _format_with_template(self, content: str, kwargs: Dict[str, Any]) -> str:
f"Jinja2 template error in content: '{content}'. Error: {str(e)}",
original_error=e,
)
-
elif self.template_format == "curly":
- original_placeholders = set(extract_placeholders(content))
-
- replacements, _unresolved = build_replacements(
- original_placeholders, kwargs
- )
+ import re
+
+ # Extract variables that exist in the original template before replacement
+ # This allows us to distinguish template variables from {{}} in user input values
+ original_variables = set(re.findall(r"\{\{(.*?)\}\}", content))
+
+ result = content
+ for key, value in kwargs.items():
+ # Escape backslashes in the replacement string to prevent regex interpretation
+ escaped_value = str(value).replace("\\", "\\\\")
+ result = re.sub(
+ r"\{\{" + re.escape(key) + r"\}\}", escaped_value, result
+ )
- result = apply_replacements(content, replacements)
+ # Only check if ORIGINAL template variables remain unreplaced
+ # Don't error on {{}} that came from user input values
+ unreplaced_matches = set(re.findall(r"\{\{(.*?)\}\}", result))
+ truly_unreplaced = original_variables & unreplaced_matches
- truly_unreplaced = compute_truly_unreplaced(
- original_placeholders, result
- )
if truly_unreplaced:
- hint = missing_lib_hints(truly_unreplaced)
- suffix = f" Hint: {hint}" if hint else ""
raise TemplateFormatError(
- f"Unreplaced variables in curly template: {sorted(truly_unreplaced)}.{suffix}"
+ f"Unreplaced variables in curly template: {sorted(truly_unreplaced)}"
)
-
return result
-
else:
raise TemplateFormatError(
f"Unknown template format: {self.template_format}"
)
-
except KeyError as e:
key = str(e).strip("'")
raise TemplateFormatError(
@@ -701,8 +554,7 @@ def _format_with_template(self, content: str, kwargs: Dict[str, Any]) -> str:
)
except Exception as e:
raise TemplateFormatError(
- f"Error formatting template '{content}': {str(e)}",
- original_error=e,
+ f"Error formatting template '{content}': {str(e)}", original_error=e
)
def _substitute_variables(self, obj: Any, kwargs: Dict[str, Any]) -> Any:
diff --git a/sdk/agenta/sdk/utils/logging.py b/sdk/agenta/sdk/utils/logging.py
index cc4789b93c..1091ceefd0 100644
--- a/sdk/agenta/sdk/utils/logging.py
+++ b/sdk/agenta/sdk/utils/logging.py
@@ -8,6 +8,15 @@
import structlog
from structlog.typing import EventDict, WrappedLogger, Processor
+# from datetime import datetime
+# from logging.handlers import RotatingFileHandler
+
+# from opentelemetry.trace import get_current_span
+# from opentelemetry._logs import set_logger_provider
+# from opentelemetry.sdk._logs import LoggingHandler, LoggerProvider
+# from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
+# from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter
+
TRACE_LEVEL = 1
logging.TRACE = TRACE_LEVEL
logging.addLevelName(TRACE_LEVEL, "TRACE")
@@ -31,6 +40,15 @@ def bound_logger_trace(self, *args, **kwargs):
AGENTA_LOG_CONSOLE_ENABLED = os.getenv("AGENTA_LOG_CONSOLE_ENABLED", "true") == "true"
AGENTA_LOG_CONSOLE_LEVEL = os.getenv("AGENTA_LOG_CONSOLE_LEVEL", "TRACE").upper()
+# AGENTA_LOG_OTLP_ENABLED = os.getenv("AGENTA_LOG_OTLP_ENABLED", "false") == "true"
+# AGENTA_LOG_OTLP_LEVEL = os.getenv("AGENTA_LOG_OTLP_LEVEL", "INFO").upper()
+
+# AGENTA_LOG_FILE_ENABLED = os.getenv("AGENTA_LOG_FILE_ENABLED", "true") == "true"
+# AGENTA_LOG_FILE_LEVEL = os.getenv("AGENTA_LOG_FILE_LEVEL", "WARNING").upper()
+# AGENTA_LOG_FILE_BASE = os.getenv("AGENTA_LOG_FILE_PATH", "error")
+# LOG_FILE_DATE = datetime.utcnow().strftime("%Y-%m-%d")
+# AGENTA_LOG_FILE_PATH = f"{AGENTA_LOG_FILE_BASE}-{LOG_FILE_DATE}.log"
+
# COLORS
LEVEL_COLORS = {
"TRACE": "\033[97m",
@@ -70,6 +88,15 @@ def process_positional_args(_, __, event_dict: EventDict) -> EventDict:
return event_dict
+# def add_trace_context(_, __, event_dict: EventDict) -> EventDict:
+# span = get_current_span()
+# if span and span.get_span_context().is_valid:
+# ctx = span.get_span_context()
+# event_dict["TraceId"] = format(ctx.trace_id, "032x")
+# event_dict["SpanId"] = format(ctx.span_id, "016x")
+# return event_dict
+
+
def add_logger_info(
logger: WrappedLogger, method_name: str, event_dict: EventDict
) -> EventDict:
@@ -116,9 +143,36 @@ def render(_, __, event_dict: EventDict) -> str:
return render
+# def plain_renderer() -> Processor:
+# hidden = {
+# "SeverityText",
+# "SeverityNumber",
+# "MethodName",
+# "logger_factory",
+# "LoggerName",
+# "level",
+# }
+
+# def render(_, __, event_dict: EventDict) -> str:
+# ts = event_dict.pop("Timestamp", "")[:23] + "Z"
+# level = event_dict.get("level", "")
+# msg = event_dict.pop("event", "")
+# padded = f"[{level:<5}]"
+# logger = f"[{event_dict.pop('logger', '')}]"
+# extras = " ".join(f"{k}={v}" for k, v in event_dict.items() if k not in hidden)
+# return f"{ts} {padded} {msg} {logger} {extras}"
+
+# return render
+
+
+# def json_renderer() -> Processor:
+# return structlog.processors.JSONRenderer()
+
+
SHARED_PROCESSORS: list[Processor] = [
structlog.processors.TimeStamper(fmt="iso", utc=True, key="Timestamp"),
process_positional_args,
+ # add_trace_context,
add_logger_info,
structlog.processors.format_exc_info,
structlog.processors.dict_tracebacks,
@@ -139,30 +193,36 @@ def create_struct_logger(
)
-# Guard against double initialization
-_LOGGING_CONFIGURED = False
-
# CONFIGURE HANDLERS AND STRUCTLOG LOGGERS
handlers = []
loggers = []
-if AGENTA_LOG_CONSOLE_ENABLED and not _LOGGING_CONFIGURED:
- _LOGGING_CONFIGURED = True
-
- # Check if console logger already has handlers (from OSS module)
- console_logger = logging.getLogger("console")
-
- if not console_logger.handlers:
- # Only add handler if it doesn't exist yet
- h = logging.StreamHandler(sys.stdout)
- h.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL))
- h.setFormatter(logging.Formatter("%(message)s"))
- console_logger.addHandler(h)
- console_logger.setLevel(TRACE_LEVEL)
- console_logger.propagate = False
-
+if AGENTA_LOG_CONSOLE_ENABLED:
+ h = logging.StreamHandler(sys.stdout)
+ h.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL))
+ h.setFormatter(logging.Formatter("%(message)s"))
+ logging.getLogger("console").addHandler(h)
loggers.append(create_struct_logger([colored_console_renderer()], "console"))
+# if AGENTA_LOG_FILE_ENABLED:
+# h = RotatingFileHandler(AGENTA_LOG_FILE_PATH, maxBytes=10 * 1024 * 1024, backupCount=5)
+# h.setLevel(getattr(logging, AGENTA_LOG_FILE_LEVEL, logging.WARNING))
+# h.setFormatter(logging.Formatter("%(message)s"))
+# logging.getLogger("file").addHandler(h)
+# loggers.append(create_struct_logger([plain_renderer()], "file"))
+
+# if AGENTA_LOG_OTLP_ENABLED:
+# provider = LoggerProvider()
+# exporter = OTLPLogExporter()
+# provider.add_log_record_processor(BatchLogRecordProcessor(exporter))
+# set_logger_provider(provider)
+# h = LoggingHandler(
+# level=getattr(logging, AGENTA_LOG_OTLP_LEVEL, logging.INFO), logger_provider=provider
+# )
+# h.setFormatter(logging.Formatter("%(message)s"))
+# logging.getLogger("otel").addHandler(h)
+# loggers.append(create_struct_logger([json_renderer()], "otel"))
+
class MultiLogger:
def __init__(self, *loggers: structlog.stdlib.BoundLogger):
diff --git a/sdk/agenta/sdk/workflows/handlers.py b/sdk/agenta/sdk/workflows/handlers.py
index 2a72fe20eb..cbee22d305 100644
--- a/sdk/agenta/sdk/workflows/handlers.py
+++ b/sdk/agenta/sdk/workflows/handlers.py
@@ -76,153 +76,6 @@ def _compute_similarity(embedding_1: List[float], embedding_2: List[float]) -> f
return dot / (norm1 * norm2)
-import json
-import re
-from typing import Any, Dict, Iterable, Tuple, Optional
-
-try:
- import jsonpath # ✅ use module API
- from jsonpath import JSONPointer # pointer class is fine to use
-except Exception:
- jsonpath = None
- JSONPointer = None
-
-# ========= Scheme detection =========
-
-
-def detect_scheme(expr: str) -> str:
- """Return 'json-path', 'json-pointer', or 'dot-notation' based on the placeholder prefix."""
- if expr.startswith("$"):
- return "json-path"
- if expr.startswith("/"):
- return "json-pointer"
- return "dot-notation"
-
-
-# ========= Resolvers =========
-
-
-def resolve_dot_notation(expr: str, data: dict) -> object:
- if "[" in expr or "]" in expr:
- raise KeyError(f"Bracket syntax is not supported in dot-notation: {expr!r}")
-
- # First, check if the expression exists as a literal key (e.g., "topic.story" as a single key)
- # This allows users to use dots in their variable names without nested access
- if expr in data:
- return data[expr]
-
- # If not found as a literal key, try to parse as dot-notation path
- cur = data
- for token in (p for p in expr.split(".") if p):
- if isinstance(cur, list) and token.isdigit():
- cur = cur[int(token)]
- else:
- if not isinstance(cur, dict):
- raise KeyError(
- f"Cannot access key {token!r} on non-dict while resolving {expr!r}"
- )
- if token not in cur:
- raise KeyError(f"Missing key {token!r} while resolving {expr!r}")
- cur = cur[token]
- return cur
-
-
-def resolve_json_path(expr: str, data: dict) -> object:
- if jsonpath is None:
- raise ImportError("python-jsonpath is required for json-path ($...)")
-
- if not (expr == "$" or expr.startswith("$.") or expr.startswith("$[")):
- raise ValueError(
- f"Invalid json-path expression {expr!r}. "
- "Must start with '$', '$.' or '$[' (no implicit normalization)."
- )
-
- # Use package-level APIf
- results = jsonpath.findall(expr, data) # always returns a list
- return results[0] if len(results) == 1 else results
-
-
-def resolve_json_pointer(expr: str, data: Dict[str, Any]) -> Any:
- """Resolve a JSON Pointer; returns a single value."""
- if JSONPointer is None:
- raise ImportError("python-jsonpath is required for json-pointer (/...)")
- return JSONPointer(expr).resolve(data)
-
-
-def resolve_any(expr: str, data: Dict[str, Any]) -> Any:
- """Dispatch to the right resolver based on detected scheme."""
- scheme = detect_scheme(expr)
- if scheme == "json-path":
- return resolve_json_path(expr, data)
- if scheme == "json-pointer":
- return resolve_json_pointer(expr, data)
- return resolve_dot_notation(expr, data)
-
-
-# ========= Placeholder & coercion helpers =========
-
-_PLACEHOLDER_RE = re.compile(r"\{\{\s*(.*?)\s*\}\}")
-
-
-def extract_placeholders(template: str) -> Iterable[str]:
- """Yield the inner text of all {{ ... }} occurrences (trimmed)."""
- for m in _PLACEHOLDER_RE.finditer(template):
- yield m.group(1).strip()
-
-
-def coerce_to_str(value: Any) -> str:
- """Pretty stringify values for embedding into templates."""
- if isinstance(value, (dict, list)):
- return json.dumps(value, ensure_ascii=False)
- return str(value)
-
-
-def build_replacements(
- placeholders: Iterable[str], data: Dict[str, Any]
-) -> Tuple[Dict[str, str], set]:
- """
- Resolve all placeholders against data.
- Returns (replacements, unresolved_placeholders).
- """
- replacements: Dict[str, str] = {}
- unresolved: set = set()
- for expr in set(placeholders):
- try:
- val = resolve_any(expr, data)
- # Escape backslashes to avoid regex replacement surprises
- replacements[expr] = coerce_to_str(val).replace("\\", "\\\\")
- except Exception:
- unresolved.add(expr)
- return replacements, unresolved
-
-
-def apply_replacements(template: str, replacements: Dict[str, str]) -> str:
- """Replace {{ expr }} using a callback to avoid regex-injection issues."""
-
- def _repl(m: re.Match) -> str:
- expr = m.group(1).strip()
- return replacements.get(expr, m.group(0))
-
- return _PLACEHOLDER_RE.sub(_repl, template)
-
-
-def compute_truly_unreplaced(original: set, rendered: str) -> set:
- """Only count placeholders that were in the original template and remain."""
- now = set(extract_placeholders(rendered))
- return original & now
-
-
-def missing_lib_hints(unreplaced: set) -> Optional[str]:
- """Suggest installing python-jsonpath if placeholders indicate json-path or json-pointer usage."""
- if any(expr.startswith("$") or expr.startswith("/") for expr in unreplaced) and (
- jsonpath is None or JSONPointer is None
- ):
- return (
- "Install python-jsonpath to enable json-path ($...) and json-pointer (/...)"
- )
- return None
-
-
def _format_with_template(
content: str,
format: str,
@@ -237,24 +90,33 @@ def _format_with_template(
try:
return Template(content).render(**kwargs)
- except TemplateError:
+ except TemplateError as e:
return content
elif format == "curly":
- original_placeholders = set(extract_placeholders(content))
+ import re
- replacements, _unresolved = build_replacements(original_placeholders, kwargs)
+ # Extract variables that exist in the original template before replacement
+ # This allows us to distinguish template variables from {{}} in user input values
+ original_variables = set(re.findall(r"\{\{(.*?)\}\}", content))
- result = apply_replacements(content, replacements)
+ result = content
+ for key, value in kwargs.items():
+ pattern = r"\{\{" + re.escape(key) + r"\}\}"
+ old_result = result
+ # Escape backslashes in the replacement string to prevent regex interpretation
+ escaped_value = str(value).replace("\\", "\\\\")
+ result = re.sub(pattern, escaped_value, result)
- truly_unreplaced = compute_truly_unreplaced(original_placeholders, result)
+ # Only check if ORIGINAL template variables remain unreplaced
+ # Don't error on {{}} that came from user input values
+ unreplaced_matches = set(re.findall(r"\{\{(.*?)\}\}", result))
+ truly_unreplaced = original_variables & unreplaced_matches
if truly_unreplaced:
- hint = missing_lib_hints(truly_unreplaced)
- suffix = f" Hint: {hint}" if hint else ""
+ log.info(f"WORKFLOW Found unreplaced variables: {truly_unreplaced}")
raise ValueError(
- f"Template variables not found or unresolved: "
- f"{', '.join(sorted(truly_unreplaced))}.{suffix}"
+ f"Template variables not found in inputs: {', '.join(sorted(truly_unreplaced))}"
)
return result
@@ -776,31 +638,6 @@ async def auto_ai_critique_v0(
got=model,
)
- response_type = parameters.get("response_type") or "text"
-
- if not response_type in ["text", "json_object", "json_schema"]:
- raise InvalidConfigurationParameterV0Error(
- path="response_type",
- expected=["text", "json_object", "json_schema"],
- got=response_type,
- )
-
- json_schema = parameters.get("json_schema") or None
-
- json_schema = json_schema if response_type == "json_schema" else None
-
- if response_type == "json_schema" and not isinstance(json_schema, dict):
- raise InvalidConfigurationParameterV0Error(
- path="json_schema",
- expected="dict",
- got=json_schema,
- )
-
- response_format: dict = dict(type=response_type)
-
- if response_type == "json_schema":
- response_format["json_schema"] = json_schema
-
correct_answer = None
if inputs:
@@ -854,6 +691,13 @@ async def auto_ai_critique_v0(
got=threshold,
)
+ if not 0.0 < threshold <= 1.0:
+ raise InvalidConfigurationParameterV0Error(
+ path="threshold",
+ expected="float[0.0, 1.0]",
+ got=threshold,
+ )
+
_outputs = None
# --------------------------------------------------------------------------
@@ -921,7 +765,6 @@ async def auto_ai_critique_v0(
model=model,
messages=formatted_prompt_template,
temperature=0.01,
- response_format=response_format,
)
_outputs = response.choices[0].message.content.strip() # type: ignore
@@ -945,20 +788,31 @@ async def auto_ai_critique_v0(
pass
if isinstance(_outputs, (int, float)):
- return {
- "score": _outputs,
- "success": _outputs >= threshold,
- }
+ return {"score": _outputs, "success": _outputs >= threshold}
if isinstance(_outputs, bool):
- return {
- "success": _outputs,
- }
+ return {"success": _outputs}
if isinstance(_outputs, dict):
- return _outputs
+ if "score" in _outputs and "success" in _outputs:
+ return {
+ "score": _outputs["score"],
+ "success": _outputs["success"],
+ }
+
+ elif "score" in _outputs:
+ return {
+ "score": _outputs["score"],
+ "success": _outputs["score"] >= threshold,
+ }
+
+ elif "success" in _outputs:
+ return {"success": _outputs}
+
+ else:
+ return _outputs
- raise InvalidOutputsV0Error(expected=["dict", "str", "int", "float"], got=_outputs)
+ raise InvalidOutputsV0Error(expected=["dict", "int", "float"], got=_outputs)
@instrument(annotate=True)
diff --git a/sdk/poetry.lock b/sdk/poetry.lock
index 1541ee64cf..b383c0d85e 100644
--- a/sdk/poetry.lock
+++ b/sdk/poetry.lock
@@ -228,18 +228,18 @@ files = [
[[package]]
name = "boto3"
-version = "1.40.63"
+version = "1.40.62"
description = "The AWS SDK for Python"
optional = false
python-versions = ">=3.9"
groups = ["dev"]
files = [
- {file = "boto3-1.40.63-py3-none-any.whl", hash = "sha256:f15d4abf1a6283887c336f660cdfc2162a210d2d8f4d98dbcbcef983371c284d"},
- {file = "boto3-1.40.63.tar.gz", hash = "sha256:3bf4b034900c87a6a9b3b3b44c4aec26e96fc73bff2505f0766224b7295178ce"},
+ {file = "boto3-1.40.62-py3-none-any.whl", hash = "sha256:f422d4ae3b278832ba807059aafa553164bce2c464cd65b24c9ea8fb8a6c4192"},
+ {file = "boto3-1.40.62.tar.gz", hash = "sha256:3dbe7e1e7dc9127a4b1f2020a14f38ffe64fad84df00623e8ab6a5d49a82ea28"},
]
[package.dependencies]
-botocore = ">=1.40.63,<1.41.0"
+botocore = ">=1.40.62,<1.41.0"
jmespath = ">=0.7.1,<2.0.0"
s3transfer = ">=0.14.0,<0.15.0"
@@ -248,14 +248,14 @@ crt = ["botocore[crt] (>=1.21.0,<2.0a0)"]
[[package]]
name = "botocore"
-version = "1.40.63"
+version = "1.40.62"
description = "Low-level, data-driven core of boto 3."
optional = false
python-versions = ">=3.9"
groups = ["dev"]
files = [
- {file = "botocore-1.40.63-py3-none-any.whl", hash = "sha256:83657b3ee487268fccc9ba022cba572ba657b9ece8cddd1fa241e2c6a49c8c14"},
- {file = "botocore-1.40.63.tar.gz", hash = "sha256:0324552c3c800e258cbcb8c22b495a2e2e0260a7408d08016196e46fa0d1b587"},
+ {file = "botocore-1.40.62-py3-none-any.whl", hash = "sha256:780f1d476d4b530ce3b12fd9f7112156d97d99ebdbbd9ef60635b0432af9d3a5"},
+ {file = "botocore-1.40.62.tar.gz", hash = "sha256:1e8e57c131597dc234d67428bda1323e8f0a687ea13ea570253159ab9256fa28"},
]
[package.dependencies]
@@ -744,14 +744,14 @@ files = [
[[package]]
name = "fsspec"
-version = "2025.10.0"
+version = "2025.9.0"
description = "File-system specification"
optional = false
python-versions = ">=3.9"
groups = ["main"]
files = [
- {file = "fsspec-2025.10.0-py3-none-any.whl", hash = "sha256:7c7712353ae7d875407f97715f0e1ffcc21e33d5b24556cb1e090ae9409ec61d"},
- {file = "fsspec-2025.10.0.tar.gz", hash = "sha256:b6789427626f068f9a83ca4e8a3cc050850b6c0f71f99ddb4f542b8266a26a59"},
+ {file = "fsspec-2025.9.0-py3-none-any.whl", hash = "sha256:530dc2a2af60a414a832059574df4a6e10cce927f6f4a78209390fe38955cfb7"},
+ {file = "fsspec-2025.9.0.tar.gz", hash = "sha256:19fd429483d25d28b65ec68f9f4adc16c17ea2c7c7bf54ec61360d478fb19c19"},
]
[package.extras]
@@ -784,14 +784,14 @@ tqdm = ["tqdm"]
[[package]]
name = "google-auth"
-version = "2.42.1"
+version = "2.42.0"
description = "Google Authentication Library"
optional = false
python-versions = ">=3.7"
groups = ["main"]
files = [
- {file = "google_auth-2.42.1-py2.py3-none-any.whl", hash = "sha256:eb73d71c91fc95dbd221a2eb87477c278a355e7367a35c0d84e6b0e5f9b4ad11"},
- {file = "google_auth-2.42.1.tar.gz", hash = "sha256:30178b7a21aa50bffbdc1ffcb34ff770a2f65c712170ecd5446c4bef4dc2b94e"},
+ {file = "google_auth-2.42.0-py2.py3-none-any.whl", hash = "sha256:f8f944bcb9723339b0ef58a73840f3c61bc91b69bf7368464906120b55804473"},
+ {file = "google_auth-2.42.0.tar.gz", hash = "sha256:9bbbeef3442586effb124d1ca032cfb8fb7acd8754ab79b55facd2b8f3ab2802"},
]
[package.dependencies]
@@ -2217,21 +2217,6 @@ files = [
[package.extras]
cli = ["click (>=5.0)"]
-[[package]]
-name = "python-jsonpath"
-version = "2.0.1"
-description = "JSONPath, JSON Pointer and JSON Patch for Python."
-optional = false
-python-versions = ">=3.8"
-groups = ["main"]
-files = [
- {file = "python_jsonpath-2.0.1-py3-none-any.whl", hash = "sha256:ebd518b7c883acc5b976518d76b6c96288405edec7d9ef838641869c1e1a5eb7"},
- {file = "python_jsonpath-2.0.1.tar.gz", hash = "sha256:32a84ebb2dc0ec1b42a6e165b0f9174aef8310bad29154ad9aee31ac37cca18f"},
-]
-
-[package.extras]
-strict = ["iregexp-check (>=0.1.4)", "regex"]
-
[[package]]
name = "pyyaml"
version = "6.0.3"
@@ -3188,4 +3173,4 @@ type = ["pytest-mypy"]
[metadata]
lock-version = "2.1"
python-versions = "^3.11"
-content-hash = "e6413824b6ec2fa2e89002d58d6c3432772dc3279619b8f54e4818abaa3b44a7"
+content-hash = "14edf246a0775b4245d1b8d10d33092e474aadd3458b78b72d2a13d2bbdae975"
diff --git a/sdk/pyproject.toml b/sdk/pyproject.toml
index dd3bf54cf0..48c56139a8 100644
--- a/sdk/pyproject.toml
+++ b/sdk/pyproject.toml
@@ -1,6 +1,6 @@
[tool.poetry]
name = "agenta"
-version = "0.60.2"
+version = "0.60.0"
description = "The SDK for agenta is an open-source LLMOps platform."
readme = "README.md"
authors = [
@@ -34,7 +34,6 @@ pyyaml = "^6.0.2"
toml = "^0.10.2"
litellm = "==1.78.7"
jinja2 = "^3.1.6"
-python-jsonpath = "^2.0.0"
opentelemetry-api = "^1.27.0"
opentelemetry-sdk = "^1.27.0"
opentelemetry-instrumentation = ">=0.56b0"
diff --git a/web/ee/package.json b/web/ee/package.json
index 01946d1ff3..3a7d0209d2 100644
--- a/web/ee/package.json
+++ b/web/ee/package.json
@@ -1,6 +1,6 @@
{
"name": "@agenta/ee",
- "version": "0.60.2",
+ "version": "0.62.0",
"private": true,
"engines": {
"node": ">=18"
diff --git a/web/ee/public/assets/Agenta-logo-full-dark-accent.png b/web/ee/public/assets/Agenta-logo-full-dark-accent.png
deleted file mode 100644
index c14833dab1..0000000000
Binary files a/web/ee/public/assets/Agenta-logo-full-dark-accent.png and /dev/null differ
diff --git a/web/ee/public/assets/Agenta-logo-full-light.png b/web/ee/public/assets/Agenta-logo-full-light.png
deleted file mode 100644
index 4c9b31a813..0000000000
Binary files a/web/ee/public/assets/Agenta-logo-full-light.png and /dev/null differ
diff --git a/web/ee/public/assets/dark-complete-transparent-CROPPED.png b/web/ee/public/assets/dark-complete-transparent-CROPPED.png
new file mode 100644
index 0000000000..7d134ac59a
Binary files /dev/null and b/web/ee/public/assets/dark-complete-transparent-CROPPED.png differ
diff --git a/web/ee/public/assets/dark-complete-transparent_white_logo.png b/web/ee/public/assets/dark-complete-transparent_white_logo.png
new file mode 100644
index 0000000000..8685bbf981
Binary files /dev/null and b/web/ee/public/assets/dark-complete-transparent_white_logo.png differ
diff --git a/web/ee/public/assets/dark-logo.svg b/web/ee/public/assets/dark-logo.svg
new file mode 100644
index 0000000000..6cb8ef3330
--- /dev/null
+++ b/web/ee/public/assets/dark-logo.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/web/ee/public/assets/favicon.ico b/web/ee/public/assets/favicon.ico
index dad02fe072..4dc8619b1d 100644
Binary files a/web/ee/public/assets/favicon.ico and b/web/ee/public/assets/favicon.ico differ
diff --git a/web/ee/public/assets/light-complete-transparent-CROPPED.png b/web/ee/public/assets/light-complete-transparent-CROPPED.png
new file mode 100644
index 0000000000..6be2e99e08
Binary files /dev/null and b/web/ee/public/assets/light-complete-transparent-CROPPED.png differ
diff --git a/web/ee/public/assets/light-logo.svg b/web/ee/public/assets/light-logo.svg
new file mode 100644
index 0000000000..9c795f8e88
--- /dev/null
+++ b/web/ee/public/assets/light-logo.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/index.tsx b/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/index.tsx
index 0df5ba5646..d0704f2feb 100644
--- a/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/index.tsx
+++ b/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/index.tsx
@@ -23,6 +23,7 @@ import {focusScenarioAtom} from "@/oss/components/EvalRunDetails/state/focusScen
import {urlStateAtom} from "@/oss/components/EvalRunDetails/state/urlState"
import MetricDetailsPopover from "@/oss/components/HumanEvaluations/assets/MetricDetailsPopover"
import {formatMetricValue} from "@/oss/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils"
+import {getMetricsFromEvaluator} from "@/oss/components/pages/observability/drawer/AnnotateDrawer/assets/transforms"
import {getStatusLabel} from "@/oss/lib/constants/statusLabels"
import {
evaluationRunStateFamily,
@@ -45,19 +46,6 @@ import FocusDrawerContentSkeleton from "../Skeletons/FocusDrawerContentSkeleton"
import RunOutput, {fallbackPrimitive, resolveOnlineOutput} from "./assets/RunOutput"
import RunTraceHeader from "./assets/RunTraceHeader"
-import {
- getFromAnnotationOutputs,
- resolveEvaluatorMetricsMap,
- SCENARIO_METRIC_ALIASES,
- asEvaluatorArray,
- extractEvaluatorSlug,
- extractEvaluatorName,
- findAnnotationStepKey,
- collectSlugCandidates,
- collectEvaluatorIdentifiers,
- pickString,
- buildDrawerMetricDefinition,
-} from "./lib/helpers"
const EMPTY_COMPARISON_RUN_IDS: string[] = []
@@ -72,7 +60,139 @@ const emptyMetricDataAtom = atom<{value: any; distInfo?: any}>({
distInfo: undefined,
})
-export interface DrawerMetricValueCellProps {
+const SCENARIO_METRIC_ALIASES: Record
= {
+ "attributes.ag.metrics.costs.cumulative.total": ["totalCost", "costs.total", "cost"],
+ "attributes.ag.metrics.duration.cumulative": ["duration.total", "duration"],
+ "attributes.ag.metrics.tokens.cumulative.total": ["totalTokens", "tokens.total", "tokens"],
+ "attributes.ag.metrics.errors.cumulative": ["errors"],
+ totalCost: ["attributes.ag.metrics.costs.cumulative.total", "costs.total", "cost"],
+ "duration.total": ["attributes.ag.metrics.duration.cumulative", "duration"],
+ totalTokens: ["attributes.ag.metrics.tokens.cumulative.total", "tokens.total", "tokens"],
+ promptTokens: ["attributes.ag.metrics.tokens.cumulative.total", "tokens", "tokens.prompt"],
+ completionTokens: [
+ "attributes.ag.metrics.tokens.cumulative.total",
+ "tokens",
+ "tokens.completion",
+ ],
+ errors: ["attributes.ag.metrics.errors.cumulative"],
+}
+
+const asEvaluatorArray = (input: any): any[] => {
+ if (!input) return []
+ if (Array.isArray(input)) return input
+ if (typeof input === "object") return Object.values(input)
+ return []
+}
+
+const pickString = (candidate: unknown): string | undefined => {
+ if (typeof candidate === "string") {
+ const trimmed = candidate.trim()
+ if (trimmed.length > 0) return trimmed
+ }
+ return undefined
+}
+
+const collectEvaluatorIdentifiers = (entry: any): string[] => {
+ if (!entry || typeof entry !== "object") return []
+ const ids = new Set()
+ ;[
+ entry.slug,
+ entry.id,
+ entry.key,
+ entry.uid,
+ entry.evaluator_key,
+ entry?.data?.slug,
+ entry?.data?.id,
+ entry?.data?.key,
+ entry?.data?.evaluator_key,
+ entry?.meta?.slug,
+ entry?.meta?.id,
+ entry?.meta?.key,
+ entry?.flags?.slug,
+ entry?.flags?.id,
+ entry?.flags?.key,
+ entry?.flags?.evaluator_key,
+ entry?.references?.slug,
+ entry?.references?.id,
+ entry?.references?.key,
+ ].forEach((candidate) => {
+ const value = pickString(candidate)
+ if (value) ids.add(value)
+ })
+ return Array.from(ids)
+}
+
+const extractEvaluatorSlug = (entry: any): string | undefined => {
+ if (!entry || typeof entry !== "object") return undefined
+ const candidates = collectEvaluatorIdentifiers(entry)
+ if (candidates.length) return candidates[0]
+ return undefined
+}
+
+const extractEvaluatorName = (entry: any): string | undefined => {
+ if (!entry || typeof entry !== "object") return undefined
+ const candidates = [
+ entry?.name,
+ entry?.displayName,
+ entry?.display_name,
+ entry?.title,
+ entry?.label,
+ entry?.meta?.displayName,
+ entry?.meta?.display_name,
+ entry?.meta?.name,
+ entry?.flags?.display_name,
+ entry?.flags?.name,
+ entry?.data?.display_name,
+ entry?.data?.name,
+ ]
+ for (const candidate of candidates) {
+ const value = pickString(candidate)
+ if (value) return value
+ }
+ return undefined
+}
+
+const asRecord = (value: any): Record | undefined => {
+ if (!value || typeof value !== "object" || Array.isArray(value)) return undefined
+ const entries = Object.entries(value)
+ if (!entries.length) return undefined
+ return value as Record
+}
+
+const extractSchemaProperties = (entry: any): Record | undefined => {
+ if (!entry || typeof entry !== "object") return undefined
+ const candidates = [
+ entry?.data?.schemas?.outputs?.properties,
+ entry?.data?.schemas?.output?.properties,
+ entry?.data?.service?.format?.properties?.outputs?.properties,
+ entry?.data?.service?.properties?.outputs?.properties,
+ entry?.data?.output_schema?.properties,
+ entry?.data?.outputs_schema?.properties,
+ entry?.output_schema?.properties,
+ entry?.schema?.properties,
+ ]
+ for (const candidate of candidates) {
+ const record = asRecord(candidate)
+ if (record) return record
+ }
+ return undefined
+}
+
+const resolveEvaluatorMetricsMap = (entry: any): Record | undefined => {
+ if (!entry || typeof entry !== "object") return undefined
+ const direct = asRecord(entry.metrics)
+ if (direct) return direct
+
+ const schemaProps = extractSchemaProperties(entry)
+ if (schemaProps) return schemaProps
+
+ const derived = asRecord(getMetricsFromEvaluator(entry as any))
+ if (derived) return derived
+
+ return undefined
+}
+
+interface DrawerMetricValueCellProps {
runId: string
scenarioId?: string
evaluatorSlug: string
@@ -96,13 +216,172 @@ interface EvaluatorContext {
errorStep?: EvaluatorFailure
}
-export interface DrawerEvaluatorMetric {
+interface DrawerEvaluatorMetric {
id: string
displayName: string
metricKey?: string
fallbackKeys?: string[]
}
+const normalizeMetricPrimaryKey = (slug: string | undefined, rawKey: string): string => {
+ const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined
+ const trimmed = rawKey.trim()
+ if (!trimmed) return normalizedSlug ?? ""
+ if (normalizedSlug) {
+ const prefix = `${normalizedSlug}.`
+ if (trimmed.startsWith(prefix)) return trimmed
+ }
+ if (trimmed.includes(".")) return trimmed
+ return normalizedSlug ? `${normalizedSlug}.${trimmed}` : trimmed
+}
+
+const collectMetricFallbackKeys = (
+ slug: string | undefined,
+ rawKey: string,
+ primaryKey: string,
+ meta: any,
+): string[] => {
+ const set = new Set()
+ const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined
+ const push = (value?: string) => {
+ if (!value) return
+ const trimmed = String(value).trim()
+ if (!trimmed) return
+ if (trimmed.includes(".") || !normalizedSlug) {
+ set.add(trimmed)
+ } else {
+ set.add(`${normalizedSlug}.${trimmed}`)
+ }
+ }
+
+ push(rawKey)
+
+ const aliases = Array.isArray(meta?.aliases)
+ ? meta?.aliases
+ : meta?.aliases
+ ? [meta.aliases]
+ : meta?.alias
+ ? [meta.alias]
+ : []
+ aliases.forEach(push)
+
+ const extraKeys = [
+ meta?.metricKey,
+ meta?.metric_key,
+ meta?.key,
+ meta?.path,
+ meta?.fullKey,
+ meta?.full_key,
+ meta?.canonicalKey,
+ meta?.canonical_key,
+ meta?.statsKey,
+ meta?.stats_key,
+ meta?.metric,
+ ]
+ extraKeys.forEach(push)
+
+ const fallbackKeys = Array.from(set).filter((value) => value !== rawKey && value !== primaryKey)
+ return fallbackKeys
+}
+
+const buildDrawerMetricDefinition = (
+ slug: string | undefined,
+ rawKey: string,
+ meta: any,
+): DrawerEvaluatorMetric => {
+ const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined
+ const normalizedDisplay =
+ normalizedSlug && rawKey.startsWith(`${normalizedSlug}.`)
+ ? rawKey.slice(normalizedSlug.length + 1)
+ : rawKey
+ const primaryKey = normalizeMetricPrimaryKey(slug, rawKey)
+ const fallbackKeys = collectMetricFallbackKeys(slug, rawKey, primaryKey, meta)
+ const id = canonicalizeMetricKey(primaryKey) || primaryKey
+
+ return {
+ id,
+ displayName: normalizedDisplay || primaryKey,
+ metricKey: primaryKey,
+ fallbackKeys: fallbackKeys.length ? fallbackKeys : undefined,
+ }
+}
+
+const collectCandidateSteps = (data?: UseEvaluationRunScenarioStepsFetcherResult): any[] => {
+ if (!data) return []
+ const buckets: any[] = []
+ if (Array.isArray(data.annotationSteps)) buckets.push(...(data.annotationSteps as any[]))
+ if (Array.isArray(data.steps)) buckets.push(...(data.steps as any[]))
+ if (Array.isArray(data.invocationSteps)) buckets.push(...(data.invocationSteps as any[]))
+ return buckets
+}
+
+const collectSlugCandidates = (
+ data: UseEvaluationRunScenarioStepsFetcherResult | undefined,
+ evaluatorSlug: string,
+): string[] => {
+ const set = new Set()
+ const push = (value?: string | null) => {
+ if (!value) return
+ const normalized = String(value).trim()
+ if (!normalized) return
+ set.add(normalized)
+ }
+
+ push(evaluatorSlug)
+
+ const steps = collectCandidateSteps(data)
+ steps.forEach((step) => {
+ if (!step) return
+ const ref: any = step?.references?.evaluator
+ push(step?.stepKey as any)
+ push(step?.stepkey as any)
+ push(step?.step_key as any)
+ push(ref?.slug)
+ push(ref?.key)
+ push(ref?.id)
+ })
+
+ return Array.from(set)
+}
+
+const findAnnotationStepKey = (
+ data: UseEvaluationRunScenarioStepsFetcherResult | undefined,
+ slugCandidates: string[],
+): string | undefined => {
+ if (!data) return undefined
+
+ const steps = collectCandidateSteps(data)
+ if (!steps.length) return undefined
+
+ const loweredCandidates = slugCandidates
+ .map((slug) => String(slug).toLowerCase())
+ .filter((slug) => slug.length > 0)
+
+ const matched = steps.find((step) => {
+ if (!step) return false
+ const possible: string[] = [
+ (step as any)?.stepKey,
+ (step as any)?.stepkey,
+ (step as any)?.step_key,
+ (step as any)?.references?.evaluator?.slug,
+ (step as any)?.references?.evaluator?.key,
+ (step as any)?.references?.evaluator?.id,
+ ]
+
+ return possible
+ .filter(Boolean)
+ .map((value) => String(value).toLowerCase())
+ .some((candidate) => loweredCandidates.includes(candidate))
+ })
+
+ return (
+ (matched as any)?.stepKey ??
+ (matched as any)?.stepkey ??
+ (matched as any)?.step_key ??
+ undefined
+ )
+}
+
const EvaluatorFailureDisplay = ({
status,
error,
@@ -270,8 +549,6 @@ const DrawerMetricValueCell = ({
const bareMetricData = useAtomValue(runScopedAtoms.bare)
const canonicalBareMetricData = useAtomValue(runScopedAtoms.canonicalBare)
- const runScopedStats = useAtomValue(runMetricsStatsCacheFamily(runId))
-
const runScopedResult = useMemo(() => {
const candidates = [
{key: normalizedPrimaryKey, data: primaryMetricData},
@@ -481,41 +758,8 @@ const DrawerMetricValueCell = ({
evaluatorSlug,
])
- // Prefer run-scoped/metrics-map value; if it is missing or schema-like, fallback to annotation outputs
- const annotationFallback = useMemo(() => {
- const v = resolution.rawValue
- const isSchemaLike =
- v &&
- typeof v === "object" &&
- !Array.isArray(v) &&
- Object.keys(v as any).length <= 2 &&
- "type" in (v as any)
-
- const unusable =
- v === undefined ||
- v === null ||
- (typeof v === "string" && !v.trim()) ||
- (typeof v === "number" && Number.isNaN(v)) ||
- isSchemaLike
-
- if (!unusable) return undefined
-
- return getFromAnnotationOutputs({
- scenarioStepsResult,
- slugCandidates,
- evaluatorSlug,
- expandedCandidates,
- })
- }, [
- resolution.rawValue,
- scenarioStepsResult,
- slugCandidates,
- evaluatorSlug,
- expandedCandidates,
- ])
-
- const rawValue = annotationFallback?.value ?? resolution.rawValue
- const matchedKey = annotationFallback?.matchedKey ?? resolution.matchedKey
+ const rawValue = resolution.rawValue
+ const matchedKey = resolution.matchedKey
const distInfo = useMemo(() => {
if (resolution.distInfo !== undefined) return resolution.distInfo
@@ -640,7 +884,6 @@ const DrawerMetricValueCell = ({
const editorKey = `${runId}-${scenarioId}-${evaluatorSlug}-${metricName}`
return (
{}}
initialValue={display}
@@ -663,10 +906,9 @@ const DrawerMetricValueCell = ({
const tagNode = (
{display}
@@ -1852,7 +2094,7 @@ const FocusDrawerContent = () => {
const isPrevOpen = !!(prevSlug && activeKeys.includes(prevSlug))
const metricMap = new Map()
- const metricHelper = (meta: any, rawKey: string) => {
+ Object.entries(metrics || {}).forEach(([rawKey, meta]) => {
const definition = buildDrawerMetricDefinition(
evaluator.slug,
String(rawKey),
@@ -1875,16 +2117,6 @@ const FocusDrawerContent = () => {
: undefined,
})
}
- }
-
- Object.entries(metrics || {}).forEach(([rawKey, meta]) => {
- if (meta.properties) {
- Object.entries(meta.properties).forEach(([propKey, propMeta]) => {
- metricHelper(propMeta, `${rawKey}.${propKey}`)
- })
- } else {
- metricHelper(meta, rawKey)
- }
})
const metricDefs = Array.from(metricMap.values())
@@ -1919,7 +2151,7 @@ const FocusDrawerContent = () => {
error: scenarioStepsError,
}}
sectionId={`section-${evaluator.slug}`}
- metricRowClassName="flex flex-col items-start gap-1 mb-3 w-full"
+ metricRowClassName="flex flex-col items-start gap-1 mb-3"
/>
),
}
diff --git a/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/lib/helpers.ts b/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/lib/helpers.ts
deleted file mode 100644
index a84d3d1d55..0000000000
--- a/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/lib/helpers.ts
+++ /dev/null
@@ -1,401 +0,0 @@
-import {UseEvaluationRunScenarioStepsFetcherResult} from "@/oss/lib/hooks/useEvaluationRunScenarioSteps/types"
-import {DrawerEvaluatorMetric, DrawerMetricValueCellProps} from ".."
-import {canonicalizeMetricKey} from "@/oss/lib/metricUtils"
-import {getMetricsFromEvaluator} from "@/oss/components/pages/observability/drawer/AnnotateDrawer/assets/transforms"
-
-export const SCENARIO_METRIC_ALIASES: Record = {
- "attributes.ag.metrics.costs.cumulative.total": ["totalCost", "costs.total", "cost"],
- "attributes.ag.metrics.duration.cumulative": ["duration.total", "duration"],
- "attributes.ag.metrics.tokens.cumulative.total": ["totalTokens", "tokens.total", "tokens"],
- "attributes.ag.metrics.errors.cumulative": ["errors"],
- totalCost: ["attributes.ag.metrics.costs.cumulative.total", "costs.total", "cost"],
- "duration.total": ["attributes.ag.metrics.duration.cumulative", "duration"],
- totalTokens: ["attributes.ag.metrics.tokens.cumulative.total", "tokens.total", "tokens"],
- promptTokens: ["attributes.ag.metrics.tokens.cumulative.total", "tokens", "tokens.prompt"],
- completionTokens: [
- "attributes.ag.metrics.tokens.cumulative.total",
- "tokens",
- "tokens.completion",
- ],
- errors: ["attributes.ag.metrics.errors.cumulative"],
-}
-
-export const asEvaluatorArray = (input: any): any[] => {
- if (!input) return []
- if (Array.isArray(input)) return input
- if (typeof input === "object") return Object.values(input)
- return []
-}
-
-export const pickString = (candidate: unknown): string | undefined => {
- if (typeof candidate === "string") {
- const trimmed = candidate.trim()
- if (trimmed.length > 0) return trimmed
- }
- return undefined
-}
-
-export const collectEvaluatorIdentifiers = (entry: any): string[] => {
- if (!entry || typeof entry !== "object") return []
- const ids = new Set()
- ;[
- entry.slug,
- entry.id,
- entry.key,
- entry.uid,
- entry.evaluator_key,
- entry?.data?.slug,
- entry?.data?.id,
- entry?.data?.key,
- entry?.data?.evaluator_key,
- entry?.meta?.slug,
- entry?.meta?.id,
- entry?.meta?.key,
- entry?.flags?.slug,
- entry?.flags?.id,
- entry?.flags?.key,
- entry?.flags?.evaluator_key,
- entry?.references?.slug,
- entry?.references?.id,
- entry?.references?.key,
- ].forEach((candidate) => {
- const value = pickString(candidate)
- if (value) ids.add(value)
- })
- return Array.from(ids)
-}
-
-export const extractEvaluatorSlug = (entry: any): string | undefined => {
- if (!entry || typeof entry !== "object") return undefined
- const candidates = collectEvaluatorIdentifiers(entry)
- if (candidates.length) return candidates[0]
- return undefined
-}
-
-export const extractEvaluatorName = (entry: any): string | undefined => {
- if (!entry || typeof entry !== "object") return undefined
- const candidates = [
- entry?.name,
- entry?.displayName,
- entry?.display_name,
- entry?.title,
- entry?.label,
- entry?.meta?.displayName,
- entry?.meta?.display_name,
- entry?.meta?.name,
- entry?.flags?.display_name,
- entry?.flags?.name,
- entry?.data?.display_name,
- entry?.data?.name,
- ]
- for (const candidate of candidates) {
- const value = pickString(candidate)
- if (value) return value
- }
- return undefined
-}
-
-export const asRecord = (value: any): Record | undefined => {
- if (!value || typeof value !== "object" || Array.isArray(value)) return undefined
- const entries = Object.entries(value)
- if (!entries.length) return undefined
- return value as Record
-}
-
-export const extractSchemaProperties = (entry: any): Record | undefined => {
- if (!entry || typeof entry !== "object") return undefined
- const candidates = [
- entry?.data?.schemas?.outputs?.properties,
- entry?.data?.schemas?.output?.properties,
- entry?.data?.service?.format?.properties?.outputs?.properties,
- entry?.data?.service?.properties?.outputs?.properties,
- entry?.data?.output_schema?.properties,
- entry?.data?.outputs_schema?.properties,
- entry?.output_schema?.properties,
- entry?.schema?.properties,
- ]
- for (const candidate of candidates) {
- const record = asRecord(candidate)
- if (record) return record
- }
- return undefined
-}
-
-export const resolveEvaluatorMetricsMap = (entry: any): Record | undefined => {
- if (!entry || typeof entry !== "object") return undefined
- const direct = asRecord(entry.metrics)
- if (direct) return direct
-
- const schemaProps = extractSchemaProperties(entry)
- if (schemaProps) return schemaProps
-
- const derived = asRecord(getMetricsFromEvaluator(entry as any))
- if (derived) return derived
-
- return undefined
-}
-
-export const normalizeMetricPrimaryKey = (slug: string | undefined, rawKey: string): string => {
- const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined
- const trimmed = rawKey.trim()
- if (!trimmed) return normalizedSlug ?? ""
- if (normalizedSlug) {
- const prefix = `${normalizedSlug}.`
- if (trimmed.startsWith(prefix)) return trimmed
- }
- if (trimmed.includes(".")) return trimmed
- return normalizedSlug ? `${normalizedSlug}.${trimmed}` : trimmed
-}
-
-export const collectMetricFallbackKeys = (
- slug: string | undefined,
- rawKey: string,
- primaryKey: string,
- meta: any,
-): string[] => {
- const set = new Set()
- const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined
- const push = (value?: string) => {
- if (!value) return
- const trimmed = String(value).trim()
- if (!trimmed) return
- if (trimmed.includes(".") || !normalizedSlug) {
- set.add(trimmed)
- } else {
- set.add(`${normalizedSlug}.${trimmed}`)
- }
- }
-
- push(rawKey)
-
- const aliases = Array.isArray(meta?.aliases)
- ? meta?.aliases
- : meta?.aliases
- ? [meta.aliases]
- : meta?.alias
- ? [meta.alias]
- : []
- aliases.forEach(push)
-
- const extraKeys = [
- meta?.metricKey,
- meta?.metric_key,
- meta?.key,
- meta?.path,
- meta?.fullKey,
- meta?.full_key,
- meta?.canonicalKey,
- meta?.canonical_key,
- meta?.statsKey,
- meta?.stats_key,
- meta?.metric,
- ]
- extraKeys.forEach(push)
-
- const fallbackKeys = Array.from(set).filter((value) => value !== rawKey && value !== primaryKey)
- return fallbackKeys
-}
-
-export const buildDrawerMetricDefinition = (
- slug: string | undefined,
- rawKey: string,
- meta: any,
-): DrawerEvaluatorMetric => {
- const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined
- const normalizedDisplay =
- normalizedSlug && rawKey.startsWith(`${normalizedSlug}.`)
- ? rawKey.slice(normalizedSlug.length + 1)
- : rawKey
- const primaryKey = normalizeMetricPrimaryKey(slug, rawKey)
- const fallbackKeys = collectMetricFallbackKeys(slug, rawKey, primaryKey, meta)
- const id = canonicalizeMetricKey(primaryKey) || primaryKey
-
- return {
- id,
- displayName: normalizedDisplay || primaryKey,
- metricKey: primaryKey,
- fallbackKeys: fallbackKeys.length ? fallbackKeys : undefined,
- }
-}
-
-export const collectCandidateSteps = (data?: UseEvaluationRunScenarioStepsFetcherResult): any[] => {
- if (!data) return []
- const buckets: any[] = []
- if (Array.isArray(data.annotationSteps)) buckets.push(...(data.annotationSteps as any[]))
- if (Array.isArray(data.steps)) buckets.push(...(data.steps as any[]))
- if (Array.isArray(data.invocationSteps)) buckets.push(...(data.invocationSteps as any[]))
- return buckets
-}
-
-export const collectSlugCandidates = (
- data: UseEvaluationRunScenarioStepsFetcherResult | undefined,
- evaluatorSlug: string,
-): string[] => {
- const set = new Set()
- const push = (value?: string | null) => {
- if (!value) return
- const normalized = String(value).trim()
- if (!normalized) return
- set.add(normalized)
- }
-
- push(evaluatorSlug)
-
- const steps = collectCandidateSteps(data)
- steps.forEach((step) => {
- if (!step) return
- const ref: any = step?.references?.evaluator
- push(step?.stepKey as any)
- push(step?.stepkey as any)
- push(step?.step_key as any)
- push(ref?.slug)
- push(ref?.key)
- push(ref?.id)
- })
-
- return Array.from(set)
-}
-
-export const findAnnotationStepKey = (
- data: UseEvaluationRunScenarioStepsFetcherResult | undefined,
- slugCandidates: string[],
-): string | undefined => {
- if (!data) return undefined
-
- const steps = collectCandidateSteps(data)
- if (!steps.length) return undefined
-
- const loweredCandidates = slugCandidates
- .map((slug) => String(slug).toLowerCase())
- .filter((slug) => slug.length > 0)
-
- const matched = steps.find((step) => {
- if (!step) return false
- const possible: string[] = [
- (step as any)?.stepKey,
- (step as any)?.stepkey,
- (step as any)?.step_key,
- (step as any)?.references?.evaluator?.slug,
- (step as any)?.references?.evaluator?.key,
- (step as any)?.references?.evaluator?.id,
- ]
-
- return possible
- .filter(Boolean)
- .map((value) => String(value).toLowerCase())
- .some((candidate) => loweredCandidates.includes(candidate))
- })
-
- return (
- (matched as any)?.stepKey ??
- (matched as any)?.stepkey ??
- (matched as any)?.step_key ??
- undefined
- )
-}
-
-/** Return the best primitive/array value from annotationSteps[].annotation.data.outputs */
-export const getFromAnnotationOutputs = ({
- scenarioStepsResult,
- slugCandidates,
- evaluatorSlug,
- expandedCandidates,
-}: {
- scenarioStepsResult?: DrawerMetricValueCellProps["scenarioStepsResult"]
- slugCandidates: string[]
- evaluatorSlug: string
- expandedCandidates: string[]
-}): {value: any; matchedKey?: string} | undefined => {
- const data = scenarioStepsResult?.data
- if (!data || !Array.isArray(data.annotationSteps)) return undefined
-
- // choose only annotation steps that belong to any of our slug candidates
- const pool = new Set(slugCandidates.map((s) => String(s).toLowerCase()))
- const steps = (data.annotationSteps as any[]).filter((s) => {
- const sk = s?.stepKey ?? s?.stepkey ?? s?.step_key
- const ref = s?.references?.evaluator
- const ids = [sk, ref?.slug, ref?.key, ref?.id]
- .filter(Boolean)
- .map((x) => String(x).toLowerCase())
- return ids.some((id) => pool.has(id))
- })
-
- if (!steps.length) return undefined
-
- // outputs pockets we’re allowed to read as fallback
- const outputsOf = (s: any) =>
- [s?.annotation?.data?.outputs, s?.data?.outputs, s?.outputs].filter(
- (o) => o && typeof o === "object",
- ) as Record[]
-
- const isPrimitive = (v: unknown) =>
- v === null || ["string", "number", "boolean"].includes(typeof v)
-
- const stripPfx = (k: string) => {
- const PFX = [
- "attributes.ag.data.outputs.",
- "ag.data.outputs.",
- "outputs.",
- `${evaluatorSlug}.`,
- ]
- for (const p of PFX) if (k.startsWith(p)) return k.slice(p.length)
- return k
- }
-
- const pathGet = (obj: any, path: string) =>
- path.split(".").reduce((acc, k) => (acc == null ? acc : acc[k]), obj)
-
- // 1) exact/bare path tries inside outputs
- for (const s of steps) {
- for (const outs of outputsOf(s)) {
- for (const cand of expandedCandidates) {
- const bare = stripPfx(cand)
- for (const v of new Set([stripPfx(cand), bare, `extra.${bare}`])) {
- const val = pathGet(outs, v)
- if (val !== undefined && (isPrimitive(val) || Array.isArray(val))) {
- return {value: val, matchedKey: v}
- }
- }
- }
- }
- }
-
- // 2) fuzzy DFS through outputs (skip schema objects like { type: ... })
- const canonical = (s?: string) =>
- typeof s === "string" ? s.toLowerCase().replace(/[^a-z0-9]+/g, "") : ""
-
- const terminals = new Set(
- expandedCandidates.map((k) => stripPfx(k).split(".").pop()!).map(canonical),
- )
-
- const looksLikeSchema = (o: any) =>
- o &&
- typeof o === "object" &&
- !Array.isArray(o) &&
- Object.keys(o).length <= 2 &&
- "type" in o &&
- (Object.keys(o).length === 1 || "description" in o)
-
- const dfs = (obj: any, path: string[] = []): {value: any; matchedKey: string} | undefined => {
- if (!obj || typeof obj !== "object") return
- for (const [k, v] of Object.entries(obj)) {
- const p = [...path, k]
- if (isPrimitive(v) || Array.isArray(v)) {
- const hit = terminals.has(canonical(k)) || terminals.has(canonical(p[p.length - 1]))
- if (hit) return {value: v, matchedKey: p.join(".")}
- } else if (!looksLikeSchema(v)) {
- const h = dfs(v, p)
- if (h) return h
- }
- }
- }
-
- for (const s of steps) {
- for (const outs of outputsOf(s)) {
- const hit = dfs(outs)
- if (hit) return hit
- }
- }
-
- return undefined
-}
diff --git a/web/ee/src/components/EvalRunDetails/components/EvalRunOverviewViewer/index.tsx b/web/ee/src/components/EvalRunDetails/components/EvalRunOverviewViewer/index.tsx
index fc434a83b1..68763db144 100644
--- a/web/ee/src/components/EvalRunDetails/components/EvalRunOverviewViewer/index.tsx
+++ b/web/ee/src/components/EvalRunDetails/components/EvalRunOverviewViewer/index.tsx
@@ -756,13 +756,8 @@ const EvalRunOverviewViewer = ({type = "auto"}: {type: "auto" | "online"}) => {
{hasMetrics ? (
<>
- {combinedMetricEntries
- .filter(
- (entry) =>
- entry.metric?.type !== "string" &&
- entry.metric?.type !== "json",
- )
- .map(({fullKey, metric, evaluatorSlug, metricKey}, idx) => {
+ {combinedMetricEntries.map(
+ ({fullKey, metric, evaluatorSlug, metricKey}, idx) => {
if (!metric || !Object.keys(metric || {}).length) return null
const isBooleanMetric =
@@ -1171,7 +1166,8 @@ const EvalRunOverviewViewer = ({type = "auto"}: {type: "auto" | "online"}) => {
placeholderDescription={placeholderCopy?.description}
/>
)
- })}
+ },
+ )}
{placeholderCards.length ? placeholderCards : null}
>
) : placeholderCards.length ? (
diff --git a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/ScenarioTable.tsx b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/ScenarioTable.tsx
index 8de842f3a6..a7d10a41c3 100644
--- a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/ScenarioTable.tsx
+++ b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/ScenarioTable.tsx
@@ -22,7 +22,7 @@ import {
type QueryWindowingPayload,
} from "../../../../services/onlineEvaluations/api"
import {EvalRunTestcaseTableSkeleton} from "../../AutoEvalRun/components/EvalRunTestcaseViewer/assets/EvalRunTestcaseViewerSkeleton"
-import type {TableRow} from "./types"
+import type {TableRow} from "../types"
import useScrollToScenario from "./hooks/useScrollToScenario"
import useTableDataSource from "./hooks/useTableDataSource"
diff --git a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx
index b433a2c116..96452ec02e 100644
--- a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx
+++ b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx
@@ -442,8 +442,6 @@ export const InputSummaryCell = memo(
const deepMerge = (target: Record
, source?: Record) => {
if (!source || typeof source !== "object") return target
Object.entries(source).forEach(([key, rawValue]) => {
- // Prevent prototype pollution by excluding dangerous keys
- if (key === "__proto__" || key === "constructor" || key === "prototype") return
const parsed = tryParseJson(rawValue)
const value = parsed
if (value && typeof value === "object" && !Array.isArray(value)) {
diff --git a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/MetricCell/MetricCell.tsx b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/MetricCell/MetricCell.tsx
index 28268c68eb..1a6b7d210f 100644
--- a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/MetricCell/MetricCell.tsx
+++ b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/MetricCell/MetricCell.tsx
@@ -1,4 +1,4 @@
-import {type ReactNode, memo, useCallback, useMemo} from "react"
+import {type ReactNode, memo, useCallback, useEffect, useMemo} from "react"
import {Tag, Tooltip} from "antd"
import clsx from "clsx"
@@ -27,7 +27,7 @@ import {EvaluationStatus} from "@/oss/lib/Types"
import {STATUS_COLOR_TEXT} from "../../../EvalRunScenarioStatusTag/assets"
import {CellWrapper} from "../CellComponents" // CellWrapper is default export? need to check.
-import {resolveAnnotationMetricValue, resolveStepFailure} from "./helpers"
+import {EvaluatorFailure, hasFailureStatus, resolveAnnotationMetricValue} from "./helpers"
import {AnnotationValueCellProps, MetricCellProps, MetricValueCellProps} from "./types"
/*
@@ -159,22 +159,8 @@ const MetricCell = memo(
formatted = String(value)
}
- // 1) Detect string by the actual value, not by metricType
- const isPlainString = typeof value === "string"
-
- // 2) When string, render as a wrapped block (no popover)
- if (isPlainString) {
- return (
-
-
- {value as string}
-
-
- )
- }
-
- // 3) Only show popover for non-strings
- if (distInfo && !isPlainString) {
+ // Wrap in popover when distInfo present
+ if (distInfo && metricType !== "string") {
return (
data.name in x,
- )?.[data.name]?.metricType
- const kind: ColumnDef["kind"] =
- type === "string" ? "annotation" : "metric"
-
return {
...data,
name: data.name,
key: `${metricKey}.${data.name}`,
title: `${formattedName} ${isMean ? "(mean)" : ""}`.trim(),
- kind,
+ kind: "metric",
path: fullPath,
fallbackPath: legacyPath,
stepKey: "metric",
- metricType: type,
+ metricType: metricsFromEvaluators[metricKey]?.find(
+ (x) => data.name in x,
+ )?.[data.name]?.metricType,
}
}
return undefined
@@ -310,17 +298,15 @@ export function buildScenarioTableData({
return undefined
}
const formattedName = formatColumnTitle(metricName)
- const type = def?.[metricName]?.metricType
- const kind: ColumnDef["kind"] = type === "string" ? "annotation" : "metric"
return {
name: metricName,
key: `${metricKey}.${metricName}`,
title: formattedName,
- kind,
- path: fullPath,
- fallbackPath: fullPath,
+ kind: "metric" as const,
+ path: `${metricKey}.${metricName}`,
+ fallbackPath: `${metricKey}.${metricName}`,
stepKey: "metric",
- metricType: type,
+ metricType: def?.[metricName]?.metricType,
}
})
.filter(Boolean) as any[]
diff --git a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/utils.tsx b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/utils.tsx
index 93df472fcf..f5cc20bd06 100644
--- a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/utils.tsx
+++ b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/utils.tsx
@@ -35,24 +35,7 @@ import {AnnotationValueCell, EvaluatorFailureCell, MetricValueCell} from "./Metr
import TimestampCell from "./TimestampCell"
import {BaseColumn, TableColumn} from "./types"
-// ---------------- Helpers to detect/normalize annotation-like metric paths ----------------
-const OUT_PREFIX = "attributes.ag.data.outputs."
-const IN_PREFIX = "attributes.ag.data.inputs."
-
-/** A “metric” column that actually points inside the annotation payload. */
-const isAnnotationLikeMetricPath = (p?: string) =>
- typeof p === "string" && (p.includes(OUT_PREFIX) || p.includes(IN_PREFIX))
-
-/** Strip the run-scoped prefix to the field path used by AnnotationValueCell helpers. */
-const toAnnotationFieldPath = (p: string) =>
- p.includes(OUT_PREFIX)
- ? p.slice(OUT_PREFIX.length)
- : p.includes(IN_PREFIX)
- ? p.slice(IN_PREFIX.length)
- : p
-// ------------------------------------------------------------------------------------------
-
-// Helper to compare metric/annotation primitives across scenarios (used for sorting metrics)
+// Helper to compare metric/annotation primitives across scenarios
function scenarioMetricPrimitive(recordKey: string, column: any, runId: string) {
const st = evalAtomStore()
let raw: any = column.values?.[recordKey]
@@ -119,6 +102,9 @@ function scenarioMetricSorter(column: any, runId: string) {
/**
* Transforms a list of scenario metrics into a map of scenarioId -> metrics, merging
* nested metrics under `outputs` into the same level.
+ *
+ * @param {{scenarioMetrics: any[]}} props - The props object containing the metrics.
+ * @returns {Record>} - A map of scenarioId -> metrics.
*/
export const getScenarioMetricsMap = ({scenarioMetrics}: {scenarioMetrics: any[]}) => {
const map: Record> = {}
@@ -128,9 +114,11 @@ export const getScenarioMetricsMap = ({scenarioMetrics}: {scenarioMetrics: any[]
const sid = m.scenarioId
if (!sid) return
+ // Clone the data object to avoid accidental mutations
const data: Record =
m && typeof m === "object" && m.data && typeof m.data === "object" ? {...m.data} : {}
+ // If metrics are nested under `outputs`, merge them into the same level
if (data.outputs && typeof data.outputs === "object") {
Object.assign(data, data.outputs)
delete data.outputs
@@ -157,7 +145,6 @@ const generateColumnTitle = (col: BaseColumn) => {
if (col.kind === "annotation") return titleCase(col.name)
return titleCase(col.title ?? col.name)
}
-
const generateColumnWidth = (col: BaseColumn) => {
if (col.kind === "meta") return 80
if (col.kind === "input") return COLUMN_WIDTHS.input
@@ -166,7 +153,6 @@ const generateColumnWidth = (col: BaseColumn) => {
if (col.kind === "invocation") return COLUMN_WIDTHS.response
return 20
}
-
const orderRank = (def: EnhancedColumnType): number => {
if (def.key === "#") return 0
if (def.key === "timestamp") return 1
@@ -177,6 +163,7 @@ const orderRank = (def: EnhancedColumnType): number => {
if (def.key?.includes("evaluators")) return 6
if (def.key === "__metrics_group__") return 7
if (def.key === "errors") return 9 // ensure errors column stays at the end of metrics group
+
return 8
}
@@ -262,13 +249,8 @@ export function buildAntdColumns(
width: generateColumnWidth(c),
__editLabel: editLabel,
}
-
- // Sorting:
- // - keep sorting for true numeric/boolean/string metrics
- // - disable sorting for annotation-like metric paths (their values come from annotations, not metrics atoms)
const sortable =
(c.kind === "metric" || c.kind === "annotation") &&
- !isAnnotationLikeMetricPath(c.path) &&
isSortableMetricType(c.metricType)
const sorter = sortable ? scenarioMetricSorter(c, runId) : undefined
@@ -416,6 +398,7 @@ export function buildAntdColumns(
width: 120,
minWidth: 120,
render: (_: any, record: TableRow) => {
+ // Use runId from record data instead of function parameter
const effectiveRunId = (record as any).runId || runId
return (
{
+ // Use runId from record data instead of function parameter
const effectiveRunId = (record as any).runId || runId
-
+ // if (record.isSkeleton) return
switch (c.kind) {
case "input": {
const inputStepKey = resolveStepKeyForRun(c, effectiveRunId)
@@ -725,25 +708,6 @@ export function buildAntdColumns(
)
}
case "metric": {
- // If this “metric” is actually pointing inside annotations, render via AnnotationValueCell
- if (isAnnotationLikeMetricPath(c.path)) {
- const annotationStepKey = resolveStepKeyForRun(c, effectiveRunId)
- const fieldPath = toAnnotationFieldPath(c.path)
- return (
-
- )
- }
-
const scenarioId = record.scenarioId || record.key
const evaluatorSlug = (c as any).evaluatorSlug as string | undefined
const groupIndex = (c as any).evaluatorColumnIndex ?? 0
diff --git a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils.ts b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils.ts
index 2cd4ffbc56..f08ccb95ff 100644
--- a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils.ts
+++ b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils.ts
@@ -146,50 +146,25 @@ export const format3Sig = (num: number | string): string => {
* Format a metric value using the mapping above.
* Falls back to the raw value when the metric has no formatter or value is non-numeric.
*/
-export function formatMetricValue(metricKey: string, value: unknown): string {
- if (value == null) {
- return ""
- }
-
- if (Array.isArray(value)) {
- return value.map((v) => formatMetricValue(metricKey, v)).join(", ")
- }
-
- if (typeof value === "boolean") {
- return value ? "true" : "false"
- }
-
- if (typeof value === "object") {
- try {
- return JSON.stringify(value, null, 2)
- } catch (error) {
- return String(value)
- }
- }
-
- if (typeof value !== "string" && typeof value !== "number") {
- return String(value)
- }
-
+export function formatMetricValue(metricKey: string, value: number | string): string {
const fmt = METRIC_FORMATTERS[metricKey] || {
decimals: 2,
}
- if (fmt?.format) {
- return fmt.format(value)
+ if (Array.isArray(value)) {
+ return value.map((v) => {
+ return formatMetricValue(metricKey, v)
+ })
}
+ if (!fmt) return String(value)
- if (typeof value !== "number") {
- const numericValue = Number(value)
- if (Number.isNaN(numericValue)) {
- return String(value)
- }
- const adjusted = fmt.multiplier ? numericValue * fmt.multiplier : numericValue
- const rounded = Number.isFinite(adjusted) ? format3Sig(adjusted) : format3Sig(value)
- return `${fmt.prefix ?? ""}${rounded}${fmt.suffix ?? ""}`
+ if (fmt.format) {
+ return fmt.format(value)
}
- const adjusted = fmt.multiplier ? value * fmt.multiplier : value
- const rounded = Number.isFinite(adjusted) ? format3Sig(adjusted) : format3Sig(value)
+ let num = typeof value === "number" ? value : Number(value)
+ num = fmt.multiplier ? num * fmt.multiplier : num
+ const rounded =
+ Number.isFinite(num) && fmt.decimals !== undefined ? format3Sig(num) : format3Sig(value)
return `${fmt.prefix ?? ""}${rounded}${fmt.suffix ?? ""}`
}
diff --git a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/index.tsx b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/index.tsx
index 32870ec074..1dce19c1f0 100644
--- a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/index.tsx
+++ b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/index.tsx
@@ -287,9 +287,6 @@ export const MetricDetailsPopoverWrapper = memo(
const summary = useMemo(() => {
if (!stats) return "N/A"
- if (resolvedMetricType === "string" || resolvedMetricType === "object") {
- return "N/A"
- }
// Numeric metrics → mean
if (typeof (stats as any).mean === "number") {
return format3Sig(Number((stats as any).mean))
diff --git a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/types.ts b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/types.ts
index ae12092aad..1d06b45a69 100644
--- a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/types.ts
+++ b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/types.ts
@@ -6,7 +6,7 @@ export interface MetricDetailsPopoverProps {
primaryValue?: number | string
extraDimensions: Record
/** Value to highlight (bin/bar will be inferred from this value) */
- highlightValue?: number | string | boolean | Array
+ highlightValue?: number | string
/** Hide primitives key‒value table; useful for lightweight popovers */
hidePrimitiveTable?: boolean
/** Force using edge-axis (for debugging) */
diff --git a/web/ee/src/components/PostSignupForm/PostSignupForm.tsx b/web/ee/src/components/PostSignupForm/PostSignupForm.tsx
index d481fc3350..ef9c38778c 100644
--- a/web/ee/src/components/PostSignupForm/PostSignupForm.tsx
+++ b/web/ee/src/components/PostSignupForm/PostSignupForm.tsx
@@ -345,7 +345,7 @@ const PostSignupForm = () => {
<>
= ({settings, selectedTe
)}
>
- {settings
- .filter((field) => field.type !== "hidden")
- .map((field) => {
- const rules = [
- {required: field.required ?? true, message: "This field is required"},
- ]
+ {settings.map((field) => {
+ const rules = [
+ {required: field.required ?? true, message: "This field is required"},
+ ]
- return (
-
- {field.label}
- {field.description && (
-
-
-
- )}
-
- }
- initialValue={field.default}
- rules={rules}
- >
- {(field.type === "string" || field.type === "regex") &&
- selectedTestcase.testcase ? (
-
- option!.value
- .toUpperCase()
- .indexOf(inputValue.toUpperCase()) !== -1
- }
- />
- ) : field.type === "string" || field.type === "regex" ? (
-
- ) : field.type === "number" ? (
-
- ) : field.type === "boolean" || field.type === "bool" ? (
-
- ) : field.type === "text" ? (
-
- ) : field.type === "code" ? (
-
- ) : field.type === "multiple_choice" ? (
- ({
- label: option,
- value: option,
- }))}
- />
- ) : field.type === "object" ? (
-
- ) : null}
-
- )
- })}
+ return (
+
+ {field.label}
+ {field.description && (
+
+
+
+ )}
+
+ }
+ initialValue={field.default}
+ rules={rules}
+ >
+ {(field.type === "string" || field.type === "regex") &&
+ selectedTestcase.testcase ? (
+
+ option!.value
+ .toUpperCase()
+ .indexOf(inputValue.toUpperCase()) !== -1
+ }
+ />
+ ) : field.type === "string" || field.type === "regex" ? (
+
+ ) : field.type === "number" ? (
+
+ ) : field.type === "boolean" || field.type === "bool" ? (
+
+ ) : field.type === "text" ? (
+
+ ) : field.type === "code" ? (
+
+ ) : field.type === "multiple_choice" ? (
+ ({
+ label: option,
+ value: option,
+ }))}
+ />
+ ) : field.type === "object" ? (
+
+ ) : null}
+
+ )
+ })}
)
diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DebugSection.tsx b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DebugSection.tsx
index 681db73a89..423eb57c45 100644
--- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DebugSection.tsx
+++ b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DebugSection.tsx
@@ -78,12 +78,12 @@ import {appSchemaAtom, appUriInfoAtom} from "@/oss/state/variant/atoms/fetcher"
import EvaluatorTestcaseModal from "./EvaluatorTestcaseModal"
import EvaluatorVariantModal from "./EvaluatorVariantModal"
import {buildVariantFromRevision} from "./variantUtils"
-
interface DebugSectionProps {
selectedTestcase: {
testcase: Record | null
}
selectedVariant: EnhancedVariant
+ // NonNullable["variants"]>[number]
testsets: testset[] | null
traceTree: {
trace: Record | string | null
@@ -108,17 +108,6 @@ interface DebugSectionProps {
}
const useStyles = createUseStyles((theme: JSSTheme) => ({
- "@global": {
- /* Make selection modal fit viewport/container with scrollable body */
- ".ant-modal .ant-modal-content": {
- maxHeight: "80vh",
- display: "flex",
- flexDirection: "column",
- },
- ".ant-modal .ant-modal-body": {
- overflow: "auto",
- },
- },
title: {
fontSize: theme.fontSizeLG,
fontWeight: theme.fontWeightMedium,
@@ -156,9 +145,6 @@ const useStyles = createUseStyles((theme: JSSTheme) => ({
},
}))
-const LAST_APP_KEY = "agenta:lastAppId"
-const LAST_VARIANT_KEY = "agenta:lastVariantId"
-
const DebugSection = ({
selectedTestcase,
selectedVariant: _selectedVariant,
@@ -209,10 +195,10 @@ const DebugSection = ({
const selectedVariant = useMemo(() => {
const revs = _selectedVariant?.revisions || []
+ // find the most recent revision by looking at the updatedAtTimestamp
const variant = revs?.sort((a, b) => b.updatedAtTimestamp - a.updatedAtTimestamp)[0]
return variant
}, [_selectedVariant])
-
const fallbackVariant = useMemo(() => {
if (_selectedVariant || !defaultAppId) return null
const revisionLists = Object.values(defaultRevisionMap || {})
@@ -230,60 +216,6 @@ const DebugSection = ({
return []
}, [variants, fallbackVariant])
- // Resolve current application object for display
- const selectedApp = useMemo(() => {
- const id = _selectedVariant?.appId || defaultAppId
- return availableApps.find((a: any) => a.app_id === id)
- }, [_selectedVariant?.appId, defaultAppId, availableApps])
-
- // Initialize from localStorage (remember last app/variant) with fallbacks
- useEffect(() => {
- // if parent already set a specific variant, respect it
- if (_selectedVariant) return
-
- const storedAppId =
- typeof window !== "undefined" ? localStorage.getItem(LAST_APP_KEY) : null
- const storedVariantId =
- typeof window !== "undefined" ? localStorage.getItem(LAST_VARIANT_KEY) : null
-
- let nextVariant: Variant | null = null
-
- // 1) Try to find an existing variant matching stored ids among provided or fallback variants
- const searchPool: Variant[] = [...(variants || []), ...(derivedVariants || [])].filter(
- Boolean,
- ) as Variant[]
-
- if (storedVariantId) {
- nextVariant = searchPool.find((v) => (v as any)?.variantId === storedVariantId) || null
- }
-
- // 2) If not found by variant, but we have an app id, try first variant under that app
- if (!nextVariant && storedAppId) {
- nextVariant = searchPool.find((v) => (v as any)?.appId === storedAppId) || null
- }
-
- // 3) Finally fall back to first available variant in our computed list
- if (!nextVariant && searchPool.length > 0) {
- nextVariant = searchPool[0]
- }
-
- if (nextVariant) {
- setSelectedVariant(nextVariant)
- }
- }, [_selectedVariant, variants, derivedVariants, setSelectedVariant])
-
- // Persist whenever the working selectedVariant changes
- useEffect(() => {
- const v = _selectedVariant as any
- if (!v) return
- try {
- if (v.appId) localStorage.setItem(LAST_APP_KEY, v.appId)
- if (v.variantId) localStorage.setItem(LAST_VARIANT_KEY, v.variantId)
- } catch {
- // ignore storage errors (private mode, etc.)
- }
- }, [_selectedVariant])
-
useEffect(() => {
if (_selectedVariant) return
if (derivedVariants.length > 0) {
@@ -354,34 +286,8 @@ const DebugSection = ({
setIsLoadingResult(true)
const settingsValues = form.getFieldValue("settings_values") || {}
- let normalizedSettings = {...settingsValues}
-
- if (typeof normalizedSettings.json_schema === "string") {
- try {
- const parsed = JSON.parse(normalizedSettings.json_schema)
- if (!parsed || typeof parsed !== "object" || Array.isArray(parsed)) {
- throw new Error()
- }
- normalizedSettings.json_schema = parsed
- } catch {
- message.error("JSON schema must be a valid JSON object")
- setEvalOutputStatus({success: false, error: true})
- setIsLoadingResult(false)
- return
- }
- } else if (
- normalizedSettings.json_schema &&
- (typeof normalizedSettings.json_schema !== "object" ||
- Array.isArray(normalizedSettings.json_schema))
- ) {
- message.error("JSON schema must be a valid JSON object")
- setEvalOutputStatus({success: false, error: true})
- setIsLoadingResult(false)
- return
- }
-
const {testcaseObj, evalMapObj} = mapTestcaseAndEvalValues(
- normalizedSettings,
+ settingsValues,
selectedTestcase.testcase,
)
@@ -406,6 +312,7 @@ const DebugSection = ({
? correctAnswerKey.split(".")[1]
: correctAnswerKey
+ // Normalize ground_truth and prediction to compact, comparable strings
const normalizeCompact = (val: any) => {
try {
if (val === undefined || val === null) return ""
@@ -426,7 +333,9 @@ const DebugSection = ({
outputs = {
...outputs,
+ // Include all testcase fields so evaluators can access them directly (e.g., {{topic}})
...selectedTestcase.testcase,
+ // Set both ground_truth and the specific correct answer key for compatibility
ground_truth,
[groundTruthKey]: ground_truth,
prediction,
@@ -436,7 +345,7 @@ const DebugSection = ({
const runResponse = await createEvaluatorRunExecution(selectedEvaluator.key, {
inputs: outputs,
- settings: transformTraceKeysInSettings(normalizedSettings),
+ settings: transformTraceKeysInSettings(settingsValues),
...(selectedEvaluator.requires_llm_api_keys || settingsValues?.requires_llm_api_keys
? {credentials: apiKeyObject(secrets)}
: {}),
@@ -447,7 +356,8 @@ const DebugSection = ({
} catch (error: any) {
console.error(error)
setEvalOutputStatus({success: false, error: true})
- if (error.response?.data?.detail) {
+ if (error.response.data.detail) {
+ // Handle both string and object error details properly
const errorDetail =
typeof error.response.data.detail === "string"
? error.response.data.detail
@@ -491,8 +401,12 @@ const DebugSection = ({
const isCustomBySchema = Boolean(spec) && !hasInputsProp && !hasMessagesProp
const isCustom = Boolean(flags?.isCustom) || isCustomBySchema
+ // Build effective input keys
let effectiveKeys: string[] = []
if (isCustom) {
+ // For custom workflows, use top-level schema keys
+ // Do not strip "context" here; callVariant will attach project context under root
+ // and any variable keys are handled by transformToRequestBody.
effectiveKeys = spec ? extractInputKeysFromSchema(spec, routePath) : []
} else {
const fromParams = (() => {
@@ -515,8 +429,10 @@ const DebugSection = ({
).filter((k) => k && k !== "chat")
}
+ // Parameter definitions: mark as non-input so callVariant nests under inputs for non-custom
params.inputs = (effectiveKeys || []).map((name) => ({name, input: false}))
+ // Optional parameters/body extras: prefer stable transform snapshot
const baseParameters = isPlainObject(stableTransformedParams)
? {...stableTransformedParams}
: transformToRequestBody({
@@ -530,6 +446,7 @@ const DebugSection = ({
routePath,
) || []
: [],
+ // Keep request shape aligned with OpenAPI schema
isChat: hasMessagesProp,
isCustom,
customProperties: isCustom ? customProps : undefined,
@@ -552,7 +469,7 @@ const DebugSection = ({
: undefined
if (variantAgConfig) {
- ;(baseParameters as any).ag_config = variantAgConfig
+ baseParameters.ag_config = variantAgConfig
}
}
@@ -579,6 +496,7 @@ const DebugSection = ({
params.isCustom = selectedVariant?.isCustom
}
+ // Filter testcase down to allowed keys only (exclude chat)
const testcaseDict = selectedTestcase.testcase
const allowed = new Set((params.inputs || []).map((p) => p.name))
const filtered = Object.fromEntries(
@@ -615,7 +533,7 @@ const DebugSection = ({
const {trace, tree, data} = result
setVariantResult(getStringOrJson(data))
- if (trace?.spans) {
+ if (trace && trace?.spans) {
setTraceTree({
trace: transformTraceTreeToJson(
fromBaseResponseToTraceSpanType(trace.spans, trace.trace_id)[0],
@@ -624,16 +542,17 @@ const DebugSection = ({
}
if (tree) {
- const tTree = tree.nodes
+ const traceTree = tree.nodes
.flatMap((node: AgentaNodeDTO) => buildNodeTree(node))
.flatMap((item: any) => observabilityTransformer(item))
.map((item) => {
const {key, children, ...trace} = item
+
return trace
})[0]
setTraceTree({
- trace: buildNodeTreeV3(tTree),
+ trace: buildNodeTreeV3(traceTree),
})
}
setVariantStatus({success: true, error: false})
@@ -644,7 +563,7 @@ const DebugSection = ({
if (!controller.signal.aborted) {
console.error("error: ", error)
message.error(error.message)
- if (error.response?.data?.detail) {
+ if (error.response.data.detail) {
setVariantResult(getStringOrJson(error.response.data.detail))
} else {
setVariantResult("Error occured")
@@ -690,10 +609,6 @@ const DebugSection = ({
}
}
- // Helper to print "App / Variant" nicely
- const appName = selectedApp?.name || selectedApp?.app_name || "app"
- const variantName = selectedVariant?.variantName || "variant"
-
return (
@@ -811,8 +726,7 @@ const DebugSection = ({
{
key: "change_variant",
icon:
,
- // Updated copy
- label: "Change application",
+ label: "Change Variant",
onClick: () => setOpenVariantModal(true),
},
],
@@ -827,8 +741,8 @@ const DebugSection = ({
}
>
- {/* Show "App / Variant" */}
- Run application ({appName}/{variantName})
+ {/* Adding key above ensures React re-renders this label when variant changes */}
+ Run {selectedVariant?.variantName || "variant"}
)}
@@ -976,15 +890,7 @@ const DebugSection = ({
variants={derivedVariants}
open={openVariantModal}
onCancel={() => setOpenVariantModal(false)}
- setSelectedVariant={(v) => {
- setSelectedVariant(v)
- // eager persist on selection from modal
- try {
- if ((v as any)?.appId) localStorage.setItem(LAST_APP_KEY, (v as any).appId)
- if ((v as any)?.variantId)
- localStorage.setItem(LAST_VARIANT_KEY, (v as any).variantId)
- } catch {}
- }}
+ setSelectedVariant={setSelectedVariant}
selectedVariant={selectedVariant}
selectedTestsetId={selectedTestset}
/>
diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DynamicFormField.tsx b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DynamicFormField.tsx
index 52e34ce6ae..1d1a9d2761 100644
--- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DynamicFormField.tsx
+++ b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/DynamicFormField.tsx
@@ -11,7 +11,6 @@ import {isValidRegex} from "@/oss/lib/helpers/validators"
import {generatePaths} from "@/oss/lib/transformers"
import {EvaluationSettingsTemplate, JSSTheme} from "@/oss/lib/Types"
-import {JSONSchemaEditor} from "./JSONSchema"
import {Messages} from "./Messages"
type DynamicFormFieldProps = EvaluationSettingsTemplate & {
@@ -108,8 +107,6 @@ export const DynamicFormField: React.FC = ({
const classes = useStyles()
const {token} = theme.useToken()
- const watched = Form.useWatch(name as any, form)
- const savedValue = watched ?? defaultVal
const handleValueChange = useCallback(
(next: string) => {
if (form) {
@@ -122,7 +119,6 @@ export const DynamicFormField: React.FC = ({
)
const rules: Rule[] = [{required: required ?? true, message: "This field is required"}]
-
if (type === "regex")
rules.push({
validator: (_, value) =>
@@ -205,16 +201,6 @@ export const DynamicFormField: React.FC = ({
value={settingsValue}
onChange={handleValueChange}
/>
- ) : type === "llm_response_schema" ? (
-
) : null}
)}
diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaEditor.tsx b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaEditor.tsx
deleted file mode 100644
index e17f145e42..0000000000
--- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaEditor.tsx
+++ /dev/null
@@ -1,449 +0,0 @@
-import {useCallback, useEffect, useMemo, useRef, useState} from "react"
-import {
- Button,
- Checkbox,
- Flex,
- Form,
- FormInstance,
- Input,
- InputNumber,
- Select,
- Space,
- Typography,
- Alert,
- Tooltip,
- Modal,
-} from "antd"
-import {DeleteOutlined, InfoCircleOutlined, PlusOutlined} from "@ant-design/icons"
-import {createUseStyles} from "react-jss"
-import {useLocalStorage} from "usehooks-ts"
-
-import SharedEditor from "@/oss/components/Playground/Components/SharedEditor"
-import {JSSTheme} from "@/oss/lib/Types"
-
-import {
- generateJSONSchema,
- isSchemaCompatibleWithBasicMode,
- parseJSONSchema,
-} from "./JSONSchemaGenerator"
-import {CategoricalOption, ResponseFormatType, SchemaConfig} from "./types"
-
-interface JSONSchemaEditorProps {
- form: FormInstance
- name: string | string[]
- defaultValue?: string
-}
-
-const createDefaultCategories = (): CategoricalOption[] => [
- {name: "good", description: "The response is good"},
- {name: "bad", description: "The response is bad"},
-]
-
-const useStyles = createUseStyles((theme: JSSTheme) => ({
- editor: {
- border: `1px solid ${theme.colorBorder}`,
- borderRadius: theme.borderRadius,
- overflow: "hidden",
- "& .monaco-editor": {
- width: "0 !important",
- },
- },
- categoryItem: {
- display: "flex",
- gap: theme.marginXS,
- alignItems: "flex-start",
- marginBottom: theme.marginXS,
- },
-}))
-
-export const JSONSchemaEditor: React.FC = ({form, name, defaultValue}) => {
- const [modal, contextHolder] = Modal.useModal()
- const classes = useStyles()
- const [mode, setMode] = useState("basic")
-
- // Basic mode state
- const [responseFormat, setResponseFormat] = useState("boolean")
- const [includeReasoning, setIncludeReasoning] = useState(false)
- const [minValue, setMinValue] = useState(0)
- const [maxValue, setMaxValue] = useState(10)
- const [categories, setCategories] = useState(createDefaultCategories())
-
- // Advanced mode state
- const [rawSchema, setRawSchema] = useState(defaultValue ?? "")
- const [supportsBasicMode, setSupportsBasicMode] = useState(() => {
- if (!defaultValue) {
- return true
- }
-
- return isSchemaCompatibleWithBasicMode(defaultValue)
- })
-
- const lastSyncedValueRef = useRef(undefined)
-
- const namePath = useMemo(() => (Array.isArray(name) ? name : [name]), [name])
-
- const applyParsedConfig = useCallback((parsed: SchemaConfig) => {
- setResponseFormat(parsed.responseFormat)
- setIncludeReasoning(parsed.includeReasoning)
-
- if (parsed.continuousConfig) {
- setMinValue(parsed.continuousConfig.minimum)
- setMaxValue(parsed.continuousConfig.maximum)
- }
-
- if (parsed.categoricalOptions && parsed.categoricalOptions.length > 0) {
- setCategories(parsed.categoricalOptions)
- } else {
- setCategories(createDefaultCategories())
- }
- }, [])
-
- const syncFormValue = useCallback(
- (value: string) => {
- const current = form.getFieldValue(namePath)
- if (current === value && lastSyncedValueRef.current === value) return
-
- form.setFieldValue(namePath, value)
- lastSyncedValueRef.current = value
- },
- [form, namePath],
- )
-
- const getDefaultConfig = useCallback((): SchemaConfig => {
- return {
- responseFormat: "boolean",
- includeReasoning: false,
- continuousConfig: {minimum: 0, maximum: 10},
- categoricalOptions: createDefaultCategories(),
- }
- }, [])
-
- const applyConfigAndSync = useCallback(
- (config: SchemaConfig) => {
- applyParsedConfig(config)
- const schemaString = JSON.stringify(generateJSONSchema(config), null, 2)
- setRawSchema(schemaString)
- syncFormValue(schemaString)
- setSupportsBasicMode(true)
- },
- [applyParsedConfig, syncFormValue],
- )
-
- // Initialize from default value
- useEffect(() => {
- if (!defaultValue) {
- setSupportsBasicMode(true)
- setRawSchema("")
- return
- }
-
- if (lastSyncedValueRef.current === defaultValue) {
- return
- }
-
- const parsed = parseJSONSchema(defaultValue)
- if (parsed) applyParsedConfig(parsed)
-
- setSupportsBasicMode(isSchemaCompatibleWithBasicMode(defaultValue))
- setRawSchema(defaultValue)
- }, [defaultValue, applyParsedConfig])
-
- useEffect(() => {
- if (!supportsBasicMode && mode !== "advanced") {
- setMode("advanced")
- }
- }, [supportsBasicMode, mode])
-
- // Update form when basic mode changes
- useEffect(() => {
- if (mode === "basic" && supportsBasicMode) {
- const config: SchemaConfig = {
- responseFormat,
- includeReasoning,
- continuousConfig: {minimum: minValue, maximum: maxValue},
- categoricalOptions: categories,
- }
- const schema = generateJSONSchema(config)
- const schemaString = JSON.stringify(schema, null, 2)
-
- syncFormValue(schemaString)
- }
- }, [
- mode,
- responseFormat,
- includeReasoning,
- minValue,
- maxValue,
- categories,
- supportsBasicMode,
- syncFormValue,
- ])
-
- const handleModeSwitch = (newMode: "basic" | "advanced") => {
- if (newMode === mode) {
- return
- }
-
- if (newMode === "advanced" && mode === "basic") {
- const config: SchemaConfig = {
- responseFormat,
- includeReasoning,
- continuousConfig: {minimum: minValue, maximum: maxValue},
- categoricalOptions: categories,
- }
- const schema = generateJSONSchema(config)
- const schemaString = JSON.stringify(schema, null, 2)
- setRawSchema(schemaString)
- syncFormValue(schemaString)
- setSupportsBasicMode(true)
- setMode("advanced")
- return
- }
-
- if (newMode === "basic" && mode === "advanced") {
- if (!supportsBasicMode) {
- modal.confirm({
- title: "Switch to basic mode?",
- content:
- "Switching to basic mode will reset your advanced configuration. Are you sure?",
- okText: "Switch",
- cancelText: "Cancel",
- onOk: () => {
- const parsed = parseJSONSchema(rawSchema)
- const config = parsed ?? getDefaultConfig()
- applyConfigAndSync(config)
- setMode("basic")
- },
- })
- return
- }
-
- const parsed = parseJSONSchema(rawSchema)
- const config = parsed ?? getDefaultConfig()
- applyConfigAndSync(config)
- setMode("basic")
- return
- }
-
- setMode(newMode)
- }
-
- const addCategory = () => {
- setCategories([...categories, {name: "", description: ""}])
- }
-
- const removeCategory = (index: number) => {
- setCategories(categories.filter((_, i) => i !== index))
- }
-
- const updateCategory = (index: number, field: "name" | "description", value: string) => {
- const updated = [...categories]
- updated[index][field] = value
- setCategories(updated)
- }
-
- if (mode === "advanced") {
- return (
- <>
-
-
- Configuration (Advanced Mode)
-
- handleModeSwitch("basic")}>
- Basic Mode
-
-
-
-
- {
- if (value !== undefined) {
- setRawSchema(value)
- setSupportsBasicMode(
- value ? isSchemaCompatibleWithBasicMode(value) : false,
- )
-
- if (Array.isArray(name)) {
- form.setFieldValue(name, value)
- } else {
- form.setFieldValue([name], value)
- }
- }
- }}
- editorProps={{
- codeOnly: true,
- language: "json",
- }}
- syncWithInitialValueChanges={true}
- />
-
- {contextHolder}
- >
- )
- }
-
- // Basic Mode
- return (
- <>
-
-
- Feedback Configuration
-
- handleModeSwitch("advanced")}>
- Advanced Mode
-
-
-
-
-
- {/* Response Format */}
-
-
- Response Format
-
-
-
-
-
setResponseFormat(value)}
- options={[
- {label: "Boolean (True/False)", value: "boolean"},
- {label: "Continuous (Numeric Range)", value: "continuous"},
- {label: "Categorical (Predefined Options)", value: "categorical"},
- ]}
- />
-
-
- {/* Conditional fields based on response format */}
- {responseFormat === "boolean" && (
-
- )}
-
- {responseFormat === "continuous" && (
-
-
-
- Minimum
-
-
-
-
-
setMinValue(value ?? 0)}
- />
-
-
-
- Maximum
-
-
-
-
-
setMaxValue(value ?? 10)}
- />
-
-
- )}
-
- {responseFormat === "categorical" && (
-
-
- Categories
-
-
-
-
- {categories.map((category, index) => (
-
-
- updateCategory(index, "name", e.target.value)
- }
- style={{width: 150}}
- />
-
- updateCategory(index, "description", e.target.value)
- }
- style={{flex: 1}}
- />
- }
- onClick={() => removeCategory(index)}
- disabled={categories.length <= 1}
- />
-
- ))}
-
}
- onClick={addCategory}
- style={{width: "100%"}}
- >
- Add Category
-
-
- )}
-
- {/* Include Reasoning */}
-
- setIncludeReasoning(e.target.checked)}
- >
- Include reasoning
-
-
-
-
-
-
-
- {contextHolder}
- >
- )
-}
diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaGenerator.ts b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaGenerator.ts
deleted file mode 100644
index b6acddb008..0000000000
--- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaGenerator.ts
+++ /dev/null
@@ -1,152 +0,0 @@
-import deepEqual from "fast-deep-equal"
-import {GeneratedJSONSchema, SchemaConfig} from "./types"
-
-export function isSchemaCompatibleWithBasicMode(schemaString: string): boolean {
- const config = parseJSONSchema(schemaString)
-
- if (!config) {
- return false
- }
-
- try {
- const parsed = JSON.parse(schemaString)
- const normalizedOriginalSchema = parsed.schema || parsed
- const regeneratedSchema = generateJSONSchema(config).schema
-
- return deepEqual(normalizedOriginalSchema, regeneratedSchema)
- } catch {
- return false
- }
-}
-
-export function generateJSONSchema(config: SchemaConfig): GeneratedJSONSchema {
- const {responseFormat, includeReasoning, continuousConfig, categoricalOptions} = config
-
- const properties: Record = {}
- const required: string[] = ["correctness"]
-
- // Base description is always "The grade results"
- const baseDescription = "The grade results"
-
- // Add the main correctness field based on response format
- switch (responseFormat) {
- case "continuous":
- properties.correctness = {
- type: "number",
- description: baseDescription,
- minimum: continuousConfig?.minimum ?? 0,
- maximum: continuousConfig?.maximum ?? 10,
- }
- break
-
- case "boolean":
- properties.correctness = {
- type: "boolean",
- description: baseDescription,
- }
- break
-
- case "categorical":
- if (categoricalOptions && categoricalOptions.length > 0) {
- const enumValues = categoricalOptions.map((opt) => opt.name)
- const categoryDescriptions = categoricalOptions
- .map((opt) => `"${opt.name}": ${opt.description}`)
- .join("| ")
-
- properties.correctness = {
- type: "string",
- description: `${baseDescription}. Categories: ${categoryDescriptions}`,
- enum: enumValues,
- }
- } else {
- // Fallback if no categories defined
- properties.correctness = {
- type: "string",
- description: baseDescription,
- }
- }
- break
- }
-
- // Add reasoning field if requested
- if (includeReasoning) {
- properties.comment = {
- type: "string",
- description: "Reasoning for the score",
- }
- required.push("comment")
- }
-
- return {
- name: "schema",
- schema: {
- title: "extract",
- description: "Extract information from the user's response.",
- type: "object",
- properties,
- required,
- strict: true,
- },
- }
-}
-
-export function parseJSONSchema(schemaString: string): SchemaConfig | null {
- try {
- const parsed = JSON.parse(schemaString)
-
- // Handle both old format (direct schema) and new format (with name wrapper)
- const schema = parsed.schema || parsed
-
- if (!schema.properties || !schema.properties.correctness) {
- return null
- }
-
- const correctness = schema.properties.correctness
- const hasReasoning = !!schema.properties.comment
-
- let responseFormat: SchemaConfig["responseFormat"] = "boolean"
- let continuousConfig: SchemaConfig["continuousConfig"]
- let categoricalOptions: SchemaConfig["categoricalOptions"]
-
- if (correctness.type === "number") {
- responseFormat = "continuous"
- continuousConfig = {
- minimum: correctness.minimum ?? 0,
- maximum: correctness.maximum ?? 10,
- }
- } else if (correctness.type === "boolean") {
- responseFormat = "boolean"
- } else if (correctness.type === "string" && correctness.enum) {
- responseFormat = "categorical"
-
- // Parse category descriptions from the description field
- const desc = correctness.description || ""
- const categoriesMatch = desc.match(/Categories: (.+)/)
-
- if (categoriesMatch) {
- const categoriesStr = categoriesMatch[1]
- const categoryPairs = categoriesStr.split("| ")
-
- categoricalOptions = correctness.enum.map((name: string) => {
- const pair = categoryPairs.find((p: string) => p.startsWith(`"${name}":`))
- const description = pair ? pair.split(": ")[1] || "" : ""
- return {name, description}
- })
- } else {
- categoricalOptions = correctness.enum.map((name: string) => ({
- name,
- description: "",
- }))
- }
- }
-
- return {
- responseFormat,
- includeReasoning: hasReasoning,
- continuousConfig,
- categoricalOptions,
- }
- } catch {
- return null
- }
-}
diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/index.ts b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/index.ts
deleted file mode 100644
index 9447df2662..0000000000
--- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/index.ts
+++ /dev/null
@@ -1,3 +0,0 @@
-export {JSONSchemaEditor} from "./JSONSchemaEditor"
-export * from "./types"
-export * from "./JSONSchemaGenerator"
diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/types.ts b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/types.ts
deleted file mode 100644
index 7b758b77a3..0000000000
--- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/types.ts
+++ /dev/null
@@ -1,38 +0,0 @@
-export type ResponseFormatType = "continuous" | "boolean" | "categorical"
-
-export interface ContinuousConfig {
- minimum: number
- maximum: number
-}
-
-export interface CategoricalOption {
- name: string
- description: string
-}
-
-export interface SchemaConfig {
- responseFormat: ResponseFormatType
- includeReasoning: boolean
- continuousConfig?: ContinuousConfig
- categoricalOptions?: CategoricalOption[]
-}
-
-export interface JSONSchemaProperty {
- type: string
- description: string
- minimum?: number
- maximum?: number
- enum?: string[]
-}
-
-export interface GeneratedJSONSchema {
- name: string
- schema: {
- title: string
- description: string
- type: "object"
- properties: Record
- required: string[]
- strict: boolean
- }
-}
diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx
index a3ab293953..7c27c58fae 100644
--- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx
+++ b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx
@@ -6,14 +6,7 @@ import dynamic from "next/dynamic"
import {createUseStyles} from "react-jss"
import {useAppId} from "@/oss/hooks/useAppId"
-import {
- EvaluationSettingsTemplate,
- Evaluator,
- EvaluatorConfig,
- JSSTheme,
- testset,
- Variant,
-} from "@/oss/lib/Types"
+import {Evaluator, EvaluatorConfig, JSSTheme, testset, Variant} from "@/oss/lib/Types"
import {
CreateEvaluationConfigData,
createEvaluatorConfig,
@@ -123,145 +116,55 @@ const ConfigureEvaluator = ({
trace: null,
})
- const evaluatorVersionNumber = useMemo(() => {
- const raw =
- editEvalEditValues?.settings_values?.version ??
- selectedEvaluator?.settings_template?.version?.default ??
- 3
-
- if (typeof raw === "number") return raw
- // extract leading number (e.g., "4", "4.1", "v4")
- const match = String(raw).match(/\d+(\.\d+)?/)
- return match ? parseFloat(match[0]) : 3
- }, [editEvalEditValues?.settings_values?.version, selectedEvaluator])
-
- const evalFields = useMemo(() => {
- const templateEntries = Object.entries(selectedEvaluator?.settings_template || {})
- const allowStructuredOutputs = evaluatorVersionNumber >= 4
-
- return templateEntries.reduce(
- (acc, [key, field]) => {
- const f = field as Partial | undefined
- if (!f?.type) return acc
- if (!allowStructuredOutputs && (key === "json_schema" || key === "response_type")) {
- return acc
- }
- acc.push({
+ const evalFields = useMemo(
+ () =>
+ Object.keys(selectedEvaluator?.settings_template || {})
+ .filter((key) => !!selectedEvaluator?.settings_template[key]?.type)
+ .map((key) => ({
key,
- ...(f as EvaluationSettingsTemplate),
- advanced: Boolean((f as any)?.advanced),
- })
- return acc
- },
- [] as Array,
- )
- }, [selectedEvaluator, evaluatorVersionNumber])
+ ...selectedEvaluator?.settings_template[key]!,
+ advanced: selectedEvaluator?.settings_template[key]?.advanced || false,
+ })),
+ [selectedEvaluator],
+ )
const advancedSettingsFields = evalFields.filter((field) => field.advanced)
const basicSettingsFields = evalFields.filter((field) => !field.advanced)
- const onSubmit = async (values: CreateEvaluationConfigData) => {
+ const onSubmit = (values: CreateEvaluationConfigData) => {
try {
setSubmitLoading(true)
if (!selectedEvaluator.key) throw new Error("No selected key")
const settingsValues = values.settings_values || {}
- const jsonSchemaFieldPath: Array = ["settings_values", "json_schema"]
- const hasJsonSchema = Object.prototype.hasOwnProperty.call(
- settingsValues,
- "json_schema",
- )
-
- if (hasJsonSchema) {
- form.setFields([{name: jsonSchemaFieldPath, errors: []}])
-
- if (typeof settingsValues.json_schema === "string") {
- try {
- const parsed = JSON.parse(settingsValues.json_schema)
- if (!parsed || typeof parsed !== "object" || Array.isArray(parsed)) {
- throw new Error()
- }
- settingsValues.json_schema = parsed
- } catch {
- form.setFields([
- {
- name: jsonSchemaFieldPath,
- errors: ["Enter a valid JSON object"],
- },
- ])
- throw new Error("JSON schema must be a valid JSON object")
- }
- } else if (
- settingsValues.json_schema &&
- (typeof settingsValues.json_schema !== "object" ||
- Array.isArray(settingsValues.json_schema))
- ) {
- form.setFields([
- {
- name: jsonSchemaFieldPath,
- errors: ["Enter a valid JSON object"],
- },
- ])
- throw new Error("JSON schema must be a valid JSON object")
- }
- }
-
const data = {
...values,
evaluator_key: selectedEvaluator.key,
settings_values: settingsValues,
}
- if (editMode) {
- await updateEvaluatorConfig(editEvalEditValues?.id!, data)
-
- setEditEvalEditValues((previous) =>
- previous
- ? {
- ...previous,
- ...data,
- settings_values: settingsValues,
- }
- : previous,
- )
- } else {
- const response = await createEvaluatorConfig(appId, data)
- const createdConfig = response?.data
-
- if (createdConfig) {
- setEditEvalEditValues(createdConfig)
- setEditMode(true)
- }
- }
-
- onSuccess()
+ ;(editMode
+ ? updateEvaluatorConfig(editEvalEditValues?.id!, data)
+ : createEvaluatorConfig(appId, data)
+ )
+ .then(onSuccess)
+ .catch(console.error)
+ .finally(() => setSubmitLoading(false))
} catch (error: any) {
- if (error?.errorFields) return
+ setSubmitLoading(false)
console.error(error)
message.error(error.message)
- } finally {
- setSubmitLoading(false)
}
}
useEffect(() => {
- // Reset form before loading new values so there are no stale values
form.resetFields()
-
- if (editMode && editEvalEditValues) {
- // Load all values including nested settings_values
- form.setFieldsValue({
- ...editEvalEditValues,
- settings_values: editEvalEditValues.settings_values || {},
- })
- } else if (cloneConfig && editEvalEditValues) {
- // When cloning, copy only settings_values and clear the name so user provides a new name
- form.setFieldsValue({
- settings_values: editEvalEditValues.settings_values || {},
- name: "",
- })
+ if (editMode) {
+ form.setFieldsValue(editEvalEditValues)
+ } else if (cloneConfig) {
+ form.setFieldValue("settings_values", editEvalEditValues?.settings_values)
}
- }, [editMode, cloneConfig, editEvalEditValues, form])
+ }, [editMode, cloneConfig])
return (
diff --git a/web/ee/src/components/pages/settings/Billing/index.tsx b/web/ee/src/components/pages/settings/Billing/index.tsx
index 3a5ec92157..fec538eac0 100644
--- a/web/ee/src/components/pages/settings/Billing/index.tsx
+++ b/web/ee/src/components/pages/settings/Billing/index.tsx
@@ -104,7 +104,7 @@ const Billing = () => {
{Object.entries(usage)
- ?.filter(([key]) => (key !== "users" && key !== "applications"))
+ ?.filter(([key]) => key !== "users")
?.map(([key, info]) => {
return (
(typeof msg === "string" ? msg : JSON.stringify(msg)))
- .join("\n")
- : (value?.toString() ?? "-")
- case "multiple_choice":
- return Array.isArray(value) ? value.join(", ") : (value?.toString() ?? "-")
- case "hidden":
- return "-"
default:
- return value?.toString() ?? "-"
+ return value?.toString()
}
}
diff --git a/web/ee/src/lib/hooks/useEvaluationRunData/refreshLiveRun.ts b/web/ee/src/lib/hooks/useEvaluationRunData/refreshLiveRun.ts
index 2d9a63f16b..dca3a7eab1 100644
--- a/web/ee/src/lib/hooks/useEvaluationRunData/refreshLiveRun.ts
+++ b/web/ee/src/lib/hooks/useEvaluationRunData/refreshLiveRun.ts
@@ -84,7 +84,7 @@ export const refreshLiveEvaluationRun = async (runId: string): Promise (o ? o[key] : undefined), obj)
}
export function computeInputsAndGroundTruth({
diff --git a/web/ee/src/state/observability/dashboard.ts b/web/ee/src/state/observability/dashboard.ts
index e8c0215a71..6d040b22d7 100644
--- a/web/ee/src/state/observability/dashboard.ts
+++ b/web/ee/src/state/observability/dashboard.ts
@@ -3,9 +3,9 @@ import {eagerAtom} from "jotai-eager"
import {atomWithQuery} from "jotai-tanstack-query"
import {GenerationDashboardData} from "@/oss/lib/types_ee"
+import {fetchGenerationsDashboardData} from "@/oss/services/tracing/api"
import {routerAppIdAtom} from "@/oss/state/app/atoms/fetcher"
import {projectIdAtom} from "@/oss/state/project"
-import {fetchGenerationsDashboardData} from "@/oss/services/tracing/api"
const DEFAULT_RANGE = "30_days"
diff --git a/web/oss/package.json b/web/oss/package.json
index 6d4e247c86..c773e5dccd 100644
--- a/web/oss/package.json
+++ b/web/oss/package.json
@@ -1,6 +1,6 @@
{
"name": "@agenta/oss",
- "version": "0.60.2",
+ "version": "0.60.0",
"private": true,
"engines": {
"node": ">=18"
diff --git a/web/oss/public/assets/Agenta-logo-full-dark-accent.png b/web/oss/public/assets/Agenta-logo-full-dark-accent.png
deleted file mode 100644
index c14833dab1..0000000000
Binary files a/web/oss/public/assets/Agenta-logo-full-dark-accent.png and /dev/null differ
diff --git a/web/oss/public/assets/Agenta-logo-full-light.png b/web/oss/public/assets/Agenta-logo-full-light.png
deleted file mode 100644
index 4c9b31a813..0000000000
Binary files a/web/oss/public/assets/Agenta-logo-full-light.png and /dev/null differ
diff --git a/web/oss/public/assets/dark-complete-transparent-CROPPED.png b/web/oss/public/assets/dark-complete-transparent-CROPPED.png
new file mode 100644
index 0000000000..7d134ac59a
Binary files /dev/null and b/web/oss/public/assets/dark-complete-transparent-CROPPED.png differ
diff --git a/web/oss/public/assets/dark-complete-transparent_white_logo.png b/web/oss/public/assets/dark-complete-transparent_white_logo.png
new file mode 100644
index 0000000000..8685bbf981
Binary files /dev/null and b/web/oss/public/assets/dark-complete-transparent_white_logo.png differ
diff --git a/web/oss/public/assets/dark-logo.svg b/web/oss/public/assets/dark-logo.svg
new file mode 100644
index 0000000000..6cb8ef3330
--- /dev/null
+++ b/web/oss/public/assets/dark-logo.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/web/oss/public/assets/favicon.ico b/web/oss/public/assets/favicon.ico
index dad02fe072..4dc8619b1d 100644
Binary files a/web/oss/public/assets/favicon.ico and b/web/oss/public/assets/favicon.ico differ
diff --git a/web/oss/public/assets/light-complete-transparent-CROPPED.png b/web/oss/public/assets/light-complete-transparent-CROPPED.png
new file mode 100644
index 0000000000..6be2e99e08
Binary files /dev/null and b/web/oss/public/assets/light-complete-transparent-CROPPED.png differ
diff --git a/web/oss/public/assets/light-logo.svg b/web/oss/public/assets/light-logo.svg
new file mode 100644
index 0000000000..9c795f8e88
--- /dev/null
+++ b/web/oss/public/assets/light-logo.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/web/oss/src/components/Filters/EditColumns/assets/helper.ts b/web/oss/src/components/Filters/EditColumns/assets/helper.ts
index 511eabc8f6..3cbfc58d73 100644
--- a/web/oss/src/components/Filters/EditColumns/assets/helper.ts
+++ b/web/oss/src/components/Filters/EditColumns/assets/helper.ts
@@ -30,5 +30,5 @@ export const formatColumnTitle = (text: string) => {
return text
.replace(/_/g, " ")
.replace(/([a-z])([A-Z])/g, "$1 $2")
- .replace(/\b\w/g, (c) => c)
+ .replace(/\b\w/g, (c) => c.toUpperCase())
}
diff --git a/web/oss/src/components/Logo/Logo.tsx b/web/oss/src/components/Logo/Logo.tsx
index 1c3b6447d5..ddb1c133f9 100644
--- a/web/oss/src/components/Logo/Logo.tsx
+++ b/web/oss/src/components/Logo/Logo.tsx
@@ -5,8 +5,8 @@ import Image from "next/image"
import {useAppTheme} from "../Layout/ThemeContextProvider"
const LOGOS = {
- dark: "/assets/Agenta-logo-full-dark-accent.png",
- light: "/assets/Agenta-logo-full-light.png",
+ dark: "/assets/dark-complete-transparent-CROPPED.png",
+ light: "/assets/light-complete-transparent-CROPPED.png",
}
const Logo: React.FC> & {isOnlyIconLogo?: boolean}> = (
diff --git a/web/oss/src/components/TestsetTable/TestsetTable.tsx b/web/oss/src/components/TestsetTable/TestsetTable.tsx
index 17fa756fa9..4275a1e49d 100644
--- a/web/oss/src/components/TestsetTable/TestsetTable.tsx
+++ b/web/oss/src/components/TestsetTable/TestsetTable.tsx
@@ -1,13 +1,5 @@
// @ts-nocheck
-import {
- type FC,
- type ChangeEvent,
- ReactNode,
- useEffect,
- useState,
- useMemo,
- useCallback,
-} from "react"
+import {type FC, type ChangeEvent, ReactNode, useEffect, useState, useMemo, useCallback} from "react"
import {type IHeaderParams} from "@ag-grid-community/core"
import {CheckCircleFilled} from "@ant-design/icons"
@@ -417,7 +409,6 @@ const TestsetTable: FC = ({mode}) => {
onRowSelected={onRowSelectedOrDeselected}
onRowDataUpdated={onRowSelectedOrDeselected}
className="ph-no-capture"
- suppressFieldDotNotation={true}
/>
diff --git a/web/oss/src/components/pages/app-management/modals/MaxAppModal.tsx b/web/oss/src/components/pages/app-management/modals/MaxAppModal.tsx
index 330ad640cc..8ee1f9dfc2 100644
--- a/web/oss/src/components/pages/app-management/modals/MaxAppModal.tsx
+++ b/web/oss/src/components/pages/app-management/modals/MaxAppModal.tsx
@@ -43,7 +43,7 @@ const MaxAppModal: React.FC = ({...props}) => {
| null => {
- if (input == null) return null
- if (Array.isArray(input)) return {}
- if (typeof input === "object") return input as Record
- return {}
- }
-
const onFinish = useCallback(
async (values: any) => {
try {
@@ -202,9 +195,7 @@ const CreateEvaluator = ({
is_custom: false,
},
meta: evaluatorWithMeta.meta || {},
- ...(evaluatorWithMeta.tags
- ? {tags: normalizeTags(evaluatorWithMeta.tags)}
- : {}),
+ ...(evaluatorWithMeta.tags ? {tags: evaluatorWithMeta.tags} : {}),
},
}
diff --git a/web/oss/src/components/pages/observability/drawer/TestsetDrawer/TestsetDrawer.tsx b/web/oss/src/components/pages/observability/drawer/TestsetDrawer/TestsetDrawer.tsx
index af04837b4a..9c796b31c3 100644
--- a/web/oss/src/components/pages/observability/drawer/TestsetDrawer/TestsetDrawer.tsx
+++ b/web/oss/src/components/pages/observability/drawer/TestsetDrawer/TestsetDrawer.tsx
@@ -37,7 +37,6 @@ import {useTestsetsData} from "@/oss/state/testset"
import {useStyles} from "./assets/styles"
import {Mapping, Preview, TestsetColumn, TestsetDrawerProps, TestsetTraceData} from "./assets/types"
-import {getValueAtPath} from "./assets/helpers"
const TestsetDrawer = ({
onClose,
@@ -358,7 +357,8 @@ const TestsetDrawer = ({
continue // Skip duplicate columns for now
}
- const value = getValueAtPath(item, mapping.data)
+ const keys = mapping.data.split(".")
+ const value = keys.reduce((acc: any, key) => acc?.[key], item)
formattedItem[targetKey] =
value === undefined || value === null
diff --git a/web/oss/src/components/pages/observability/drawer/TestsetDrawer/assets/helpers.ts b/web/oss/src/components/pages/observability/drawer/TestsetDrawer/assets/helpers.ts
deleted file mode 100644
index 9ffb36365a..0000000000
--- a/web/oss/src/components/pages/observability/drawer/TestsetDrawer/assets/helpers.ts
+++ /dev/null
@@ -1,32 +0,0 @@
-const splitPath = (path: string) => path.split(/(? p.replace(/\\\./g, "."))
-
-export const getValueAtPath = (obj: any, rawPath: string) => {
- if (obj == null || !rawPath) return undefined
-
- // quick direct hit (entire path is a literal key on the current object)
- if (Object.prototype.hasOwnProperty.call(obj, rawPath)) return obj[rawPath]
-
- const parts = splitPath(rawPath)
- let cur: any = obj
-
- for (let i = 0; i < parts.length; i++) {
- if (cur == null) return undefined
-
- const key = parts[i]
-
- if (Object.prototype.hasOwnProperty.call(cur, key)) {
- cur = cur[key]
- continue
- }
-
- // fallback: treat the remaining segments as one literal key containing dots
- const remainder = parts.slice(i).join(".")
- if (Object.prototype.hasOwnProperty.call(cur, remainder)) {
- return cur[remainder]
- }
-
- return undefined
- }
-
- return cur
-}
diff --git a/web/oss/src/lib/Types.ts b/web/oss/src/lib/Types.ts
index 1a2c86c43b..54f92ef345 100644
--- a/web/oss/src/lib/Types.ts
+++ b/web/oss/src/lib/Types.ts
@@ -831,16 +831,9 @@ export interface StyleProps {
themeMode: "dark" | "light"
}
-export interface SettingsPreset {
- key: string;
- name: string;
- values: Record;
-}
-
export interface Evaluator {
name: string
key: string
- settings_presets?: SettingsPreset[]
settings_template: Record
icon_url?: string | StaticImageData
color?: string
@@ -983,7 +976,6 @@ type ValueTypeOptions =
| "hidden"
| "messages"
| "multiple_choice"
- | "llm_response_schema"
export interface EvaluationSettingsTemplate {
type: ValueTypeOptions
diff --git a/web/oss/src/pages/auth/[[...path]].tsx b/web/oss/src/pages/auth/[[...path]].tsx
index 2d93bdfcf7..ef1ed9af39 100644
--- a/web/oss/src/pages/auth/[[...path]].tsx
+++ b/web/oss/src/pages/auth/[[...path]].tsx
@@ -123,7 +123,7 @@ const Auth = () => {
)}
>
{
const [csvVersion, setCsvVersion] = useState(0)
const [isValidating, setIsValidating] = useState(false)
- // Keep a ref in sync so the effect can read the latest fallback columns without re-running.
- const columnsFallbackRef = useRef(columnsFallback)
- useEffect(() => {
- columnsFallbackRef.current = columnsFallback
- }, [columnsFallback])
-
- // Track ids that we already attempted (success or non-cancel failure) and those in flight.
- const triedRef = useRef>(new Set())
- const inFlightRef = useRef>(new Set())
-
// Extract CSV columns from the TanStack Query cache for any testset
const cachedColumnsByTestsetId = useMemo(() => {
if (!enabled) return {}
@@ -43,9 +33,9 @@ export const useTestsetsData = ({enabled = true} = {}) => {
const source =
firstRow &&
typeof firstRow === "object" &&
- (firstRow as any).data &&
- typeof (firstRow as any).data === "object"
- ? ((firstRow as any).data as Record)
+ firstRow.data &&
+ typeof firstRow.data === "object"
+ ? (firstRow.data as Record)
: (firstRow as Record)
result[ts._id] = Object.keys(source)
} else {
@@ -53,10 +43,9 @@ export const useTestsetsData = ({enabled = true} = {}) => {
}
})
return result
- }, [queryClient, testsets, csvVersion, enabled])
+ }, [queryClient, testsets, csvVersion])
// Merge cache with fallback (from preview single testcase query)
- // Depend on `columnsFallback` so consumers re-render when we infer columns.
const columnsByTestsetId = useMemo(() => {
if (!enabled) return {}
const merged: Record = {...cachedColumnsByTestsetId}
@@ -66,133 +55,89 @@ export const useTestsetsData = ({enabled = true} = {}) => {
}
})
return merged
- }, [cachedColumnsByTestsetId, enabled, columnsFallback])
+ }, [cachedColumnsByTestsetId, columnsFallback])
- // Background fill: for testsets without cached columns, fetch a single testcase to infer columns.
+ // Background fill: for testsets without cached columns, fetch a single testcase to infer columns
+ const triedRef = useRef>(new Set())
useEffect(() => {
if (!enabled) return
if (isPending || isLoading) return
const controller = new AbortController()
-
- const getPending = () => {
- const fallback = columnsFallbackRef.current
- const pending = (testsets ?? []).filter((ts: any) => {
+ const tried = triedRef.current
+ const run = async () => {
+ if (!Array.isArray(testsets) || testsets.length === 0) return
+ const pending = testsets.filter((ts: any) => {
const id = ts?._id
if (!id) return false
- // If cache already has columns, skip
- if (cachedColumnsByTestsetId[id]?.length) return false
- // If fallback already has columns, skip
- if (fallback[id]?.length) return false
- // Avoid double-starting work
- if (inFlightRef.current.has(id)) return false
- // Skip ids we already tried (success or hard failure)
- if (triedRef.current.has(id)) return false
+ if (columnsByTestsetId[id]) return false
+ if (columnsFallback[id]) return false
+ if (tried.has(id)) return false
return true
})
- return pending
- }
-
- const BATCH = 6
- let stopped = false
-
- const run = async () => {
- // Process as many batches as needed in one effect run to avoid re-run storms.
- setIsValidating(true)
- try {
- while (!stopped && !controller.signal.aborted) {
- const pending = getPending()
- if (pending.length === 0) break
-
- const toFetch = pending.slice(0, BATCH)
-
- await Promise.all(
- toFetch.map(async (ts: any) => {
- const id = ts._id
- if (!id) return
-
- // Mark as in-flight before firing the request
- inFlightRef.current.add(id)
- try {
- const url = `${getAgentaApiUrl()}/preview/testcases/query`
- const {data} = await axios.post(
- url,
- {
- testset_id: id,
- windowing: {limit: 1},
- },
- {signal: controller.signal},
- )
-
- const rows: any[] = Array.isArray(data?.testcases)
- ? data.testcases
- : Array.isArray(data)
- ? data
- : []
- const first = rows[0]
- const dataObj =
- first?.data && typeof first.data === "object" ? first.data : {}
- const cols = Object.keys(dataObj as Record)
-
- if (cols.length) {
- setColumnsFallback((prev) => {
- const next = {...prev, [id]: cols}
- // Keep ref in sync immediately for this loop
- columnsFallbackRef.current = next
- return next
- })
- }
- // Mark as tried after a completed call (success or empty)
- triedRef.current.add(id)
- } catch (e: any) {
- // If aborted or axios-cancelled, allow retry in a future pass
- const isCancelled =
- e?.name === "CanceledError" ||
- e?.name === "AbortError" ||
- e?.code === "ERR_CANCELED" ||
- (typeof (axios as any).isCancel === "function" &&
- (axios as any).isCancel(e))
- if (!isCancelled) {
- // Hard failure: mark as tried to avoid hot loops
- triedRef.current.add(id)
- }
- } finally {
- inFlightRef.current.delete(id)
- }
- }),
- )
-
- // Yield between batches so React can paint and we do not hog the tab
- await new Promise((r) => setTimeout(r, 0))
- }
- } finally {
- setIsValidating(false)
- }
+ if (pending.length === 0) return
+ // Limit concurrent fetches
+ const BATCH = 6
+ const toFetch = pending.slice(0, BATCH)
+ await Promise.all(
+ toFetch.map(async (ts: any) => {
+ try {
+ setIsValidating(true)
+
+ const url = `${getAgentaApiUrl()}/preview/testcases/query`
+ const {data} = await axios.post(
+ url,
+ {
+ testset_id: ts._id,
+ windowing: {limit: 1},
+ },
+ {signal: controller.signal},
+ )
+ // Response shape:
+ // { count: number, testcases: [{ data: { ...columns }, ... }] }
+ const rows: any[] = Array.isArray(data?.testcases)
+ ? data.testcases
+ : Array.isArray(data)
+ ? data
+ : []
+ const first = rows[0]
+ const dataObj =
+ first?.data && typeof first.data === "object" ? first.data : {}
+ const cols = Object.keys(dataObj as Record)
+ if (cols.length) {
+ setColumnsFallback((prev) => ({...prev, [ts._id]: cols}))
+ // Also hydrate the primary cache so all consumers see columns immediately
+ // queryClient.setQueryData(["testsetCsvData", ts._id], [dataObj])
+ } else {
+ tried.add(ts._id)
+ }
+ } catch (e) {
+ // swallow; keep fallback empty for this id
+ tried.add(ts._id)
+ // console.warn("Failed to infer columns for testset", ts?._id, e)
+ } finally {
+ setIsValidating(false)
+ }
+ }),
+ )
+
}
-
run()
- return () => {
- stopped = true
- controller.abort()
- }
- // Re-run only when inputs truly change (not on fallback writes)
- }, [enabled, isPending, isLoading, testsets, cachedColumnsByTestsetId])
+ return () => controller.abort()
+ }, [testsets, columnsByTestsetId, columnsFallback, isPending, isLoading])
- // Scoped csvVersion bumps: only bump for testset ids we care about
+ // When any testsetCsvData query updates, bump csvVersion
useEffect(() => {
- if (!enabled) return
- const ids = new Set((testsets ?? []).map((t: any) => t?._id).filter(Boolean))
-
- const unsubscribe = queryClient.getQueryCache().subscribe((event: any) => {
- if (event?.type !== "updated") return
- const q = event?.query
- if (q?.queryKey?.[0] !== "testsetCsvData") return
- const id = q?.queryKey?.[1]
- if (!ids.has(id)) return
- setCsvVersion((v) => v + 1)
+ const unsubscribe = queryClient.getQueryCache().subscribe((event) => {
+ // Only react to updates of our csv data queries
+ const q = (event as any)?.query
+ const key0 = q?.queryKey?.[0]
+ if (key0 === "testsetCsvData") {
+ setCsvVersion((v) => v + 1)
+ }
})
return unsubscribe
- }, [enabled, queryClient, testsets])
+ }, [queryClient])
return {
testsets: testsets ?? [],
diff --git a/web/package.json b/web/package.json
index 9228ee9355..f06cb54ce3 100644
--- a/web/package.json
+++ b/web/package.json
@@ -1,6 +1,6 @@
{
"name": "agenta-web",
- "version": "0.60.2",
+ "version": "0.62.0",
"workspaces": [
"ee",
"oss",