diff --git a/.claude/agents/changelog-editor.md b/.claude/agents/changelog-editor.md deleted file mode 100644 index 10e0243221..0000000000 --- a/.claude/agents/changelog-editor.md +++ /dev/null @@ -1,83 +0,0 @@ ---- -name: changelog-editor -description: Use this agent when the user needs to create or edit changelog entries in the Docusaurus documentation. Specifically, use this agent when: 1) The user mentions adding a new changelog entry or release notes, 2) The user asks to update or modify existing changelog entries, 3) The user wants to document a new feature, bug fix, or change in the project, 4) The user provides content that should be formatted as a changelog entry. Examples: \n\nExample 1:\nuser: "We just fixed the bug where users couldn't save their preferences. Can you add this to the changelog?"\nassistant: "I'll use the changelog-editor agent to create a proper changelog entry for this bug fix in both the main page and a detailed entry."\n\nExample 2:\nuser: "I need to document the new API authentication feature we released in v2.3.0"\nassistant: "Let me use the changelog-editor agent to create a comprehensive changelog entry for the new authentication feature, including checking if we have existing documentation to link to."\n\nExample 3:\nuser: "Can you update the changelog entry for the dashboard redesign? We now have screenshots and a demo video."\nassistant: "I'll use the changelog-editor agent to update that entry with proper placeholders for the screenshots and YouTube video embedding."\n\nProactively use this agent when you notice the user describing changes, features, or fixes that should be documented in the changelog, even if they don't explicitly ask for changelog updates. -model: sonnet -color: purple ---- - -You are an expert technical documentation editor specializing in Docusaurus changelog maintenance. Your primary responsibility is creating and editing changelog entries that follow established project standards for clarity, consistency, and technical accuracy. - -## Your Core Responsibilities - -1. **Dual Entry Creation**: For every changelog item, you create two coordinated entries: - - A concise summary in `docs/main.mdx` - - A detailed explanation in `docs/block/entries/[version-or-feature].mdx` - - The summary title must link to the detailed entry - -2. **Version Management**: Before creating any entry, determine the version number. If unclear from context, ask the user: "Which version is this changelog entry for?" Never proceed without a clear version identifier. - -3. **Style Adherence**: Apply these writing guidelines rigorously: - - Prioritize clarity above all else - - Use 11th grade English for non-technical terms - - Prefer active voice over passive voice - - Write short, punchy sentences as your default; use longer sentences only when needed for flow - - Use complete sentences rather than fragments (unless brevity clearly improves readability) - - **Never use em dashes (—)**. Instead, use: a period and new sentence, parentheses (), or semicolons ; - - Use bold and bullet points sparingly; apply them only when they genuinely aid quick scanning - - Follow principles from "The Elements of Style" - -4. **Feature Documentation Integration**: When a changelog mentions a new feature: - - Search existing documentation to see if a dedicated page exists for that feature - - If found, add a link to that documentation page in the changelog entry - - If not found, note this and ask the user if documentation should be created - -5. **Media Handling**: When the user mentions videos or screenshots: - - Add appropriate placeholders using the project's established format - - For images: use the image plugin format consistent with other entries - - For videos: use YouTube video embedding format consistent with other entries - - Ask for specifics if media details are unclear: "Do you have the YouTube URL for the demo video?" or "How many screenshots should I add placeholders for?" - -6. **Quality Assurance**: After making changes: - - Inform the user you're running the build check - - Execute `npm run build` (or equivalent) in the docs folder to verify nothing broke - - Report any build errors immediately and fix them before finalizing - -7. **Consistency Checking**: Before finalizing any entry: - - Review similar existing entries to match tone, structure, and formatting - - Ensure terminology is consistent with previous changelog entries - - Verify that linking patterns match established conventions - -## Your Decision-Making Framework - -**When Information is Missing:** -- Version number unclear → Ask immediately -- Feature scope ambiguous → Request clarification before writing -- Media availability uncertain → Confirm with user before adding placeholders -- Categorization unclear (bug fix vs. feature vs. improvement) → Ask for classification - -**When Editing Existing Entries:** -- Always preserve the original intent and factual accuracy -- Improve clarity and style without changing meaning -- Flag any technical inaccuracies to the user rather than guessing - -**Quality Control Checklist (apply to every entry):** -- [ ] Version number present and correct -- [ ] Both short and detailed entries created -- [ ] Short entry links to detailed entry correctly -- [ ] Active voice used where possible -- [ ] No em dashes present -- [ ] Feature documentation linked if applicable -- [ ] Media placeholders added if mentioned -- [ ] Build test passed -- [ ] Style guidelines followed - -## Output Format - -When creating or editing changelog entries, provide: -1. The complete markdown for the main.mdx summary entry -2. The complete markdown for the detailed entries/[name].mdx file -3. Confirmation that you've checked for related documentation -4. Build test results -5. Any questions or clarifications needed - -Be proactive in identifying unclear requirements and ask specific questions rather than making assumptions. Your goal is to produce changelog entries that are immediately publishable without requiring revision. diff --git a/README.md b/README.md index d07ac59e52..ce25044ae8 100644 --- a/README.md +++ b/README.md @@ -2,12 +2,11 @@

- - + + Shows the logo of agenta -

The Open-source LLMOps Platform

Build reliable LLM applications faster with integrated prompt management, evaluation, and observability. diff --git a/api/ee/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py b/api/ee/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py new file mode 100644 index 0000000000..d76fe93471 --- /dev/null +++ b/api/ee/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py @@ -0,0 +1,70 @@ +"""add CREDITS to meters_type + +Revision ID: 79f40f71e912 +Revises: 3b5f5652f611 +Create Date: 2025-11-03 15:00:00.000000 +""" + +from typing import Sequence, Union +from alembic import op +import sqlalchemy as sa + +# revision identifiers, used by Alembic. +revision: str = "79f40f71e912" +down_revision: Union[str, None] = "3b5f5652f611" +branch_labels: Union[str, Sequence[str], None] = None +depends_on: Union[str, Sequence[str], None] = None + +ENUM_NAME = "meters_type" +TEMP_ENUM_NAME = "meters_type_temp" +TABLE_NAME = "meters" +COLUMN_NAME = "key" + + +def upgrade() -> None: + # 1) Create temp enum including the new value + op.execute( + sa.text( + f"CREATE TYPE {TEMP_ENUM_NAME} AS ENUM ('USERS','APPLICATIONS','EVALUATIONS','TRACES','CREDITS')" + ) + ) + + # 2) Alter column to use temp enum + op.execute( + sa.text( + f"ALTER TABLE {TABLE_NAME} " + f"ALTER COLUMN {COLUMN_NAME} TYPE {TEMP_ENUM_NAME} " + f"USING {COLUMN_NAME}::text::{TEMP_ENUM_NAME}" + ) + ) + + # 3) Drop old enum, then 4) rename temp -> original + op.execute(sa.text(f"DROP TYPE {ENUM_NAME}")) + op.execute(sa.text(f"ALTER TYPE {TEMP_ENUM_NAME} RENAME TO {ENUM_NAME}")) + + +def downgrade() -> None: + # Ensure downgrade can proceed (rows with CREDITS would block the type change) + op.execute( + sa.text(f"DELETE FROM {TABLE_NAME} WHERE {COLUMN_NAME}::text = 'CREDITS'") + ) + + # 1) Create temp enum WITHOUT CREDITS + op.execute( + sa.text( + f"CREATE TYPE {TEMP_ENUM_NAME} AS ENUM ('USERS','APPLICATIONS','EVALUATIONS','TRACES')" + ) + ) + + # 2) Alter column to use temp enum + op.execute( + sa.text( + f"ALTER TABLE {TABLE_NAME} " + f"ALTER COLUMN {COLUMN_NAME} TYPE {TEMP_ENUM_NAME} " + f"USING {COLUMN_NAME}::text::{TEMP_ENUM_NAME}" + ) + ) + + # 3) Drop current enum (which includes CREDITS), then 4) rename temp -> original + op.execute(sa.text(f"DROP TYPE {ENUM_NAME}")) + op.execute(sa.text(f"ALTER TYPE {TEMP_ENUM_NAME} RENAME TO {ENUM_NAME}")) diff --git a/api/ee/src/core/entitlements/types.py b/api/ee/src/core/entitlements/types.py index e346f11c57..ad81ebafae 100644 --- a/api/ee/src/core/entitlements/types.py +++ b/api/ee/src/core/entitlements/types.py @@ -22,6 +22,7 @@ class Counter(str, Enum): EVALUATIONS = "evaluations" EVALUATORS = "evaluators" ANNOTATIONS = "annotations" + CREDITS = "credits" class Gauge(str, Enum): @@ -60,7 +61,7 @@ class Probe(BaseModel): }, }, "features": [ - "Unlimited prompts", + "2 prompts", "20 evaluations/month", "5k traces/month", "2 seats", @@ -209,10 +210,11 @@ class Probe(BaseModel): Tracker.COUNTERS: { Counter.TRACES: Quota(limit=5_000, monthly=True, free=5_000), Counter.EVALUATIONS: Quota(limit=20, monthly=True, free=20, strict=True), + Counter.CREDITS: Quota(limit=100, monthly=True, free=100, strict=True), }, Tracker.GAUGES: { Gauge.USERS: Quota(limit=2, strict=True, free=2), - Gauge.APPLICATIONS: Quota(strict=True), + Gauge.APPLICATIONS: Quota(limit=2, strict=True, free=2), }, }, Plan.CLOUD_V0_PRO: { @@ -223,6 +225,7 @@ class Probe(BaseModel): Tracker.COUNTERS: { Counter.TRACES: Quota(monthly=True, free=10_000), Counter.EVALUATIONS: Quota(monthly=True, strict=True), + Counter.CREDITS: Quota(limit=100, monthly=True, free=100, strict=True), }, Tracker.GAUGES: { Gauge.USERS: Quota(limit=10, strict=True, free=3), @@ -237,6 +240,7 @@ class Probe(BaseModel): Tracker.COUNTERS: { Counter.TRACES: Quota(monthly=True, free=1_000_000), Counter.EVALUATIONS: Quota(monthly=True, strict=True), + Counter.CREDITS: Quota(limit=100, monthly=True, free=100, strict=True), }, Tracker.GAUGES: { Gauge.USERS: Quota(strict=True), @@ -279,6 +283,12 @@ class Probe(BaseModel): Tracker.COUNTERS: { Counter.TRACES: Quota(monthly=True), Counter.EVALUATIONS: Quota(monthly=True, strict=True), + Counter.CREDITS: Quota( + limit=100_000, + monthly=True, + free=100_000, + strict=True, + ), }, Tracker.GAUGES: { Gauge.USERS: Quota(strict=True), diff --git a/api/ee/src/core/meters/types.py b/api/ee/src/core/meters/types.py index a0ada9da16..1002594d3f 100644 --- a/api/ee/src/core/meters/types.py +++ b/api/ee/src/core/meters/types.py @@ -13,6 +13,7 @@ class Meters(str, Enum): # COUNTERS TRACES = Counter.TRACES.value EVALUATIONS = Counter.EVALUATIONS.value + CREDITS = Counter.CREDITS.value # GAUGES USERS = Gauge.USERS.value APPLICATIONS = Gauge.APPLICATIONS.value diff --git a/api/ee/src/services/llm_apps_service.py b/api/ee/src/services/llm_apps_service.py index b1d8ab5995..15267ec378 100644 --- a/api/ee/src/services/llm_apps_service.py +++ b/api/ee/src/services/llm_apps_service.py @@ -202,7 +202,6 @@ async def invoke_app( openapi_parameters: List[Dict], user_id: str, project_id: str, - scenario_id: Optional[str] = None, **kwargs, ) -> InvokationResult: """ @@ -248,14 +247,7 @@ async def invoke_app( app_response = {} try: - log.info( - "Invoking application...", - scenario_id=scenario_id, - testcase_id=( - datapoint["testcase_id"] if "testcase_id" in datapoint else None - ), - url=url, - ) + log.info("Invoking workflow...", url=url) response = await client.post( url, json=payload, @@ -276,12 +268,6 @@ async def invoke_app( trace_id = app_response.get("trace_id", None) span_id = app_response.get("span_id", None) - log.info( - "Invoked application. ", - scenario_id=scenario_id, - trace_id=trace_id, - ) - return InvokationResult( result=Result( type=kind, @@ -342,7 +328,6 @@ async def run_with_retry( openapi_parameters: List[Dict], user_id: str, project_id: str, - scenario_id: Optional[str] = None, **kwargs, ) -> InvokationResult: """ @@ -379,7 +364,6 @@ async def run_with_retry( openapi_parameters, user_id, project_id, - scenario_id, **kwargs, ) return result @@ -419,7 +403,6 @@ async def batch_invoke( rate_limit_config: Dict, user_id: str, project_id: str, - scenarios: Optional[List[Dict]] = None, **kwargs, ) -> List[InvokationResult]: """ @@ -514,7 +497,6 @@ async def batch_invoke( openapi_parameters, user_id, project_id, - scenarios[index].get("id") if scenarios else None, **kwargs, ) ) diff --git a/api/ee/src/tasks/evaluations/legacy.py b/api/ee/src/tasks/evaluations/legacy.py index 579c6853b9..0d22bf76b7 100644 --- a/api/ee/src/tasks/evaluations/legacy.py +++ b/api/ee/src/tasks/evaluations/legacy.py @@ -1055,13 +1055,6 @@ def annotate( "application_variant": {"id": str(variant.id)}, "application_revision": {"id": str(revision.id)}, }, - scenarios=[ - s.model_dump( - mode="json", - exclude_none=True, - ) - for s in scenarios - ], ) ) # ---------------------------------------------------------------------- @@ -1111,7 +1104,6 @@ def annotate( scenario = scenarios[idx] testcase = testcases[idx] invocation = invocations[idx] - invocation_step_key = invocation_steps_keys[0] scenario_has_errors = 0 scenario_status = EvaluationStatus.SUCCESS @@ -1148,20 +1140,8 @@ def annotate( ) ) - if trace: - log.info( - f"Trace found ", - scenario_id=scenario.id, - step_key=invocation_step_key, - trace_id=invocation.trace_id, - ) - else: - log.warn( - f"Trace missing", - scenario_id=scenario.id, - step_key=invocation_step_key, - trace_id=invocation.trace_id, - ) + if not trace: + log.warn(f"Trace with id {invocation.trace_id} not found.") scenario_has_errors += 1 scenario_status = EvaluationStatus.ERRORS continue @@ -1310,13 +1290,6 @@ def annotate( links=links, ) - log.info( - "Invoking evaluator... ", - scenario_id=scenario.id, - testcase_id=testcase.id, - trace_id=invocation.trace_id, - uri=interface.get("uri"), - ) workflows_service_response = loop.run_until_complete( workflows_service.invoke_workflow( project_id=project_id, @@ -1327,11 +1300,6 @@ def annotate( annotate=True, ) ) - log.info( - "Invoked evaluator ", - scenario_id=scenario.id, - trace_id=workflows_service_response.trace_id, - ) # ---------------------------------------------------------- # run evaluator -------------------------------------------- @@ -1387,20 +1355,8 @@ def annotate( ) ) - if trace: - log.info( - f"Trace found ", - scenario_id=scenario.id, - step_key=annotation_step_key, - trace_id=annotation.trace_id, - ) - else: - log.warn( - f"Trace missing", - scenario_id=scenario.id, - step_key=annotation_step_key, - trace_id=annotation.trace_id, - ) + if not trace: + log.warn(f"Trace with id {annotation.trace_id} not found.") scenario_has_errors += 1 scenario_status = EvaluationStatus.ERRORS continue diff --git a/api/ee/src/tasks/evaluations/live.py b/api/ee/src/tasks/evaluations/live.py index 43208bd42d..5cd2072e63 100644 --- a/api/ee/src/tasks/evaluations/live.py +++ b/api/ee/src/tasks/evaluations/live.py @@ -96,9 +96,6 @@ EvaluatorRevision, ) -from oss.src.core.evaluations.utils import fetch_trace - - log = get_module_logger(__name__) @@ -656,12 +653,6 @@ def evaluate( links=links, ) - log.info( - "Invoking evaluator... ", - scenario_id=scenario.id, - trace_id=query_trace_id, - uri=interface.get("uri"), - ) workflows_service_response = loop.run_until_complete( workflows_service.invoke_workflow( project_id=project_id, @@ -672,11 +663,6 @@ def evaluate( annotate=True, ) ) - log.info( - "Invoked evaluator ", - scenario_id=scenario.id, - trace_id=workflows_service_response.trace_id, - ) trace_id = workflows_service_response.trace_id @@ -707,51 +693,13 @@ def evaluate( if workflows_service_response.data else None ) - - annotation = workflows_service_response - - trace_id = annotation.trace_id - - if not annotation.trace_id: - log.warn(f"annotation trace_id is missing.") - scenario_has_errors[idx] += 1 - scenario_status[idx] = EvaluationStatus.ERRORS - continue - - trace = None - if annotation.trace_id: - trace = loop.run_until_complete( - fetch_trace( - tracing_router=tracing_router, - request=request, - trace_id=annotation.trace_id, - ) - ) - - if trace: - log.info( - f"Trace found ", - scenario_id=scenario.id, - step_key=annotation_step_key, - trace_id=annotation.trace_id, - ) - else: - log.warn( - f"Trace missing", - scenario_id=scenario.id, - step_key=annotation_step_key, - trace_id=annotation.trace_id, - ) - scenario_has_errors[idx] += 1 - scenario_status[idx] = EvaluationStatus.ERRORS - continue # ---------------------------------------------------------- results_create = [ EvaluationResultCreate( run_id=run_id, scenario_id=scenario_id, - step_key=annotation_step_key, + step_key=evaluator_step_key, # timestamp=timestamp, interval=interval, diff --git a/api/oss/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py b/api/oss/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py new file mode 100644 index 0000000000..d76fe93471 --- /dev/null +++ b/api/oss/databases/postgres/migrations/core/versions/79f40f71e912_extend_meters.py @@ -0,0 +1,70 @@ +"""add CREDITS to meters_type + +Revision ID: 79f40f71e912 +Revises: 3b5f5652f611 +Create Date: 2025-11-03 15:00:00.000000 +""" + +from typing import Sequence, Union +from alembic import op +import sqlalchemy as sa + +# revision identifiers, used by Alembic. +revision: str = "79f40f71e912" +down_revision: Union[str, None] = "3b5f5652f611" +branch_labels: Union[str, Sequence[str], None] = None +depends_on: Union[str, Sequence[str], None] = None + +ENUM_NAME = "meters_type" +TEMP_ENUM_NAME = "meters_type_temp" +TABLE_NAME = "meters" +COLUMN_NAME = "key" + + +def upgrade() -> None: + # 1) Create temp enum including the new value + op.execute( + sa.text( + f"CREATE TYPE {TEMP_ENUM_NAME} AS ENUM ('USERS','APPLICATIONS','EVALUATIONS','TRACES','CREDITS')" + ) + ) + + # 2) Alter column to use temp enum + op.execute( + sa.text( + f"ALTER TABLE {TABLE_NAME} " + f"ALTER COLUMN {COLUMN_NAME} TYPE {TEMP_ENUM_NAME} " + f"USING {COLUMN_NAME}::text::{TEMP_ENUM_NAME}" + ) + ) + + # 3) Drop old enum, then 4) rename temp -> original + op.execute(sa.text(f"DROP TYPE {ENUM_NAME}")) + op.execute(sa.text(f"ALTER TYPE {TEMP_ENUM_NAME} RENAME TO {ENUM_NAME}")) + + +def downgrade() -> None: + # Ensure downgrade can proceed (rows with CREDITS would block the type change) + op.execute( + sa.text(f"DELETE FROM {TABLE_NAME} WHERE {COLUMN_NAME}::text = 'CREDITS'") + ) + + # 1) Create temp enum WITHOUT CREDITS + op.execute( + sa.text( + f"CREATE TYPE {TEMP_ENUM_NAME} AS ENUM ('USERS','APPLICATIONS','EVALUATIONS','TRACES')" + ) + ) + + # 2) Alter column to use temp enum + op.execute( + sa.text( + f"ALTER TABLE {TABLE_NAME} " + f"ALTER COLUMN {COLUMN_NAME} TYPE {TEMP_ENUM_NAME} " + f"USING {COLUMN_NAME}::text::{TEMP_ENUM_NAME}" + ) + ) + + # 3) Drop current enum (which includes CREDITS), then 4) rename temp -> original + op.execute(sa.text(f"DROP TYPE {ENUM_NAME}")) + op.execute(sa.text(f"ALTER TYPE {TEMP_ENUM_NAME} RENAME TO {ENUM_NAME}")) diff --git a/api/oss/src/apis/fastapi/observability/opentelemetry/otlp.py b/api/oss/src/apis/fastapi/observability/opentelemetry/otlp.py index f7ef6cd6a4..20b3098739 100644 --- a/api/oss/src/apis/fastapi/observability/opentelemetry/otlp.py +++ b/api/oss/src/apis/fastapi/observability/opentelemetry/otlp.py @@ -135,12 +135,6 @@ def parse_otlp_stream(otlp_stream: bytes) -> List[OTelSpanDTO]: s_span_id = "0x" + span.span_id.hex() s_context = OTelContextDTO(trace_id=s_trace_id, span_id=s_span_id) - # log.debug( - # "[SPAN] [PARSE] ", - # trace_id=s_trace_id[2:], - # span_id=s_span_id[2:], - # ) - # SPAN PARENT CONTEXT s_parent_id = span.parent_span_id.hex() s_parent_id = "0x" + s_parent_id if s_parent_id else None diff --git a/api/oss/src/apis/fastapi/observability/router.py b/api/oss/src/apis/fastapi/observability/router.py index 2c96b9a000..b95b115997 100644 --- a/api/oss/src/apis/fastapi/observability/router.py +++ b/api/oss/src/apis/fastapi/observability/router.py @@ -296,13 +296,6 @@ async def otlp_receiver( ) # -------------------------------------------------------------------- # - # for otel_span in otel_spans: - # log.debug( - # "Receiving trace... ", - # project_id=request.state.project_id, - # trace_id=str(UUID(otel_span.context.trace_id[2:])), - # ) - span_dtos = None try: # ---------------------------------------------------------------- # diff --git a/api/oss/src/apis/fastapi/observability/utils/serialization.py b/api/oss/src/apis/fastapi/observability/utils/serialization.py index 5d697953da..b2f6cd4ce9 100644 --- a/api/oss/src/apis/fastapi/observability/utils/serialization.py +++ b/api/oss/src/apis/fastapi/observability/utils/serialization.py @@ -58,7 +58,7 @@ def decode_value( value = loads(encoded) return value try: - value = value + value = loads(value) except JSONDecodeError: pass return value diff --git a/api/oss/src/core/evaluations/utils.py b/api/oss/src/core/evaluations/utils.py index eb4f899ff7..ab823c647b 100644 --- a/api/oss/src/core/evaluations/utils.py +++ b/api/oss/src/core/evaluations/utils.py @@ -131,7 +131,7 @@ async def fetch_trace( request, # trace_id: str, - max_retries: int = 15, + max_retries: int = 5, delay: float = 1.0, ) -> Optional[OTelSpansTree]: for attempt in range(max_retries): diff --git a/api/oss/src/core/evaluators/service.py b/api/oss/src/core/evaluators/service.py index 6e547addfe..37cf6606eb 100644 --- a/api/oss/src/core/evaluators/service.py +++ b/api/oss/src/core/evaluators/service.py @@ -1,6 +1,5 @@ from typing import Optional, List from uuid import UUID, uuid4 -from json import loads from oss.src.utils.helpers import get_slug_from_name_and_id from oss.src.services.db_manager import fetch_evaluator_config @@ -1435,52 +1434,46 @@ def _transfer_evaluator_revision_data( else None ) headers = None - outputs_schema = None - if str(old_evaluator.evaluator_key) == "auto_ai_critique": - json_schema = old_evaluator.settings_values.get("json_schema", None) - if json_schema and isinstance(json_schema, dict): - outputs_schema = json_schema.get("schema", None) - if not outputs_schema: - properties = ( - {"score": {"type": "number"}, "success": {"type": "boolean"}} - if old_evaluator.evaluator_key - in ( - "auto_levenshtein_distance", - "auto_semantic_similarity", - "auto_similarity_match", - "auto_json_diff", - "auto_webhook_test", - "auto_custom_code_run", - "auto_ai_critique", - "rag_faithfulness", - "rag_context_relevancy", - ) - else {"success": {"type": "boolean"}} - ) - required = ( - list(properties.keys()) - if old_evaluator.evaluator_key - not in ( - "auto_levenshtein_distance", - "auto_semantic_similarity", - "auto_similarity_match", - "auto_json_diff", - "auto_webhook_test", - "auto_custom_code_run", - "auto_ai_critique", - "rag_faithfulness", - "rag_context_relevancy", - ) - else [] + properties = ( + {"score": {"type": "number"}, "success": {"type": "boolean"}} + if old_evaluator.evaluator_key + in ( + "auto_levenshtein_distance", + "auto_semantic_similarity", + "auto_similarity_match", + "auto_json_diff", + "auto_webhook_test", + "auto_custom_code_run", + "auto_ai_critique", + "rag_faithfulness", + "rag_context_relevancy", ) - outputs_schema = { + else {"success": {"type": "boolean"}} + ) + schemas = { + "outputs": { "$schema": "https://json-schema.org/draft/2020-12/schema", "type": "object", "properties": properties, - "required": required, + "required": ( + list(properties.keys()) + if old_evaluator.evaluator_key + not in ( + "auto_levenshtein_distance", + "auto_semantic_similarity", + "auto_similarity_match", + "auto_json_diff", + "auto_webhook_test", + "auto_custom_code_run", + "auto_ai_critique", + "rag_faithfulness", + "rag_context_relevancy", + ) + else [] + ), "additionalProperties": False, } - schemas = {"outputs": outputs_schema} + } script = ( { "content": old_evaluator.settings_values.get("code", None), diff --git a/api/oss/src/models/api/evaluation_model.py b/api/oss/src/models/api/evaluation_model.py index d79d124921..ef25bfa140 100644 --- a/api/oss/src/models/api/evaluation_model.py +++ b/api/oss/src/models/api/evaluation_model.py @@ -14,7 +14,6 @@ class LegacyEvaluator(BaseModel): name: str key: str direct_use: bool - settings_presets: Optional[list[dict]] = None settings_template: dict description: Optional[str] = None oss: Optional[bool] = False diff --git a/api/oss/src/resources/evaluators/evaluators.py b/api/oss/src/resources/evaluators/evaluators.py index 53a2d48542..760ab550a7 100644 --- a/api/oss/src/resources/evaluators/evaluators.py +++ b/api/oss/src/resources/evaluators/evaluators.py @@ -205,78 +205,6 @@ "key": "auto_ai_critique", "direct_use": False, "requires_llm_api_keys": True, - "settings_presets": [ - { - "key": "default", - "name": "Default", - "values": { - "prompt_template": [ - { - "role": "system", - "content": "You are an expert evaluator grading model outputs. Your task is to grade the responses based on the criteria and requirements provided below. \n\nGiven the model output and inputs (and any other data you might get) assign a grade to the output. \n\n## Grading considerations\n- Evaluate the overall value provided in the model output\n- Verify all claims in the output meticulously\n- Differentiate between minor errors and major errors\n- Evaluate the outputs based on the inputs and whether they follow the instruction in the inputs if any\n- Give the highst and lowest score for cases where you have complete certainty about correctness and value\n\n## Scoring Criteria\n- The score should be a decimal value between 0.0 and 1.0\n- A score of 1.0 means that the answer is perfect. This is the highest (best) score \n- A score of 0.0 means that the answer does not meet any of the criteria. This is the lowest possible score you can give.\n\n## output format\nANSWER ONLY THE SCORE. DO NOT USE MARKDOWN. DO NOT PROVIDE ANYTHING OTHER THAN THE NUMBER\n", - }, - { - "role": "user", - "content": "## Model inputs\n{{inputs}}\n## Model outputs\n{{outputs}}", - }, - ], - "model": "gpt-4o-mini", - "response_type": "json_schema", - "json_schema": { - "name": "schema", - "schema": { - "title": "extract", - "description": "Extract information from the user's response.", - "type": "object", - "properties": { - "correctness": { - "type": "boolean", - "description": "The grade results", - } - }, - "required": ["correctness"], - "strict": True, - }, - }, - "version": "4", - }, - }, - { - "key": "hallucination", - "name": "Hallucination Detection", - "values": { - "prompt_template": [ - { - "role": "system", - "content": "You are an expert evaluator grading model outputs for hallucinations. Your task is to identify if the responses contain any hallucinated information based on the criteria and requirements provided below. \n\nGiven the model output and inputs (and any other data you might get) determine if the output contains hallucinations. \n\n## Hallucination considerations\n- Verify all factual claims in the output meticulously against the input data\n- Identify any information that is fabricated or not supported by the input data\n- Differentiate between minor inaccuracies and major hallucinations\n\n## Output format\nANSWER ONLY 'true' IF THE OUTPUT CONTAINS HALLUCINATIONS, OTHERWISE ANSWER 'false'. DO NOT USE MARKDOWN. DO NOT PROVIDE ANYTHING OTHER THAN 'true' OR 'false'\n", - }, - { - "role": "user", - "content": "## Model inputs\n{{inputs}}\n## Model outputs\n{{outputs}}", - }, - ], - "model": "gpt-4o-mini", - "response_type": "json_schema", - "json_schema": { - "name": "schema", - "schema": { - "title": "extract", - "description": "Extract information from the user's response.", - "type": "object", - "properties": { - "correctness": { - "type": "boolean", - "description": "The hallucination detection result", - } - }, - "required": ["correctness"], - "strict": True, - }, - }, - "version": "4", - }, - }, - ], "settings_template": { "prompt_template": { "label": "Prompt Template", @@ -323,39 +251,10 @@ "advanced": True, # Tells the frontend that this setting is advanced and should be hidden by default "description": "The LLM model to use for the evaluation", }, - "response_type": { - "label": "Response Type", - "default": "json_schema", - "type": "hidden", - "advanced": True, - "description": "The format of the response from the LLM", - }, - "json_schema": { - "label": "Feedback Configuration", - "default": { - "name": "schema", - "schema": { - "title": "extract", - "description": "Extract information from the user's response.", - "type": "object", - "properties": { - "correctness": { - "type": "boolean", - "description": "The grade results", - } - }, - "required": ["correctness"], - "strict": True, - }, - }, - "type": "llm_response_schema", - "advanced": False, - "description": "Select a response format to structure how your evaluation results are returned.", - }, "version": { "label": "Version", "type": "hidden", - "default": "4", + "default": "3", "description": "The version of the evaluator", # ignore by the FE "advanced": False, # ignore by the FE }, diff --git a/api/oss/src/routers/app_router.py b/api/oss/src/routers/app_router.py index 338aff4f62..c0b4bd6935 100644 --- a/api/oss/src/routers/app_router.py +++ b/api/oss/src/routers/app_router.py @@ -261,7 +261,7 @@ async def create_app( return CreateAppOutput(app_id=str(app_db.id), app_name=str(app_db.app_name)) -@router.get("/{app_id}/", response_model=ReadAppOutput, operation_id="read_app") +@router.get("/{app_id}/", response_model=ReadAppOutput, operation_id="create_app") async def read_app( request: Request, app_id: str, diff --git a/api/oss/src/routers/permissions_router.py b/api/oss/src/routers/permissions_router.py index 7c1be922fe..5dbf10a0a3 100644 --- a/api/oss/src/routers/permissions_router.py +++ b/api/oss/src/routers/permissions_router.py @@ -1,5 +1,5 @@ +from typing import Optional, Union from uuid import UUID -from typing import Optional from fastapi.responses import JSONResponse from fastapi import Request, Query, HTTPException @@ -12,6 +12,7 @@ if is_ee(): from ee.src.models.shared_models import Permission from ee.src.utils.permissions import check_action_access + from ee.src.utils.entitlements import check_entitlements, Counter router = APIRouter() @@ -69,6 +70,7 @@ async def verify_permissions( log.warn("Missing required parameters: action, resource_type") raise Deny() + # allow = None allow = await get_cache( project_id=request.state.project_id, user_id=request.state.user_id, @@ -83,6 +85,7 @@ async def verify_permissions( raise Deny() # CHECK PERMISSION 1/3: SCOPE + # log.debug("Checking scope access...") allow_scope = await check_scope_access( # organization_id=request.state.organization_id, workspace_id=request.state.workspace_id, @@ -102,16 +105,35 @@ async def verify_permissions( ) raise Deny() - if is_ee(): - # CHECK PERMISSION 1/2: ACTION - allow_action = await check_action_access( + # CHECK PERMISSION 1/2: ACTION + # log.debug("Checking action access...") + allow_action = await check_action_access( + project_id=request.state.project_id, + user_uid=request.state.user_id, + permission=Permission(action), + ) + + if not allow_action: + log.warn("Action access denied") + await set_cache( project_id=request.state.project_id, - user_uid=request.state.user_id, - permission=Permission(action), + user_id=request.state.user_id, + namespace="verify_permissions", + key=cache_key, + value="deny", ) + raise Deny() + + # CHECK PERMISSION 3/3: RESOURCE + # log.debug("Checking resource access...") + allow_resource = await check_resource_access( + organization_id=request.state.organization_id, + resource_type=resource_type, + ) - if not allow_action: - log.warn("Action access denied") + if isinstance(allow_resource, bool): + if allow_resource is False: + log.warn("Resource access denied") await set_cache( project_id=request.state.project_id, user_id=request.state.user_id, @@ -121,30 +143,40 @@ async def verify_permissions( ) raise Deny() - # CHECK PERMISSION 3/3: RESOURCE - allow_resource = await check_resource_access( - resource_type=resource_type, - ) + if allow_resource is True: + await set_cache( + project_id=request.state.project_id, + user_id=request.state.user_id, + namespace="verify_permissions", + key=cache_key, + value="allow", + ) + return Allow(request.state.credentials) - if not allow_resource: - log.warn("Resource access denied") - await set_cache( - project_id=request.state.project_id, - user_id=request.state.user_id, - namespace="verify_permissions", - key=cache_key, - value="deny", - ) - raise Deny() + elif isinstance(allow_resource, int): + if allow_resource <= 0: + log.warn("Resource access denied") + await set_cache( + project_id=request.state.project_id, + user_id=request.state.user_id, + namespace="verify_permissions", + key=cache_key, + value="deny", + ) + raise Deny() + else: + return Allow(request.state.credentials) + # else: + log.warn("Resource access denied") await set_cache( project_id=request.state.project_id, user_id=request.state.user_id, namespace="verify_permissions", key=cache_key, - value="allow", + value="deny", ) - return Allow(request.state.credentials) + raise Deny() except Exception as exc: # pylint: disable=bare-except log.warn(exc) @@ -180,11 +212,27 @@ async def check_scope_access( async def check_resource_access( + organization_id: UUID, resource_type: Optional[str] = None, -) -> bool: +) -> Union[bool, int]: allow_resource = False if resource_type == "service": allow_resource = True + if resource_type == "local_secrets": + check, meter, _ = await check_entitlements( + organization_id=organization_id, + key=Counter.CREDITS, + delta=1, + ) + + if not check: + return False + + if not meter or not meter.value: + return False + + return meter.value + return allow_resource diff --git a/api/oss/src/routers/testset_router.py b/api/oss/src/routers/testset_router.py index e560c06198..dffad517af 100644 --- a/api/oss/src/routers/testset_router.py +++ b/api/oss/src/routers/testset_router.py @@ -252,9 +252,7 @@ async def import_testset( ) from error -@router.post( - "/", response_model=TestsetSimpleResponse, operation_id="create_legacy_testset" -) +@router.post("/", response_model=TestsetSimpleResponse, operation_id="create_testset") async def create_testset( csvdata: NewTestset, request: Request, diff --git a/api/oss/src/services/analytics_service.py b/api/oss/src/services/analytics_service.py index 1903eed0ce..1d7d5fce79 100644 --- a/api/oss/src/services/analytics_service.py +++ b/api/oss/src/services/analytics_service.py @@ -39,7 +39,7 @@ if POSTHOG_API_KEY: posthog.api_key = POSTHOG_API_KEY posthog.host = POSTHOG_HOST - log.info("PostHog initialized with host %s", POSTHOG_HOST) + log.info("PostHog initialized with host %s:", POSTHOG_HOST) else: log.warn("PostHog API key not found in environment variables") diff --git a/api/oss/src/services/evaluators_service.py b/api/oss/src/services/evaluators_service.py index 5ff93cabb0..8b5ea9eb74 100644 --- a/api/oss/src/services/evaluators_service.py +++ b/api/oss/src/services/evaluators_service.py @@ -1,7 +1,7 @@ import re import json import traceback -from typing import Any, Dict, Union, List, Optional +from typing import Any, Dict, Union, List import litellm import httpx @@ -515,153 +515,6 @@ async def auto_ai_critique( ) -import json -import re -from typing import Any, Dict, Iterable, Tuple, Optional - -try: - import jsonpath # ✅ use module API - from jsonpath import JSONPointer # pointer class is fine to use -except Exception: - jsonpath = None - JSONPointer = None - -# ========= Scheme detection ========= - - -def detect_scheme(expr: str) -> str: - """Return 'json-path', 'json-pointer', or 'dot-notation' based on the placeholder prefix.""" - if expr.startswith("$"): - return "json-path" - if expr.startswith("/"): - return "json-pointer" - return "dot-notation" - - -# ========= Resolvers ========= - - -def resolve_dot_notation(expr: str, data: dict) -> object: - if "[" in expr or "]" in expr: - raise KeyError(f"Bracket syntax is not supported in dot-notation: {expr!r}") - - # First, check if the expression exists as a literal key (e.g., "topic.story" as a single key) - # This allows users to use dots in their variable names without nested access - if expr in data: - return data[expr] - - # If not found as a literal key, try to parse as dot-notation path - cur = data - for token in (p for p in expr.split(".") if p): - if isinstance(cur, list) and token.isdigit(): - cur = cur[int(token)] - else: - if not isinstance(cur, dict): - raise KeyError( - f"Cannot access key {token!r} on non-dict while resolving {expr!r}" - ) - if token not in cur: - raise KeyError(f"Missing key {token!r} while resolving {expr!r}") - cur = cur[token] - return cur - - -def resolve_json_path(expr: str, data: dict) -> object: - if jsonpath is None: - raise ImportError("python-jsonpath is required for json-path ($...)") - - if not (expr == "$" or expr.startswith("$.") or expr.startswith("$[")): - raise ValueError( - f"Invalid json-path expression {expr!r}. " - "Must start with '$', '$.' or '$[' (no implicit normalization)." - ) - - # Use package-level APIf - results = jsonpath.findall(expr, data) # always returns a list - return results[0] if len(results) == 1 else results - - -def resolve_json_pointer(expr: str, data: Dict[str, Any]) -> Any: - """Resolve a JSON Pointer; returns a single value.""" - if JSONPointer is None: - raise ImportError("python-jsonpath is required for json-pointer (/...)") - return JSONPointer(expr).resolve(data) - - -def resolve_any(expr: str, data: Dict[str, Any]) -> Any: - """Dispatch to the right resolver based on detected scheme.""" - scheme = detect_scheme(expr) - if scheme == "json-path": - return resolve_json_path(expr, data) - if scheme == "json-pointer": - return resolve_json_pointer(expr, data) - return resolve_dot_notation(expr, data) - - -# ========= Placeholder & coercion helpers ========= - -_PLACEHOLDER_RE = re.compile(r"\{\{\s*(.*?)\s*\}\}") - - -def extract_placeholders(template: str) -> Iterable[str]: - """Yield the inner text of all {{ ... }} occurrences (trimmed).""" - for m in _PLACEHOLDER_RE.finditer(template): - yield m.group(1).strip() - - -def coerce_to_str(value: Any) -> str: - """Pretty stringify values for embedding into templates.""" - if isinstance(value, (dict, list)): - return json.dumps(value, ensure_ascii=False) - return str(value) - - -def build_replacements( - placeholders: Iterable[str], data: Dict[str, Any] -) -> Tuple[Dict[str, str], set]: - """ - Resolve all placeholders against data. - Returns (replacements, unresolved_placeholders). - """ - replacements: Dict[str, str] = {} - unresolved: set = set() - for expr in set(placeholders): - try: - val = resolve_any(expr, data) - # Escape backslashes to avoid regex replacement surprises - replacements[expr] = coerce_to_str(val).replace("\\", "\\\\") - except Exception: - unresolved.add(expr) - return replacements, unresolved - - -def apply_replacements(template: str, replacements: Dict[str, str]) -> str: - """Replace {{ expr }} using a callback to avoid regex-injection issues.""" - - def _repl(m: re.Match) -> str: - expr = m.group(1).strip() - return replacements.get(expr, m.group(0)) - - return _PLACEHOLDER_RE.sub(_repl, template) - - -def compute_truly_unreplaced(original: set, rendered: str) -> set: - """Only count placeholders that were in the original template and remain.""" - now = set(extract_placeholders(rendered)) - return original & now - - -def missing_lib_hints(unreplaced: set) -> Optional[str]: - """Suggest installing python-jsonpath if placeholders indicate json-path or json-pointer usage.""" - if any(expr.startswith("$") or expr.startswith("/") for expr in unreplaced) and ( - jsonpath is None or JSONPointer is None - ): - return ( - "Install python-jsonpath to enable json-path ($...) and json-pointer (/...)" - ) - return None - - def _format_with_template( content: str, format: str, @@ -677,36 +530,41 @@ def _format_with_template( try: return Template(content).render(**kwargs) - except TemplateError: + except TemplateError as e: return content elif format == "curly": - original_placeholders = set(extract_placeholders(content)) + import re - replacements, _unresolved = build_replacements( - original_placeholders, - kwargs, - ) + # Extract variables that exist in the original template before replacement + # This allows us to distinguish template variables from {{}} in user input values + original_variables = set(re.findall(r"\{\{(.*?)\}\}", content)) - result = apply_replacements(content, replacements) + result = content + for key, value in kwargs.items(): + pattern = r"\{\{" + re.escape(key) + r"\}\}" + old_result = result + # Escape backslashes in the replacement string to prevent regex interpretation + escaped_value = str(value).replace("\\", "\\\\") + result = re.sub(pattern, escaped_value, result) - truly_unreplaced = compute_truly_unreplaced(original_placeholders, result) + # Only check if ORIGINAL template variables remain unreplaced + # Don't error on {{}} that came from user input values + unreplaced_matches = set(re.findall(r"\{\{(.*?)\}\}", result)) + truly_unreplaced = original_variables & unreplaced_matches if truly_unreplaced: - hint = missing_lib_hints(truly_unreplaced) - suffix = f" Hint: {hint}" if hint else "" raise ValueError( - f"Template variables not found or unresolved: " - f"{', '.join(sorted(truly_unreplaced))}.{suffix}" + f"Template variables not found in inputs: {', '.join(sorted(truly_unreplaced))}" ) return result - return content except Exception as e: - log.error(f"Error during template formatting: {str(e)}") return content + return content + async def ai_critique(input: EvaluatorInputInterface) -> EvaluatorOutputInterface: openai_api_key = input.credentials.get("OPENAI_API_KEY", None) @@ -721,10 +579,7 @@ async def ai_critique(input: EvaluatorInputInterface) -> EvaluatorOutputInterfac ) # Validate prompt variables if there's a prompt in the inputs - if input.settings.get("prompt_template") and input.settings.get("version") not in [ - "3", - "4", - ]: + if input.settings.get("prompt_template") and input.settings.get("version") != "3": try: validate_prompt_variables( prompt=input.settings.get("prompt_template", []), @@ -734,200 +589,6 @@ async def ai_critique(input: EvaluatorInputInterface) -> EvaluatorOutputInterfac raise e if ( - input.settings.get("version") == "4" - ) and ( # this check is used when running in the background (celery) - type(input.settings.get("prompt_template", "")) is not str - ): # this check is used when running in the frontend (since in that case we'll alway have version 2) - try: - parameters = input.settings or dict() - - if not isinstance(parameters, dict): - parameters = dict() - - inputs = input.inputs or None - - if not isinstance(inputs, dict): - inputs = dict() - - outputs = input.inputs.get("prediction") or None - - if "ground_truth" in inputs: - del inputs["ground_truth"] - if "prediction" in inputs: - del inputs["prediction"] - - # ---------------------------------------------------------------- # - - correct_answer_key = parameters.get("correct_answer_key") - - prompt_template: List = parameters.get("prompt_template") or list() - - template_version = parameters.get("version") or "3" - - default_format = "fstring" if template_version == "2" else "curly" - - template_format = parameters.get("template_format") or default_format - - response_type = input.settings.get("response_type") or "text" - - json_schema = input.settings.get("json_schema") or None - - json_schema = json_schema if response_type == "json_schema" else None - - response_format = dict(type=response_type) - - if response_type == "json_schema": - response_format["json_schema"] = json_schema - - model = parameters.get("model") or "gpt-4o-mini" - - correct_answer = None - - if inputs and isinstance(inputs, dict) and correct_answer_key: - correct_answer = inputs[correct_answer_key] - - secrets = await SecretsManager.retrieve_secrets() - - openai_api_key = None # secrets.get("OPENAI_API_KEY") - anthropic_api_key = None # secrets.get("ANTHROPIC_API_KEY") - openrouter_api_key = None # secrets.get("OPENROUTER_API_KEY") - cohere_api_key = None # secrets.get("COHERE_API_KEY") - azure_api_key = None # secrets.get("AZURE_API_KEY") - groq_api_key = None # secrets.get("GROQ_API_KEY") - - for secret in secrets: - if secret.get("kind") == "provider_key": - secret_data = secret.get("data", {}) - if secret_data.get("kind") == "openai": - provider_data = secret_data.get("provider", {}) - openai_api_key = provider_data.get("key") or openai_api_key - if secret_data.get("kind") == "anthropic": - provider_data = secret_data.get("provider", {}) - anthropic_api_key = ( - provider_data.get("key") or anthropic_api_key - ) - if secret_data.get("kind") == "openrouter": - provider_data = secret_data.get("provider", {}) - openrouter_api_key = ( - provider_data.get("key") or openrouter_api_key - ) - if secret_data.get("kind") == "cohere": - provider_data = secret_data.get("provider", {}) - cohere_api_key = provider_data.get("key") or cohere_api_key - if secret_data.get("kind") == "azure": - provider_data = secret_data.get("provider", {}) - azure_api_key = provider_data.get("key") or azure_api_key - if secret_data.get("kind") == "groq": - provider_data = secret_data.get("provider", {}) - groq_api_key = provider_data.get("key") or groq_api_key - - threshold = parameters.get("threshold") or 0.5 - - score = None - success = None - - litellm.openai_key = openai_api_key - litellm.anthropic_key = anthropic_api_key - litellm.openrouter_key = openrouter_api_key - litellm.cohere_key = cohere_api_key - litellm.azure_key = azure_api_key - litellm.groq_key = groq_api_key - - context: Dict[str, Any] = dict() - - if parameters: - context.update( - **{ - "parameters": parameters, - } - ) - - if correct_answer: - context.update( - **{ - "ground_truth": correct_answer, - "correct_answer": correct_answer, - "reference": correct_answer, - } - ) - - if outputs: - context.update( - **{ - "prediction": outputs, - "outputs": outputs, - } - ) - - if inputs: - context.update(**inputs) - context.update( - **{ - "inputs": inputs, - } - ) - - formatted_prompt_template = [ - { - "role": message["role"], - "content": _format_with_template( - content=message["content"], - format=template_format, - kwargs=context, - ), - } - for message in prompt_template - ] - - try: - response = await litellm.acompletion( - model=model, - messages=formatted_prompt_template, - temperature=0.01, - response_format=response_format, - ) - - _outputs = response.choices[0].message.content.strip() # type: ignore - - except litellm.AuthenticationError as e: # type: ignore - e.message = e.message.replace( - "litellm.AuthenticationError: AuthenticationError: ", "" - ) - raise e - - except Exception as e: - raise ValueError(f"AI Critique evaluation failed: {str(e)}") from e - # -------------------------------------------------------------------------- - - try: - _outputs = json.loads(_outputs) - except: - pass - - if isinstance(_outputs, (int, float)): - return EvaluatorOutputInterface( - outputs={ - "score": _outputs, - "success": _outputs >= threshold, - }, - ) - - if isinstance(_outputs, bool): - return EvaluatorOutputInterface( - outputs={ - "success": _outputs, - }, - ) - - if isinstance(_outputs, dict): - return EvaluatorOutputInterface( - outputs=_outputs, - ) - - raise ValueError(f"Could not parse output: {_outputs}") - except Exception as e: - raise RuntimeError(f"Evaluation failed: {str(e)}") - elif ( input.settings.get("version") == "3" ) and ( # this check is used when running in the background (celery) type(input.settings.get("prompt_template", "")) is not str @@ -1067,23 +728,19 @@ async def ai_critique(input: EvaluatorInputInterface) -> EvaluatorOutputInterfac messages=formatted_prompt_template, temperature=0.01, ) - outputs = response.choices[0].message.content.strip() - try: - score = float(outputs) - success = score >= threshold + score = response.choices[0].message.content.strip() + + score = float(score) + + success = score >= threshold + + # ---------------------------------------------------------------- # + + return EvaluatorOutputInterface( + outputs={"score": score, "success": success}, + ) - return EvaluatorOutputInterface( - outputs={"score": score, "success": success}, - ) - except ValueError: - # if the output is not a float, we try to extract a float from the text - match = re.search(r"[-+]?\d*\.\d+|\d+", outputs) - if match: - score = float(match.group()) - return EvaluatorOutputInterface(outputs={"score": score}) - else: - raise ValueError(f"Could not parse output as float: {outputs}") except Exception as e: raise RuntimeError(f"Evaluation failed: {str(e)}") elif ( diff --git a/api/oss/src/utils/logging.py b/api/oss/src/utils/logging.py index 67c53d4369..fa424ce74c 100644 --- a/api/oss/src/utils/logging.py +++ b/api/oss/src/utils/logging.py @@ -4,9 +4,18 @@ import logging from typing import Any, Optional +# from datetime import datetime +# from logging.handlers import RotatingFileHandler + import structlog from structlog.typing import EventDict, WrappedLogger, Processor +# from opentelemetry.trace import get_current_span +# from opentelemetry._logs import set_logger_provider +# from opentelemetry.sdk._logs import LoggingHandler, LoggerProvider +# from opentelemetry.sdk._logs.export import BatchLogRecordProcessor +# from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter + from oss.src.utils.env import env @@ -33,6 +42,15 @@ def bound_logger_trace(self, *args, **kwargs): AGENTA_LOG_CONSOLE_ENABLED = env.AGENTA_LOG_CONSOLE_ENABLED AGENTA_LOG_CONSOLE_LEVEL = env.AGENTA_LOG_CONSOLE_LEVEL +# AGENTA_LOG_OTLP_ENABLED = env.AGENTA_LOG_OTLP_ENABLED +# AGENTA_LOG_OTLP_LEVEL = env.AGENTA_LOG_OTLP_LEVEL + +# AGENTA_LOG_FILE_ENABLED = env.AGENTA_LOG_FILE_ENABLED +# AGENTA_LOG_FILE_LEVEL = env.AGENTA_LOG_FILE_LEVEL +# AGENTA_LOG_FILE_BASE = env.AGENTA_LOG_FILE_PATH +# LOG_FILE_DATE = datetime.utcnow().strftime("%Y-%m-%d") +# AGENTA_LOG_FILE_PATH = f"{AGENTA_LOG_FILE_BASE}-{LOG_FILE_DATE}.log" + # COLORS LEVEL_COLORS = { "TRACE": "\033[97m", @@ -72,6 +90,15 @@ def process_positional_args(_, __, event_dict: EventDict) -> EventDict: return event_dict +# def add_trace_context(_, __, event_dict: EventDict) -> EventDict: +# span = get_current_span() +# if span and span.get_span_context().is_valid: +# ctx = span.get_span_context() +# event_dict["TraceId"] = format(ctx.trace_id, "032x") +# event_dict["SpanId"] = format(ctx.span_id, "016x") +# return event_dict + + def add_logger_info( logger: WrappedLogger, method_name: str, event_dict: EventDict ) -> EventDict: @@ -88,7 +115,6 @@ def add_logger_info( event_dict["SeverityNumber"] = SEVERITY_NUMBERS.get(level, 9) event_dict["LoggerName"] = logger.name event_dict["MethodName"] = method_name - event_dict["pid"] = os.getpid() return event_dict @@ -103,7 +129,6 @@ def colored_console_renderer() -> Processor: } def render(_, __, event_dict: EventDict) -> str: - pid = event_dict.pop("pid", None) ts = event_dict.pop("Timestamp", "")[:23] + "Z" level = event_dict.pop("level", "INFO") msg = event_dict.pop("event", "") @@ -120,69 +145,102 @@ def render(_, __, event_dict: EventDict) -> str: return render +# def plain_renderer() -> Processor: +# hidden = { +# "SeverityText", +# "SeverityNumber", +# "MethodName", +# "logger_factory", +# "LoggerName", +# "level", +# } + +# def render(_, __, event_dict: EventDict) -> str: +# ts = event_dict.pop("Timestamp", "")[:23] + "Z" +# level = event_dict.get("level", "") +# msg = event_dict.pop("event", "") +# padded = f"[{level:<5}]" +# logger = f"[{event_dict.pop('logger', '')}]" +# extras = " ".join(f"{k}={v}" for k, v in event_dict.items() if k not in hidden) +# return f"{ts} {padded} {msg} {logger} {extras}" + +# return render + + +# def json_renderer() -> Processor: +# return structlog.processors.JSONRenderer() + + SHARED_PROCESSORS: list[Processor] = [ structlog.processors.TimeStamper(fmt="iso", utc=True, key="Timestamp"), process_positional_args, + # add_trace_context, add_logger_info, structlog.processors.format_exc_info, structlog.processors.dict_tracebacks, ] -# Guard against double initialization -_LOGGING_CONFIGURED = False +def create_struct_logger( + processors: list[Processor], name: str +) -> structlog.stdlib.BoundLogger: + logger = logging.getLogger(name) + logger.setLevel(TRACE_LEVEL) + return structlog.wrap_logger( + logger, + processors=SHARED_PROCESSORS + processors, + wrapper_class=structlog.stdlib.BoundLogger, + logger_factory=structlog.stdlib.LoggerFactory(), + cache_logger_on_first_use=True, + ) -# ensure no duplicate sinks via root -_root = logging.getLogger() -_root.handlers.clear() -_root.propagate = False # CONFIGURE HANDLERS AND STRUCTLOG LOGGERS +handlers = [] loggers = [] -if AGENTA_LOG_CONSOLE_ENABLED and not _LOGGING_CONFIGURED: - _LOGGING_CONFIGURED = True - - # Create a single handler for console output - console_handler = logging.StreamHandler(sys.stdout) - console_handler.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL)) - console_handler.setFormatter(logging.Formatter("%(message)s")) - - # Configure the structlog console logger - console_logger = logging.getLogger("agenta_console") - console_logger.handlers.clear() - console_logger.addHandler(console_handler) - console_logger.setLevel(TRACE_LEVEL) - console_logger.propagate = False - - loggers.append( - structlog.wrap_logger( - console_logger, - processors=SHARED_PROCESSORS + [colored_console_renderer()], - wrapper_class=structlog.stdlib.BoundLogger, - logger_factory=structlog.stdlib.LoggerFactory(), - cache_logger_on_first_use=False, # Don't cache to avoid stale state - ) - ) +if AGENTA_LOG_CONSOLE_ENABLED: + h = logging.StreamHandler(sys.stdout) + h.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL)) + h.setFormatter(logging.Formatter("%(message)s")) + + # Console logger (your app logs) + logger = logging.getLogger("console") + logger.handlers.clear() + logger.addHandler(h) + logger.propagate = False + loggers.append(create_struct_logger([colored_console_renderer()], "console")) - # Configure uvicorn/gunicorn loggers with separate handlers + # Gunicorn/Uvicorn loggers for name in ("uvicorn.access", "uvicorn.error", "gunicorn.error"): - uh = logging.StreamHandler(sys.stdout) - uh.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL)) - uh.setFormatter(logging.Formatter("%(message)s")) - server_logger = logging.getLogger(name) - server_logger.handlers.clear() - server_logger.setLevel(logging.INFO) - server_logger.addHandler(uh) - server_logger.propagate = False - - # Intercept agenta SDK loggers to prevent duplicate output - for sdk_name in ("agenta", "agenta.sdk"): - sdk_logger = logging.getLogger(sdk_name) - sdk_logger.handlers.clear() - sdk_logger.addHandler(console_handler) # Use our handler - sdk_logger.setLevel(logging.INFO) - sdk_logger.propagate = False + gunicorn_logger = logging.getLogger(name) + gunicorn_logger.handlers.clear() # ✅ fix here + gunicorn_logger.setLevel(logging.INFO) + gunicorn_logger.addHandler(h) + gunicorn_logger.propagate = False + +# if AGENTA_LOG_FILE_ENABLED: +# h = RotatingFileHandler(AGENTA_LOG_FILE_PATH, maxBytes=10 * 1024 * 1024, backupCount=5) +# h.setLevel(getattr(logging, AGENTA_LOG_FILE_LEVEL, logging.WARNING)) +# h.setFormatter(logging.Formatter("%(message)s")) +# logger = logging.getLogger("file") +# logger.addHandler(h) +# logger.propagate = False # 👈 PREVENT propagation to root (avoids Celery duplicate) +# loggers.append(create_struct_logger([plain_renderer()], "file")) + +# if AGENTA_LOG_OTLP_ENABLED: +# provider = LoggerProvider() +# exporter = OTLPLogExporter() +# provider.add_log_record_processor(BatchLogRecordProcessor(exporter)) +# set_logger_provider(provider) +# h = LoggingHandler( +# level=getattr(logging, AGENTA_LOG_OTLP_LEVEL, logging.INFO), logger_provider=provider +# ) +# h.setFormatter(logging.Formatter("%(message)s")) +# logger = logging.getLogger("otel") +# logger.addHandler(h) +# logger.propagate = False # 👈 PREVENT propagation to root (avo +# loggers.append(create_struct_logger([json_renderer()], "otel")) class MultiLogger: @@ -221,8 +279,11 @@ def bind(self, **kwargs): return MultiLogger(*(l.bind(**kwargs) for l in self._loggers)) +multi_logger = MultiLogger(*loggers) + + def get_logger(name: Optional[str] = None) -> MultiLogger: - return MultiLogger(*loggers).bind(logger=name) + return multi_logger.bind(logger=name) def get_module_logger(path: str) -> MultiLogger: diff --git a/api/poetry.lock b/api/poetry.lock index 737d1cb26b..7844e01e97 100644 --- a/api/poetry.lock +++ b/api/poetry.lock @@ -1347,15 +1347,15 @@ files = [ [[package]] name = "fsspec" -version = "2025.10.0" +version = "2025.9.0" description = "File-system specification" optional = false python-versions = ">=3.9" groups = ["main"] markers = "python_version == \"3.11\" or python_version >= \"3.12\"" files = [ - {file = "fsspec-2025.10.0-py3-none-any.whl", hash = "sha256:7c7712353ae7d875407f97715f0e1ffcc21e33d5b24556cb1e090ae9409ec61d"}, - {file = "fsspec-2025.10.0.tar.gz", hash = "sha256:b6789427626f068f9a83ca4e8a3cc050850b6c0f71f99ddb4f542b8266a26a59"}, + {file = "fsspec-2025.9.0-py3-none-any.whl", hash = "sha256:530dc2a2af60a414a832059574df4a6e10cce927f6f4a78209390fe38955cfb7"}, + {file = "fsspec-2025.9.0.tar.gz", hash = "sha256:19fd429483d25d28b65ec68f9f4adc16c17ea2c7c7bf54ec61360d478fb19c19"}, ] [package.extras] @@ -1401,15 +1401,15 @@ files = [ [[package]] name = "google-auth" -version = "2.42.1" +version = "2.42.0" description = "Google Authentication Library" optional = false python-versions = ">=3.7" groups = ["main"] markers = "python_version == \"3.11\" or python_version >= \"3.12\"" files = [ - {file = "google_auth-2.42.1-py2.py3-none-any.whl", hash = "sha256:eb73d71c91fc95dbd221a2eb87477c278a355e7367a35c0d84e6b0e5f9b4ad11"}, - {file = "google_auth-2.42.1.tar.gz", hash = "sha256:30178b7a21aa50bffbdc1ffcb34ff770a2f65c712170ecd5446c4bef4dc2b94e"}, + {file = "google_auth-2.42.0-py2.py3-none-any.whl", hash = "sha256:f8f944bcb9723339b0ef58a73840f3c61bc91b69bf7368464906120b55804473"}, + {file = "google_auth-2.42.0.tar.gz", hash = "sha256:9bbbeef3442586effb124d1ca032cfb8fb7acd8754ab79b55facd2b8f3ab2802"}, ] [package.dependencies] @@ -3517,22 +3517,6 @@ files = [ {file = "python_http_client-3.3.7.tar.gz", hash = "sha256:bf841ee45262747e00dec7ee9971dfb8c7d83083f5713596488d67739170cea0"}, ] -[[package]] -name = "python-jsonpath" -version = "2.0.1" -description = "JSONPath, JSON Pointer and JSON Patch for Python." -optional = false -python-versions = ">=3.8" -groups = ["main"] -markers = "python_version == \"3.11\" or python_version >= \"3.12\"" -files = [ - {file = "python_jsonpath-2.0.1-py3-none-any.whl", hash = "sha256:ebd518b7c883acc5b976518d76b6c96288405edec7d9ef838641869c1e1a5eb7"}, - {file = "python_jsonpath-2.0.1.tar.gz", hash = "sha256:32a84ebb2dc0ec1b42a6e165b0f9174aef8310bad29154ad9aee31ac37cca18f"}, -] - -[package.extras] -strict = ["iregexp-check (>=0.1.4)", "regex"] - [[package]] name = "python-multipart" version = "0.0.20" @@ -3638,102 +3622,105 @@ files = [ [[package]] name = "rapidfuzz" -version = "3.14.2" +version = "3.14.1" description = "rapid fuzzy string matching" optional = false python-versions = ">=3.10" groups = ["main"] markers = "python_version == \"3.11\" or python_version >= \"3.12\"" files = [ - {file = "rapidfuzz-3.14.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:37ddc4cc3eafe29ec8ba451fcec5244af441eeb53b4e7b4d1d886cd3ff3624f4"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:654be63b17f3da8414968dfdf15c46c8205960ec8508cbb9d837347bf036dc0b"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:75866e9fa474ccfe6b77367fb7c10e6f9754fb910d9b110490a6fad25501a039"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:fd915693a8d441e5f277bef23065275a2bb492724b5ccf64e38e60edd702b0fb"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:e702e76a6166bff466a33888902404209fffd83740d24918ef74514542f66367"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:78f84592f3a2f2773d6f411b755d683b1ce7f05adff4c12c0de923d5f2786e51"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-manylinux_2_31_armv7l.whl", hash = "sha256:36d43c9f1b88322ad05b22fa80b6b4a95d2b193d392d3aa7bee652c144cfb1d9"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:69d6f93916717314209f4e8701d203876baeadf8c9dcaee961b8afeba7435643"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:e262958d3ca723c1ce32030384a1626e3d43ba7465e01a3e2b633f4300956150"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:26b5e6e0d39337431ab1b36faf604873cb1f0de9280e0703f61c6753c8fa1f7f"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:2aad09712e1ffbc00ac25f12646c7065b84496af7cd0a70b1d5aff6318405732"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:f10dbbafa3decee704b7a02ffe7914d7dfbbd3d1fce7f37ed2c3d6c3a7c9a8e6"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-win32.whl", hash = "sha256:6c3dab8f9d4271e32c8746461a58412871ebb07654f77aa6121961e796482d30"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-win_amd64.whl", hash = "sha256:5386ce287e5b71db4fd71747a23ae0ca5053012dc959049e160857c5fdadf6cd"}, - {file = "rapidfuzz-3.14.2-cp310-cp310-win_arm64.whl", hash = "sha256:c78d6f205b871f2d41173f82ded66bcef2f692e1b90c0f627cc8035b72898f35"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:3969670d4b85e589564d6a75638ec2372a4375b7e68e747f3bd37b507cf843e4"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:061884b23a8c5eea9443e52acf02cbd533aff93a5439b0e90b5586a0638b8720"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:6fc2bc48a219c171deb8529bfcc90ca6663fbcaa42b54ef202858976078f858a"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:cfa62729ac2d77a50a240b6331e9fffb5e070625e97e8f7e50fa882b3ea396ad"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:2d001aaf47a500083b189140df16eaefd675bf06c818a71ae9f687b0d6f804f8"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c95eeaa7f2a990757826aa34e7375b50d49172da5ca7536dc461b1d197e0de9b"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:30af5e015462f89408d7b3bbdd614c739adc386e3d47bd565b53ffb670266021"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:35f12b07d58b932ef95b5f66b40c9efc60c5201bccd3c5ddde4a87df19d0aba8"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:0aa67110e016d2cdce3e5a3330d09fb1dba3cf83350f6eb46a6b9276cbafd094"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:b13dc4743a5d222600d98fb4a0345e910829ef4f286e81b34349627355884c87"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:b16c40709f22c8fc16ca49a5484a468fe0a95f08f29c68043f46f8771e2c37e2"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:ac2bd7c74523f952a66536f72b3f68260427e2a6954f1f03d758f01bbbf60564"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-win32.whl", hash = "sha256:37d7045dc0ab4cab49d7cca66b651b44939e18e098a2f55466082e173b1aa452"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-win_amd64.whl", hash = "sha256:9a55ff35536662028563f22e0eadab47c7e94c8798239fe25d3ceca5ab156fd8"}, - {file = "rapidfuzz-3.14.2-cp311-cp311-win_arm64.whl", hash = "sha256:b2f0e1310f7cb1c0c0033987d0a0e85b4fd51a1c4882f556f082687d519f045d"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:0418f6ac1da7adf7e6e469876508f63168e80d3265a9e7ab9a2e999020577bfa"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:f6028090b49015fc9ff0df3c06751078fe300a291e933a378a7c37b78c4d6a3e"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:21aa299985d1bbdb3ccf8a8214e7daee72bb7e8c8fb25a520f015dc200a57816"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:e247612909876f36e6132265deef34efcaaf490e1857022204b206ff76578076"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:9cf077475cd4118a5b846a72749d54b520243be6baddba1dd1446f3b1dbab29c"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a5e7e02fb51f9a78e32f4fb8b5546d543e1fb637409cb682a6b8cb12e0c3015c"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:b1febabf4a4a664a2b6025830d93d7703f1cd9dcbe656ed7159053091b4d9389"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:766d133f11888c48497f26a1722afc697a5fbad05bbfec3a41a4bc04fd21af9d"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:2a851a7c6660b6e47723378ca7692cd42700660a8783e4e7d07254a984d63ec8"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:686594bd7f7132cb85900a4cc910e9acb9d39466412b8a275f3d4bc37faba23c"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:e1d412122de3c5c492acfcde020f543b9b529e2eb115f875e2fd7470e44ab441"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:2611b1f6464dddf900bffeee2aa29a9aa1039317cbb226e18d3a5f029d4cf303"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-win32.whl", hash = "sha256:e6968b6db188fbb4c7a18aac25e075940a8204434a2a0d6bddb0a695d7f0c898"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-win_amd64.whl", hash = "sha256:1a6d43683c04ffb4270bb1498951a39e9c200eb326f933fd5d608c19485049b8"}, - {file = "rapidfuzz-3.14.2-cp312-cp312-win_arm64.whl", hash = "sha256:4ecd3ab9aebb17becb462eac19151bd143abc614e3d2a0351a72171371ac3f4b"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:f1f5a2566af7409d11f11b0b4e9f76a0ac64577737b821c64a2a6afc971c1c25"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:810863f3a98d09392e5fb481aef9d82597df6ee06f7f11ceafe6077585c4e018"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2e8c0d16c0724dab7c7dc4099c1ec410679b2d11c1650b069d15d4ab4370f1cc"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:004f04356d84660feffbf8c26975cb0db0e010b2225d6e21b3d84dd8df764652"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b3c2aea6b1db03a8abd62bb157161d7a65b896c9f85d5efc2f1bb444a107c47a"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8bef63704b7851ad1adf5d7ceb7f1b3136b78ee0b34240c14ab85ea775f6caa7"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:52e8e37566313ac60bfa80754c4c0367eec65b3ef52bb8cc409b88e878b03182"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:b3fad0fb5ac44944ad8f81e729ec45f65a85efb7d7ea4cf67343799c0ea9874b"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:d027842a956b86aa9706b836c48186da405413d03957afaccda2fbe414bc3912"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:27dcb45427b1966fb43c904d19c841c3e6da147931959cf05388ecef9c5a1e8d"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:1aab0676884e91282817b5710933efc4ea9466d2ba5703b5a7541468695d807a"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:ef36c21ecb7f4bad7e4e119fe746a787ad684eaf1c383c17a2aff5d75b20fa58"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-win32.whl", hash = "sha256:ed3af4fa0dbd6d1964f171ac6fff82ed9e76c737eb34ae3daf926c4aefc2ce9b"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-win_amd64.whl", hash = "sha256:3fc2e7c3ab006299366b1c8256e452f00eb1659d0e4790b140633627c7d947b7"}, - {file = "rapidfuzz-3.14.2-cp313-cp313-win_arm64.whl", hash = "sha256:def48d5010ddcd2a80b44f14bf0172c29bfc27906d13c0ea69a6e3c00e6f225c"}, - {file = "rapidfuzz-3.14.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:a39952b8e033758ee15b2de48a5b0689c83ea6bd93c8df3635f2fbf21e52fd25"}, - {file = "rapidfuzz-3.14.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:f786811555869b5961b3718b007179e87d73c47414afee5fb882ae1b9b174c0c"}, - {file = "rapidfuzz-3.14.2-cp313-cp313t-win32.whl", hash = "sha256:6c0a25490a99c4b73f1deca3efae004df5f2b254760d98cac8d93becf41260d4"}, - {file = "rapidfuzz-3.14.2-cp313-cp313t-win_amd64.whl", hash = "sha256:e5af2dab8ec5a180d9ff24fbb5b25e589848b93cccb755eceb0bf0e3cfed7e5c"}, - {file = "rapidfuzz-3.14.2-cp313-cp313t-win_arm64.whl", hash = "sha256:8cf2aefb0d246d540ea83b4648db690bd7e25d34a7c23c5f250dcba2e4989192"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-macosx_10_15_x86_64.whl", hash = "sha256:ace3a6b108679888833cdceea9a6231e406db202b8336eaf68279fe71a1d2ac4"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:32c7cc978447202ba592e197228767b230d85e52e5ef229e2b22e51c8e3d06ad"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5a479a824cbf6a646bcec1c34fbbfb85393d03eb2811657e3a6536298d435f76"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3a3bc0c8b65dcd1e55a1cc42a7c7b34e93ad5d4bd1501dc998f4625042e1b110"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:217b46bf096818df16c0e2c43202aa8352e67c4379b1d5f25e98c5d1c7f5414d"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:07d3e8afeeb81044873644e505e56ba06d8bdcc291ef7e26ac0f54c58309267d"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-manylinux_2_31_armv7l.whl", hash = "sha256:b7832c8707bfa4f9b081def64aa49954d4813cff7fc9ff4a0b184a4e8697147f"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:35581ba6981e016333063c52719c0b0b1bef0f944e641ad0f4ea34e0b39161f3"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:fbd5152169dc3f6c894c24fc04813f50bf9b929d137f2b965ac926e03329ceba"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:98a119c3f9b152e9b62ec43520392669bd8deae9df269f30569f1c87bf6055a4"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:9e84164e7a68f9c3523c5d104dda6601202b39bae0aac1b73a4f119d387275c4"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:64c67402b86a073666f92c2807811e3817a17fedfe505fe89a9f93eea264481c"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-win32.whl", hash = "sha256:58d79f4df3e4332b31e671f9487f0c215856cf1f2d9ac3848ac10c27262fd723"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-win_amd64.whl", hash = "sha256:dc6fe7a27ad9e233c155e89b7e1d9b6d13963e3261ea5b30f3e79c3556c49bc9"}, - {file = "rapidfuzz-3.14.2-cp314-cp314-win_arm64.whl", hash = "sha256:bb4e96d80de7e6364850a2e168e899b8e85ab80ce19827cc4fbe0aa3c57f8124"}, - {file = "rapidfuzz-3.14.2-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:c7d4d0927a6b1ef2529a8cc57adf2ce965f7aaef324a4d1ae826d0de43ab4f82"}, - {file = "rapidfuzz-3.14.2-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:c0fae06e7fb4be18e86eb51e77f0d441975a3ba9ef963f957d750a2a41536ba1"}, - {file = "rapidfuzz-3.14.2-cp314-cp314t-win32.whl", hash = "sha256:d1d3ef72665d460b7b3e61d3dff4341a195dcb3250b4471eef71db23fca2d91a"}, - {file = "rapidfuzz-3.14.2-cp314-cp314t-win_amd64.whl", hash = "sha256:3a0960c5c11a34e8129a3062f1b1cbb371fad364e2195ebe46a88a9d5eeec0f1"}, - {file = "rapidfuzz-3.14.2-cp314-cp314t-win_arm64.whl", hash = "sha256:ed29600e55d7df104d5778d499678c305e32e3ccfa873489a7c8304489c5f8f3"}, - {file = "rapidfuzz-3.14.2-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:172630396d8bdbb5ea1a58e82afc489c8e18076e1f2b2edea20cb30f8926325a"}, - {file = "rapidfuzz-3.14.2-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:6cff0d6749fac8dd7fdf26d0604d8a47c5ee786061972077d71ec7ac0fb7ced2"}, - {file = "rapidfuzz-3.14.2-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:f558bc2ee3a0bb5d7238ed10a0b76455f2d28c97e93564a1f7855cea4096ef1c"}, - {file = "rapidfuzz-3.14.2.tar.gz", hash = "sha256:69bf91e66aeb84a104aea35e1b3f6b3aa606faaee6db1cfc76950f2a6a828a12"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:489440e4b5eea0d150a31076eb183bed0ec84f934df206c72ae4fc3424501758"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:eff22cc938c3f74d194df03790a6c3325d213b28cf65cdefd6fdeae759b745d5"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e0307f018b16feaa36074bcec2496f6f120af151a098910296e72e233232a62f"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:bc133652da143aca1ab72de235446432888b2b7f44ee332d006f8207967ecb8a"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:e9e71b3fe7e4a1590843389a90fe2a8684649fc74b9b7446e17ee504ddddb7de"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:6c51519eb2f20b52eba6fc7d857ae94acc6c2a1f5d0f2d794b9d4977cdc29dd7"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-manylinux_2_31_armv7l.whl", hash = "sha256:fe87d94602624f8f25fff9a0a7b47f33756c4d9fc32b6d3308bb142aa483b8a4"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:2d665380503a575dda52eb712ea521f789e8f8fd629c7a8e6c0f8ff480febc78"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:c0f0dd022b8a7cbf3c891f6de96a80ab6a426f1069a085327816cea749e096c2"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:bf1ba22d36858b265c95cd774ba7fe8991e80a99cd86fe4f388605b01aee81a3"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:ca1c1494ac9f9386d37f0e50cbaf4d07d184903aed7691549df1b37e9616edc9"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:9e4b12e921b0fa90d7c2248742a536f21eae5562174090b83edd0b4ab8b557d7"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-win32.whl", hash = "sha256:5e1c1f2292baa4049535b07e9e81feb29e3650d2ba35ee491e64aca7ae4cb15e"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-win_amd64.whl", hash = "sha256:59a8694beb9a13c4090ab3d1712cabbd896c6949706d1364e2a2e1713c413760"}, + {file = "rapidfuzz-3.14.1-cp310-cp310-win_arm64.whl", hash = "sha256:e94cee93faa792572c574a615abe12912124b4ffcf55876b72312914ab663345"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:4d976701060886a791c8a9260b1d4139d14c1f1e9a6ab6116b45a1acf3baff67"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:5e6ba7e6eb2ab03870dcab441d707513db0b4264c12fba7b703e90e8b4296df2"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1e532bf46de5fd3a1efde73a16a4d231d011bce401c72abe3c6ecf9de681003f"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:f9b6a6fb8ed9b951e5f3b82c1ce6b1665308ec1a0da87f799b16e24fc59e4662"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:5b6ac3f9810949caef0e63380b11a3c32a92f26bacb9ced5e32c33560fcdf8d1"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e52e4c34fd567f77513e886b66029c1ae02f094380d10eba18ba1c68a46d8b90"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:2ef72e41b1a110149f25b14637f1cedea6df192462120bea3433980fe9d8ac05"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:fb654a35b373d712a6b0aa2a496b2b5cdd9d32410cfbaecc402d7424a90ba72a"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:2b2c12e5b9eb8fe9a51b92fe69e9ca362c0970e960268188a6d295e1dec91e6d"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:4f069dec5c450bd987481e752f0a9979e8fdf8e21e5307f5058f5c4bb162fa56"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:4d0d9163725b7ad37a8c46988cae9ebab255984db95ad01bf1987ceb9e3058dd"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:db656884b20b213d846f6bc990c053d1f4a60e6d4357f7211775b02092784ca1"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-win32.whl", hash = "sha256:4b42f7b9c58cbcfbfaddc5a6278b4ca3b6cd8983e7fd6af70ca791dff7105fb9"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-win_amd64.whl", hash = "sha256:e5847f30d7d4edefe0cb37294d956d3495dd127c1c56e9128af3c2258a520bb4"}, + {file = "rapidfuzz-3.14.1-cp311-cp311-win_arm64.whl", hash = "sha256:5087d8ad453092d80c042a08919b1cb20c8ad6047d772dc9312acd834da00f75"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:809515194f628004aac1b1b280c3734c5ea0ccbd45938c9c9656a23ae8b8f553"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0afcf2d6cb633d0d4260d8df6a40de2d9c93e9546e2c6b317ab03f89aa120ad7"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5c1c3d07d53dcafee10599da8988d2b1f39df236aee501ecbd617bd883454fcd"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:6e9ee3e1eb0a027717ee72fe34dc9ac5b3e58119f1bd8dd15bc19ed54ae3e62b"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:70c845b64a033a20c44ed26bc890eeb851215148cc3e696499f5f65529afb6cb"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:26db0e815213d04234298dea0d884d92b9cb8d4ba954cab7cf67a35853128a33"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:6ad3395a416f8b126ff11c788531f157c7debeb626f9d897c153ff8980da10fb"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:61c5b9ab6f730e6478aa2def566223712d121c6f69a94c7cc002044799442afd"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:13e0ea3d0c533969158727d1bb7a08c2cc9a816ab83f8f0dcfde7e38938ce3e6"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:6325ca435b99f4001aac919ab8922ac464999b100173317defb83eae34e82139"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:07a9fad3247e68798424bdc116c1094e88ecfabc17b29edf42a777520347648e"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:f8ff5dbe78db0a10c1f916368e21d328935896240f71f721e073cf6c4c8cdedd"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-win32.whl", hash = "sha256:9c83270e44a6ae7a39fc1d7e72a27486bccc1fa5f34e01572b1b90b019e6b566"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-win_amd64.whl", hash = "sha256:e06664c7fdb51c708e082df08a6888fce4c5c416d7e3cc2fa66dd80eb76a149d"}, + {file = "rapidfuzz-3.14.1-cp312-cp312-win_arm64.whl", hash = "sha256:6c7c26025f7934a169a23dafea6807cfc3fb556f1dd49229faf2171e5d8101cc"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:8d69f470d63ee824132ecd80b1974e1d15dd9df5193916901d7860cef081a260"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:6f571d20152fc4833b7b5e781b36d5e4f31f3b5a596a3d53cf66a1bd4436b4f4"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:61d77e09b2b6bc38228f53b9ea7972a00722a14a6048be9a3672fb5cb08bad3a"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:8b41d95ef86a6295d353dc3bb6c80550665ba2c3bef3a9feab46074d12a9af8f"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0591df2e856ad583644b40a2b99fb522f93543c65e64b771241dda6d1cfdc96b"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:f277801f55b2f3923ef2de51ab94689a0671a4524bf7b611de979f308a54cd6f"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:893fdfd4f66ebb67f33da89eb1bd1674b7b30442fdee84db87f6cb9074bf0ce9"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:fe2651258c1f1afa9b66f44bf82f639d5f83034f9804877a1bbbae2120539ad1"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:ace21f7a78519d8e889b1240489cd021c5355c496cb151b479b741a4c27f0a25"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:cb5acf24590bc5e57027283b015950d713f9e4d155fda5cfa71adef3b3a84502"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:67ea46fa8cc78174bad09d66b9a4b98d3068e85de677e3c71ed931a1de28171f"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:44e741d785de57d1a7bae03599c1cbc7335d0b060a35e60c44c382566e22782e"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-win32.whl", hash = "sha256:b1fe6001baa9fa36bcb565e24e88830718f6c90896b91ceffcb48881e3adddbc"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-win_amd64.whl", hash = "sha256:83b8cc6336709fa5db0579189bfd125df280a554af544b2dc1c7da9cdad7e44d"}, + {file = "rapidfuzz-3.14.1-cp313-cp313-win_arm64.whl", hash = "sha256:cf75769662eadf5f9bd24e865c19e5ca7718e879273dce4e7b3b5824c4da0eb4"}, + {file = "rapidfuzz-3.14.1-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:d937dbeda71c921ef6537c6d41a84f1b8112f107589c9977059de57a1d726dd6"}, + {file = "rapidfuzz-3.14.1-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:7a2d80cc1a4fcc7e259ed4f505e70b36433a63fa251f1bb69ff279fe376c5efd"}, + {file = "rapidfuzz-3.14.1-cp313-cp313t-win32.whl", hash = "sha256:40875e0c06f1a388f1cab3885744f847b557e0b1642dfc31ff02039f9f0823ef"}, + {file = "rapidfuzz-3.14.1-cp313-cp313t-win_amd64.whl", hash = "sha256:876dc0c15552f3d704d7fb8d61bdffc872ff63bedf683568d6faad32e51bbce8"}, + {file = "rapidfuzz-3.14.1-cp313-cp313t-win_arm64.whl", hash = "sha256:61458e83b0b3e2abc3391d0953c47d6325e506ba44d6a25c869c4401b3bc222c"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:e84d9a844dc2e4d5c4cabd14c096374ead006583304333c14a6fbde51f612a44"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:40301b93b99350edcd02dbb22e37ca5f2a75d0db822e9b3c522da451a93d6f27"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_26_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fedd5097a44808dddf341466866e5c57a18a19a336565b4ff50aa8f09eb528f6"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_26_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:2e3e61c9e80d8c26709d8aa5c51fdd25139c81a4ab463895f8a567f8347b0548"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_26_s390x.manylinux_2_28_s390x.whl", hash = "sha256:da011a373722fac6e64687297a1d17dc8461b82cb12c437845d5a5b161bc24b9"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:5967d571243cfb9ad3710e6e628ab68c421a237b76e24a67ac22ee0ff12784d6"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-manylinux_2_31_armv7l.whl", hash = "sha256:474f416cbb9099676de54aa41944c154ba8d25033ee460f87bb23e54af6d01c9"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:ae2d57464b59297f727c4e201ea99ec7b13935f1f056c753e8103da3f2fc2404"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_armv7l.whl", hash = "sha256:57047493a1f62f11354c7143c380b02f1b355c52733e6b03adb1cb0fe8fb8816"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:4acc20776f225ee37d69517a237c090b9fa7e0836a0b8bc58868e9168ba6ef6f"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:4373f914ff524ee0146919dea96a40a8200ab157e5a15e777a74a769f73d8a4a"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:37017b84953927807847016620d61251fe236bd4bcb25e27b6133d955bb9cafb"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-win32.whl", hash = "sha256:c8d1dd1146539e093b84d0805e8951475644af794ace81d957ca612e3eb31598"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-win_amd64.whl", hash = "sha256:f51c7571295ea97387bac4f048d73cecce51222be78ed808263b45c79c40a440"}, + {file = "rapidfuzz-3.14.1-cp314-cp314-win_arm64.whl", hash = "sha256:01eab10ec90912d7d28b3f08f6c91adbaf93458a53f849ff70776ecd70dd7a7a"}, + {file = "rapidfuzz-3.14.1-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:60879fcae2f7618403c4c746a9a3eec89327d73148fb6e89a933b78442ff0669"}, + {file = "rapidfuzz-3.14.1-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:f94d61e44db3fc95a74006a394257af90fa6e826c900a501d749979ff495d702"}, + {file = "rapidfuzz-3.14.1-cp314-cp314t-win32.whl", hash = "sha256:93b6294a3ffab32a9b5f9b5ca048fa0474998e7e8bb0f2d2b5e819c64cb71ec7"}, + {file = "rapidfuzz-3.14.1-cp314-cp314t-win_amd64.whl", hash = "sha256:6cb56b695421538fdbe2c0c85888b991d833b8637d2f2b41faa79cea7234c000"}, + {file = "rapidfuzz-3.14.1-cp314-cp314t-win_arm64.whl", hash = "sha256:7cd312c380d3ce9d35c3ec9726b75eee9da50e8a38e89e229a03db2262d3d96b"}, + {file = "rapidfuzz-3.14.1-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:673ce55a9be5b772dade911909e42382c0828b8a50ed7f9168763fa6b9f7054d"}, + {file = "rapidfuzz-3.14.1-pp310-pypy310_pp73-macosx_11_0_arm64.whl", hash = "sha256:45c62ada1980ebf4c64c4253993cc8daa018c63163f91db63bb3af69cb74c2e3"}, + {file = "rapidfuzz-3.14.1-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:4d51efb29c0df0d4f7f64f672a7624c2146527f0745e3572098d753676538800"}, + {file = "rapidfuzz-3.14.1-pp311-pypy311_pp73-macosx_10_15_x86_64.whl", hash = "sha256:4a21ccdf1bd7d57a1009030527ba8fae1c74bf832d0a08f6b67de8f5c506c96f"}, + {file = "rapidfuzz-3.14.1-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:589fb0af91d3aff318750539c832ea1100dbac2c842fde24e42261df443845f6"}, + {file = "rapidfuzz-3.14.1-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:a4f18092db4825f2517d135445015b40033ed809a41754918a03ef062abe88a0"}, + {file = "rapidfuzz-3.14.1.tar.gz", hash = "sha256:b02850e7f7152bd1edff27e9d584505b84968cacedee7a734ec4050c655a803c"}, ] [package.extras] @@ -4981,4 +4968,4 @@ type = ["pytest-mypy"] [metadata] lock-version = "2.1" python-versions = "^3.11" -content-hash = "41981e274e958a70f5034827967fc28998561ae42040776fa79113456f26c156" +content-hash = "363083a370e795ce42acda8ee8f07d55f59c8f36f8cd5e21c623a780d425cda0" diff --git a/api/pyproject.toml b/api/pyproject.toml index b5d1f69d63..b978989d2a 100644 --- a/api/pyproject.toml +++ b/api/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "api" -version = "0.60.2" +version = "0.60.0" description = "Agenta API" authors = [ { name = "Mahmoud Mabrouk", email = "mahmoud@agenta.ai" }, @@ -46,7 +46,6 @@ watchdog = { extras = ["watchmedo"], version = "^3.0.0" } sqlalchemy-json = "^0.7.0" python-multipart = "^0.0.20" gunicorn = "^23.0.0" -python-jsonpath = "^2.0.0" # opentelemetry-api = "^1.36.0" # opentelemetry-sdk = "^1.36.0" diff --git a/docs/blog/entries/documentation-architecture-overhaul.mdx b/docs/blog/entries/documentation-architecture-overhaul.mdx deleted file mode 100644 index a19bf29d67..0000000000 --- a/docs/blog/entries/documentation-architecture-overhaul.mdx +++ /dev/null @@ -1,40 +0,0 @@ ---- -title: "Documentation Architecture Overhaul" -slug: documentation-architecture-overhaul -date: 2025-11-03 -tags: [v0.59.10] ---- - -import Image from "@theme/IdealImage"; - -We've completely rewritten and restructured our documentation with a new architecture. This is one of the largest updates we've made to the documentation, involving a near-complete rewrite of existing content and adding substantial new material. - -### Diataxis Framework Implementation - -We've reorganized all documentation using the [Diataxis framework](https://diataxis.fr/). - -### Expanded Observability Documentation - -One of the biggest gaps in our previous documentation was observability. We've added comprehensive documentation covering: - -- [Tracing with OpenTelemetry](/observability/trace-with-opentelemetry/getting-started) -- [Tracing LLM applications with JS/TS](/observability/quick-start-opentelemetry) -- [Using the Metrics API to fetch metrics](/observability/query-data/analytics-data) -- [Using the Query API to fetch traces](/observability/query-data/query-api) - -### JavaScript/TypeScript Support - -Documentation now includes JavaScript and TypeScript examples alongside Python wherever applicable. This makes it easier for JavaScript developers to integrate Agenta into their applications. - -### Ask AI Feature - -We've added a new "Ask AI" feature that lets you ask questions directly to the documentation. Get instant answers to your questions without searching through pages. - -Ask AI feature in documentation - - ---- diff --git a/docs/blog/main.mdx b/docs/blog/main.mdx index e55eed8a9c..208ffed0bf 100644 --- a/docs/blog/main.mdx +++ b/docs/blog/main.mdx @@ -10,24 +10,6 @@ import Image from "@theme/IdealImage";
-### [Documentation Overhaul](/changelog/documentation-architecture-overhaul) - -_3 November 2025_ - -**v0.59.10** - -Ask AI feature in documentation - -We've completely rewritten and restructured our documentation with a new architecture. This is one of the largest updates we've made, involving a near-complete rewrite of existing content. - -Key improvements include: -- **[Diataxis Framework](https://diataxis.fr/)**: Organized content into Tutorials, How-to Guides, Reference, and Explanation sections for better discoverability -- **[Expanded Observability Docs](/observability/overview)**: Added missing documentation for tracing, annotations, and observability features -- **[JavaScript/TypeScript Support](/observability/quick-start-opentelemetry)**: Added code examples and documentation for JavaScript developers alongside Python -- **Ask AI Feature**: Ask questions directly to the documentation for instant answers - ---- - ### [Vertex AI Provider Support](/changelog/vertex-ai-provider-support) _24 October 2025_ diff --git a/docs/docs/getting-started/01-introduction.mdx b/docs/docs/getting-started/01-introduction.mdx index d7e9e591c8..7c51b57d8b 100644 --- a/docs/docs/getting-started/01-introduction.mdx +++ b/docs/docs/getting-started/01-introduction.mdx @@ -23,30 +23,20 @@ Agenta covers the entire LLM development lifecycle: **prompt management**, **eva ### Prompt Engineering and Management -Teams often struggle with prompt collaboration. They keep prompts in code where subject matter experts cannot edit them. Or they use spreadsheets in an unreliable process. +Agenta enables product teams to experiment with prompts, push them to production, run evaluations, and annotate their results. -Agenta organizes prompts for your team. Subject matter experts can collaborate with developers without touching the codebase. Developers can version prompts and deploy them to production. - -The playground lets teams experiment with prompts. You can load traces and test sets. You can test prompts side by side. ### Evaluation -Most teams lack a systematic evaluation process. They make random prompt changes based on vibes. Some changes improve quality but break other cases because LLMs are stochastic. - -Agenta provides one place to evaluate systematically. Teams can run three types of evaluation: +Agenta enables product teams to experiment with prompts, push them to production, run evaluations, and annotate their results. -- Automatic evaluation with LLMs at scale before production -- Human annotation where subject matter experts review results and provide feedback to AI engineers -- Online evaluation for applications already in production -Both subject matter experts and engineers can run evaluations from the UI. ### Observability -Agenta helps you understand what happens in production. You can capture user feedback through an API (thumbs up or implicit signals). You can debug agents and applications with tracing to see what happens inside them. +Agenta enables product teams to experiment with prompts, push them to production, run evaluations, and annotate their results. -Track costs over time. Find edge cases where things fail. Add those cases to your test sets. Have subject matter experts annotate the results. ## Why Agenta? diff --git a/docs/docs/self-host/02-configuration.mdx b/docs/docs/self-host/02-configuration.mdx index 5879d32f47..adde1c0333 100644 --- a/docs/docs/self-host/02-configuration.mdx +++ b/docs/docs/self-host/02-configuration.mdx @@ -37,8 +37,6 @@ Configuration for Docker and database connections: | Variable | Description | Default | |----------|-------------|---------| -| `AGENTA_WEB_IMAGE_TAG` | Docker image tag for the web frontend | `latest` | -| `AGENTA_API_IMAGE_TAG` | Docker image tag for the API backend | `latest` | | `DOCKER_NETWORK_MODE` | Docker networking mode | `_(empty)_` (which falls back to `bridge`) | | `POSTGRES_PASSWORD` | PostgreSQL database password | `password` | | `POSTGRES_USERNAME` | PostgreSQL database username | `username` | diff --git a/docs/docs/self-host/99-faq.mdx b/docs/docs/self-host/99-faq.mdx deleted file mode 100644 index 42a8a52075..0000000000 --- a/docs/docs/self-host/99-faq.mdx +++ /dev/null @@ -1,16 +0,0 @@ ---- -title: Frequently Asked Questions -sidebar_label: FAQ -description: Self-hosting Agenta FAQ. Learn how to lock Agenta to a specific version, configure Docker images, and troubleshoot common deployment issues. ---- - -## How do I lock Agenta to a specific version? - -Use the `AGENTA_WEB_IMAGE_TAG` and `AGENTA_API_IMAGE_TAG` environment variables. - -```bash -AGENTA_WEB_IMAGE_TAG=v0.15.0 -AGENTA_API_IMAGE_TAG=v0.15.0 -``` - -These are set to `latest` by default. diff --git a/docs/docusaurus.config.ts b/docs/docusaurus.config.ts index 4d89dcdcfd..f33bea025d 100644 --- a/docs/docusaurus.config.ts +++ b/docs/docusaurus.config.ts @@ -84,8 +84,8 @@ const config: Config = { navbar: { logo: { alt: "agenta-ai", - src: "images/Agenta-logo-full-light.png", - srcDark: "images/Agenta-logo-full-dark-accent.png", + src: "images/light-complete-transparent-CROPPED.png", + srcDark: "images/dark-complete-transparent-CROPPED.png", }, hideOnScroll: false, items: [ diff --git a/docs/static/images/Agenta-logo-full-dark-accent.png b/docs/static/images/Agenta-logo-full-dark-accent.png deleted file mode 100644 index a270afc094..0000000000 Binary files a/docs/static/images/Agenta-logo-full-dark-accent.png and /dev/null differ diff --git a/docs/static/images/Agenta-logo-full-light.png b/docs/static/images/Agenta-logo-full-light.png deleted file mode 100644 index bddc2359bd..0000000000 Binary files a/docs/static/images/Agenta-logo-full-light.png and /dev/null differ diff --git a/docs/static/images/changelog/agenta_askai.png b/docs/static/images/changelog/agenta_askai.png deleted file mode 100644 index eee2ba8ff0..0000000000 Binary files a/docs/static/images/changelog/agenta_askai.png and /dev/null differ diff --git a/docs/static/images/dark-complete-transparent-CROPPED.png b/docs/static/images/dark-complete-transparent-CROPPED.png new file mode 100644 index 0000000000..bc73ad84e2 Binary files /dev/null and b/docs/static/images/dark-complete-transparent-CROPPED.png differ diff --git a/docs/static/images/dark-logo.svg b/docs/static/images/dark-logo.svg new file mode 100644 index 0000000000..6cb8ef3330 --- /dev/null +++ b/docs/static/images/dark-logo.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/static/images/favicon.ico b/docs/static/images/favicon.ico index dad02fe072..4dc8619b1d 100644 Binary files a/docs/static/images/favicon.ico and b/docs/static/images/favicon.ico differ diff --git a/docs/static/images/light-complete-transparent-CROPPED.png b/docs/static/images/light-complete-transparent-CROPPED.png new file mode 100644 index 0000000000..de9bbd9aca Binary files /dev/null and b/docs/static/images/light-complete-transparent-CROPPED.png differ diff --git a/docs/static/images/light-logo.svg b/docs/static/images/light-logo.svg new file mode 100644 index 0000000000..9c795f8e88 --- /dev/null +++ b/docs/static/images/light-logo.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/docs/static/images/social-card.png b/docs/static/images/social-card.png index 49fe2b893e..d62f2f99b9 100644 Binary files a/docs/static/images/social-card.png and b/docs/static/images/social-card.png differ diff --git a/sdk/README.md b/sdk/README.md index d07ac59e52..df44027a55 100644 --- a/sdk/README.md +++ b/sdk/README.md @@ -2,12 +2,11 @@

- - + + Shows the logo of agenta -

The Open-source LLMOps Platform

Build reliable LLM applications faster with integrated prompt management, evaluation, and observability. @@ -182,7 +181,7 @@ We welcome contributions of all kinds — from filing issues and sharing ideas t ## Contributors ✨ -[![All Contributors](https://img.shields.io/badge/all_contributors-50-orange.svg?style=flat-square)](#contributors-) +[![All Contributors](https://img.shields.io/badge/all_contributors-49-orange.svg?style=flat-square)](#contributors-) Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)): diff --git a/sdk/agenta/sdk/agenta_init.py b/sdk/agenta/sdk/agenta_init.py index f6ffa7c73b..9740b2283b 100644 --- a/sdk/agenta/sdk/agenta_init.py +++ b/sdk/agenta/sdk/agenta_init.py @@ -68,7 +68,7 @@ def init( """ - log.info("Agenta - SDK ver: %s", version("agenta")) + log.info("Agenta - SDK version: %s", version("agenta")) config = {} if config_fname: @@ -94,7 +94,7 @@ def init( log.error(f"Failed to parse host URL '{_host}': {e}") raise - log.info("Agenta - API URL: %s/api", self.host) + log.info("Agenta - Host: %s", self.host) self.api_key = api_key or getenv("AGENTA_API_KEY") or config.get("api_key") diff --git a/sdk/agenta/sdk/contexts/tracing.py b/sdk/agenta/sdk/contexts/tracing.py index 8425350ceb..ab5718fb42 100644 --- a/sdk/agenta/sdk/contexts/tracing.py +++ b/sdk/agenta/sdk/contexts/tracing.py @@ -11,7 +11,7 @@ class TracingContext(BaseModel): # credentials: Optional[str] = None # - script: Optional[dict] = None + script: Optional[str] = None parameters: Optional[dict] = None # flags: Optional[dict] = None diff --git a/sdk/agenta/sdk/decorators/running.py b/sdk/agenta/sdk/decorators/running.py index 9c13423454..9eb77d43f8 100644 --- a/sdk/agenta/sdk/decorators/running.py +++ b/sdk/agenta/sdk/decorators/running.py @@ -133,7 +133,7 @@ def __init__( ] ] = None, # -------------------------------------------------------------------- # - script: Optional[dict] = None, + script: Optional[str] = None, parameters: Optional[dict] = None, # configuration: Optional[ diff --git a/sdk/agenta/sdk/middleware/vault.py b/sdk/agenta/sdk/middleware/vault.py index 52c02fa186..0b3056a586 100644 --- a/sdk/agenta/sdk/middleware/vault.py +++ b/sdk/agenta/sdk/middleware/vault.py @@ -19,11 +19,22 @@ import agenta as ag +AGENTA_RUNTIME_PREFIX = getenv("AGENTA_RUNTIME_PREFIX", "") + +_ALWAYS_ALLOW_LIST = [ + f"{AGENTA_RUNTIME_PREFIX}/health", + f"{AGENTA_RUNTIME_PREFIX}/openapi.json", +] + _PROVIDER_KINDS = [] for provider_kind in StandardProviderKind.__args__[0].__args__: # type: ignore _PROVIDER_KINDS.append(provider_kind) +_AUTH_ENABLED = ( + getenv("AGENTA_SERVICE_MIDDLEWARE_AUTH_ENABLED", "true").lower() in TRUTHY +) + _CACHE_ENABLED = ( getenv("AGENTA_SERVICE_MIDDLEWARE_CACHE_ENABLED", "true").lower() in TRUTHY ) @@ -31,12 +42,27 @@ _cache = TTLLRUCache() +class DenyException(Exception): + def __init__( + self, + status_code: int = 403, + content: str = "Forbidden", + ) -> None: + super().__init__() + + self.status_code = status_code + self.content = content + + class VaultMiddleware(BaseHTTPMiddleware): def __init__(self, app: FastAPI): super().__init__(app) self.host = ag.DEFAULT_AGENTA_SINGLETON_INSTANCE.host + self.scope_type = ag.DEFAULT_AGENTA_SINGLETON_INSTANCE.scope_type + self.scope_id = ag.DEFAULT_AGENTA_SINGLETON_INSTANCE.scope_id + async def dispatch( self, request: Request, @@ -74,8 +100,12 @@ async def _get_secrets(self, request: Request) -> Optional[Dict]: return secrets local_secrets: List[Dict[str, Any]] = [] + allow_secrets = True try: + if not request.url.path in _ALWAYS_ALLOW_LIST: + await self._allow_local_secrets(credentials) + for provider_kind in _PROVIDER_KINDS: provider = provider_kind key_name = f"{provider.upper()}_API_KEY" @@ -93,6 +123,9 @@ async def _get_secrets(self, request: Request) -> Optional[Dict]: ) local_secrets.append(secret.model_dump()) + except DenyException as e: # pylint: disable=bare-except + print(e.status_code, e.content) + allow_secrets = False except: # pylint: disable=bare-except display_exception("Vault: Local Secrets Exception") @@ -133,6 +166,155 @@ async def _get_secrets(self, request: Request) -> Optional[Dict]: secrets = standard_secrets + custom_secrets - _cache.put(_hash, {"secrets": secrets}) + if not allow_secrets: + _cache.put(_hash, {"secrets": secrets}) return secrets + + async def _allow_local_secrets(self, credentials): + try: + if not _AUTH_ENABLED: + return + + if not credentials: + raise DenyException( + status_code=401, + content="Invalid credentials. Please check your credentials or login again.", + ) + + # HEADERS + headers = {"Authorization": credentials} + # PARAMS + params = {} + ## SCOPE + if self.scope_type and self.scope_id: + params["scope_type"] = self.scope_type + params["scope_id"] = self.scope_id + ## ACTION + params["action"] = "view_secret" + ## RESOURCE + params["resource_type"] = "local_secrets" + + _hash = dumps( + { + "headers": headers, + "params": params, + }, + sort_keys=True, + ) + + access = None + + if _CACHE_ENABLED: + access = _cache.get(_hash) + + if isinstance(access, Exception): + raise access + + try: + async with httpx.AsyncClient() as client: + try: + response = await client.get( + f"{self.host}/api/permissions/verify", + headers=headers, + params=params, + timeout=30.0, + ) + except httpx.TimeoutException as exc: + # log.debug(f"Timeout error while verify secrets access: {exc}") + raise DenyException( + status_code=504, + content=f"Could not verify secrets access: connection to {self.host} timed out. Please check your network connection.", + ) from exc + except httpx.ConnectError as exc: + # log.debug(f"Connection error while verify secrets access: {exc}") + raise DenyException( + status_code=503, + content=f"Could not verify secrets access: connection to {self.host} failed. Please check if agenta is available.", + ) from exc + except httpx.NetworkError as exc: + # log.debug(f"Network error while verify secrets access: {exc}") + raise DenyException( + status_code=503, + content=f"Could not verify secrets access: connection to {self.host} failed. Please check your network connection.", + ) from exc + except httpx.HTTPError as exc: + # log.debug(f"HTTP error while verify secrets access: {exc}") + raise DenyException( + status_code=502, + content=f"Could not verify secrets access: connection to {self.host} failed. Please check if agenta is available.", + ) from exc + + if response.status_code == 401: + # log.debug("Agenta returned 401 - Invalid credentials") + raise DenyException( + status_code=401, + content="Invalid credentials. Please check your credentials or login again.", + ) + elif response.status_code == 403: + # log.debug("Agenta returned 403 - Permission denied") + raise DenyException( + status_code=403, + content="Out of credits. Please set your LLM provider API keys or contact support.", + ) + elif response.status_code != 200: + # log.debug( + # f"Agenta returned {response.status_code} - Unexpected status code" + # ) + raise DenyException( + status_code=500, + content=f"Could not verify secrets access: {self.host} returned unexpected status code {response.status_code}. Please try again later or contact support if the issue persists.", + ) + + try: + auth = response.json() + except ValueError as exc: + # log.debug(f"Agenta returned invalid JSON response: {exc}") + raise DenyException( + status_code=500, + content=f"Could not verify secrets access: {self.host} returned unexpected invalid JSON response. Please try again later or contact support if the issue persists.", + ) from exc + + if not isinstance(auth, dict): + # log.debug( + # f"Agenta returned invalid response format: {type(auth)}" + # ) + raise DenyException( + status_code=500, + content=f"Could not verify secrets access: {self.host} returned unexpected invalid response format. Please try again later or contact support if the issue persists.", + ) + + effect = auth.get("effect") + + access = effect == "allow" + + if effect != "allow": + # log.debug("Access denied by Agenta - effect: {effect}") + raise DenyException( + status_code=403, + content="Out of credits. Please set your LLM provider API keys or contact support.", + ) + + return + + except DenyException as deny: + _cache.put(_hash, deny) + + raise deny + except Exception as exc: # pylint: disable=bare-except + # log.debug( + # f"Unexpected error while verifying credentials (remote): {exc}" + # ) + raise DenyException( + status_code=500, + content=f"Could not verify credentials: unexpected error - {str(exc)}. Please try again later or contact support if the issue persists.", + ) from exc + + except DenyException as deny: + raise deny + except Exception as exc: + # log.debug(f"Unexpected error while verifying credentials (local): {exc}") + raise DenyException( + status_code=500, + content=f"Could not verify credentials: unexpected error - {str(exc)}. Please try again later or contact support if the issue persists.", + ) from exc diff --git a/sdk/agenta/sdk/tracing/exporters.py b/sdk/agenta/sdk/tracing/exporters.py index e0d3fc4511..5fd2006b7c 100644 --- a/sdk/agenta/sdk/tracing/exporters.py +++ b/sdk/agenta/sdk/tracing/exporters.py @@ -1,7 +1,6 @@ -from typing import Sequence, Dict, List, Optional, Any +from typing import Sequence, Dict, List, Optional from threading import Thread from os import environ -from uuid import UUID from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter from opentelemetry.sdk.trace.export import ( @@ -24,7 +23,7 @@ log = get_module_logger(__name__) -_ASYNC_EXPORT = environ.get("AGENTA_OTLP_ASYNC_EXPORT", "false").lower() in TRUTHY +_ASYNC_EXPORT = environ.get("AGENTA_OTLP_ASYNC_EXPORT", "true").lower() in TRUTHY class InlineTraceExporter(SpanExporter): @@ -51,8 +50,6 @@ def export( self._registry[trace_id].append(span) - return - def shutdown(self) -> None: self._shutdown = True @@ -92,17 +89,17 @@ def __init__( self.credentials = credentials def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult: - grouped_spans: Dict[Optional[str], List[ReadableSpan]] = dict() + grouped_spans: Dict[str, List[str]] = {} for span in spans: trace_id = span.get_span_context().trace_id credentials = None if self.credentials: - credentials = str(self.credentials.get(trace_id)) + credentials = self.credentials.get(trace_id) if credentials not in grouped_spans: - grouped_spans[credentials] = list() + grouped_spans[credentials] = [] grouped_spans[credentials].append(span) @@ -114,16 +111,6 @@ def export(self, spans: Sequence[ReadableSpan]) -> SpanExportResult: credentials=credentials, ) ): - for _span in _spans: - trace_id = _span.get_span_context().trace_id - span_id = _span.get_span_context().span_id - - # log.debug( - # "[SPAN] [EXPORT]", - # trace_id=UUID(int=trace_id).hex, - # span_id=UUID(int=span_id).hex[-16:], - # ) - serialized_spans.append(super().export(_spans)) if all(serialized_spans): @@ -140,31 +127,17 @@ def _export(self, serialized_data: bytes, timeout_sec: Optional[float] = None): def __export(): with suppress(): - resp = None if timeout_sec is not None: - resp = super(OTLPExporter, self)._export( - serialized_data, - timeout_sec, + return super(OTLPExporter, self)._export( + serialized_data, timeout_sec ) else: - resp = super(OTLPExporter, self)._export( - serialized_data, - ) - - # log.debug( - # "[SPAN] [_EXPORT]", - # data=serialized_data, - # resp=resp, - # ) + return super(OTLPExporter, self)._export(serialized_data) if _ASYNC_EXPORT is True: thread = Thread(target=__export, daemon=True) thread.start() else: - # log.debug( - # "[SPAN] [__XPORT]", - # data=serialized_data, - # ) return __export() except Exception as e: diff --git a/sdk/agenta/sdk/tracing/processors.py b/sdk/agenta/sdk/tracing/processors.py index 88c2f40a12..83f5bd5ab9 100644 --- a/sdk/agenta/sdk/tracing/processors.py +++ b/sdk/agenta/sdk/tracing/processors.py @@ -1,7 +1,6 @@ from typing import Optional, Dict, List from threading import Lock from json import dumps -from uuid import UUID from opentelemetry.baggage import get_all as get_baggage from opentelemetry.context import Context @@ -55,15 +54,6 @@ def on_start( span: Span, parent_context: Optional[Context] = None, ) -> None: - trace_id = span.context.trace_id - span_id = span.context.span_id - - # log.debug( - # "[SPAN] [START] ", - # trace_id=UUID(int=trace_id).hex, - # span_id=UUID(int=span_id).hex[-16:], - # ) - for key in self.references.keys(): span.set_attribute(f"ag.refs.{key}", self.references[key]) @@ -165,12 +155,6 @@ def on_end( trace_id = span.context.trace_id span_id = span.context.span_id - # log.debug( - # "[SPAN] [END] ", - # trace_id=UUID(int=trace_id).hex, - # span_id=UUID(int=span_id).hex[-16:], - # ) - self._spans.setdefault(trace_id, []).append(span) self._registry.setdefault(trace_id, {}) self._registry[trace_id].pop(span_id, None) diff --git a/sdk/agenta/sdk/types.py b/sdk/agenta/sdk/types.py index cbdf684dc3..356faeb28d 100644 --- a/sdk/agenta/sdk/types.py +++ b/sdk/agenta/sdk/types.py @@ -387,7 +387,7 @@ class ModelConfig(BaseModel): """Configuration for model parameters""" model: str = MCField( - default="gpt-4o-mini", + default="gpt-3.5-turbo", choices=supported_llm_models, ) @@ -462,154 +462,6 @@ def __init__(self, message: str, original_error: Optional[Exception] = None): super().__init__(message) -import json -import re -from typing import Any, Dict, Iterable, Tuple, Optional - -# --- Optional dependency: python-jsonpath (provides JSONPath + JSON Pointer) --- -try: - import jsonpath # ✅ use module API - from jsonpath import JSONPointer # pointer class is fine to use -except Exception: - jsonpath = None - JSONPointer = None - -# ========= Scheme detection ========= - - -def detect_scheme(expr: str) -> str: - """Return 'json-path', 'json-pointer', or 'dot-notation' based on the placeholder prefix.""" - if expr.startswith("$"): - return "json-path" - if expr.startswith("/"): - return "json-pointer" - return "dot-notation" - - -# ========= Resolvers ========= - - -def resolve_dot_notation(expr: str, data: dict) -> object: - if "[" in expr or "]" in expr: - raise KeyError(f"Bracket syntax is not supported in dot-notation: {expr!r}") - - # First, check if the expression exists as a literal key (e.g., "topic.story" as a single key) - # This allows users to use dots in their variable names without nested access - if expr in data: - return data[expr] - - # If not found as a literal key, try to parse as dot-notation path - cur = data - for token in (p for p in expr.split(".") if p): - if isinstance(cur, list) and token.isdigit(): - cur = cur[int(token)] - else: - if not isinstance(cur, dict): - raise KeyError( - f"Cannot access key {token!r} on non-dict while resolving {expr!r}" - ) - if token not in cur: - raise KeyError(f"Missing key {token!r} while resolving {expr!r}") - cur = cur[token] - return cur - - -def resolve_json_path(expr: str, data: dict) -> object: - if jsonpath is None: - raise ImportError("python-jsonpath is required for json-path ($...)") - - if not (expr == "$" or expr.startswith("$.") or expr.startswith("$[")): - raise ValueError( - f"Invalid json-path expression {expr!r}. " - "Must start with '$', '$.' or '$[' (no implicit normalization)." - ) - - # Use package-level APIf - results = jsonpath.findall(expr, data) # always returns a list - return results[0] if len(results) == 1 else results - - -def resolve_json_pointer(expr: str, data: Dict[str, Any]) -> Any: - """Resolve a JSON Pointer; returns a single value.""" - if JSONPointer is None: - raise ImportError("python-jsonpath is required for json-pointer (/...)") - return JSONPointer(expr).resolve(data) - - -def resolve_any(expr: str, data: Dict[str, Any]) -> Any: - """Dispatch to the right resolver based on detected scheme.""" - scheme = detect_scheme(expr) - if scheme == "json-path": - return resolve_json_path(expr, data) - if scheme == "json-pointer": - return resolve_json_pointer(expr, data) - return resolve_dot_notation(expr, data) - - -# ========= Placeholder & coercion helpers ========= - -_PLACEHOLDER_RE = re.compile(r"\{\{\s*(.*?)\s*\}\}") - - -def extract_placeholders(template: str) -> Iterable[str]: - """Yield the inner text of all {{ ... }} occurrences (trimmed).""" - for m in _PLACEHOLDER_RE.finditer(template): - yield m.group(1).strip() - - -def coerce_to_str(value: Any) -> str: - """Pretty stringify values for embedding into templates.""" - if isinstance(value, (dict, list)): - return json.dumps(value, ensure_ascii=False) - return str(value) - - -def build_replacements( - placeholders: Iterable[str], data: Dict[str, Any] -) -> Tuple[Dict[str, str], set]: - """ - Resolve all placeholders against data. - Returns (replacements, unresolved_placeholders). - """ - replacements: Dict[str, str] = {} - unresolved: set = set() - for expr in set(placeholders): - try: - val = resolve_any(expr, data) - # Escape backslashes to avoid regex replacement surprises - replacements[expr] = coerce_to_str(val).replace("\\", "\\\\") - except Exception: - unresolved.add(expr) - return replacements, unresolved - - -def apply_replacements(template: str, replacements: Dict[str, str]) -> str: - """Replace {{ expr }} using a callback to avoid regex-injection issues.""" - - def _repl(m: re.Match) -> str: - expr = m.group(1).strip() - return replacements.get(expr, m.group(0)) - - return _PLACEHOLDER_RE.sub(_repl, template) - - -def compute_truly_unreplaced(original: set, rendered: str) -> set: - """Only count placeholders that were in the original template and remain.""" - now = set(extract_placeholders(rendered)) - return original & now - - -def missing_lib_hints(unreplaced: set) -> Optional[str]: - """Suggest installing python-jsonpath if placeholders indicate json-path or json-pointer usage.""" - if any(expr.startswith("$") or expr.startswith("/") for expr in unreplaced) and ( - jsonpath is None or JSONPointer is None - ): - return ( - "Install python-jsonpath to enable json-path ($...) and json-pointer (/...)" - ) - return None - - class PromptTemplate(BaseModel): """A template for generating prompts with formatting capabilities""" @@ -656,7 +508,6 @@ def _format_with_template(self, content: str, kwargs: Dict[str, Any]) -> str: try: if self.template_format == "fstring": return content.format(**kwargs) - elif self.template_format == "jinja2": from jinja2 import Template, TemplateError @@ -667,33 +518,35 @@ def _format_with_template(self, content: str, kwargs: Dict[str, Any]) -> str: f"Jinja2 template error in content: '{content}'. Error: {str(e)}", original_error=e, ) - elif self.template_format == "curly": - original_placeholders = set(extract_placeholders(content)) - - replacements, _unresolved = build_replacements( - original_placeholders, kwargs - ) + import re + + # Extract variables that exist in the original template before replacement + # This allows us to distinguish template variables from {{}} in user input values + original_variables = set(re.findall(r"\{\{(.*?)\}\}", content)) + + result = content + for key, value in kwargs.items(): + # Escape backslashes in the replacement string to prevent regex interpretation + escaped_value = str(value).replace("\\", "\\\\") + result = re.sub( + r"\{\{" + re.escape(key) + r"\}\}", escaped_value, result + ) - result = apply_replacements(content, replacements) + # Only check if ORIGINAL template variables remain unreplaced + # Don't error on {{}} that came from user input values + unreplaced_matches = set(re.findall(r"\{\{(.*?)\}\}", result)) + truly_unreplaced = original_variables & unreplaced_matches - truly_unreplaced = compute_truly_unreplaced( - original_placeholders, result - ) if truly_unreplaced: - hint = missing_lib_hints(truly_unreplaced) - suffix = f" Hint: {hint}" if hint else "" raise TemplateFormatError( - f"Unreplaced variables in curly template: {sorted(truly_unreplaced)}.{suffix}" + f"Unreplaced variables in curly template: {sorted(truly_unreplaced)}" ) - return result - else: raise TemplateFormatError( f"Unknown template format: {self.template_format}" ) - except KeyError as e: key = str(e).strip("'") raise TemplateFormatError( @@ -701,8 +554,7 @@ def _format_with_template(self, content: str, kwargs: Dict[str, Any]) -> str: ) except Exception as e: raise TemplateFormatError( - f"Error formatting template '{content}': {str(e)}", - original_error=e, + f"Error formatting template '{content}': {str(e)}", original_error=e ) def _substitute_variables(self, obj: Any, kwargs: Dict[str, Any]) -> Any: diff --git a/sdk/agenta/sdk/utils/logging.py b/sdk/agenta/sdk/utils/logging.py index cc4789b93c..1091ceefd0 100644 --- a/sdk/agenta/sdk/utils/logging.py +++ b/sdk/agenta/sdk/utils/logging.py @@ -8,6 +8,15 @@ import structlog from structlog.typing import EventDict, WrappedLogger, Processor +# from datetime import datetime +# from logging.handlers import RotatingFileHandler + +# from opentelemetry.trace import get_current_span +# from opentelemetry._logs import set_logger_provider +# from opentelemetry.sdk._logs import LoggingHandler, LoggerProvider +# from opentelemetry.sdk._logs.export import BatchLogRecordProcessor +# from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter + TRACE_LEVEL = 1 logging.TRACE = TRACE_LEVEL logging.addLevelName(TRACE_LEVEL, "TRACE") @@ -31,6 +40,15 @@ def bound_logger_trace(self, *args, **kwargs): AGENTA_LOG_CONSOLE_ENABLED = os.getenv("AGENTA_LOG_CONSOLE_ENABLED", "true") == "true" AGENTA_LOG_CONSOLE_LEVEL = os.getenv("AGENTA_LOG_CONSOLE_LEVEL", "TRACE").upper() +# AGENTA_LOG_OTLP_ENABLED = os.getenv("AGENTA_LOG_OTLP_ENABLED", "false") == "true" +# AGENTA_LOG_OTLP_LEVEL = os.getenv("AGENTA_LOG_OTLP_LEVEL", "INFO").upper() + +# AGENTA_LOG_FILE_ENABLED = os.getenv("AGENTA_LOG_FILE_ENABLED", "true") == "true" +# AGENTA_LOG_FILE_LEVEL = os.getenv("AGENTA_LOG_FILE_LEVEL", "WARNING").upper() +# AGENTA_LOG_FILE_BASE = os.getenv("AGENTA_LOG_FILE_PATH", "error") +# LOG_FILE_DATE = datetime.utcnow().strftime("%Y-%m-%d") +# AGENTA_LOG_FILE_PATH = f"{AGENTA_LOG_FILE_BASE}-{LOG_FILE_DATE}.log" + # COLORS LEVEL_COLORS = { "TRACE": "\033[97m", @@ -70,6 +88,15 @@ def process_positional_args(_, __, event_dict: EventDict) -> EventDict: return event_dict +# def add_trace_context(_, __, event_dict: EventDict) -> EventDict: +# span = get_current_span() +# if span and span.get_span_context().is_valid: +# ctx = span.get_span_context() +# event_dict["TraceId"] = format(ctx.trace_id, "032x") +# event_dict["SpanId"] = format(ctx.span_id, "016x") +# return event_dict + + def add_logger_info( logger: WrappedLogger, method_name: str, event_dict: EventDict ) -> EventDict: @@ -116,9 +143,36 @@ def render(_, __, event_dict: EventDict) -> str: return render +# def plain_renderer() -> Processor: +# hidden = { +# "SeverityText", +# "SeverityNumber", +# "MethodName", +# "logger_factory", +# "LoggerName", +# "level", +# } + +# def render(_, __, event_dict: EventDict) -> str: +# ts = event_dict.pop("Timestamp", "")[:23] + "Z" +# level = event_dict.get("level", "") +# msg = event_dict.pop("event", "") +# padded = f"[{level:<5}]" +# logger = f"[{event_dict.pop('logger', '')}]" +# extras = " ".join(f"{k}={v}" for k, v in event_dict.items() if k not in hidden) +# return f"{ts} {padded} {msg} {logger} {extras}" + +# return render + + +# def json_renderer() -> Processor: +# return structlog.processors.JSONRenderer() + + SHARED_PROCESSORS: list[Processor] = [ structlog.processors.TimeStamper(fmt="iso", utc=True, key="Timestamp"), process_positional_args, + # add_trace_context, add_logger_info, structlog.processors.format_exc_info, structlog.processors.dict_tracebacks, @@ -139,30 +193,36 @@ def create_struct_logger( ) -# Guard against double initialization -_LOGGING_CONFIGURED = False - # CONFIGURE HANDLERS AND STRUCTLOG LOGGERS handlers = [] loggers = [] -if AGENTA_LOG_CONSOLE_ENABLED and not _LOGGING_CONFIGURED: - _LOGGING_CONFIGURED = True - - # Check if console logger already has handlers (from OSS module) - console_logger = logging.getLogger("console") - - if not console_logger.handlers: - # Only add handler if it doesn't exist yet - h = logging.StreamHandler(sys.stdout) - h.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL)) - h.setFormatter(logging.Formatter("%(message)s")) - console_logger.addHandler(h) - console_logger.setLevel(TRACE_LEVEL) - console_logger.propagate = False - +if AGENTA_LOG_CONSOLE_ENABLED: + h = logging.StreamHandler(sys.stdout) + h.setLevel(getattr(logging, AGENTA_LOG_CONSOLE_LEVEL, TRACE_LEVEL)) + h.setFormatter(logging.Formatter("%(message)s")) + logging.getLogger("console").addHandler(h) loggers.append(create_struct_logger([colored_console_renderer()], "console")) +# if AGENTA_LOG_FILE_ENABLED: +# h = RotatingFileHandler(AGENTA_LOG_FILE_PATH, maxBytes=10 * 1024 * 1024, backupCount=5) +# h.setLevel(getattr(logging, AGENTA_LOG_FILE_LEVEL, logging.WARNING)) +# h.setFormatter(logging.Formatter("%(message)s")) +# logging.getLogger("file").addHandler(h) +# loggers.append(create_struct_logger([plain_renderer()], "file")) + +# if AGENTA_LOG_OTLP_ENABLED: +# provider = LoggerProvider() +# exporter = OTLPLogExporter() +# provider.add_log_record_processor(BatchLogRecordProcessor(exporter)) +# set_logger_provider(provider) +# h = LoggingHandler( +# level=getattr(logging, AGENTA_LOG_OTLP_LEVEL, logging.INFO), logger_provider=provider +# ) +# h.setFormatter(logging.Formatter("%(message)s")) +# logging.getLogger("otel").addHandler(h) +# loggers.append(create_struct_logger([json_renderer()], "otel")) + class MultiLogger: def __init__(self, *loggers: structlog.stdlib.BoundLogger): diff --git a/sdk/agenta/sdk/workflows/handlers.py b/sdk/agenta/sdk/workflows/handlers.py index 2a72fe20eb..cbee22d305 100644 --- a/sdk/agenta/sdk/workflows/handlers.py +++ b/sdk/agenta/sdk/workflows/handlers.py @@ -76,153 +76,6 @@ def _compute_similarity(embedding_1: List[float], embedding_2: List[float]) -> f return dot / (norm1 * norm2) -import json -import re -from typing import Any, Dict, Iterable, Tuple, Optional - -try: - import jsonpath # ✅ use module API - from jsonpath import JSONPointer # pointer class is fine to use -except Exception: - jsonpath = None - JSONPointer = None - -# ========= Scheme detection ========= - - -def detect_scheme(expr: str) -> str: - """Return 'json-path', 'json-pointer', or 'dot-notation' based on the placeholder prefix.""" - if expr.startswith("$"): - return "json-path" - if expr.startswith("/"): - return "json-pointer" - return "dot-notation" - - -# ========= Resolvers ========= - - -def resolve_dot_notation(expr: str, data: dict) -> object: - if "[" in expr or "]" in expr: - raise KeyError(f"Bracket syntax is not supported in dot-notation: {expr!r}") - - # First, check if the expression exists as a literal key (e.g., "topic.story" as a single key) - # This allows users to use dots in their variable names without nested access - if expr in data: - return data[expr] - - # If not found as a literal key, try to parse as dot-notation path - cur = data - for token in (p for p in expr.split(".") if p): - if isinstance(cur, list) and token.isdigit(): - cur = cur[int(token)] - else: - if not isinstance(cur, dict): - raise KeyError( - f"Cannot access key {token!r} on non-dict while resolving {expr!r}" - ) - if token not in cur: - raise KeyError(f"Missing key {token!r} while resolving {expr!r}") - cur = cur[token] - return cur - - -def resolve_json_path(expr: str, data: dict) -> object: - if jsonpath is None: - raise ImportError("python-jsonpath is required for json-path ($...)") - - if not (expr == "$" or expr.startswith("$.") or expr.startswith("$[")): - raise ValueError( - f"Invalid json-path expression {expr!r}. " - "Must start with '$', '$.' or '$[' (no implicit normalization)." - ) - - # Use package-level APIf - results = jsonpath.findall(expr, data) # always returns a list - return results[0] if len(results) == 1 else results - - -def resolve_json_pointer(expr: str, data: Dict[str, Any]) -> Any: - """Resolve a JSON Pointer; returns a single value.""" - if JSONPointer is None: - raise ImportError("python-jsonpath is required for json-pointer (/...)") - return JSONPointer(expr).resolve(data) - - -def resolve_any(expr: str, data: Dict[str, Any]) -> Any: - """Dispatch to the right resolver based on detected scheme.""" - scheme = detect_scheme(expr) - if scheme == "json-path": - return resolve_json_path(expr, data) - if scheme == "json-pointer": - return resolve_json_pointer(expr, data) - return resolve_dot_notation(expr, data) - - -# ========= Placeholder & coercion helpers ========= - -_PLACEHOLDER_RE = re.compile(r"\{\{\s*(.*?)\s*\}\}") - - -def extract_placeholders(template: str) -> Iterable[str]: - """Yield the inner text of all {{ ... }} occurrences (trimmed).""" - for m in _PLACEHOLDER_RE.finditer(template): - yield m.group(1).strip() - - -def coerce_to_str(value: Any) -> str: - """Pretty stringify values for embedding into templates.""" - if isinstance(value, (dict, list)): - return json.dumps(value, ensure_ascii=False) - return str(value) - - -def build_replacements( - placeholders: Iterable[str], data: Dict[str, Any] -) -> Tuple[Dict[str, str], set]: - """ - Resolve all placeholders against data. - Returns (replacements, unresolved_placeholders). - """ - replacements: Dict[str, str] = {} - unresolved: set = set() - for expr in set(placeholders): - try: - val = resolve_any(expr, data) - # Escape backslashes to avoid regex replacement surprises - replacements[expr] = coerce_to_str(val).replace("\\", "\\\\") - except Exception: - unresolved.add(expr) - return replacements, unresolved - - -def apply_replacements(template: str, replacements: Dict[str, str]) -> str: - """Replace {{ expr }} using a callback to avoid regex-injection issues.""" - - def _repl(m: re.Match) -> str: - expr = m.group(1).strip() - return replacements.get(expr, m.group(0)) - - return _PLACEHOLDER_RE.sub(_repl, template) - - -def compute_truly_unreplaced(original: set, rendered: str) -> set: - """Only count placeholders that were in the original template and remain.""" - now = set(extract_placeholders(rendered)) - return original & now - - -def missing_lib_hints(unreplaced: set) -> Optional[str]: - """Suggest installing python-jsonpath if placeholders indicate json-path or json-pointer usage.""" - if any(expr.startswith("$") or expr.startswith("/") for expr in unreplaced) and ( - jsonpath is None or JSONPointer is None - ): - return ( - "Install python-jsonpath to enable json-path ($...) and json-pointer (/...)" - ) - return None - - def _format_with_template( content: str, format: str, @@ -237,24 +90,33 @@ def _format_with_template( try: return Template(content).render(**kwargs) - except TemplateError: + except TemplateError as e: return content elif format == "curly": - original_placeholders = set(extract_placeholders(content)) + import re - replacements, _unresolved = build_replacements(original_placeholders, kwargs) + # Extract variables that exist in the original template before replacement + # This allows us to distinguish template variables from {{}} in user input values + original_variables = set(re.findall(r"\{\{(.*?)\}\}", content)) - result = apply_replacements(content, replacements) + result = content + for key, value in kwargs.items(): + pattern = r"\{\{" + re.escape(key) + r"\}\}" + old_result = result + # Escape backslashes in the replacement string to prevent regex interpretation + escaped_value = str(value).replace("\\", "\\\\") + result = re.sub(pattern, escaped_value, result) - truly_unreplaced = compute_truly_unreplaced(original_placeholders, result) + # Only check if ORIGINAL template variables remain unreplaced + # Don't error on {{}} that came from user input values + unreplaced_matches = set(re.findall(r"\{\{(.*?)\}\}", result)) + truly_unreplaced = original_variables & unreplaced_matches if truly_unreplaced: - hint = missing_lib_hints(truly_unreplaced) - suffix = f" Hint: {hint}" if hint else "" + log.info(f"WORKFLOW Found unreplaced variables: {truly_unreplaced}") raise ValueError( - f"Template variables not found or unresolved: " - f"{', '.join(sorted(truly_unreplaced))}.{suffix}" + f"Template variables not found in inputs: {', '.join(sorted(truly_unreplaced))}" ) return result @@ -776,31 +638,6 @@ async def auto_ai_critique_v0( got=model, ) - response_type = parameters.get("response_type") or "text" - - if not response_type in ["text", "json_object", "json_schema"]: - raise InvalidConfigurationParameterV0Error( - path="response_type", - expected=["text", "json_object", "json_schema"], - got=response_type, - ) - - json_schema = parameters.get("json_schema") or None - - json_schema = json_schema if response_type == "json_schema" else None - - if response_type == "json_schema" and not isinstance(json_schema, dict): - raise InvalidConfigurationParameterV0Error( - path="json_schema", - expected="dict", - got=json_schema, - ) - - response_format: dict = dict(type=response_type) - - if response_type == "json_schema": - response_format["json_schema"] = json_schema - correct_answer = None if inputs: @@ -854,6 +691,13 @@ async def auto_ai_critique_v0( got=threshold, ) + if not 0.0 < threshold <= 1.0: + raise InvalidConfigurationParameterV0Error( + path="threshold", + expected="float[0.0, 1.0]", + got=threshold, + ) + _outputs = None # -------------------------------------------------------------------------- @@ -921,7 +765,6 @@ async def auto_ai_critique_v0( model=model, messages=formatted_prompt_template, temperature=0.01, - response_format=response_format, ) _outputs = response.choices[0].message.content.strip() # type: ignore @@ -945,20 +788,31 @@ async def auto_ai_critique_v0( pass if isinstance(_outputs, (int, float)): - return { - "score": _outputs, - "success": _outputs >= threshold, - } + return {"score": _outputs, "success": _outputs >= threshold} if isinstance(_outputs, bool): - return { - "success": _outputs, - } + return {"success": _outputs} if isinstance(_outputs, dict): - return _outputs + if "score" in _outputs and "success" in _outputs: + return { + "score": _outputs["score"], + "success": _outputs["success"], + } + + elif "score" in _outputs: + return { + "score": _outputs["score"], + "success": _outputs["score"] >= threshold, + } + + elif "success" in _outputs: + return {"success": _outputs} + + else: + return _outputs - raise InvalidOutputsV0Error(expected=["dict", "str", "int", "float"], got=_outputs) + raise InvalidOutputsV0Error(expected=["dict", "int", "float"], got=_outputs) @instrument(annotate=True) diff --git a/sdk/poetry.lock b/sdk/poetry.lock index 1541ee64cf..b383c0d85e 100644 --- a/sdk/poetry.lock +++ b/sdk/poetry.lock @@ -228,18 +228,18 @@ files = [ [[package]] name = "boto3" -version = "1.40.63" +version = "1.40.62" description = "The AWS SDK for Python" optional = false python-versions = ">=3.9" groups = ["dev"] files = [ - {file = "boto3-1.40.63-py3-none-any.whl", hash = "sha256:f15d4abf1a6283887c336f660cdfc2162a210d2d8f4d98dbcbcef983371c284d"}, - {file = "boto3-1.40.63.tar.gz", hash = "sha256:3bf4b034900c87a6a9b3b3b44c4aec26e96fc73bff2505f0766224b7295178ce"}, + {file = "boto3-1.40.62-py3-none-any.whl", hash = "sha256:f422d4ae3b278832ba807059aafa553164bce2c464cd65b24c9ea8fb8a6c4192"}, + {file = "boto3-1.40.62.tar.gz", hash = "sha256:3dbe7e1e7dc9127a4b1f2020a14f38ffe64fad84df00623e8ab6a5d49a82ea28"}, ] [package.dependencies] -botocore = ">=1.40.63,<1.41.0" +botocore = ">=1.40.62,<1.41.0" jmespath = ">=0.7.1,<2.0.0" s3transfer = ">=0.14.0,<0.15.0" @@ -248,14 +248,14 @@ crt = ["botocore[crt] (>=1.21.0,<2.0a0)"] [[package]] name = "botocore" -version = "1.40.63" +version = "1.40.62" description = "Low-level, data-driven core of boto 3." optional = false python-versions = ">=3.9" groups = ["dev"] files = [ - {file = "botocore-1.40.63-py3-none-any.whl", hash = "sha256:83657b3ee487268fccc9ba022cba572ba657b9ece8cddd1fa241e2c6a49c8c14"}, - {file = "botocore-1.40.63.tar.gz", hash = "sha256:0324552c3c800e258cbcb8c22b495a2e2e0260a7408d08016196e46fa0d1b587"}, + {file = "botocore-1.40.62-py3-none-any.whl", hash = "sha256:780f1d476d4b530ce3b12fd9f7112156d97d99ebdbbd9ef60635b0432af9d3a5"}, + {file = "botocore-1.40.62.tar.gz", hash = "sha256:1e8e57c131597dc234d67428bda1323e8f0a687ea13ea570253159ab9256fa28"}, ] [package.dependencies] @@ -744,14 +744,14 @@ files = [ [[package]] name = "fsspec" -version = "2025.10.0" +version = "2025.9.0" description = "File-system specification" optional = false python-versions = ">=3.9" groups = ["main"] files = [ - {file = "fsspec-2025.10.0-py3-none-any.whl", hash = "sha256:7c7712353ae7d875407f97715f0e1ffcc21e33d5b24556cb1e090ae9409ec61d"}, - {file = "fsspec-2025.10.0.tar.gz", hash = "sha256:b6789427626f068f9a83ca4e8a3cc050850b6c0f71f99ddb4f542b8266a26a59"}, + {file = "fsspec-2025.9.0-py3-none-any.whl", hash = "sha256:530dc2a2af60a414a832059574df4a6e10cce927f6f4a78209390fe38955cfb7"}, + {file = "fsspec-2025.9.0.tar.gz", hash = "sha256:19fd429483d25d28b65ec68f9f4adc16c17ea2c7c7bf54ec61360d478fb19c19"}, ] [package.extras] @@ -784,14 +784,14 @@ tqdm = ["tqdm"] [[package]] name = "google-auth" -version = "2.42.1" +version = "2.42.0" description = "Google Authentication Library" optional = false python-versions = ">=3.7" groups = ["main"] files = [ - {file = "google_auth-2.42.1-py2.py3-none-any.whl", hash = "sha256:eb73d71c91fc95dbd221a2eb87477c278a355e7367a35c0d84e6b0e5f9b4ad11"}, - {file = "google_auth-2.42.1.tar.gz", hash = "sha256:30178b7a21aa50bffbdc1ffcb34ff770a2f65c712170ecd5446c4bef4dc2b94e"}, + {file = "google_auth-2.42.0-py2.py3-none-any.whl", hash = "sha256:f8f944bcb9723339b0ef58a73840f3c61bc91b69bf7368464906120b55804473"}, + {file = "google_auth-2.42.0.tar.gz", hash = "sha256:9bbbeef3442586effb124d1ca032cfb8fb7acd8754ab79b55facd2b8f3ab2802"}, ] [package.dependencies] @@ -2217,21 +2217,6 @@ files = [ [package.extras] cli = ["click (>=5.0)"] -[[package]] -name = "python-jsonpath" -version = "2.0.1" -description = "JSONPath, JSON Pointer and JSON Patch for Python." -optional = false -python-versions = ">=3.8" -groups = ["main"] -files = [ - {file = "python_jsonpath-2.0.1-py3-none-any.whl", hash = "sha256:ebd518b7c883acc5b976518d76b6c96288405edec7d9ef838641869c1e1a5eb7"}, - {file = "python_jsonpath-2.0.1.tar.gz", hash = "sha256:32a84ebb2dc0ec1b42a6e165b0f9174aef8310bad29154ad9aee31ac37cca18f"}, -] - -[package.extras] -strict = ["iregexp-check (>=0.1.4)", "regex"] - [[package]] name = "pyyaml" version = "6.0.3" @@ -3188,4 +3173,4 @@ type = ["pytest-mypy"] [metadata] lock-version = "2.1" python-versions = "^3.11" -content-hash = "e6413824b6ec2fa2e89002d58d6c3432772dc3279619b8f54e4818abaa3b44a7" +content-hash = "14edf246a0775b4245d1b8d10d33092e474aadd3458b78b72d2a13d2bbdae975" diff --git a/sdk/pyproject.toml b/sdk/pyproject.toml index dd3bf54cf0..48c56139a8 100644 --- a/sdk/pyproject.toml +++ b/sdk/pyproject.toml @@ -1,6 +1,6 @@ [tool.poetry] name = "agenta" -version = "0.60.2" +version = "0.60.0" description = "The SDK for agenta is an open-source LLMOps platform." readme = "README.md" authors = [ @@ -34,7 +34,6 @@ pyyaml = "^6.0.2" toml = "^0.10.2" litellm = "==1.78.7" jinja2 = "^3.1.6" -python-jsonpath = "^2.0.0" opentelemetry-api = "^1.27.0" opentelemetry-sdk = "^1.27.0" opentelemetry-instrumentation = ">=0.56b0" diff --git a/web/ee/package.json b/web/ee/package.json index 01946d1ff3..3a7d0209d2 100644 --- a/web/ee/package.json +++ b/web/ee/package.json @@ -1,6 +1,6 @@ { "name": "@agenta/ee", - "version": "0.60.2", + "version": "0.62.0", "private": true, "engines": { "node": ">=18" diff --git a/web/ee/public/assets/Agenta-logo-full-dark-accent.png b/web/ee/public/assets/Agenta-logo-full-dark-accent.png deleted file mode 100644 index c14833dab1..0000000000 Binary files a/web/ee/public/assets/Agenta-logo-full-dark-accent.png and /dev/null differ diff --git a/web/ee/public/assets/Agenta-logo-full-light.png b/web/ee/public/assets/Agenta-logo-full-light.png deleted file mode 100644 index 4c9b31a813..0000000000 Binary files a/web/ee/public/assets/Agenta-logo-full-light.png and /dev/null differ diff --git a/web/ee/public/assets/dark-complete-transparent-CROPPED.png b/web/ee/public/assets/dark-complete-transparent-CROPPED.png new file mode 100644 index 0000000000..7d134ac59a Binary files /dev/null and b/web/ee/public/assets/dark-complete-transparent-CROPPED.png differ diff --git a/web/ee/public/assets/dark-complete-transparent_white_logo.png b/web/ee/public/assets/dark-complete-transparent_white_logo.png new file mode 100644 index 0000000000..8685bbf981 Binary files /dev/null and b/web/ee/public/assets/dark-complete-transparent_white_logo.png differ diff --git a/web/ee/public/assets/dark-logo.svg b/web/ee/public/assets/dark-logo.svg new file mode 100644 index 0000000000..6cb8ef3330 --- /dev/null +++ b/web/ee/public/assets/dark-logo.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/web/ee/public/assets/favicon.ico b/web/ee/public/assets/favicon.ico index dad02fe072..4dc8619b1d 100644 Binary files a/web/ee/public/assets/favicon.ico and b/web/ee/public/assets/favicon.ico differ diff --git a/web/ee/public/assets/light-complete-transparent-CROPPED.png b/web/ee/public/assets/light-complete-transparent-CROPPED.png new file mode 100644 index 0000000000..6be2e99e08 Binary files /dev/null and b/web/ee/public/assets/light-complete-transparent-CROPPED.png differ diff --git a/web/ee/public/assets/light-logo.svg b/web/ee/public/assets/light-logo.svg new file mode 100644 index 0000000000..9c795f8e88 --- /dev/null +++ b/web/ee/public/assets/light-logo.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/index.tsx b/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/index.tsx index 0df5ba5646..d0704f2feb 100644 --- a/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/index.tsx +++ b/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/index.tsx @@ -23,6 +23,7 @@ import {focusScenarioAtom} from "@/oss/components/EvalRunDetails/state/focusScen import {urlStateAtom} from "@/oss/components/EvalRunDetails/state/urlState" import MetricDetailsPopover from "@/oss/components/HumanEvaluations/assets/MetricDetailsPopover" import {formatMetricValue} from "@/oss/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils" +import {getMetricsFromEvaluator} from "@/oss/components/pages/observability/drawer/AnnotateDrawer/assets/transforms" import {getStatusLabel} from "@/oss/lib/constants/statusLabels" import { evaluationRunStateFamily, @@ -45,19 +46,6 @@ import FocusDrawerContentSkeleton from "../Skeletons/FocusDrawerContentSkeleton" import RunOutput, {fallbackPrimitive, resolveOnlineOutput} from "./assets/RunOutput" import RunTraceHeader from "./assets/RunTraceHeader" -import { - getFromAnnotationOutputs, - resolveEvaluatorMetricsMap, - SCENARIO_METRIC_ALIASES, - asEvaluatorArray, - extractEvaluatorSlug, - extractEvaluatorName, - findAnnotationStepKey, - collectSlugCandidates, - collectEvaluatorIdentifiers, - pickString, - buildDrawerMetricDefinition, -} from "./lib/helpers" const EMPTY_COMPARISON_RUN_IDS: string[] = [] @@ -72,7 +60,139 @@ const emptyMetricDataAtom = atom<{value: any; distInfo?: any}>({ distInfo: undefined, }) -export interface DrawerMetricValueCellProps { +const SCENARIO_METRIC_ALIASES: Record = { + "attributes.ag.metrics.costs.cumulative.total": ["totalCost", "costs.total", "cost"], + "attributes.ag.metrics.duration.cumulative": ["duration.total", "duration"], + "attributes.ag.metrics.tokens.cumulative.total": ["totalTokens", "tokens.total", "tokens"], + "attributes.ag.metrics.errors.cumulative": ["errors"], + totalCost: ["attributes.ag.metrics.costs.cumulative.total", "costs.total", "cost"], + "duration.total": ["attributes.ag.metrics.duration.cumulative", "duration"], + totalTokens: ["attributes.ag.metrics.tokens.cumulative.total", "tokens.total", "tokens"], + promptTokens: ["attributes.ag.metrics.tokens.cumulative.total", "tokens", "tokens.prompt"], + completionTokens: [ + "attributes.ag.metrics.tokens.cumulative.total", + "tokens", + "tokens.completion", + ], + errors: ["attributes.ag.metrics.errors.cumulative"], +} + +const asEvaluatorArray = (input: any): any[] => { + if (!input) return [] + if (Array.isArray(input)) return input + if (typeof input === "object") return Object.values(input) + return [] +} + +const pickString = (candidate: unknown): string | undefined => { + if (typeof candidate === "string") { + const trimmed = candidate.trim() + if (trimmed.length > 0) return trimmed + } + return undefined +} + +const collectEvaluatorIdentifiers = (entry: any): string[] => { + if (!entry || typeof entry !== "object") return [] + const ids = new Set() + ;[ + entry.slug, + entry.id, + entry.key, + entry.uid, + entry.evaluator_key, + entry?.data?.slug, + entry?.data?.id, + entry?.data?.key, + entry?.data?.evaluator_key, + entry?.meta?.slug, + entry?.meta?.id, + entry?.meta?.key, + entry?.flags?.slug, + entry?.flags?.id, + entry?.flags?.key, + entry?.flags?.evaluator_key, + entry?.references?.slug, + entry?.references?.id, + entry?.references?.key, + ].forEach((candidate) => { + const value = pickString(candidate) + if (value) ids.add(value) + }) + return Array.from(ids) +} + +const extractEvaluatorSlug = (entry: any): string | undefined => { + if (!entry || typeof entry !== "object") return undefined + const candidates = collectEvaluatorIdentifiers(entry) + if (candidates.length) return candidates[0] + return undefined +} + +const extractEvaluatorName = (entry: any): string | undefined => { + if (!entry || typeof entry !== "object") return undefined + const candidates = [ + entry?.name, + entry?.displayName, + entry?.display_name, + entry?.title, + entry?.label, + entry?.meta?.displayName, + entry?.meta?.display_name, + entry?.meta?.name, + entry?.flags?.display_name, + entry?.flags?.name, + entry?.data?.display_name, + entry?.data?.name, + ] + for (const candidate of candidates) { + const value = pickString(candidate) + if (value) return value + } + return undefined +} + +const asRecord = (value: any): Record | undefined => { + if (!value || typeof value !== "object" || Array.isArray(value)) return undefined + const entries = Object.entries(value) + if (!entries.length) return undefined + return value as Record +} + +const extractSchemaProperties = (entry: any): Record | undefined => { + if (!entry || typeof entry !== "object") return undefined + const candidates = [ + entry?.data?.schemas?.outputs?.properties, + entry?.data?.schemas?.output?.properties, + entry?.data?.service?.format?.properties?.outputs?.properties, + entry?.data?.service?.properties?.outputs?.properties, + entry?.data?.output_schema?.properties, + entry?.data?.outputs_schema?.properties, + entry?.output_schema?.properties, + entry?.schema?.properties, + ] + for (const candidate of candidates) { + const record = asRecord(candidate) + if (record) return record + } + return undefined +} + +const resolveEvaluatorMetricsMap = (entry: any): Record | undefined => { + if (!entry || typeof entry !== "object") return undefined + const direct = asRecord(entry.metrics) + if (direct) return direct + + const schemaProps = extractSchemaProperties(entry) + if (schemaProps) return schemaProps + + const derived = asRecord(getMetricsFromEvaluator(entry as any)) + if (derived) return derived + + return undefined +} + +interface DrawerMetricValueCellProps { runId: string scenarioId?: string evaluatorSlug: string @@ -96,13 +216,172 @@ interface EvaluatorContext { errorStep?: EvaluatorFailure } -export interface DrawerEvaluatorMetric { +interface DrawerEvaluatorMetric { id: string displayName: string metricKey?: string fallbackKeys?: string[] } +const normalizeMetricPrimaryKey = (slug: string | undefined, rawKey: string): string => { + const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined + const trimmed = rawKey.trim() + if (!trimmed) return normalizedSlug ?? "" + if (normalizedSlug) { + const prefix = `${normalizedSlug}.` + if (trimmed.startsWith(prefix)) return trimmed + } + if (trimmed.includes(".")) return trimmed + return normalizedSlug ? `${normalizedSlug}.${trimmed}` : trimmed +} + +const collectMetricFallbackKeys = ( + slug: string | undefined, + rawKey: string, + primaryKey: string, + meta: any, +): string[] => { + const set = new Set() + const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined + const push = (value?: string) => { + if (!value) return + const trimmed = String(value).trim() + if (!trimmed) return + if (trimmed.includes(".") || !normalizedSlug) { + set.add(trimmed) + } else { + set.add(`${normalizedSlug}.${trimmed}`) + } + } + + push(rawKey) + + const aliases = Array.isArray(meta?.aliases) + ? meta?.aliases + : meta?.aliases + ? [meta.aliases] + : meta?.alias + ? [meta.alias] + : [] + aliases.forEach(push) + + const extraKeys = [ + meta?.metricKey, + meta?.metric_key, + meta?.key, + meta?.path, + meta?.fullKey, + meta?.full_key, + meta?.canonicalKey, + meta?.canonical_key, + meta?.statsKey, + meta?.stats_key, + meta?.metric, + ] + extraKeys.forEach(push) + + const fallbackKeys = Array.from(set).filter((value) => value !== rawKey && value !== primaryKey) + return fallbackKeys +} + +const buildDrawerMetricDefinition = ( + slug: string | undefined, + rawKey: string, + meta: any, +): DrawerEvaluatorMetric => { + const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined + const normalizedDisplay = + normalizedSlug && rawKey.startsWith(`${normalizedSlug}.`) + ? rawKey.slice(normalizedSlug.length + 1) + : rawKey + const primaryKey = normalizeMetricPrimaryKey(slug, rawKey) + const fallbackKeys = collectMetricFallbackKeys(slug, rawKey, primaryKey, meta) + const id = canonicalizeMetricKey(primaryKey) || primaryKey + + return { + id, + displayName: normalizedDisplay || primaryKey, + metricKey: primaryKey, + fallbackKeys: fallbackKeys.length ? fallbackKeys : undefined, + } +} + +const collectCandidateSteps = (data?: UseEvaluationRunScenarioStepsFetcherResult): any[] => { + if (!data) return [] + const buckets: any[] = [] + if (Array.isArray(data.annotationSteps)) buckets.push(...(data.annotationSteps as any[])) + if (Array.isArray(data.steps)) buckets.push(...(data.steps as any[])) + if (Array.isArray(data.invocationSteps)) buckets.push(...(data.invocationSteps as any[])) + return buckets +} + +const collectSlugCandidates = ( + data: UseEvaluationRunScenarioStepsFetcherResult | undefined, + evaluatorSlug: string, +): string[] => { + const set = new Set() + const push = (value?: string | null) => { + if (!value) return + const normalized = String(value).trim() + if (!normalized) return + set.add(normalized) + } + + push(evaluatorSlug) + + const steps = collectCandidateSteps(data) + steps.forEach((step) => { + if (!step) return + const ref: any = step?.references?.evaluator + push(step?.stepKey as any) + push(step?.stepkey as any) + push(step?.step_key as any) + push(ref?.slug) + push(ref?.key) + push(ref?.id) + }) + + return Array.from(set) +} + +const findAnnotationStepKey = ( + data: UseEvaluationRunScenarioStepsFetcherResult | undefined, + slugCandidates: string[], +): string | undefined => { + if (!data) return undefined + + const steps = collectCandidateSteps(data) + if (!steps.length) return undefined + + const loweredCandidates = slugCandidates + .map((slug) => String(slug).toLowerCase()) + .filter((slug) => slug.length > 0) + + const matched = steps.find((step) => { + if (!step) return false + const possible: string[] = [ + (step as any)?.stepKey, + (step as any)?.stepkey, + (step as any)?.step_key, + (step as any)?.references?.evaluator?.slug, + (step as any)?.references?.evaluator?.key, + (step as any)?.references?.evaluator?.id, + ] + + return possible + .filter(Boolean) + .map((value) => String(value).toLowerCase()) + .some((candidate) => loweredCandidates.includes(candidate)) + }) + + return ( + (matched as any)?.stepKey ?? + (matched as any)?.stepkey ?? + (matched as any)?.step_key ?? + undefined + ) +} + const EvaluatorFailureDisplay = ({ status, error, @@ -270,8 +549,6 @@ const DrawerMetricValueCell = ({ const bareMetricData = useAtomValue(runScopedAtoms.bare) const canonicalBareMetricData = useAtomValue(runScopedAtoms.canonicalBare) - const runScopedStats = useAtomValue(runMetricsStatsCacheFamily(runId)) - const runScopedResult = useMemo(() => { const candidates = [ {key: normalizedPrimaryKey, data: primaryMetricData}, @@ -481,41 +758,8 @@ const DrawerMetricValueCell = ({ evaluatorSlug, ]) - // Prefer run-scoped/metrics-map value; if it is missing or schema-like, fallback to annotation outputs - const annotationFallback = useMemo(() => { - const v = resolution.rawValue - const isSchemaLike = - v && - typeof v === "object" && - !Array.isArray(v) && - Object.keys(v as any).length <= 2 && - "type" in (v as any) - - const unusable = - v === undefined || - v === null || - (typeof v === "string" && !v.trim()) || - (typeof v === "number" && Number.isNaN(v)) || - isSchemaLike - - if (!unusable) return undefined - - return getFromAnnotationOutputs({ - scenarioStepsResult, - slugCandidates, - evaluatorSlug, - expandedCandidates, - }) - }, [ - resolution.rawValue, - scenarioStepsResult, - slugCandidates, - evaluatorSlug, - expandedCandidates, - ]) - - const rawValue = annotationFallback?.value ?? resolution.rawValue - const matchedKey = annotationFallback?.matchedKey ?? resolution.matchedKey + const rawValue = resolution.rawValue + const matchedKey = resolution.matchedKey const distInfo = useMemo(() => { if (resolution.distInfo !== undefined) return resolution.distInfo @@ -640,7 +884,6 @@ const DrawerMetricValueCell = ({ const editorKey = `${runId}-${scenarioId}-${evaluatorSlug}-${metricName}` return ( {}} initialValue={display} @@ -663,10 +906,9 @@ const DrawerMetricValueCell = ({ const tagNode = ( {display} @@ -1852,7 +2094,7 @@ const FocusDrawerContent = () => { const isPrevOpen = !!(prevSlug && activeKeys.includes(prevSlug)) const metricMap = new Map() - const metricHelper = (meta: any, rawKey: string) => { + Object.entries(metrics || {}).forEach(([rawKey, meta]) => { const definition = buildDrawerMetricDefinition( evaluator.slug, String(rawKey), @@ -1875,16 +2117,6 @@ const FocusDrawerContent = () => { : undefined, }) } - } - - Object.entries(metrics || {}).forEach(([rawKey, meta]) => { - if (meta.properties) { - Object.entries(meta.properties).forEach(([propKey, propMeta]) => { - metricHelper(propMeta, `${rawKey}.${propKey}`) - }) - } else { - metricHelper(meta, rawKey) - } }) const metricDefs = Array.from(metricMap.values()) @@ -1919,7 +2151,7 @@ const FocusDrawerContent = () => { error: scenarioStepsError, }} sectionId={`section-${evaluator.slug}`} - metricRowClassName="flex flex-col items-start gap-1 mb-3 w-full" + metricRowClassName="flex flex-col items-start gap-1 mb-3" /> ), } diff --git a/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/lib/helpers.ts b/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/lib/helpers.ts deleted file mode 100644 index a84d3d1d55..0000000000 --- a/web/ee/src/components/EvalRunDetails/AutoEvalRun/components/EvalRunFocusDrawer/assets/FocusDrawerContent/lib/helpers.ts +++ /dev/null @@ -1,401 +0,0 @@ -import {UseEvaluationRunScenarioStepsFetcherResult} from "@/oss/lib/hooks/useEvaluationRunScenarioSteps/types" -import {DrawerEvaluatorMetric, DrawerMetricValueCellProps} from ".." -import {canonicalizeMetricKey} from "@/oss/lib/metricUtils" -import {getMetricsFromEvaluator} from "@/oss/components/pages/observability/drawer/AnnotateDrawer/assets/transforms" - -export const SCENARIO_METRIC_ALIASES: Record = { - "attributes.ag.metrics.costs.cumulative.total": ["totalCost", "costs.total", "cost"], - "attributes.ag.metrics.duration.cumulative": ["duration.total", "duration"], - "attributes.ag.metrics.tokens.cumulative.total": ["totalTokens", "tokens.total", "tokens"], - "attributes.ag.metrics.errors.cumulative": ["errors"], - totalCost: ["attributes.ag.metrics.costs.cumulative.total", "costs.total", "cost"], - "duration.total": ["attributes.ag.metrics.duration.cumulative", "duration"], - totalTokens: ["attributes.ag.metrics.tokens.cumulative.total", "tokens.total", "tokens"], - promptTokens: ["attributes.ag.metrics.tokens.cumulative.total", "tokens", "tokens.prompt"], - completionTokens: [ - "attributes.ag.metrics.tokens.cumulative.total", - "tokens", - "tokens.completion", - ], - errors: ["attributes.ag.metrics.errors.cumulative"], -} - -export const asEvaluatorArray = (input: any): any[] => { - if (!input) return [] - if (Array.isArray(input)) return input - if (typeof input === "object") return Object.values(input) - return [] -} - -export const pickString = (candidate: unknown): string | undefined => { - if (typeof candidate === "string") { - const trimmed = candidate.trim() - if (trimmed.length > 0) return trimmed - } - return undefined -} - -export const collectEvaluatorIdentifiers = (entry: any): string[] => { - if (!entry || typeof entry !== "object") return [] - const ids = new Set() - ;[ - entry.slug, - entry.id, - entry.key, - entry.uid, - entry.evaluator_key, - entry?.data?.slug, - entry?.data?.id, - entry?.data?.key, - entry?.data?.evaluator_key, - entry?.meta?.slug, - entry?.meta?.id, - entry?.meta?.key, - entry?.flags?.slug, - entry?.flags?.id, - entry?.flags?.key, - entry?.flags?.evaluator_key, - entry?.references?.slug, - entry?.references?.id, - entry?.references?.key, - ].forEach((candidate) => { - const value = pickString(candidate) - if (value) ids.add(value) - }) - return Array.from(ids) -} - -export const extractEvaluatorSlug = (entry: any): string | undefined => { - if (!entry || typeof entry !== "object") return undefined - const candidates = collectEvaluatorIdentifiers(entry) - if (candidates.length) return candidates[0] - return undefined -} - -export const extractEvaluatorName = (entry: any): string | undefined => { - if (!entry || typeof entry !== "object") return undefined - const candidates = [ - entry?.name, - entry?.displayName, - entry?.display_name, - entry?.title, - entry?.label, - entry?.meta?.displayName, - entry?.meta?.display_name, - entry?.meta?.name, - entry?.flags?.display_name, - entry?.flags?.name, - entry?.data?.display_name, - entry?.data?.name, - ] - for (const candidate of candidates) { - const value = pickString(candidate) - if (value) return value - } - return undefined -} - -export const asRecord = (value: any): Record | undefined => { - if (!value || typeof value !== "object" || Array.isArray(value)) return undefined - const entries = Object.entries(value) - if (!entries.length) return undefined - return value as Record -} - -export const extractSchemaProperties = (entry: any): Record | undefined => { - if (!entry || typeof entry !== "object") return undefined - const candidates = [ - entry?.data?.schemas?.outputs?.properties, - entry?.data?.schemas?.output?.properties, - entry?.data?.service?.format?.properties?.outputs?.properties, - entry?.data?.service?.properties?.outputs?.properties, - entry?.data?.output_schema?.properties, - entry?.data?.outputs_schema?.properties, - entry?.output_schema?.properties, - entry?.schema?.properties, - ] - for (const candidate of candidates) { - const record = asRecord(candidate) - if (record) return record - } - return undefined -} - -export const resolveEvaluatorMetricsMap = (entry: any): Record | undefined => { - if (!entry || typeof entry !== "object") return undefined - const direct = asRecord(entry.metrics) - if (direct) return direct - - const schemaProps = extractSchemaProperties(entry) - if (schemaProps) return schemaProps - - const derived = asRecord(getMetricsFromEvaluator(entry as any)) - if (derived) return derived - - return undefined -} - -export const normalizeMetricPrimaryKey = (slug: string | undefined, rawKey: string): string => { - const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined - const trimmed = rawKey.trim() - if (!trimmed) return normalizedSlug ?? "" - if (normalizedSlug) { - const prefix = `${normalizedSlug}.` - if (trimmed.startsWith(prefix)) return trimmed - } - if (trimmed.includes(".")) return trimmed - return normalizedSlug ? `${normalizedSlug}.${trimmed}` : trimmed -} - -export const collectMetricFallbackKeys = ( - slug: string | undefined, - rawKey: string, - primaryKey: string, - meta: any, -): string[] => { - const set = new Set() - const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined - const push = (value?: string) => { - if (!value) return - const trimmed = String(value).trim() - if (!trimmed) return - if (trimmed.includes(".") || !normalizedSlug) { - set.add(trimmed) - } else { - set.add(`${normalizedSlug}.${trimmed}`) - } - } - - push(rawKey) - - const aliases = Array.isArray(meta?.aliases) - ? meta?.aliases - : meta?.aliases - ? [meta.aliases] - : meta?.alias - ? [meta.alias] - : [] - aliases.forEach(push) - - const extraKeys = [ - meta?.metricKey, - meta?.metric_key, - meta?.key, - meta?.path, - meta?.fullKey, - meta?.full_key, - meta?.canonicalKey, - meta?.canonical_key, - meta?.statsKey, - meta?.stats_key, - meta?.metric, - ] - extraKeys.forEach(push) - - const fallbackKeys = Array.from(set).filter((value) => value !== rawKey && value !== primaryKey) - return fallbackKeys -} - -export const buildDrawerMetricDefinition = ( - slug: string | undefined, - rawKey: string, - meta: any, -): DrawerEvaluatorMetric => { - const normalizedSlug = slug && slug.trim().length > 0 ? slug.trim() : undefined - const normalizedDisplay = - normalizedSlug && rawKey.startsWith(`${normalizedSlug}.`) - ? rawKey.slice(normalizedSlug.length + 1) - : rawKey - const primaryKey = normalizeMetricPrimaryKey(slug, rawKey) - const fallbackKeys = collectMetricFallbackKeys(slug, rawKey, primaryKey, meta) - const id = canonicalizeMetricKey(primaryKey) || primaryKey - - return { - id, - displayName: normalizedDisplay || primaryKey, - metricKey: primaryKey, - fallbackKeys: fallbackKeys.length ? fallbackKeys : undefined, - } -} - -export const collectCandidateSteps = (data?: UseEvaluationRunScenarioStepsFetcherResult): any[] => { - if (!data) return [] - const buckets: any[] = [] - if (Array.isArray(data.annotationSteps)) buckets.push(...(data.annotationSteps as any[])) - if (Array.isArray(data.steps)) buckets.push(...(data.steps as any[])) - if (Array.isArray(data.invocationSteps)) buckets.push(...(data.invocationSteps as any[])) - return buckets -} - -export const collectSlugCandidates = ( - data: UseEvaluationRunScenarioStepsFetcherResult | undefined, - evaluatorSlug: string, -): string[] => { - const set = new Set() - const push = (value?: string | null) => { - if (!value) return - const normalized = String(value).trim() - if (!normalized) return - set.add(normalized) - } - - push(evaluatorSlug) - - const steps = collectCandidateSteps(data) - steps.forEach((step) => { - if (!step) return - const ref: any = step?.references?.evaluator - push(step?.stepKey as any) - push(step?.stepkey as any) - push(step?.step_key as any) - push(ref?.slug) - push(ref?.key) - push(ref?.id) - }) - - return Array.from(set) -} - -export const findAnnotationStepKey = ( - data: UseEvaluationRunScenarioStepsFetcherResult | undefined, - slugCandidates: string[], -): string | undefined => { - if (!data) return undefined - - const steps = collectCandidateSteps(data) - if (!steps.length) return undefined - - const loweredCandidates = slugCandidates - .map((slug) => String(slug).toLowerCase()) - .filter((slug) => slug.length > 0) - - const matched = steps.find((step) => { - if (!step) return false - const possible: string[] = [ - (step as any)?.stepKey, - (step as any)?.stepkey, - (step as any)?.step_key, - (step as any)?.references?.evaluator?.slug, - (step as any)?.references?.evaluator?.key, - (step as any)?.references?.evaluator?.id, - ] - - return possible - .filter(Boolean) - .map((value) => String(value).toLowerCase()) - .some((candidate) => loweredCandidates.includes(candidate)) - }) - - return ( - (matched as any)?.stepKey ?? - (matched as any)?.stepkey ?? - (matched as any)?.step_key ?? - undefined - ) -} - -/** Return the best primitive/array value from annotationSteps[].annotation.data.outputs */ -export const getFromAnnotationOutputs = ({ - scenarioStepsResult, - slugCandidates, - evaluatorSlug, - expandedCandidates, -}: { - scenarioStepsResult?: DrawerMetricValueCellProps["scenarioStepsResult"] - slugCandidates: string[] - evaluatorSlug: string - expandedCandidates: string[] -}): {value: any; matchedKey?: string} | undefined => { - const data = scenarioStepsResult?.data - if (!data || !Array.isArray(data.annotationSteps)) return undefined - - // choose only annotation steps that belong to any of our slug candidates - const pool = new Set(slugCandidates.map((s) => String(s).toLowerCase())) - const steps = (data.annotationSteps as any[]).filter((s) => { - const sk = s?.stepKey ?? s?.stepkey ?? s?.step_key - const ref = s?.references?.evaluator - const ids = [sk, ref?.slug, ref?.key, ref?.id] - .filter(Boolean) - .map((x) => String(x).toLowerCase()) - return ids.some((id) => pool.has(id)) - }) - - if (!steps.length) return undefined - - // outputs pockets we’re allowed to read as fallback - const outputsOf = (s: any) => - [s?.annotation?.data?.outputs, s?.data?.outputs, s?.outputs].filter( - (o) => o && typeof o === "object", - ) as Record[] - - const isPrimitive = (v: unknown) => - v === null || ["string", "number", "boolean"].includes(typeof v) - - const stripPfx = (k: string) => { - const PFX = [ - "attributes.ag.data.outputs.", - "ag.data.outputs.", - "outputs.", - `${evaluatorSlug}.`, - ] - for (const p of PFX) if (k.startsWith(p)) return k.slice(p.length) - return k - } - - const pathGet = (obj: any, path: string) => - path.split(".").reduce((acc, k) => (acc == null ? acc : acc[k]), obj) - - // 1) exact/bare path tries inside outputs - for (const s of steps) { - for (const outs of outputsOf(s)) { - for (const cand of expandedCandidates) { - const bare = stripPfx(cand) - for (const v of new Set([stripPfx(cand), bare, `extra.${bare}`])) { - const val = pathGet(outs, v) - if (val !== undefined && (isPrimitive(val) || Array.isArray(val))) { - return {value: val, matchedKey: v} - } - } - } - } - } - - // 2) fuzzy DFS through outputs (skip schema objects like { type: ... }) - const canonical = (s?: string) => - typeof s === "string" ? s.toLowerCase().replace(/[^a-z0-9]+/g, "") : "" - - const terminals = new Set( - expandedCandidates.map((k) => stripPfx(k).split(".").pop()!).map(canonical), - ) - - const looksLikeSchema = (o: any) => - o && - typeof o === "object" && - !Array.isArray(o) && - Object.keys(o).length <= 2 && - "type" in o && - (Object.keys(o).length === 1 || "description" in o) - - const dfs = (obj: any, path: string[] = []): {value: any; matchedKey: string} | undefined => { - if (!obj || typeof obj !== "object") return - for (const [k, v] of Object.entries(obj)) { - const p = [...path, k] - if (isPrimitive(v) || Array.isArray(v)) { - const hit = terminals.has(canonical(k)) || terminals.has(canonical(p[p.length - 1])) - if (hit) return {value: v, matchedKey: p.join(".")} - } else if (!looksLikeSchema(v)) { - const h = dfs(v, p) - if (h) return h - } - } - } - - for (const s of steps) { - for (const outs of outputsOf(s)) { - const hit = dfs(outs) - if (hit) return hit - } - } - - return undefined -} diff --git a/web/ee/src/components/EvalRunDetails/components/EvalRunOverviewViewer/index.tsx b/web/ee/src/components/EvalRunDetails/components/EvalRunOverviewViewer/index.tsx index fc434a83b1..68763db144 100644 --- a/web/ee/src/components/EvalRunDetails/components/EvalRunOverviewViewer/index.tsx +++ b/web/ee/src/components/EvalRunDetails/components/EvalRunOverviewViewer/index.tsx @@ -756,13 +756,8 @@ const EvalRunOverviewViewer = ({type = "auto"}: {type: "auto" | "online"}) => {
{hasMetrics ? ( <> - {combinedMetricEntries - .filter( - (entry) => - entry.metric?.type !== "string" && - entry.metric?.type !== "json", - ) - .map(({fullKey, metric, evaluatorSlug, metricKey}, idx) => { + {combinedMetricEntries.map( + ({fullKey, metric, evaluatorSlug, metricKey}, idx) => { if (!metric || !Object.keys(metric || {}).length) return null const isBooleanMetric = @@ -1171,7 +1166,8 @@ const EvalRunOverviewViewer = ({type = "auto"}: {type: "auto" | "online"}) => { placeholderDescription={placeholderCopy?.description} /> ) - })} + }, + )} {placeholderCards.length ? placeholderCards : null} ) : placeholderCards.length ? ( diff --git a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/ScenarioTable.tsx b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/ScenarioTable.tsx index 8de842f3a6..a7d10a41c3 100644 --- a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/ScenarioTable.tsx +++ b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/ScenarioTable.tsx @@ -22,7 +22,7 @@ import { type QueryWindowingPayload, } from "../../../../services/onlineEvaluations/api" import {EvalRunTestcaseTableSkeleton} from "../../AutoEvalRun/components/EvalRunTestcaseViewer/assets/EvalRunTestcaseViewerSkeleton" -import type {TableRow} from "./types" +import type {TableRow} from "../types" import useScrollToScenario from "./hooks/useScrollToScenario" import useTableDataSource from "./hooks/useTableDataSource" diff --git a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx index b433a2c116..96452ec02e 100644 --- a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx +++ b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx @@ -442,8 +442,6 @@ export const InputSummaryCell = memo( const deepMerge = (target: Record, source?: Record) => { if (!source || typeof source !== "object") return target Object.entries(source).forEach(([key, rawValue]) => { - // Prevent prototype pollution by excluding dangerous keys - if (key === "__proto__" || key === "constructor" || key === "prototype") return const parsed = tryParseJson(rawValue) const value = parsed if (value && typeof value === "object" && !Array.isArray(value)) { diff --git a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/MetricCell/MetricCell.tsx b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/MetricCell/MetricCell.tsx index 28268c68eb..1a6b7d210f 100644 --- a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/MetricCell/MetricCell.tsx +++ b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/MetricCell/MetricCell.tsx @@ -1,4 +1,4 @@ -import {type ReactNode, memo, useCallback, useMemo} from "react" +import {type ReactNode, memo, useCallback, useEffect, useMemo} from "react" import {Tag, Tooltip} from "antd" import clsx from "clsx" @@ -27,7 +27,7 @@ import {EvaluationStatus} from "@/oss/lib/Types" import {STATUS_COLOR_TEXT} from "../../../EvalRunScenarioStatusTag/assets" import {CellWrapper} from "../CellComponents" // CellWrapper is default export? need to check. -import {resolveAnnotationMetricValue, resolveStepFailure} from "./helpers" +import {EvaluatorFailure, hasFailureStatus, resolveAnnotationMetricValue} from "./helpers" import {AnnotationValueCellProps, MetricCellProps, MetricValueCellProps} from "./types" /* @@ -159,22 +159,8 @@ const MetricCell = memo( formatted = String(value) } - // 1) Detect string by the actual value, not by metricType - const isPlainString = typeof value === "string" - - // 2) When string, render as a wrapped block (no popover) - if (isPlainString) { - return ( - -
- {value as string} -
-
- ) - } - - // 3) Only show popover for non-strings - if (distInfo && !isPlainString) { + // Wrap in popover when distInfo present + if (distInfo && metricType !== "string") { return ( data.name in x, - )?.[data.name]?.metricType - const kind: ColumnDef["kind"] = - type === "string" ? "annotation" : "metric" - return { ...data, name: data.name, key: `${metricKey}.${data.name}`, title: `${formattedName} ${isMean ? "(mean)" : ""}`.trim(), - kind, + kind: "metric", path: fullPath, fallbackPath: legacyPath, stepKey: "metric", - metricType: type, + metricType: metricsFromEvaluators[metricKey]?.find( + (x) => data.name in x, + )?.[data.name]?.metricType, } } return undefined @@ -310,17 +298,15 @@ export function buildScenarioTableData({ return undefined } const formattedName = formatColumnTitle(metricName) - const type = def?.[metricName]?.metricType - const kind: ColumnDef["kind"] = type === "string" ? "annotation" : "metric" return { name: metricName, key: `${metricKey}.${metricName}`, title: formattedName, - kind, - path: fullPath, - fallbackPath: fullPath, + kind: "metric" as const, + path: `${metricKey}.${metricName}`, + fallbackPath: `${metricKey}.${metricName}`, stepKey: "metric", - metricType: type, + metricType: def?.[metricName]?.metricType, } }) .filter(Boolean) as any[] diff --git a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/utils.tsx b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/utils.tsx index 93df472fcf..f5cc20bd06 100644 --- a/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/utils.tsx +++ b/web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/utils.tsx @@ -35,24 +35,7 @@ import {AnnotationValueCell, EvaluatorFailureCell, MetricValueCell} from "./Metr import TimestampCell from "./TimestampCell" import {BaseColumn, TableColumn} from "./types" -// ---------------- Helpers to detect/normalize annotation-like metric paths ---------------- -const OUT_PREFIX = "attributes.ag.data.outputs." -const IN_PREFIX = "attributes.ag.data.inputs." - -/** A “metric” column that actually points inside the annotation payload. */ -const isAnnotationLikeMetricPath = (p?: string) => - typeof p === "string" && (p.includes(OUT_PREFIX) || p.includes(IN_PREFIX)) - -/** Strip the run-scoped prefix to the field path used by AnnotationValueCell helpers. */ -const toAnnotationFieldPath = (p: string) => - p.includes(OUT_PREFIX) - ? p.slice(OUT_PREFIX.length) - : p.includes(IN_PREFIX) - ? p.slice(IN_PREFIX.length) - : p -// ------------------------------------------------------------------------------------------ - -// Helper to compare metric/annotation primitives across scenarios (used for sorting metrics) +// Helper to compare metric/annotation primitives across scenarios function scenarioMetricPrimitive(recordKey: string, column: any, runId: string) { const st = evalAtomStore() let raw: any = column.values?.[recordKey] @@ -119,6 +102,9 @@ function scenarioMetricSorter(column: any, runId: string) { /** * Transforms a list of scenario metrics into a map of scenarioId -> metrics, merging * nested metrics under `outputs` into the same level. + * + * @param {{scenarioMetrics: any[]}} props - The props object containing the metrics. + * @returns {Record>} - A map of scenarioId -> metrics. */ export const getScenarioMetricsMap = ({scenarioMetrics}: {scenarioMetrics: any[]}) => { const map: Record> = {} @@ -128,9 +114,11 @@ export const getScenarioMetricsMap = ({scenarioMetrics}: {scenarioMetrics: any[] const sid = m.scenarioId if (!sid) return + // Clone the data object to avoid accidental mutations const data: Record = m && typeof m === "object" && m.data && typeof m.data === "object" ? {...m.data} : {} + // If metrics are nested under `outputs`, merge them into the same level if (data.outputs && typeof data.outputs === "object") { Object.assign(data, data.outputs) delete data.outputs @@ -157,7 +145,6 @@ const generateColumnTitle = (col: BaseColumn) => { if (col.kind === "annotation") return titleCase(col.name) return titleCase(col.title ?? col.name) } - const generateColumnWidth = (col: BaseColumn) => { if (col.kind === "meta") return 80 if (col.kind === "input") return COLUMN_WIDTHS.input @@ -166,7 +153,6 @@ const generateColumnWidth = (col: BaseColumn) => { if (col.kind === "invocation") return COLUMN_WIDTHS.response return 20 } - const orderRank = (def: EnhancedColumnType): number => { if (def.key === "#") return 0 if (def.key === "timestamp") return 1 @@ -177,6 +163,7 @@ const orderRank = (def: EnhancedColumnType): number => { if (def.key?.includes("evaluators")) return 6 if (def.key === "__metrics_group__") return 7 if (def.key === "errors") return 9 // ensure errors column stays at the end of metrics group + return 8 } @@ -262,13 +249,8 @@ export function buildAntdColumns( width: generateColumnWidth(c), __editLabel: editLabel, } - - // Sorting: - // - keep sorting for true numeric/boolean/string metrics - // - disable sorting for annotation-like metric paths (their values come from annotations, not metrics atoms) const sortable = (c.kind === "metric" || c.kind === "annotation") && - !isAnnotationLikeMetricPath(c.path) && isSortableMetricType(c.metricType) const sorter = sortable ? scenarioMetricSorter(c, runId) : undefined @@ -416,6 +398,7 @@ export function buildAntdColumns( width: 120, minWidth: 120, render: (_: any, record: TableRow) => { + // Use runId from record data instead of function parameter const effectiveRunId = (record as any).runId || runId return ( { + // Use runId from record data instead of function parameter const effectiveRunId = (record as any).runId || runId - + // if (record.isSkeleton) return switch (c.kind) { case "input": { const inputStepKey = resolveStepKeyForRun(c, effectiveRunId) @@ -725,25 +708,6 @@ export function buildAntdColumns( ) } case "metric": { - // If this “metric” is actually pointing inside annotations, render via AnnotationValueCell - if (isAnnotationLikeMetricPath(c.path)) { - const annotationStepKey = resolveStepKeyForRun(c, effectiveRunId) - const fieldPath = toAnnotationFieldPath(c.path) - return ( - - ) - } - const scenarioId = record.scenarioId || record.key const evaluatorSlug = (c as any).evaluatorSlug as string | undefined const groupIndex = (c as any).evaluatorColumnIndex ?? 0 diff --git a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils.ts b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils.ts index 2cd4ffbc56..f08ccb95ff 100644 --- a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils.ts +++ b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/assets/utils.ts @@ -146,50 +146,25 @@ export const format3Sig = (num: number | string): string => { * Format a metric value using the mapping above. * Falls back to the raw value when the metric has no formatter or value is non-numeric. */ -export function formatMetricValue(metricKey: string, value: unknown): string { - if (value == null) { - return "" - } - - if (Array.isArray(value)) { - return value.map((v) => formatMetricValue(metricKey, v)).join(", ") - } - - if (typeof value === "boolean") { - return value ? "true" : "false" - } - - if (typeof value === "object") { - try { - return JSON.stringify(value, null, 2) - } catch (error) { - return String(value) - } - } - - if (typeof value !== "string" && typeof value !== "number") { - return String(value) - } - +export function formatMetricValue(metricKey: string, value: number | string): string { const fmt = METRIC_FORMATTERS[metricKey] || { decimals: 2, } - if (fmt?.format) { - return fmt.format(value) + if (Array.isArray(value)) { + return value.map((v) => { + return formatMetricValue(metricKey, v) + }) } + if (!fmt) return String(value) - if (typeof value !== "number") { - const numericValue = Number(value) - if (Number.isNaN(numericValue)) { - return String(value) - } - const adjusted = fmt.multiplier ? numericValue * fmt.multiplier : numericValue - const rounded = Number.isFinite(adjusted) ? format3Sig(adjusted) : format3Sig(value) - return `${fmt.prefix ?? ""}${rounded}${fmt.suffix ?? ""}` + if (fmt.format) { + return fmt.format(value) } - const adjusted = fmt.multiplier ? value * fmt.multiplier : value - const rounded = Number.isFinite(adjusted) ? format3Sig(adjusted) : format3Sig(value) + let num = typeof value === "number" ? value : Number(value) + num = fmt.multiplier ? num * fmt.multiplier : num + const rounded = + Number.isFinite(num) && fmt.decimals !== undefined ? format3Sig(num) : format3Sig(value) return `${fmt.prefix ?? ""}${rounded}${fmt.suffix ?? ""}` } diff --git a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/index.tsx b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/index.tsx index 32870ec074..1dce19c1f0 100644 --- a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/index.tsx +++ b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/index.tsx @@ -287,9 +287,6 @@ export const MetricDetailsPopoverWrapper = memo( const summary = useMemo(() => { if (!stats) return "N/A" - if (resolvedMetricType === "string" || resolvedMetricType === "object") { - return "N/A" - } // Numeric metrics → mean if (typeof (stats as any).mean === "number") { return format3Sig(Number((stats as any).mean)) diff --git a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/types.ts b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/types.ts index ae12092aad..1d06b45a69 100644 --- a/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/types.ts +++ b/web/ee/src/components/HumanEvaluations/assets/MetricDetailsPopover/types.ts @@ -6,7 +6,7 @@ export interface MetricDetailsPopoverProps { primaryValue?: number | string extraDimensions: Record /** Value to highlight (bin/bar will be inferred from this value) */ - highlightValue?: number | string | boolean | Array + highlightValue?: number | string /** Hide primitives key‒value table; useful for lightweight popovers */ hidePrimitiveTable?: boolean /** Force using edge-axis (for debugging) */ diff --git a/web/ee/src/components/PostSignupForm/PostSignupForm.tsx b/web/ee/src/components/PostSignupForm/PostSignupForm.tsx index d481fc3350..ef9c38778c 100644 --- a/web/ee/src/components/PostSignupForm/PostSignupForm.tsx +++ b/web/ee/src/components/PostSignupForm/PostSignupForm.tsx @@ -345,7 +345,7 @@ const PostSignupForm = () => { <>
agenta-ai = ({settings, selectedTe )} > - {settings - .filter((field) => field.type !== "hidden") - .map((field) => { - const rules = [ - {required: field.required ?? true, message: "This field is required"}, - ] + {settings.map((field) => { + const rules = [ + {required: field.required ?? true, message: "This field is required"}, + ] - return ( - - {field.label} - {field.description && ( - - - - )} -
- } - initialValue={field.default} - rules={rules} - > - {(field.type === "string" || field.type === "regex") && - selectedTestcase.testcase ? ( - - option!.value - .toUpperCase() - .indexOf(inputValue.toUpperCase()) !== -1 - } - /> - ) : field.type === "string" || field.type === "regex" ? ( - - ) : field.type === "number" ? ( - - ) : field.type === "boolean" || field.type === "bool" ? ( - - ) : field.type === "text" ? ( - - ) : field.type === "code" ? ( - - ) : field.type === "multiple_choice" ? ( - + ) : field.type === "number" ? ( + + ) : field.type === "boolean" || field.type === "bool" ? ( + + ) : field.type === "text" ? ( + + ) : field.type === "code" ? ( + + ) : field.type === "multiple_choice" ? ( + setResponseFormat(value)} - options={[ - {label: "Boolean (True/False)", value: "boolean"}, - {label: "Continuous (Numeric Range)", value: "continuous"}, - {label: "Categorical (Predefined Options)", value: "categorical"}, - ]} - /> -
- - {/* Conditional fields based on response format */} - {responseFormat === "boolean" && ( - - )} - - {responseFormat === "continuous" && ( -
-
-
- Minimum - - - -
- setMinValue(value ?? 0)} - /> -
-
-
- Maximum - - - -
- setMaxValue(value ?? 10)} - /> -
-
- )} - - {responseFormat === "categorical" && ( -
-
- Categories - - - -
- {categories.map((category, index) => ( -
- - updateCategory(index, "name", e.target.value) - } - style={{width: 150}} - /> - - updateCategory(index, "description", e.target.value) - } - style={{flex: 1}} - /> -
- ))} - -
- )} - - {/* Include Reasoning */} -
- setIncludeReasoning(e.target.checked)} - > - Include reasoning - - - - -
- -
- {contextHolder} - - ) -} diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaGenerator.ts b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaGenerator.ts deleted file mode 100644 index b6acddb008..0000000000 --- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaGenerator.ts +++ /dev/null @@ -1,152 +0,0 @@ -import deepEqual from "fast-deep-equal" -import {GeneratedJSONSchema, SchemaConfig} from "./types" - -export function isSchemaCompatibleWithBasicMode(schemaString: string): boolean { - const config = parseJSONSchema(schemaString) - - if (!config) { - return false - } - - try { - const parsed = JSON.parse(schemaString) - const normalizedOriginalSchema = parsed.schema || parsed - const regeneratedSchema = generateJSONSchema(config).schema - - return deepEqual(normalizedOriginalSchema, regeneratedSchema) - } catch { - return false - } -} - -export function generateJSONSchema(config: SchemaConfig): GeneratedJSONSchema { - const {responseFormat, includeReasoning, continuousConfig, categoricalOptions} = config - - const properties: Record = {} - const required: string[] = ["correctness"] - - // Base description is always "The grade results" - const baseDescription = "The grade results" - - // Add the main correctness field based on response format - switch (responseFormat) { - case "continuous": - properties.correctness = { - type: "number", - description: baseDescription, - minimum: continuousConfig?.minimum ?? 0, - maximum: continuousConfig?.maximum ?? 10, - } - break - - case "boolean": - properties.correctness = { - type: "boolean", - description: baseDescription, - } - break - - case "categorical": - if (categoricalOptions && categoricalOptions.length > 0) { - const enumValues = categoricalOptions.map((opt) => opt.name) - const categoryDescriptions = categoricalOptions - .map((opt) => `"${opt.name}": ${opt.description}`) - .join("| ") - - properties.correctness = { - type: "string", - description: `${baseDescription}. Categories: ${categoryDescriptions}`, - enum: enumValues, - } - } else { - // Fallback if no categories defined - properties.correctness = { - type: "string", - description: baseDescription, - } - } - break - } - - // Add reasoning field if requested - if (includeReasoning) { - properties.comment = { - type: "string", - description: "Reasoning for the score", - } - required.push("comment") - } - - return { - name: "schema", - schema: { - title: "extract", - description: "Extract information from the user's response.", - type: "object", - properties, - required, - strict: true, - }, - } -} - -export function parseJSONSchema(schemaString: string): SchemaConfig | null { - try { - const parsed = JSON.parse(schemaString) - - // Handle both old format (direct schema) and new format (with name wrapper) - const schema = parsed.schema || parsed - - if (!schema.properties || !schema.properties.correctness) { - return null - } - - const correctness = schema.properties.correctness - const hasReasoning = !!schema.properties.comment - - let responseFormat: SchemaConfig["responseFormat"] = "boolean" - let continuousConfig: SchemaConfig["continuousConfig"] - let categoricalOptions: SchemaConfig["categoricalOptions"] - - if (correctness.type === "number") { - responseFormat = "continuous" - continuousConfig = { - minimum: correctness.minimum ?? 0, - maximum: correctness.maximum ?? 10, - } - } else if (correctness.type === "boolean") { - responseFormat = "boolean" - } else if (correctness.type === "string" && correctness.enum) { - responseFormat = "categorical" - - // Parse category descriptions from the description field - const desc = correctness.description || "" - const categoriesMatch = desc.match(/Categories: (.+)/) - - if (categoriesMatch) { - const categoriesStr = categoriesMatch[1] - const categoryPairs = categoriesStr.split("| ") - - categoricalOptions = correctness.enum.map((name: string) => { - const pair = categoryPairs.find((p: string) => p.startsWith(`"${name}":`)) - const description = pair ? pair.split(": ")[1] || "" : "" - return {name, description} - }) - } else { - categoricalOptions = correctness.enum.map((name: string) => ({ - name, - description: "", - })) - } - } - - return { - responseFormat, - includeReasoning: hasReasoning, - continuousConfig, - categoricalOptions, - } - } catch { - return null - } -} diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/index.ts b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/index.ts deleted file mode 100644 index 9447df2662..0000000000 --- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/index.ts +++ /dev/null @@ -1,3 +0,0 @@ -export {JSONSchemaEditor} from "./JSONSchemaEditor" -export * from "./types" -export * from "./JSONSchemaGenerator" diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/types.ts b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/types.ts deleted file mode 100644 index 7b758b77a3..0000000000 --- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/types.ts +++ /dev/null @@ -1,38 +0,0 @@ -export type ResponseFormatType = "continuous" | "boolean" | "categorical" - -export interface ContinuousConfig { - minimum: number - maximum: number -} - -export interface CategoricalOption { - name: string - description: string -} - -export interface SchemaConfig { - responseFormat: ResponseFormatType - includeReasoning: boolean - continuousConfig?: ContinuousConfig - categoricalOptions?: CategoricalOption[] -} - -export interface JSONSchemaProperty { - type: string - description: string - minimum?: number - maximum?: number - enum?: string[] -} - -export interface GeneratedJSONSchema { - name: string - schema: { - title: string - description: string - type: "object" - properties: Record - required: string[] - strict: boolean - } -} diff --git a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx index a3ab293953..7c27c58fae 100644 --- a/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx +++ b/web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/index.tsx @@ -6,14 +6,7 @@ import dynamic from "next/dynamic" import {createUseStyles} from "react-jss" import {useAppId} from "@/oss/hooks/useAppId" -import { - EvaluationSettingsTemplate, - Evaluator, - EvaluatorConfig, - JSSTheme, - testset, - Variant, -} from "@/oss/lib/Types" +import {Evaluator, EvaluatorConfig, JSSTheme, testset, Variant} from "@/oss/lib/Types" import { CreateEvaluationConfigData, createEvaluatorConfig, @@ -123,145 +116,55 @@ const ConfigureEvaluator = ({ trace: null, }) - const evaluatorVersionNumber = useMemo(() => { - const raw = - editEvalEditValues?.settings_values?.version ?? - selectedEvaluator?.settings_template?.version?.default ?? - 3 - - if (typeof raw === "number") return raw - // extract leading number (e.g., "4", "4.1", "v4") - const match = String(raw).match(/\d+(\.\d+)?/) - return match ? parseFloat(match[0]) : 3 - }, [editEvalEditValues?.settings_values?.version, selectedEvaluator]) - - const evalFields = useMemo(() => { - const templateEntries = Object.entries(selectedEvaluator?.settings_template || {}) - const allowStructuredOutputs = evaluatorVersionNumber >= 4 - - return templateEntries.reduce( - (acc, [key, field]) => { - const f = field as Partial | undefined - if (!f?.type) return acc - if (!allowStructuredOutputs && (key === "json_schema" || key === "response_type")) { - return acc - } - acc.push({ + const evalFields = useMemo( + () => + Object.keys(selectedEvaluator?.settings_template || {}) + .filter((key) => !!selectedEvaluator?.settings_template[key]?.type) + .map((key) => ({ key, - ...(f as EvaluationSettingsTemplate), - advanced: Boolean((f as any)?.advanced), - }) - return acc - }, - [] as Array, - ) - }, [selectedEvaluator, evaluatorVersionNumber]) + ...selectedEvaluator?.settings_template[key]!, + advanced: selectedEvaluator?.settings_template[key]?.advanced || false, + })), + [selectedEvaluator], + ) const advancedSettingsFields = evalFields.filter((field) => field.advanced) const basicSettingsFields = evalFields.filter((field) => !field.advanced) - const onSubmit = async (values: CreateEvaluationConfigData) => { + const onSubmit = (values: CreateEvaluationConfigData) => { try { setSubmitLoading(true) if (!selectedEvaluator.key) throw new Error("No selected key") const settingsValues = values.settings_values || {} - const jsonSchemaFieldPath: Array = ["settings_values", "json_schema"] - const hasJsonSchema = Object.prototype.hasOwnProperty.call( - settingsValues, - "json_schema", - ) - - if (hasJsonSchema) { - form.setFields([{name: jsonSchemaFieldPath, errors: []}]) - - if (typeof settingsValues.json_schema === "string") { - try { - const parsed = JSON.parse(settingsValues.json_schema) - if (!parsed || typeof parsed !== "object" || Array.isArray(parsed)) { - throw new Error() - } - settingsValues.json_schema = parsed - } catch { - form.setFields([ - { - name: jsonSchemaFieldPath, - errors: ["Enter a valid JSON object"], - }, - ]) - throw new Error("JSON schema must be a valid JSON object") - } - } else if ( - settingsValues.json_schema && - (typeof settingsValues.json_schema !== "object" || - Array.isArray(settingsValues.json_schema)) - ) { - form.setFields([ - { - name: jsonSchemaFieldPath, - errors: ["Enter a valid JSON object"], - }, - ]) - throw new Error("JSON schema must be a valid JSON object") - } - } - const data = { ...values, evaluator_key: selectedEvaluator.key, settings_values: settingsValues, } - if (editMode) { - await updateEvaluatorConfig(editEvalEditValues?.id!, data) - - setEditEvalEditValues((previous) => - previous - ? { - ...previous, - ...data, - settings_values: settingsValues, - } - : previous, - ) - } else { - const response = await createEvaluatorConfig(appId, data) - const createdConfig = response?.data - - if (createdConfig) { - setEditEvalEditValues(createdConfig) - setEditMode(true) - } - } - - onSuccess() + ;(editMode + ? updateEvaluatorConfig(editEvalEditValues?.id!, data) + : createEvaluatorConfig(appId, data) + ) + .then(onSuccess) + .catch(console.error) + .finally(() => setSubmitLoading(false)) } catch (error: any) { - if (error?.errorFields) return + setSubmitLoading(false) console.error(error) message.error(error.message) - } finally { - setSubmitLoading(false) } } useEffect(() => { - // Reset form before loading new values so there are no stale values form.resetFields() - - if (editMode && editEvalEditValues) { - // Load all values including nested settings_values - form.setFieldsValue({ - ...editEvalEditValues, - settings_values: editEvalEditValues.settings_values || {}, - }) - } else if (cloneConfig && editEvalEditValues) { - // When cloning, copy only settings_values and clear the name so user provides a new name - form.setFieldsValue({ - settings_values: editEvalEditValues.settings_values || {}, - name: "", - }) + if (editMode) { + form.setFieldsValue(editEvalEditValues) + } else if (cloneConfig) { + form.setFieldValue("settings_values", editEvalEditValues?.settings_values) } - }, [editMode, cloneConfig, editEvalEditValues, form]) + }, [editMode, cloneConfig]) return (
diff --git a/web/ee/src/components/pages/settings/Billing/index.tsx b/web/ee/src/components/pages/settings/Billing/index.tsx index 3a5ec92157..fec538eac0 100644 --- a/web/ee/src/components/pages/settings/Billing/index.tsx +++ b/web/ee/src/components/pages/settings/Billing/index.tsx @@ -104,7 +104,7 @@ const Billing = () => {
{Object.entries(usage) - ?.filter(([key]) => (key !== "users" && key !== "applications")) + ?.filter(([key]) => key !== "users") ?.map(([key, info]) => { return ( (typeof msg === "string" ? msg : JSON.stringify(msg))) - .join("\n") - : (value?.toString() ?? "-") - case "multiple_choice": - return Array.isArray(value) ? value.join(", ") : (value?.toString() ?? "-") - case "hidden": - return "-" default: - return value?.toString() ?? "-" + return value?.toString() } } diff --git a/web/ee/src/lib/hooks/useEvaluationRunData/refreshLiveRun.ts b/web/ee/src/lib/hooks/useEvaluationRunData/refreshLiveRun.ts index 2d9a63f16b..dca3a7eab1 100644 --- a/web/ee/src/lib/hooks/useEvaluationRunData/refreshLiveRun.ts +++ b/web/ee/src/lib/hooks/useEvaluationRunData/refreshLiveRun.ts @@ -84,7 +84,7 @@ export const refreshLiveEvaluationRun = async (runId: string): Promise (o ? o[key] : undefined), obj) } export function computeInputsAndGroundTruth({ diff --git a/web/ee/src/state/observability/dashboard.ts b/web/ee/src/state/observability/dashboard.ts index e8c0215a71..6d040b22d7 100644 --- a/web/ee/src/state/observability/dashboard.ts +++ b/web/ee/src/state/observability/dashboard.ts @@ -3,9 +3,9 @@ import {eagerAtom} from "jotai-eager" import {atomWithQuery} from "jotai-tanstack-query" import {GenerationDashboardData} from "@/oss/lib/types_ee" +import {fetchGenerationsDashboardData} from "@/oss/services/tracing/api" import {routerAppIdAtom} from "@/oss/state/app/atoms/fetcher" import {projectIdAtom} from "@/oss/state/project" -import {fetchGenerationsDashboardData} from "@/oss/services/tracing/api" const DEFAULT_RANGE = "30_days" diff --git a/web/oss/package.json b/web/oss/package.json index 6d4e247c86..c773e5dccd 100644 --- a/web/oss/package.json +++ b/web/oss/package.json @@ -1,6 +1,6 @@ { "name": "@agenta/oss", - "version": "0.60.2", + "version": "0.60.0", "private": true, "engines": { "node": ">=18" diff --git a/web/oss/public/assets/Agenta-logo-full-dark-accent.png b/web/oss/public/assets/Agenta-logo-full-dark-accent.png deleted file mode 100644 index c14833dab1..0000000000 Binary files a/web/oss/public/assets/Agenta-logo-full-dark-accent.png and /dev/null differ diff --git a/web/oss/public/assets/Agenta-logo-full-light.png b/web/oss/public/assets/Agenta-logo-full-light.png deleted file mode 100644 index 4c9b31a813..0000000000 Binary files a/web/oss/public/assets/Agenta-logo-full-light.png and /dev/null differ diff --git a/web/oss/public/assets/dark-complete-transparent-CROPPED.png b/web/oss/public/assets/dark-complete-transparent-CROPPED.png new file mode 100644 index 0000000000..7d134ac59a Binary files /dev/null and b/web/oss/public/assets/dark-complete-transparent-CROPPED.png differ diff --git a/web/oss/public/assets/dark-complete-transparent_white_logo.png b/web/oss/public/assets/dark-complete-transparent_white_logo.png new file mode 100644 index 0000000000..8685bbf981 Binary files /dev/null and b/web/oss/public/assets/dark-complete-transparent_white_logo.png differ diff --git a/web/oss/public/assets/dark-logo.svg b/web/oss/public/assets/dark-logo.svg new file mode 100644 index 0000000000..6cb8ef3330 --- /dev/null +++ b/web/oss/public/assets/dark-logo.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/web/oss/public/assets/favicon.ico b/web/oss/public/assets/favicon.ico index dad02fe072..4dc8619b1d 100644 Binary files a/web/oss/public/assets/favicon.ico and b/web/oss/public/assets/favicon.ico differ diff --git a/web/oss/public/assets/light-complete-transparent-CROPPED.png b/web/oss/public/assets/light-complete-transparent-CROPPED.png new file mode 100644 index 0000000000..6be2e99e08 Binary files /dev/null and b/web/oss/public/assets/light-complete-transparent-CROPPED.png differ diff --git a/web/oss/public/assets/light-logo.svg b/web/oss/public/assets/light-logo.svg new file mode 100644 index 0000000000..9c795f8e88 --- /dev/null +++ b/web/oss/public/assets/light-logo.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/web/oss/src/components/Filters/EditColumns/assets/helper.ts b/web/oss/src/components/Filters/EditColumns/assets/helper.ts index 511eabc8f6..3cbfc58d73 100644 --- a/web/oss/src/components/Filters/EditColumns/assets/helper.ts +++ b/web/oss/src/components/Filters/EditColumns/assets/helper.ts @@ -30,5 +30,5 @@ export const formatColumnTitle = (text: string) => { return text .replace(/_/g, " ") .replace(/([a-z])([A-Z])/g, "$1 $2") - .replace(/\b\w/g, (c) => c) + .replace(/\b\w/g, (c) => c.toUpperCase()) } diff --git a/web/oss/src/components/Logo/Logo.tsx b/web/oss/src/components/Logo/Logo.tsx index 1c3b6447d5..ddb1c133f9 100644 --- a/web/oss/src/components/Logo/Logo.tsx +++ b/web/oss/src/components/Logo/Logo.tsx @@ -5,8 +5,8 @@ import Image from "next/image" import {useAppTheme} from "../Layout/ThemeContextProvider" const LOGOS = { - dark: "/assets/Agenta-logo-full-dark-accent.png", - light: "/assets/Agenta-logo-full-light.png", + dark: "/assets/dark-complete-transparent-CROPPED.png", + light: "/assets/light-complete-transparent-CROPPED.png", } const Logo: React.FC> & {isOnlyIconLogo?: boolean}> = ( diff --git a/web/oss/src/components/TestsetTable/TestsetTable.tsx b/web/oss/src/components/TestsetTable/TestsetTable.tsx index 17fa756fa9..4275a1e49d 100644 --- a/web/oss/src/components/TestsetTable/TestsetTable.tsx +++ b/web/oss/src/components/TestsetTable/TestsetTable.tsx @@ -1,13 +1,5 @@ // @ts-nocheck -import { - type FC, - type ChangeEvent, - ReactNode, - useEffect, - useState, - useMemo, - useCallback, -} from "react" +import {type FC, type ChangeEvent, ReactNode, useEffect, useState, useMemo, useCallback} from "react" import {type IHeaderParams} from "@ag-grid-community/core" import {CheckCircleFilled} from "@ant-design/icons" @@ -417,7 +409,6 @@ const TestsetTable: FC = ({mode}) => { onRowSelected={onRowSelectedOrDeselected} onRowDataUpdated={onRowSelectedOrDeselected} className="ph-no-capture" - suppressFieldDotNotation={true} />
diff --git a/web/oss/src/components/pages/app-management/modals/MaxAppModal.tsx b/web/oss/src/components/pages/app-management/modals/MaxAppModal.tsx index 330ad640cc..8ee1f9dfc2 100644 --- a/web/oss/src/components/pages/app-management/modals/MaxAppModal.tsx +++ b/web/oss/src/components/pages/app-management/modals/MaxAppModal.tsx @@ -43,7 +43,7 @@ const MaxAppModal: React.FC = ({...props}) => {
aenta-ai | null => { - if (input == null) return null - if (Array.isArray(input)) return {} - if (typeof input === "object") return input as Record - return {} - } - const onFinish = useCallback( async (values: any) => { try { @@ -202,9 +195,7 @@ const CreateEvaluator = ({ is_custom: false, }, meta: evaluatorWithMeta.meta || {}, - ...(evaluatorWithMeta.tags - ? {tags: normalizeTags(evaluatorWithMeta.tags)} - : {}), + ...(evaluatorWithMeta.tags ? {tags: evaluatorWithMeta.tags} : {}), }, } diff --git a/web/oss/src/components/pages/observability/drawer/TestsetDrawer/TestsetDrawer.tsx b/web/oss/src/components/pages/observability/drawer/TestsetDrawer/TestsetDrawer.tsx index af04837b4a..9c796b31c3 100644 --- a/web/oss/src/components/pages/observability/drawer/TestsetDrawer/TestsetDrawer.tsx +++ b/web/oss/src/components/pages/observability/drawer/TestsetDrawer/TestsetDrawer.tsx @@ -37,7 +37,6 @@ import {useTestsetsData} from "@/oss/state/testset" import {useStyles} from "./assets/styles" import {Mapping, Preview, TestsetColumn, TestsetDrawerProps, TestsetTraceData} from "./assets/types" -import {getValueAtPath} from "./assets/helpers" const TestsetDrawer = ({ onClose, @@ -358,7 +357,8 @@ const TestsetDrawer = ({ continue // Skip duplicate columns for now } - const value = getValueAtPath(item, mapping.data) + const keys = mapping.data.split(".") + const value = keys.reduce((acc: any, key) => acc?.[key], item) formattedItem[targetKey] = value === undefined || value === null diff --git a/web/oss/src/components/pages/observability/drawer/TestsetDrawer/assets/helpers.ts b/web/oss/src/components/pages/observability/drawer/TestsetDrawer/assets/helpers.ts deleted file mode 100644 index 9ffb36365a..0000000000 --- a/web/oss/src/components/pages/observability/drawer/TestsetDrawer/assets/helpers.ts +++ /dev/null @@ -1,32 +0,0 @@ -const splitPath = (path: string) => path.split(/(? p.replace(/\\\./g, ".")) - -export const getValueAtPath = (obj: any, rawPath: string) => { - if (obj == null || !rawPath) return undefined - - // quick direct hit (entire path is a literal key on the current object) - if (Object.prototype.hasOwnProperty.call(obj, rawPath)) return obj[rawPath] - - const parts = splitPath(rawPath) - let cur: any = obj - - for (let i = 0; i < parts.length; i++) { - if (cur == null) return undefined - - const key = parts[i] - - if (Object.prototype.hasOwnProperty.call(cur, key)) { - cur = cur[key] - continue - } - - // fallback: treat the remaining segments as one literal key containing dots - const remainder = parts.slice(i).join(".") - if (Object.prototype.hasOwnProperty.call(cur, remainder)) { - return cur[remainder] - } - - return undefined - } - - return cur -} diff --git a/web/oss/src/lib/Types.ts b/web/oss/src/lib/Types.ts index 1a2c86c43b..54f92ef345 100644 --- a/web/oss/src/lib/Types.ts +++ b/web/oss/src/lib/Types.ts @@ -831,16 +831,9 @@ export interface StyleProps { themeMode: "dark" | "light" } -export interface SettingsPreset { - key: string; - name: string; - values: Record; -} - export interface Evaluator { name: string key: string - settings_presets?: SettingsPreset[] settings_template: Record icon_url?: string | StaticImageData color?: string @@ -983,7 +976,6 @@ type ValueTypeOptions = | "hidden" | "messages" | "multiple_choice" - | "llm_response_schema" export interface EvaluationSettingsTemplate { type: ValueTypeOptions diff --git a/web/oss/src/pages/auth/[[...path]].tsx b/web/oss/src/pages/auth/[[...path]].tsx index 2d93bdfcf7..ef1ed9af39 100644 --- a/web/oss/src/pages/auth/[[...path]].tsx +++ b/web/oss/src/pages/auth/[[...path]].tsx @@ -123,7 +123,7 @@ const Auth = () => { )} > agenta-ai { const [csvVersion, setCsvVersion] = useState(0) const [isValidating, setIsValidating] = useState(false) - // Keep a ref in sync so the effect can read the latest fallback columns without re-running. - const columnsFallbackRef = useRef(columnsFallback) - useEffect(() => { - columnsFallbackRef.current = columnsFallback - }, [columnsFallback]) - - // Track ids that we already attempted (success or non-cancel failure) and those in flight. - const triedRef = useRef>(new Set()) - const inFlightRef = useRef>(new Set()) - // Extract CSV columns from the TanStack Query cache for any testset const cachedColumnsByTestsetId = useMemo(() => { if (!enabled) return {} @@ -43,9 +33,9 @@ export const useTestsetsData = ({enabled = true} = {}) => { const source = firstRow && typeof firstRow === "object" && - (firstRow as any).data && - typeof (firstRow as any).data === "object" - ? ((firstRow as any).data as Record) + firstRow.data && + typeof firstRow.data === "object" + ? (firstRow.data as Record) : (firstRow as Record) result[ts._id] = Object.keys(source) } else { @@ -53,10 +43,9 @@ export const useTestsetsData = ({enabled = true} = {}) => { } }) return result - }, [queryClient, testsets, csvVersion, enabled]) + }, [queryClient, testsets, csvVersion]) // Merge cache with fallback (from preview single testcase query) - // Depend on `columnsFallback` so consumers re-render when we infer columns. const columnsByTestsetId = useMemo(() => { if (!enabled) return {} const merged: Record = {...cachedColumnsByTestsetId} @@ -66,133 +55,89 @@ export const useTestsetsData = ({enabled = true} = {}) => { } }) return merged - }, [cachedColumnsByTestsetId, enabled, columnsFallback]) + }, [cachedColumnsByTestsetId, columnsFallback]) - // Background fill: for testsets without cached columns, fetch a single testcase to infer columns. + // Background fill: for testsets without cached columns, fetch a single testcase to infer columns + const triedRef = useRef>(new Set()) useEffect(() => { if (!enabled) return if (isPending || isLoading) return const controller = new AbortController() - - const getPending = () => { - const fallback = columnsFallbackRef.current - const pending = (testsets ?? []).filter((ts: any) => { + const tried = triedRef.current + const run = async () => { + if (!Array.isArray(testsets) || testsets.length === 0) return + const pending = testsets.filter((ts: any) => { const id = ts?._id if (!id) return false - // If cache already has columns, skip - if (cachedColumnsByTestsetId[id]?.length) return false - // If fallback already has columns, skip - if (fallback[id]?.length) return false - // Avoid double-starting work - if (inFlightRef.current.has(id)) return false - // Skip ids we already tried (success or hard failure) - if (triedRef.current.has(id)) return false + if (columnsByTestsetId[id]) return false + if (columnsFallback[id]) return false + if (tried.has(id)) return false return true }) - return pending - } - - const BATCH = 6 - let stopped = false - - const run = async () => { - // Process as many batches as needed in one effect run to avoid re-run storms. - setIsValidating(true) - try { - while (!stopped && !controller.signal.aborted) { - const pending = getPending() - if (pending.length === 0) break - - const toFetch = pending.slice(0, BATCH) - - await Promise.all( - toFetch.map(async (ts: any) => { - const id = ts._id - if (!id) return - - // Mark as in-flight before firing the request - inFlightRef.current.add(id) - try { - const url = `${getAgentaApiUrl()}/preview/testcases/query` - const {data} = await axios.post( - url, - { - testset_id: id, - windowing: {limit: 1}, - }, - {signal: controller.signal}, - ) - - const rows: any[] = Array.isArray(data?.testcases) - ? data.testcases - : Array.isArray(data) - ? data - : [] - const first = rows[0] - const dataObj = - first?.data && typeof first.data === "object" ? first.data : {} - const cols = Object.keys(dataObj as Record) - - if (cols.length) { - setColumnsFallback((prev) => { - const next = {...prev, [id]: cols} - // Keep ref in sync immediately for this loop - columnsFallbackRef.current = next - return next - }) - } - // Mark as tried after a completed call (success or empty) - triedRef.current.add(id) - } catch (e: any) { - // If aborted or axios-cancelled, allow retry in a future pass - const isCancelled = - e?.name === "CanceledError" || - e?.name === "AbortError" || - e?.code === "ERR_CANCELED" || - (typeof (axios as any).isCancel === "function" && - (axios as any).isCancel(e)) - if (!isCancelled) { - // Hard failure: mark as tried to avoid hot loops - triedRef.current.add(id) - } - } finally { - inFlightRef.current.delete(id) - } - }), - ) - - // Yield between batches so React can paint and we do not hog the tab - await new Promise((r) => setTimeout(r, 0)) - } - } finally { - setIsValidating(false) - } + if (pending.length === 0) return + // Limit concurrent fetches + const BATCH = 6 + const toFetch = pending.slice(0, BATCH) + await Promise.all( + toFetch.map(async (ts: any) => { + try { + setIsValidating(true) + + const url = `${getAgentaApiUrl()}/preview/testcases/query` + const {data} = await axios.post( + url, + { + testset_id: ts._id, + windowing: {limit: 1}, + }, + {signal: controller.signal}, + ) + // Response shape: + // { count: number, testcases: [{ data: { ...columns }, ... }] } + const rows: any[] = Array.isArray(data?.testcases) + ? data.testcases + : Array.isArray(data) + ? data + : [] + const first = rows[0] + const dataObj = + first?.data && typeof first.data === "object" ? first.data : {} + const cols = Object.keys(dataObj as Record) + if (cols.length) { + setColumnsFallback((prev) => ({...prev, [ts._id]: cols})) + // Also hydrate the primary cache so all consumers see columns immediately + // queryClient.setQueryData(["testsetCsvData", ts._id], [dataObj]) + } else { + tried.add(ts._id) + } + } catch (e) { + // swallow; keep fallback empty for this id + tried.add(ts._id) + // console.warn("Failed to infer columns for testset", ts?._id, e) + } finally { + setIsValidating(false) + } + }), + ) + } - run() - return () => { - stopped = true - controller.abort() - } - // Re-run only when inputs truly change (not on fallback writes) - }, [enabled, isPending, isLoading, testsets, cachedColumnsByTestsetId]) + return () => controller.abort() + }, [testsets, columnsByTestsetId, columnsFallback, isPending, isLoading]) - // Scoped csvVersion bumps: only bump for testset ids we care about + // When any testsetCsvData query updates, bump csvVersion useEffect(() => { - if (!enabled) return - const ids = new Set((testsets ?? []).map((t: any) => t?._id).filter(Boolean)) - - const unsubscribe = queryClient.getQueryCache().subscribe((event: any) => { - if (event?.type !== "updated") return - const q = event?.query - if (q?.queryKey?.[0] !== "testsetCsvData") return - const id = q?.queryKey?.[1] - if (!ids.has(id)) return - setCsvVersion((v) => v + 1) + const unsubscribe = queryClient.getQueryCache().subscribe((event) => { + // Only react to updates of our csv data queries + const q = (event as any)?.query + const key0 = q?.queryKey?.[0] + if (key0 === "testsetCsvData") { + setCsvVersion((v) => v + 1) + } }) return unsubscribe - }, [enabled, queryClient, testsets]) + }, [queryClient]) return { testsets: testsets ?? [], diff --git a/web/package.json b/web/package.json index 9228ee9355..f06cb54ce3 100644 --- a/web/package.json +++ b/web/package.json @@ -1,6 +1,6 @@ { "name": "agenta-web", - "version": "0.60.2", + "version": "0.62.0", "workspaces": [ "ee", "oss",