Skip to content

Conversation

@jp-agenta
Copy link
Member

AGE-3375-/-bug-the-breadcrumb-is-sometimes-wrong

Copilot AI review requested due to automatic review settings November 12, 2025 01:16
@vercel
Copy link

vercel bot commented Nov 12, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
agenta-documentation Ready Ready Preview Comment Nov 12, 2025 1:17am

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Nov 12, 2025
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


GitHub CI seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@dosubot dosubot bot added the bug Something isn't working label Nov 12, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses breadcrumb navigation issues (AGE-3375) by reorganizing evaluation-related code from the OSS layer to the EE (Enterprise Edition) layer, along with several other fixes and improvements.

Key changes:

  • Moved evaluation database models, services, and routers from OSS to EE edition
  • Updated import paths across multiple files to reflect the new code organization
  • Removed cron service configurations from Docker Compose files
  • Changed default async export setting in SDK tracing

Reviewed Changes

Copilot reviewed 43 out of 419 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
web/ee/src/components/pages/evaluations/autoEvaluation/assets/AutoEvaluationHeader.tsx Fixed import paths for statusMapper and useStyles
web/ee/src/components/pages/evaluations/autoEvaluation/EvaluatorsModal/ConfigureEvaluator/JSONSchema/JSONSchemaGenerator.ts Renamed "score" field to "correctness" throughout schema generation
web/ee/src/components/EvaluationTable/SingleModelEvaluationTable.tsx Reordered imports to group by source
web/ee/src/components/EvalRunDetails/state/urlState.ts Fixed relative import path for evalTypeAtom
web/ee/src/components/EvalRunDetails/components/VirtualizedScenarioTable/assets/CellComponents.tsx Updated import path for resolvePath utility
web/ee/src/components/DeploymentHistory/DeploymentHistory.tsx Changed to relative import for EE types
sdk/agenta/sdk/workflows/handlers.py Re-enabled error raising instead of returning failure dict
sdk/agenta/sdk/tracing/exporters.py Changed default async export from "true" to "false"
hosting/docker-compose/*.yml Removed cron service from Docker Compose configurations
docs/docs/evaluation/configure-evaluators/05-llm-as-a-judge.mdx Simplified documentation by removing output schema configuration details
docs/blog/main.mdx Removed changelog entry for LLM-as-a-Judge output schemas
api/pyproject.toml Downgraded version from 0.62.1 to 0.61.2
api/oss/src/services/db_manager.py Removed evaluation-related functions and unused imports
api/oss/src/services/converters.py Deleted entire file (moved to EE)
api/oss/src/services/app_manager.py Updated to use EE db_manager for evaluations
api/oss/src/resources/evaluators/evaluators.py Renamed "score" to "correctness" in evaluator schemas
api/oss/src/models/db_models.py Removed evaluation database models and added KEEP comments
api/oss/src/core/evaluations/service.py Added is_ee() checks for celery task dispatching
api/oss/src/apis/fastapi/evaluators/router.py Removed duplicate is_ee() check
api/oss/src/apis/fastapi/applications/router.py Removed duplicate is_ee() check
api/oss/docker/Dockerfile.*.yml Removed cron setup commands
api/entrypoint.py Removed evaluation router includes and task imports from OSS
api/ee/src/tasks/evaluations/*.py Updated imports to use EE services and removed is_ee() checks
api/ee/src/services/*.py Added new EE-specific service files with evaluation logic
api/ee/src/routers/*.py Updated to use EE services and db_manager_ee
api/ee/src/models/db_models.py Added evaluation database models from OSS
api/ee/src/main.py Added evaluation routers and load_tasks function
api/ee/src/apis/fastapi/billing/router.py Removed duplicate is_ee() checks
api/ee/docker/Dockerfile.*.yml Updated cron paths from OSS to EE

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

import {statusMapper} from "../../../evaluations/cellRenderers/cellRenderers"
import {buildEvaluationNavigationUrl} from "../../utils"
import {useStyles} from "./styles"
import {useStyles} from "../assets/styles"
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import path '../assets/styles' creates a circular reference since this file is already in the assets directory. The path should be './styles' instead.

Suggested change
import {useStyles} from "../assets/styles"
import {useStyles} from "./styles"

Copilot uses AI. Check for mistakes.
base_id: str,
**kwargs: dict,
) -> Optional[VariantBaseDB]:
) -> VariantBaseDB:
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type changed from 'Optional[VariantBaseDB]' to 'VariantBaseDB', which removes the ability to return None. This is a breaking API change that could cause issues if callers are expecting None as a possible return value. Consider keeping Optional or ensure all callers are updated to handle the new behavior.

Copilot uses AI. Check for mistakes.
async def fetch_evaluator_config(
evaluator_config_id: str,
) -> Optional[EvaluatorConfigDB]:
async def fetch_evaluator_config(evaluator_config_id: str) -> EvaluatorConfigDB:
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type changed from 'Optional[EvaluatorConfigDB]' to 'EvaluatorConfigDB'. This removes the ability to return None and could cause issues for existing callers that check for None. Consider keeping Optional in the signature or verify all calling code handles potential exceptions.

Copilot uses AI. Check for mistakes.

async def update_evaluation(
evaluation_id: str, project_id: str, updates: Dict[str, Any]
) -> EvaluationDB:
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type changed from 'Optional[EvaluationDB]' to 'EvaluationDB', removing the ability to return None. Ensure callers expecting None are updated accordingly, or consider keeping the Optional type.

Copilot uses AI. Check for mistakes.
from sqlalchemy.ext.asyncio import AsyncSession
from supertokens_python.types import AccountInfo
from sqlalchemy.orm import joinedload, load_only, aliased
from sqlalchemy.orm import joinedload, load_only, selectinload
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import 'aliased' was removed and 'selectinload' was added, but 'selectinload' doesn't appear to be used anywhere in the visible code. Consider removing unused imports to keep the code clean.

Suggested change
from sqlalchemy.orm import joinedload, load_only, selectinload
from sqlalchemy.orm import joinedload, load_only

Copilot uses AI. Check for mistakes.
Comment on lines +1919 to +1920
res_evaluations = result_evaluations.scalars().all()
res_human_evaluations = result_human_evaluations.scalars().all()
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines differ from the original where 'list()' was used to convert the results. While both approaches work, the inconsistency could indicate an intentional change or an oversight. Verify this behavior change is intentional.

Copilot uses AI. Check for mistakes.
@bekossy bekossy changed the title AGE-3375-/-bug-the-breadcrumb-is-sometimes-wrong [Fix] The-breadcrumb-is-sometimes-wrong Nov 12, 2025
@jp-agenta jp-agenta changed the base branch from main to release/v0.62.3 November 14, 2025 13:32
@jp-agenta jp-agenta marked this pull request as draft November 14, 2025 13:33
@junaway junaway changed the title [Fix] The-breadcrumb-is-sometimes-wrong [fix] The-breadcrumb-is-sometimes-wrong Nov 14, 2025
@bekossy bekossy closed this Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants