Skip to content

fix:test_professionalism - AttributeError: 'tuple' object has no attr…#2470

Open
Angelenx wants to merge 2 commits intoconfident-ai:mainfrom
Angelenx:fix--test_professionalism---AttributeError--'tuple'-object-has-no-attribute-'find'
Open

fix:test_professionalism - AttributeError: 'tuple' object has no attr…#2470
Angelenx wants to merge 2 commits intoconfident-ai:mainfrom
Angelenx:fix--test_professionalism---AttributeError--'tuple'-object-has-no-attribute-'find'

Conversation

@Angelenx
Copy link

@Angelenx Angelenx commented Feb 3, 2026

This pull request introduces an improvement to the a_generate_with_schema_and_extract utility function in deepeval/metrics/utils.py. The update adds support for handling models that return a (result, cost) tuple, ensuring that cost tracking is properly accrued and the result is correctly extracted for downstream processing.

Metric cost handling and result extraction:

  • Updated a_generate_with_schema_and_extract to handle models that return a (result, cost) tuple, accrue the cost using the metric's _accrue_cost method if available, and extract the actual result for further processing.

This pull request fixes an issue where the new LLM return format could not be parsed, resulting in the error "test_professionalism - AttributeError: 'tuple' object has no attribute 'find'".

@vercel
Copy link

vercel bot commented Feb 3, 2026

Someone is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 3, 2026

PR author is not in the allowed authors list.

@Angelenx
Copy link
Author

Angelenx commented Feb 3, 2026

The current version, when using Openrouter's LLM API, may encounter errors similar to the following:

🙌 Congratulations! You're now using OpenRouter `openai/gpt-5-mini` for all evals that require an LLM.
    🎯 Evaluating test case #0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% 0:00:11
FRunning teardown with pytest sessionfinish...

============================================================================================ FAILURES ============================================================================================
______________________________________________________________________________________ test_professionalism ______________________________________________________________________________________

    def test_professionalism():
        dotenv.load_dotenv(dotenv_path=".env.local")
    
        model = OpenRouterModel(
            model=os.getenv("OPENROUTER_MODEL_NAME", "openai/gpt-5-mini"),
            api_key=os.getenv("OPENROUTER_API_KEY", ""),
        )
    
        professionalism_metric = ConversationalGEval(
            name="Professionalism",
            criteria="Determine whether the assistant has acted professionally based on the content.",
            threshold=0.5,
            model=model
        )
        test_case = ConversationalTestCase(
            turns=[
                Turn(role="user", content="What is DeepEval?"),
                Turn(role="assistant", content="DeepEval is an open-source LLM eval package.")
            ]
        )
>       assert_test(test_case, [professionalism_metric])

tests/test_basic.py:28: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/evaluate/evaluate.py:135: in assert_test
    test_result = loop.run_until_complete(
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/asyncio/base_events.py:691: in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/evaluate/execute.py:678: in a_execute_test_cases
    await asyncio.wait_for(
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/asyncio/tasks.py:520: in wait_for
    return await fut
           ^^^^^^^^^
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/evaluate/execute.py:581: in execute_with_semaphore
    return await _await_with_outer_deadline(
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/evaluate/execute.py:300: in _await_with_outer_deadline
    return await asyncio.wait_for(coro, timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/asyncio/tasks.py:520: in wait_for
    return await fut
           ^^^^^^^^^
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/evaluate/execute.py:923: in _a_execute_conversational_test_cases
    await measure_metrics_with_indicator(
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/metrics/indicator.py:235: in measure_metrics_with_indicator
    await asyncio.gather(*tasks)
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/metrics/indicator.py:248: in safe_a_measure
    await metric.a_measure(
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/metrics/conversational_g_eval/conversational_g_eval.py:176: in a_measure
    await self._a_generate_evaluation_steps()
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/metrics/conversational_g_eval/conversational_g_eval.py:213: in _a_generate_evaluation_steps
    return await a_generate_with_schema_and_extract(
/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/metrics/utils.py:464: in a_generate_with_schema_and_extract
    data = trimAndLoadJson(result, metric)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

input_string = (Steps(steps=['For each turn, compare the Role field to the Content field: verify the content aligns with the declared... then label the assistant as professional / borderline / unprofessional with brief supporting examples.']), 0.00116275)
metric = <deepeval.metrics.conversational_g_eval.conversational_g_eval.ConversationalGEval object at 0x79a233c24200>

    def trimAndLoadJson(
        input_string: str,
        metric: Optional[BaseMetric] = None,
    ) -> Any:
>       start = input_string.find("{")
                ^^^^^^^^^^^^^^^^^
E       AttributeError: 'tuple' object has no attribute 'find'

/home/angelen/miniconda3/envs/deepeval/lib/python3.12/site-packages/deepeval/metrics/utils.py:389: AttributeError
====================================================================================== slowest 10 durations ======================================================================================
11.68s call     tests/test_basic.py::test_professionalism

(2 durations < 0.005s hidden.  Use -vv to show these durations.)
==================================================================================== short test summary info =====================================================================================
FAILED tests/test_basic.py::test_professionalism - AttributeError: 'tuple' object has no attribute 'find'
1 failed, 4 warnings in 11.84s
All metrics errored for all test cases, please try again.

This pull request fixes the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant