test: add failing tests for cost extraction wrong index (#555) by Serhan-Asad · Pull Request #556 · promptdriven/pdd

Serhan-Asad · 2026-02-23T18:25:33Z

Summary

Adds failing tests that detect the bug reported in #555 — sync_orchestration.py reads result[1] as cost for all operations, but each operation returns a different tuple format.

Test Files

Unit tests: tests/test_e2e_issue_555_cost_extraction_wrong_index.py (16 tests — 13 fail, 3 pass)
E2E tests: tests/test_e2e_issue_555_sync_cost_extraction.py (13 tests — 10 fail, 3 pass)

What This PR Contains

Failing unit tests that reproduce the reported bug for each affected operation
Failing E2E tests that call the real sync_orchestration() function with mocked operations
Regression guards ensuring example/test/test_extend (which already work) aren't broken
Tests are verified to fail on current code and will pass once the bug is fixed

Root Cause

The sync loop extracts cost via result[1] and model via result[2] for all operations. This is only correct for example and test/test_extend (3-tuple and 4-tuple with cost at index 1). For other operations:

generate (4-tuple): cost at [2], model at [3] — result[1] is was_incremental (bool), which passes isinstance(False, (int, float)) since bool ⊂ int
crash/fix/verify (6-tuple): cost at [4], model at [5] — result[1] is a string, so cost defaults to 0.0

Additionally, operation_log.py:339-340 has the identical hardcoded index bug.

Next Steps

Implement the fix — create _extract_cost_from_result() and _extract_model_from_result() helpers
Verify all 29 tests pass
Run full test suite for regressions
Mark PR as ready for review

Fixes #555

Generated by PDD agentic bug workflow

…tdriven#555) The sync loop read result[1] as cost for all operations, but each returns a different tuple format. generate's result[1] is a bool (was_incremental) which passed isinstance(bool, (int, float)) and was counted as $1.00; crash/fix/verify have cost at index 4, not 1, so their costs were always $0.00. Two helpers now dispatch on the operation name to read the correct index. Fixes promptdriven#555 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…sts (promptdriven#555) - operation_log.py: use _extract_cost_from_result/_extract_model_from_result helpers instead of hardcoded result[1]/result[2] (same bug as sync_orchestration.py) - Update test files to import and test the actual fixed helpers (29/29 pass) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR addresses issue #555 around incorrect cost/model extraction from operation result tuples during sync_orchestration() runs and within the log_operation decorator, and adds regression tests to cover the corrected behavior.

Changes:

Added shared helpers in pdd/sync_orchestration.py to extract cost/model from operation results using operation-specific tuple indices (including excluding bool from numeric cost).
Updated sync_orchestration() and the log_operation decorator to use the new helpers rather than hardcoded tuple indices.
Added unit-style and E2E-style tests covering generate/crash/fix/verify (plus regression guards for example/test/test_extend).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`pdd/sync_orchestration.py`	Introduces `_extract_cost_from_result` / `_extract_model_from_result` and wires them into cost accumulation + log updates.
`pdd/operation_log.py`	Updates the `log_operation` decorator to use the shared extraction helpers for cost/model logging.
`tests/test_e2e_issue_555_cost_extraction_wrong_index.py`	Adds focused tests validating the helper extraction logic across operation return formats and the bool⊂int edge case.
`tests/test_e2e_issue_555_sync_cost_extraction.py`	Adds end-to-end tests exercising real `sync_orchestration()` with mocked ops to ensure totals/logging/budget behavior are correct.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pdd/operation_log.py

pdd/sync_orchestration.py

Serhan-Asad mentioned this pull request Feb 23, 2026

sync_orchestration.py: cost extraction reads wrong tuple index for generate/crash/fix/verify operations #555

Closed

Serhan-Asad force-pushed the fix/issue-555 branch from 5931538 to e7200ab Compare February 23, 2026 18:43

Serhan-Asad and others added 2 commits February 23, 2026 14:22

Serhan-Asad force-pushed the fix/issue-555 branch from e7200ab to b722e41 Compare February 23, 2026 22:45

Serhan-Asad marked this pull request as ready for review February 23, 2026 22:54

gltanaka requested a review from Copilot February 24, 2026 01:59

Copilot started reviewing on behalf of gltanaka February 24, 2026 01:59 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

pdd/operation_log.py Show resolved Hide resolved

pdd/sync_orchestration.py Show resolved Hide resolved

gltanaka merged commit 3e85268 into promptdriven:main Feb 24, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add failing tests for cost extraction wrong index (#555)#556

test: add failing tests for cost extraction wrong index (#555)#556
gltanaka merged 2 commits intopromptdriven:mainfrom
Serhan-Asad:fix/issue-555

Serhan-Asad commented Feb 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Serhan-Asad commented Feb 23, 2026

Summary

Test Files

What This PR Contains

Root Cause

Next Steps

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants