feat: implement human plan approval workflow and enhanced judge response handling #31

OtherVibes · 2025-09-25T05:13:34Z

Description

Major changes:

Add request_plan_approval tool for human plan approval workflow
Implement PLAN_PENDING_APPROVAL state for plan approval workflow
Add judge response repair system for non-JSON LLM responses
Add markdown-to-JSON coercion fallback for judge responses
Enhance workflow guidance to include research requirements visibility
Fix task size-based plan evaluation criteria (simplified for M, comprehensive for L/XL)
Implement unified workflow: all tasks go through planning phase
Add comprehensive test coverage for new functionality

Key Features:

Human approval step before judge validation
Robust fallback handling for judge response parsing failures
Task-size-appropriate validation criteria
Enhanced workflow guidance with research requirement transparency
Improved error handling and response repair mechanisms

This establishes the foundation for human-in-the-loop plan approval while maintaining robust automated validation and fallback handling.

…nse handling Major changes: - Add request_plan_approval tool for human plan approval workflow - Implement PLAN_PENDING_APPROVAL state for plan approval workflow - Add judge response repair system for non-JSON LLM responses - Add markdown-to-JSON coercion fallback for judge responses - Enhance workflow guidance to include research requirements visibility - Fix task size-based plan evaluation criteria (simplified for M, comprehensive for L/XL) - Implement unified workflow: all tasks go through planning phase - Add comprehensive test coverage for new functionality Key Features: - Human approval step before judge validation - Robust fallback handling for judge response parsing failures - Task-size-appropriate validation criteria - Enhanced workflow guidance with research requirement transparency - Improved error handling and response repair mechanisms This establishes the foundation for human-in-the-loop plan approval while maintaining robust automated validation and fallback handling.

- Enhanced test mocking in test_small_task_follows_unified_workflow to properly mock messaging providers - Applied ruff formatting to all source files - All tests now pass (183/183) - All quality checks pass (ruff, mypy, bandit) - Act (local GitHub Actions) setup is working correctly

OtherVibes mentioned this pull request Sep 25, 2025

Feat/user feedback #27

Closed

20 tasks

ci fixes

8ea626c

OtherVibes requested a review from doriwal September 25, 2025 07:30

OtherVibes merged commit 88f48ca into main Sep 26, 2025
14 of 15 checks passed

OtherVibes deleted the feat/human-plan-approval branch September 26, 2025 05:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement human plan approval workflow and enhanced judge response handling #31

feat: implement human plan approval workflow and enhanced judge response handling #31

Uh oh!

OtherVibes commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: implement human plan approval workflow and enhanced judge response handling #31

feat: implement human plan approval workflow and enhanced judge response handling #31

Uh oh!

Conversation

OtherVibes commented Sep 25, 2025

Description

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants