Skip to content

Conversation

@majdyz
Copy link
Contributor

@majdyz majdyz commented Nov 26, 2025

Summary

This PR implements a graph-level Safe Mode toggle system for Human-in-the-Loop (HITL) blocks. When Safe Mode is ON (default), HITL blocks require manual review before proceeding. When OFF, they execute automatically.

🔧 Backend Changes

  • Database: Added metadata JSON column to AgentGraph table with migration
  • API: Updated execute_graph endpoint to accept safe_mode parameter
  • Execution: Enhanced execution context to use graph metadata as default with API override capability
  • Auto-detection: Automatically populate has_human_in_the_loop for graphs containing HITL blocks
  • Block Detection: HITL block ID: 8b2a7b3c-6e9d-4a5f-8c1b-2e3f4a5b6c7d

🎨 Frontend Changes

  • Component: New FloatingSafeModeToggle with dual variants:
    • White variant: For library pages, integrates with action buttons
    • Black variant: For builders, floating positioned
  • Integration: Added toggles to both new/legacy builders and library pages
  • API Integration: Direct graph metadata updates via usePutV1UpdateGraphVersion
  • Query Management: React Query cache invalidation for consistent UI updates
  • Conditional Display: Toggle only appears when graph contains HITL blocks

🛠 Technical Implementation

  • Safe Mode ON (default): HITL blocks require manual review before proceeding
  • Safe Mode OFF: HITL blocks execute automatically without intervention
  • Priority: Backend API safe_mode parameter takes precedence over graph metadata
  • Detection: Auto-populates has_human_in_the_loop metadata field
  • Positioning: Proper z-index and responsive positioning for floating elements

🚧 Known Issues (Work in Progress)

High Priority

  • Toggle state persistence: Always shows "ON" regardless of actual state - query invalidation issue
  • LibraryAgent metadata: Missing metadata field causing TypeScript errors
  • Tooltip z-index: Still covered by some UI elements despite high z-index

Medium Priority

  • HITL detection: Logic needs improvement for reliable block detection
  • Error handling: Removing HITL blocks from graph causes save errors
  • TypeScript: Fix type mismatches between GraphModel and LibraryAgent

Low Priority

  • Frontend API: Add safe_mode parameter to execution calls once OpenAPI is regenerated
  • Performance: Consider debouncing rapid toggle clicks

🧪 Test Plan

  • Verify toggle appears only when graph has HITL blocks
  • Test toggle persistence across page refreshes
  • Confirm API calls update graph metadata correctly
  • Validate execution behavior respects safe mode setting
  • Check styling consistency across builder and library contexts

🔗 Related

  • Addresses requirements for graph-level HITL configuration
  • Builds on existing FloatingReviewsPanel infrastructure
  • Integrates with existing graph metadata system

🤖 Generated with Claude Code

majdyz and others added 30 commits November 14, 2025 12:39
This commit implements a comprehensive Human In The Loop (HITL) block that allows agents to pause execution and wait for human approval/modification of data before continuing.

Key features:
- Added WAITING_FOR_REVIEW status to AgentExecutionStatus enum
- Created PendingHumanReview database table for storing review requests
- Implemented HumanInTheLoopBlock that extracts input data and creates review entries
- Added API endpoints at /api/executions/review for fetching and reviewing pending data
- Updated execution manager to properly handle waiting status and resume after approval
- Created comprehensive frontend UI components:
  - PendingReviewCard for individual review handling
  - PendingReviewsList for multiple reviews
  - FloatingReviewsPanel for graph builder integration
- Integrated review UI into 3 locations: legacy library, new library, and graph builder
- Added proper type safety throughout with SafeJson handling
- Optimized database queries using count functions instead of full data fetching

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…security and workflow improvements

## Summary
- Complete implementation of Human In The Loop (HITL) block for pausing execution pending human review
- Fix all critical security vulnerabilities and architectural issues identified in PR reviews
- Refactor codebase to follow project patterns and improve maintainability

## Backend Changes
- Add `ReviewStatus` enum to Prisma schema with proper indexing for performance
- Implement comprehensive authorization checks with graph ownership verification
- Add atomic database transactions to prevent race conditions
- Fix critical `user_id=""` bug that prevented review resumption
- Add `wasEdited` field to track data modifications during review
- Implement proper input validation with size/depth limits to prevent DoS attacks
- Create service layer separation between business logic and database operations
- Fix Pydantic v2 validator compatibility issues
- Add proper error handling and remove silent failures
- Update execution status transitions to support WAITING_FOR_REVIEW state

## Frontend Changes
- Fix WAITING_FOR_REVIEW color consistency across UI components (purple theme)
- Add missing WAITING_FOR_REVIEW status handling in ActivityItem.tsx
- Generate updated OpenAPI client with proper type safety
- Remove unsafe `as any` type casting with proper type guards

## API Improvements
- Add structured `ReviewActionResponse` model for better type generation
- Implement comprehensive request validation with security checks
- Add proper OpenAPI schema generation for better developer experience
- Support both legacy and structured data formats for backward compatibility

## Security Enhancements
- Add authorization checks to verify graph ownership before review access
- Implement size limits (1MB) and nesting depth validation (10 levels)
- Add SQL injection protection and input sanitization
- Use atomic database operations to prevent concurrent modification issues

## Testing
- Add comprehensive unit tests covering security, validation, and business logic
- Test all edge cases including race conditions and data validation
- Verify API endpoints with proper error handling and status codes

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…ph_version parameter

The HumanInTheLoopBlock test was failing because the test framework wasn't providing the required `graph_version` parameter that the block's run method expects.

Changes:
- Add `graph_version: 1` to test framework's `extra_exec_kwargs` in backend/util/test.py
- Add test mocks for HumanInTheLoopBlock to avoid database/service dependencies during testing
- Add conditional logic to use mocks in test environment while preserving production functionality

The block now passes all tests while maintaining full production functionality for the human review workflow.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…anInTheLoopService

Based on CodeRabbit review feedback, this commit addresses three major issues:

**Security Improvements:**
- Add user ownership validation to get_existing_review() method for defense-in-depth
- Validate that users can only access reviews they own, preventing unauthorized access

**Reliability Improvements:**
- Add error handling for type conversion failures in process_approved_review()
- Reset review status back to WAITING if conversion fails, allowing user to fix data
- Raise descriptive HITLValidationError with helpful message for conversion failures

**Race Condition Fixes:**
- Prevent overwriting pending reviews when status is WAITING
- Avoid data loss when user is actively reviewing/editing data in UI
- Skip update logic if review already exists and is pending

All changes maintain backward compatibility and include proper error handling.
Tests continue to pass with enhanced security and reliability.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…in production

The previous implementation had a critical flaw where the test mocks would always
be used instead of the real workflow in production. This commit fixes this by:

**Changes:**
- Replace direct service calls with wrapper methods that can be properly mocked
- Add handle_review_workflow() and update_graph_execution_stats() wrapper methods
- Use proper test mocks that only activate during testing, not in production
- Ensure real HumanInTheLoopService workflow is used in production environments

**Testing:**
- Tests continue to pass with proper mocking
- Production code now uses real service layer and database operations
- Block behavior is consistent between test and production environments

This resolves the CodeRabbit critical issue where the real HITL workflow was never used.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
🔧 Security Fixes:
- Remove redundant user ownership check in _resume_graph_execution
- Enforce editable flag server-side before applying reviewed_data
- Add proper 403 access control to execution-scoped GET endpoint
- Strengthen SafeJson validation for reviewed_data with type safety

🎨 Frontend UX Improvements:
- Make PendingReviewCard respect editable flag from server
- Disable editing when review marked as read-only
- Fix unreachable WAITING_FOR_REVIEW case in ActivityItem
- Align WAITING_FOR_REVIEW colors (purple) across UI components

📚 Type Safety:
- Add SafeJsonData type alias for better payload validation
- Enhance reviewed_data validation with SafeJson compatibility checks

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add WAITING_FOR_REVIEW nodes to execution queue for auto-resume optimization
- Fix duplicate /api/executions prefix by moving review routes to /api/review
- Improve execution efficiency by processing approved reviews automatically

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add maxLength={2000} to review message textarea to match backend validation
- Prevents users from submitting review messages longer than server limit
- Addresses CodeRabbit comment 2526547379 about frontend validation consistency

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…tyItem

- Add WAITING_FOR_REVIEW to isActiveStatus check so it's treated as an active state
- Include "waiting for review" status text for better UX in navbar dropdown
- Addresses CodeRabbit comment about missing WAITING_FOR_REVIEW handling in ActivityItem.tsx

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
1. Remove exception throwing for rejected reviews - let downstream blocks handle rejection gracefully
2. Set node execution status to WAITING_FOR_REVIEW when pending review exists
3. Remove unused HITLValidationError import

Addresses comments:
- 2526935947: No need for raising error for rejected reviews
- 2526939180: Proper node status handling when waiting for review

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Change status from str to Literal['approved', 'rejected'] for better type safety
- Prevents invalid status values at compile time
- Improves code documentation and IDE support

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Move database operations from service layer to data layer for proper separation of concerns
- Create new data/human_review.py with database abstraction functions
- Simplify service layer to delegate to data layer functions
- Update block to use data layer directly instead of going through service
- Maintain backward compatibility through re-exports in service layer

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add VALID_STATUS_TRANSITIONS validation to node execution status updates
  - Apply to both singular and batch node status update functions
  - Validate current status before allowing transitions to new status
  - Provide clear error messages for invalid transitions

- Fix manager.py architectural issues
  - Remove local Prisma import for HITL review checking
  - Use data layer abstraction instead of direct database queries
  - Add has_pending_review function to human_review data layer

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…atabase patterns

- Move HITL database operations from service layer to data layer for proper separation of concerns
- Rename handle_review_workflow to get_or_upsert_human_review for database-oriented naming
- Implement database manager pattern following established codebase conventions
- Add VALID_STATUS_TRANSITIONS validation to node status updates with non-fatal warnings
- Update node execution status to WAITING_FOR_REVIEW to prevent automatic completion
- Remove redundant service layer file after moving logic to data layer
- Simplify block implementation to use single method that handles complete workflow
- Fix manager.py to use data layer instead of direct Prisma queries

The HITL block now follows proper architectural patterns:
- Data layer handles all database operations
- Block uses database manager pattern like other blocks
- Node status updates prevent execution manager from overriding WAITING_FOR_REVIEW
- API routes will handle review processing and execution re-queuing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…attern

Simplified update_node_execution_status to follow the same pattern as the batch function:
- Use update_many with status filter instead of complex validation logic
- Remove update count checking and warning logic for cleaner code
- Maintain same functionality with more consistent architecture

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Cleaned up unnecessary method abstractions:
- Inlined create_or_update_review into get_or_upsert_human_review
- Inlined delete_pending_review to direct Prisma calls
- Inlined update_review_status to direct Prisma calls

Reduces indirection and improves code readability while maintaining the same functionality.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Human review improvements:
- Renamed get_pending_review_by_node to get_pending_review_by_node_exec_id for clarity
- Removed all delete operations - keep review records as audit trail
- Inlined upsert_pending_review method to eliminate unnecessary wrapper
- Updated has_pending_review to filter by WAITING status only

Execution improvements:
- Refactored update_node_execution_status to use batch function instead of duplicating logic
- Handles execution_data separately since batch function doesn't support it
- Eliminates code duplication while maintaining same functionality

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…ReviewData in API

- Updated PendingHumanReviewResponse model to use PendingReviewData instead of SafeJsonData
- Fixed route handlers to properly parse data as structured PendingReviewData objects
- Added fallback handling for legacy data format in both routes and data layer
- Frontend now uses properly typed review.data.{data,message,editable} fields
- Removed manual type checking functions in favor of generated structured types
- Regenerated API types to export clean PendingReviewData interface

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Renamed PendingHumanReviewResponse to PendingHumanReviewModel for clarity
- Added from_db class method to handle database model conversion
- Moved all data parsing logic from routes into the from_db method
- PendingReviewData is now an internal implementation detail
- Removed manual parsing code and _convert_review_status helper function
- Routes now simply call Model.from_db(review) for clean conversion
- Follows established codebase patterns for model conversion

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Applied isort, black, and ruff formatting to updated files
- No functional changes, only code style improvements

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…onfusing nested structure

- Add instructions and editable columns to PendingHumanReview table schema
- Update PendingHumanReviewModel to expose payload, instructions, editable at top level
- Modify data layer to use new flat database structure instead of nested JSON
- Fix review action route to use new editable column from database

This eliminates the confusing .data.data pattern and provides a cleaner, more maintainable architecture
where review metadata is stored in proper database columns rather than nested JSON fields.

Note: Frontend updates will follow once API types are regenerated properly.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Rename database column from 'data' to 'payload' for consistency with DTO
- Remove unnecessary expected_data_type parameter since SafeJsonData handles any JSON
- Inline wrapper functions that were just one-line calls
- Remove extract_approved_data helper that was just returning review.payload
- Clean up HITLValidationError and replace with standard ValueError
- Update all references to use consistent naming throughout the codebase

This makes the codebase more consistent and removes unnecessary abstraction layers.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Removed process_approved_review and process_rejected_review functions
and inlined their logic directly into get_or_upsert_human_review.
These were just one-line return statements that added no value.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Replace GraphReviewStatus enum with direct boolean checks
- Use single upsert operation for get-or-create pattern in reviews
- Eliminate redundant database queries in execution flow
- Optimize existence checks with count() instead of find_first()
- Remove unnecessary status checks in review processing
- Implement proper separation of concerns: block handles processing decisions
- Add mandatory processed and node_exec_id fields to ReviewResult
- Use Literal types for action parameter for better type safety
- Remove local imports and move to top-level
- Simplify review action logic with atomic database operations
- Remove node execution status updates from data layer (moved to manager)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
majdyz and others added 11 commits November 20, 2025 14:51
…e UI updates

## Backend Changes
- Fix HITL block to use proper async_update_node_execution_status wrapper for websocket events
- Update human review data layer with batch processing for better performance
- Add comprehensive test coverage for human review operations
- Streamline review processing workflow for execution resumption

## Frontend Changes
- Fix review status icon to use eyes instead of pause for better UX
- Enable real-time execution status updates in both new and legacy Flow components
- Pass execution status directly to FloatingReviewsPanel for immediate reactivity
- Fix tab switching and visibility issues in review interfaces
- Improve review workflow with proper status propagation

## Key Improvements
- Real-time websocket updates ensure UI reflects REVIEW status immediately
- Better separation between running and idle states (REVIEW is idle, not running)
- Enhanced error handling and validation in review processing
- Consistent execution status handling across different UI components

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…tion

- Restore store/routes.py delete endpoint to return boolean instead of StoreSubmission
- Restore store/db.py delete function signature and error handling
- These changes were accidentally included in HITL feature development
- Only HITL-related functionality should be in this branch

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add HITL block ID (8b2a7b3c-6e9d-4a5f-8c1b-2e3f4a5b6c7d) to BETA_BLOCKS feature flag
- Block will be hidden from users by default unless beta-blocks flag is configured
- This allows controlled rollout of the Human-in-the-Loop functionality
- Beta users can access the block while it's being tested and refined

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…-Gravitas/AutoGPT into feat/human-in-the-loop-block
- Backend: Fix message handling and unify API structure
  - Fix HITL block to properly yield review messages for both approved and rejected reviews
  - Unify review API structure with single `reviews` array using `approved` boolean field
  - Remove separate approved_reviews/rejected_review_ids in favor of cleaner unified approach

- Frontend: Complete UI/UX overhaul for review interface
  - Replace plain JSON textarea with type-aware input components matching run dialog styling
  - Add "Approve All" and "Reject All" buttons with smart disabled states
  - Show rejection reason input only when excluding items (simplified UX)
  - Fix Reviews tab auto-population when execution status changes to REVIEW
  - Add proper local state management for real-time input updates
  - Use design system Input components for consistent rounded styling

Key improvements:
- No more JSON syntax errors for string inputs
- Professional appearance matching platform standards
- Intuitive workflow with conditional UI elements
- Type-safe unified API structure
- Real-time input updates with proper state management

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Remove reviewData prop from PendingReviewCard usage
- Fix TypeScript error after prop removal
- Component now extracts data directly from review payload

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…nents

Remove unnecessary comments and simplify code across HITL components:
- PendingReviewsList: Remove verbose comments, simplify logic flow
- FloatingReviewsPanel: Remove excessive commenting
- PendingReviewCard: Clean up type guard comments
- usePendingReviews: Remove redundant JSDoc comments

This improves code readability while maintaining all functionality.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
…ecture

Address reviewer suggestions for cleaner architecture:

**GraphStore Improvements:**
- Move graph execution status to centralized graphStore
- Derive isGraphRunning automatically from executionStatus
- Use consistent useQueryStates pattern instead of useSearchParams
- Remove prop drilling by letting components read from store directly

**Loading State Consistency:**
- Add missing isPending state to stopGraph operation in new builder
- New builder now has consistent loading states for both start/stop actions
- Matches behavior with new library page (both have proper loading feedback)

**Architecture Clean-up:**
- New builder: Uses graphStore for execution state ✓
- Legacy builder: Keeps existing local state approach ✓
- No mixed/duplicate state tracking between builders
- FloatingReviewsPanel reads execution status from store

**Benefits:**
- Zero prop drilling for execution status
- Consistent query state patterns across codebase
- Better UX with proper stop operation loading feedback
- Cleaner separation between new and legacy architectures
- Reusable execution state for future components

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Nov 26, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/human-in-the-loop-block-toggle

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AutoGPT-Agent
Copy link

Thank you for this comprehensive PR implementing a graph-level Safe Mode toggle for HITL blocks. The code changes look solid, and your PR description does an excellent job of explaining the intent and implementation details.

However, I noticed that you haven't checked off any of the items in your test plan. Before this can be merged, please:

  1. Complete the testing of your implementation according to the test plan you've outlined
  2. Check off the test plan items in your PR description once they've been verified

This feature affects execution behavior for HITL blocks, so thorough testing is particularly important to ensure it works as expected across different contexts (builder and library pages).

Also, I noted a few of your known issues:

  • Toggle state persistence showing "ON" regardless of actual state
  • LibraryAgent metadata TypeScript errors
  • Z-index issues with tooltip

While these don't necessarily block merging, it would be good to know if you have plans to address them in this PR or follow-up PRs.

Let me know if you need any clarification or have questions about these requirements.

@deepsource-io
Copy link

deepsource-io bot commented Nov 26, 2025

Here's the code health analysis summary for commits 3d08c22..1a15c4a. View details on DeepSource ↗.

Analysis Summary

AnalyzerStatusSummaryLink
DeepSource JavaScript LogoJavaScript✅ Success
❗ 49 occurences introduced
🎯 3 occurences resolved
View Check ↗
DeepSource Python LogoPython✅ Success
❗ 18 occurences introduced
🎯 5 occurences resolved
View Check ↗

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

@AutoGPT-Agent
Copy link

Thanks for the comprehensive PR implementing the Safe Mode toggle for HITL blocks! The implementation looks thorough and well-thought-out. I have a few comments before this is ready to merge:

Test Plan Completion

The test plan is clearly specified in your PR description, but none of the items are checked off. Could you please complete the testing as outlined and check off the test plan items? Specifically:

  • Verify toggle appears only when graph has HITL blocks
  • Test toggle persistence across page refreshes
  • Confirm API calls update graph metadata correctly
  • Validate execution behavior respects safe mode setting
  • Check styling consistency across builder and library contexts

Known Issues

You've done a great job documenting the known issues. Since these are marked as "Work in Progress", could you clarify which of these need to be resolved before merging versus which can be addressed in follow-up PRs?

Particularly concerning are these high priority issues:

  • Toggle state persistence always showing "ON"
  • TypeScript errors with LibraryAgent metadata
  • Z-index issues with tooltips

Technical Questions

  1. The HITL block ID is hardcoded (8b2a7b3c-6e9d-4a5f-8c1b-2e3f4a5b6c7d). Is there a more robust way to detect HITL blocks that won't break if block IDs change?
  2. Could you add some comments around the API priority logic where safe_mode parameter takes precedence over graph metadata?

Migration Readiness

The migration looks good, but let's ensure it's been tested on a representative database. Have you verified the migration runs successfully on a database with existing graphs?

Overall, this is a well-implemented feature with good cross-cutting concerns addressed. Looking forward to getting the test plan completed!

- Added PATCH /graphs/{id}/metadata endpoint for metadata-only updates without version bumping
- Removed safe_mode parameter from execute_graph API (backend now reads from metadata)
- Added has_human_in_the_loop field to LibraryAgent model for proper HITL detection
- Created FloatingSafeModeToggle component with support for all graph types
- Updated frontend execution calls to remove redundant safe_mode parameter
- Added proper cache invalidation for immediate UI updates
- Fixed TypeScript compatibility between legacy Graph and new API types

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@majdyz majdyz force-pushed the feat/human-in-the-loop-block-toggle branch from 121ceb6 to 4e686d7 Compare November 26, 2025 11:52
@AutoGPT-Agent
Copy link

Thank you for this detailed PR implementing the graph-level Safe Mode toggle for HITL blocks!

Feedback

Your implementation looks comprehensive, covering both backend and frontend changes needed for this feature. I appreciate the detailed description with sections for backend changes, frontend changes, technical implementation, and known issues.

A few observations:

  1. Missing Standard Checklist: Could you please add our standard checklist for code changes? Since this is a substantial feature implementation (labeled as size/xl), we need to ensure all verification steps are covered.

  2. Execution Context: The change from UserContext to ExecutionContext seems well implemented throughout the codebase, ensuring all references are updated consistently.

  3. Known Issues: It's good that you've documented the known issues, particularly around toggle state persistence, but these would need to be addressed before merging or at least have clear follow-up tasks created.

  4. Test Plan: Your test plan is helpful, but when you add the standard checklist, please ensure you've included specific verification steps for this feature.

  5. User ID Validation: The changes to backend/data/graph.py correctly validate user access in the update_graph_metadata function, which is good.

Once you've added the standard PR checklist with your test plan and addressed the known issues (or planned follow-ups), this should be ready for review again.

Split long line for better readability per Python formatting standards.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@AutoGPT-Agent
Copy link

Thank you for this PR implementing a graph-level Safe Mode toggle for HITL blocks. The feature looks well-designed and adds important safety functionality to the platform.

However, there's one key issue that needs to be addressed before this PR can be merged:

Required Changes

  • Missing PR Checklist: Our PR template requires a completed checklist for all PRs with material code changes. Please add the standard PR checklist from the template and ensure all items are checked off or explained.

Additional Feedback

  • The implementation generally looks solid. I especially like the detailed technical implementation and known issues sections in your description.
  • The backend changes for the Safe Mode feature appear to be implemented correctly, with proper database migrations and API updates.
  • The frontend components for the toggle are well-designed with appropriate variants for different contexts.
  • Good job on considering the UX implications of the toggle placement in different views.

Once you add the required checklist, this PR should be ready for review and potential approval.

@github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Nov 26, 2025
@github-actions
Copy link
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

Base automatically changed from feat/human-in-the-loop-block to dev November 27, 2025 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conflicts Automatically applied to PRs with merge conflicts platform/backend AutoGPT Platform - Back end platform/blocks platform/frontend AutoGPT Platform - Front end size/xl

Projects

Status: 🆕 Needs initial review
Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants