[feat] Implement comprehensive batch tracking and OpenAPI-driven data models #1915

christian-byrne · 2025-06-08T08:30:26Z

Four main goals of this PR:

Improved Task Tracking

Fixed inconsistencies in the progress tracker dialog during installations. Previously, task state wasn't tracked server-side, forcing the frontend to manually track queue state.

Tasks now have UUIDs and client IDs for proper identification
Maintain task history and queue server-side with thread-safe operations
WebSocket notifications when tasks start/complete keep clients synchronized
Failed tasks are tracked without interrupting the queue
Tasks can be cancelled by ID
Frontend receives consistent stdout logs for each task (previously buggy due to manual tracking)

Remove Legacy Features

Cleaning up features being replaced by other projects (native model management, subgraphs)
Removed unsafe routes like git URL downloads and pip installs - as part of core HTTP API, only registry-validated (security scanned) packages should be downloadable
Legacy features still available in legacy sub-package, accessible via --enable_manager_legacy_ui CLI flag

Type-Safe API-Driven Development

Generated OpenAPI specification for Manager HTTP API
Both frontend and backend generate types/SDKs from the same spec, enforcing contracts
Frontend uses TypeScript checking, backend uses Pydantic validation
OpenAPI validated against actual server in CI/CD
Auto-generated documentation

Set Groundwork for Restore Points

Captures environment state before/after task batches (ComfyUI version, frontend version, custom nodes, python environment)
State schemas defined in OpenAPI with validation
HTTP routes for accessing states (later, migration will be possible as part of snapshot/restore feature)

Additional refactoring includes splitting very large files out into more modules and applying ruff/pylint.

Test plan

Test task queue operations with client_id tracking
Verify batch history files are created correctly
Test OpenAPI model generation with datamodel-codegen
Verify all REST endpoints work with new models
Test queue status filtering by client_id
Verify tasks are partitioned by failed and succeeded

ltdrdata · 2025-06-08T21:22:31Z

Was it intentional to include /dist dir?

christian-byrne · 2025-06-09T01:38:31Z

Was it intentional to include /dist dir?

No. Good catch.

Also, this should be a draft PR until changes are finalized and the test plan is complete.

- Add client_id field to QueueTaskItem and TaskHistoryItem models - Implement client-specific WebSocket message routing - Add client filtering to queue status and history endpoints - Follow ComfyUI patterns for session management - Create data_models package for better code organization

… models Enhances ComfyUI Manager with robust batch execution tracking and unified data model architecture: - Implemented automatic batch history serialization with before/after system state snapshots - Added comprehensive state management capturing installed nodes, models, and ComfyUI version info - Enhanced task queue with proper client ID handling and WebSocket notifications - Migrated all data models to OpenAPI-generated Pydantic models for consistency - Added documentation for new TaskQueue methods (done_count, total_count, finalize) - Fixed 64 linting errors with proper imports and code cleanup Technical improvements: - All models now auto-generated from openapi.yaml ensuring API/implementation consistency - Batch tracking captures complete system state at operation start and completion - Enhanced REST endpoints with comprehensive documentation - Removed manual model files in favor of single source of truth - Added helper methods for system state capture and batch lifecycle management

- Removed completed TODO comments about code quality checks and client_id handling - Updated comments to reflect implemented features - Fixed ruff linting errors: - Removed duplicate constant definitions - Added missing locale import - Fixed unused imports - Moved is_local_mode logic to security_utils module - Added model_dir_name_map import to model_utils All ruff checks now pass successfully.

- Updated all POST endpoints to use proper Pydantic model validation: - `/v2/manager/queue/task` - validates QueueTaskItem - `/v2/manager/queue/install_model` - validates ModelMetadata - `/v2/manager/queue/reinstall` - validates InstallPackParams - `/v2/customnode/import_fail_info` - validates cnr_id/url fields - Added proper error handling with ValidationError for detailed error messages - Updated TaskQueue.put() to handle both dict and Pydantic model inputs - Added missing imports: InstallPackParams, ModelMetadata, ValidationError Benefits: - Early validation catches invalid data at API boundaries - Better error messages for clients with specific validation failures - Type safety throughout the request processing pipeline - Consistent validation behavior across all endpoints All ruff checks pass and validation is now enabled by default.

- Replace deprecated openapi-spec-validator with @redocly/cli - Remove fragile custom regex-based route alignment script - Use industry-standard OpenAPI validation tooling - Switch from Python to Node.js for validation pipeline - New validation catches 41 errors and 141 warnings that old validator missed

- Eliminate TaskQueue.ExecutionStatus NamedTuple in favor of generated TaskExecutionStatus Pydantic model - Remove manual conversion logic between NamedTuple and Pydantic model - Use single source of truth for task execution status - Clean up unused imports (Literal, NamedTuple) - Maintain consistent data model usage throughout TaskQueue

- Add proper async worker management to TaskQueue class - Remove redundant task_worker_thread and task_worker_lock global variables - Replace manual threading with async task management - Update is_processing() logic to use TaskQueue state instead of thread status - Implement automatic worker cleanup when queue processing completes - Simplify queue start endpoint to use TaskQueue.start_worker()

- Convert all nullable: true to OpenAPI 3.1 format using type: [type, 'null'] - Fix invalid array schema definition in ManagerMappings using oneOf - Add default security: [] configuration to satisfy security-defined rule - All 41 validation errors resolved, spec now passes with 0 errors - 141 warnings remain (mostly missing operationId and example validation)

- Updated generated_models.py to reflect OpenAPI 3.1 nullable format changes - Models now use Optional[type] instead of nullable: true - All affected models regenerated with datamodel-codegen - Syntax and linting checks pass

- Add tj-actions/changed-files to detect modified files in PR - Only run OpenAPI validation if openapi.yaml was changed - Only run Python linting on changed Python files (excluding legacy/) - Remove incorrect "pip install ast" dependency - Remove non-standard AST parsing and import checks - Makes CI more efficient and prevents unrelated failures

- Fix async/sync mismatch in TaskQueue worker implementation - Use threading.Thread with asyncio.run() as originally designed - Remove incorrect async task approach that caused blocking issues - TaskQueue now properly manages its own thread lifecycle - Resolves WebSocket message delivery and task processing issues

- Convert batch tracking messages to debug level (batch start, history saved) - Convert task processing details to debug level - Convert cache update messages to debug level - Replace print() with logging.debug() for task processing - Keep user-relevant messages at info level (ComfyUI updates, installation success) - Resolves verbose output appearing without --verbose flag

- Add SecurityLevel enum with strong/normal/normal-/weak values - Add RiskLevel enum with block/high/middle values - These will be used for security policy management 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Add SecurityLevel and RiskLevel enums to generated models - Enhance ComfyUISystemState with additional system information fields: - comfyui_root_path: ComfyUI installation directory - model_paths: Map of model types to configured paths - manager_version: ComfyUI Manager version - security_level: Current security configuration - network_mode: Network mode (online/offline/private) - cli_args: Selected CLI arguments - custom_nodes_count: Total number of custom nodes - failed_imports: List of failed imports - pip_packages: Installed pip packages 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Removed do_update_all function that was never called and only returned an error - Removed "update-all" from OperationType enum as it's no longer used - Regenerated data models to reflect the enum change The update_all functionality now properly creates individual update tasks through the API endpoint rather than being a single monolithic task. Co-Authored-By: Claude <[email protected]>

- Added query parameter models to OpenAPI spec for GET endpoints - Regenerated data models to include new query param models - Replaced manual validation with Pydantic model validation - Removed obsolete validate_required_params helper function - Provides better error messages and type safety for API endpoints Co-Authored-By: Claude <[email protected]>

christian-byrne · 2025-06-21T22:23:42Z

Test: Client ID Filtering ✔️

Test that task queue operations work with multiple connected clients and each client only sees the task states that they queued.

manage-custom-nodes-per-client.mp4

christian-byrne · 2025-06-22T00:09:19Z

Test: Batch history record writing ✔️

Test that after the task queue is empty for 4 seconds consecutively after tasks were in process, a batch history record is written to disk. The batch history record includes a large number of properties representing the state before and after the operations, as well as the operations themselves.

Batch history records are kept for 16 days and are about 25kb per record (json file).

manager-batch-history-demo.mp4

christian-byrne · 2025-06-22T02:44:03Z

Test: Tasks partitioned based on failed and succeeded in the UI ✔️

Test that failed tasks are put into separate section in the UI

manager-demo-task-failed-split.mp4

christian-byrne · 2025-06-22T02:46:10Z

The test plan is now complete. The next step is to package all three feature branches into easily-testable portable release and begin QA testing. This will start tomorrow.

christian-byrne mentioned this pull request Jun 8, 2025

[feat] Implement comprehensive batch tracking and OpenAPI-driven data models #1914

Closed

christian-byrne marked this pull request as draft June 9, 2025 01:37

christian-byrne marked this pull request as ready for review June 14, 2025 02:27

christian-byrne added 17 commits June 13, 2025 19:33

remove dist dir

3afafdb

migrate to data models for all routes

0be0a2e

return installed pack list on status update

b55e140

refresh cache before reporting status

b73dc61

add installed packs to openapi

4d89c69

add claude memory

98af8dc

[chore] Add dist/ to gitignore to exclude build artifacts

2120a0a

[chore] Regenerate data models after OpenAPI fixes

a96e7b1

- Updated generated_models.py to reflect OpenAPI 3.1 nullable format changes - Models now use Optional[type] instead of nullable: true - All affected models regenerated with datamodel-codegen - Syntax and linting checks pass

christian-byrne force-pushed the feat/implement-batch-tracking-clean branch from 6cd9b70 to 7a73f5d Compare June 14, 2025 02:51

christian-byrne added 2 commits June 13, 2025 20:27

christian-byrne force-pushed the feat/implement-batch-tracking-clean branch from 834f1fd to c888ea6 Compare June 14, 2025 04:20

christian-byrne and others added 4 commits June 17, 2025 10:36

explain glob vs legacy in claude memory

1bdb026

fix duplicated schemas in openapi

65f0764

fix files description in api

32e0039

christian-byrne and others added 14 commits June 17, 2025 13:06

[feat] Implement comprehensive system state capture for batch records

03ecda3

[fix] Remove unused imports to fix linting errors

14298b0

[fix] Use tuples in TaskQueue heap for proper comparison support

ddc9cd0

[fix] Update client filtering to handle tuple structure in pending_tasks

e923434

[refactor] Move model utility functions to model_utils module

f1b3c6b

[cleanup] Remove outdated temp_queue_batch comment

15e3699

refactor

ce3b2ba

[fix] Update data models to Pydantic v2 syntax to fix TypeError

e8fc053

[fix] Remove unused imports to fix Ruff linting errors

d912fb0

remove debug comments

624ad4c

apply formatting

d1ca628

christian-byrne added 5 commits June 21, 2025 16:01

[fix] Ensure batch history is written when queue becomes empty

7713337

add frontend version to system state capture

75ecb31

add embedded python to system state

38fefde

add batch_id to history task items

35d98dc

cleanup records older than 16 days

a66f86d

christian-byrne merged commit cb0fa58 into draft-v4 Jun 22, 2025
6 checks passed

ltdrdata deleted the feat/implement-batch-tracking-clean branch August 19, 2025 09:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feat] Implement comprehensive batch tracking and OpenAPI-driven data models #1915

[feat] Implement comprehensive batch tracking and OpenAPI-driven data models #1915

Uh oh!

christian-byrne commented Jun 8, 2025 •

edited

Loading

Uh oh!

ltdrdata commented Jun 8, 2025

Uh oh!

christian-byrne commented Jun 9, 2025

Uh oh!

christian-byrne commented Jun 21, 2025

Uh oh!

christian-byrne commented Jun 22, 2025

Uh oh!

christian-byrne commented Jun 22, 2025

Uh oh!

christian-byrne commented Jun 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[feat] Implement comprehensive batch tracking and OpenAPI-driven data models #1915

[feat] Implement comprehensive batch tracking and OpenAPI-driven data models #1915

Uh oh!

Conversation

christian-byrne commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Improved Task Tracking

Remove Legacy Features

Type-Safe API-Driven Development

Set Groundwork for Restore Points

Test plan

Uh oh!

ltdrdata commented Jun 8, 2025

Uh oh!

christian-byrne commented Jun 9, 2025

Uh oh!

christian-byrne commented Jun 21, 2025

Test: Client ID Filtering ✔️

Uh oh!

christian-byrne commented Jun 22, 2025

Test: Batch history record writing ✔️

Uh oh!

christian-byrne commented Jun 22, 2025

Test: Tasks partitioned based on failed and succeeded in the UI ✔️

Uh oh!

christian-byrne commented Jun 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

christian-byrne commented Jun 8, 2025 •

edited

Loading