ComfyUI-Copilot-w-Agent v3.0: Agent Mode, Multi-Provider, Voice I/O, … by vehoelite · Pull Request #130 · AIDC-AI/ComfyUI-Copilot

vehoelite · 2026-02-14T17:03:38Z

…LM Studio fixes

Major enhancements over upstream AIDC-AI/ComfyUI-Copilot v2.0:

Agent Mode: Autonomous multi-step workflow building with PLAN/EXECUTE/VALIDATE/REPORT loop, tool budget enforcement, loop prevention, visual step tracker
Multi-Provider: OpenAI, Groq, Anthropic, LM Studio with auto-detection, provider-aware timeouts, token budgets, and rate-limit retry
LM Studio: Fixed broken integration (wrong port, URL normalization, model listing, API key handling, header forwarding, cache invalidation)
Voice I/O: Streaming TTS with sentence extraction and gapless playback, VAD-based STT with auto-silence detection, per-provider backend (Groq Orpheus, OpenAI tts-1)
Fine-Tuning Pipeline: Complete QLoRA training for Qwen3 tool-calling, dataset generator (18 conversation types), validator, chunked CE loss for 8GB GPUs
Bug fixes: None-safe metadata, robust JSON parsing, MCP timeout tuning, nonlocal declaration fix, NullCtx for optional async managers
**ComfyUI CoPilot's very own local AI AGENT/Assistant based off Qwen/Qwen3-4B and trained QLoRa with PREMIUM data set's aimed to make ComfyUI CoPilot tool calls native. - This will be coming in the next coming days. It's training and under vigorous testing. Eventually being submitted to huggingface.

Enhanced by Claude Opus 4.6

…LM Studio fixes Major enhancements over upstream AIDC-AI/ComfyUI-Copilot v2.0: - Agent Mode: Autonomous multi-step workflow building with PLAN/EXECUTE/VALIDATE/REPORT loop, tool budget enforcement, loop prevention, visual step tracker - Multi-Provider: OpenAI, Groq, Anthropic, LM Studio with auto-detection, provider-aware timeouts, token budgets, and rate-limit retry - LM Studio: Fixed broken integration (wrong port, URL normalization, model listing, API key handling, header forwarding, cache invalidation) - Voice I/O: Streaming TTS with sentence extraction and gapless playback, VAD-based STT with auto-silence detection, per-provider backend (Groq Orpheus, OpenAI tts-1) - Fine-Tuning Pipeline: Complete QLoRA training for Qwen3 tool-calling, dataset generator (18 conversation types), validator, chunked CE loss for 8GB GPUs - Bug fixes: None-safe metadata, robust JSON parsing, MCP timeout tuning, nonlocal declaration fix, NullCtx for optional async managers Enhanced by Claude Opus 4.6

Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>

CLAassistant · 2026-02-14T17:03:45Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ vehoelite
❌ Copilot
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

vehoelite · 2026-02-14T17:07:05Z

AGENT mode might inherit small bugs but I am working on them and will continue to do so.

Also I would like to Propose: Live Interaction Logging for Training Data

Add an opt-in telemetry mode that records successful agent interactions during normal ComfyUI-Copilot usage. When a user's request completes successfully (workflow built, nodes connected, parameters set correctly), the full conversation trace — user prompt, tool calls, tool responses, and final result — gets saved as a training example in OpenAI chat-completion format.

How it works:

User enables logging via a toggle in settings (off by default, privacy-first)
Only successful interactions are captured (user confirms result or workflow executes without error)
Failed/abandoned interactions are discarded or flagged as negative examples
Traces are saved locally as JSONL, same format as the synthetic training dataset
Users can review/delete logged interactions before contributing
Why this matters:

Synthetic training data (what the fine-tuning pipeline currently generates) covers template patterns but can't anticipate the full diversity of real user requests
Real interaction logs capture the actual distribution of how people use ComfyUI — uncommon node combinations, creative workflows, domain-specific terminology
Creates a data flywheel: better model → more successful interactions → more training data → even better model
Community-contributed logs (with consent) could build a shared dataset that benefits all users
Implementation scope:

A logging middleware in the agent pipeline that serializes conversation turns
A local JSONL writer with configurable output path
A UI panel to review, approve, or delete captured interactions
Export format compatible with the existing training pipeline

If this is possible, it's obvious to allow a user to opt out - privacy and security is a personal priority. Let me know what you think.

- create_agent looked for 'model_select' but agent endpoint stored as 'model' - Now checks both keys so user's model dropdown selection is respected - Set OpenAI provider default to gpt-4.1-mini instead of gemini-2.5-flash

- Emphasize ALWAYS search_nodes first, never guess class_types - Emphasize ALWAYS get_node_details before building JSON - Add COMMON MISTAKES section: single-node, guessed names, string-for-connections - Require complete pipelines: loader -> processing -> output - Add list_available_models step for real filenames - ~700 tokens, fits Groq 6K budget

Add upstream contribution documentation infrastructure

vehoelite · 2026-02-15T06:38:53Z

Really strange the thing was signed for nearly 6 hours, then suddenly it's not and it's not allowing any changes/acting like the task is done while github reports something different.

vehoelite and others added 6 commits February 14, 2026 10:58

Initial plan

12e0a98

Add comprehensive PR submission documentation

19420f2

Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>

Add NEXT_STEPS and QUICK_REFERENCE guides, update README links

5820dbb

Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>

Add comprehensive SUMMARY_FOR_USER guide

963d22e

Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>

Add .github/PR_DOCUMENTATION_README.md as documentation index

d6f3ac5

Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>

vehoelite added 3 commits February 14, 2026 12:40

fix: agent mode model selection ignored - config key mismatch

f1e1d26

- create_agent looked for 'model_select' but agent endpoint stored as 'model' - Now checks both keys so user's model dropdown selection is respected - Set OpenAI provider default to gpt-4.1-mini instead of gemini-2.5-flash

Merge pull request AIDC-AI#1 from vehoelite/copilot/fix-enhance-features

2916691

Add upstream contribution documentation infrastructure

vehoelite force-pushed the main branch from 1091935 to 2916691 Compare February 14, 2026 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ComfyUI-Copilot-w-Agent v3.0: Agent Mode, Multi-Provider, Voice I/O, …#130

ComfyUI-Copilot-w-Agent v3.0: Agent Mode, Multi-Provider, Voice I/O, …#130
vehoelite wants to merge 9 commits intoAIDC-AI:mainfrom
vehoelite:main

vehoelite commented Feb 14, 2026

Uh oh!

CLAassistant commented Feb 14, 2026 •

edited

Loading

Uh oh!

vehoelite commented Feb 14, 2026 •

edited

Loading

Uh oh!

vehoelite commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

vehoelite commented Feb 14, 2026

Uh oh!

CLAassistant commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vehoelite commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vehoelite commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Feb 14, 2026 •

edited

Loading

vehoelite commented Feb 14, 2026 •

edited

Loading