ComfyUI-Copilot-w-Agent v3.0: Agent Mode, Multi-Provider, Voice I/O, …#130
ComfyUI-Copilot-w-Agent v3.0: Agent Mode, Multi-Provider, Voice I/O, …#130vehoelite wants to merge 9 commits intoAIDC-AI:mainfrom
Conversation
…LM Studio fixes Major enhancements over upstream AIDC-AI/ComfyUI-Copilot v2.0: - Agent Mode: Autonomous multi-step workflow building with PLAN/EXECUTE/VALIDATE/REPORT loop, tool budget enforcement, loop prevention, visual step tracker - Multi-Provider: OpenAI, Groq, Anthropic, LM Studio with auto-detection, provider-aware timeouts, token budgets, and rate-limit retry - LM Studio: Fixed broken integration (wrong port, URL normalization, model listing, API key handling, header forwarding, cache invalidation) - Voice I/O: Streaming TTS with sentence extraction and gapless playback, VAD-based STT with auto-silence detection, per-provider backend (Groq Orpheus, OpenAI tts-1) - Fine-Tuning Pipeline: Complete QLoRA training for Qwen3 tool-calling, dataset generator (18 conversation types), validator, chunked CE loss for 8GB GPUs - Bug fixes: None-safe metadata, robust JSON parsing, MCP timeout tuning, nonlocal declaration fix, NullCtx for optional async managers Enhanced by Claude Opus 4.6
Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>
Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>
Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>
Co-authored-by: vehoelite <145181904+vehoelite@users.noreply.github.com>
|
|
|
AGENT mode might inherit small bugs but I am working on them and will continue to do so. Also I would like to Propose: Live Interaction Logging for Training Data Add an opt-in telemetry mode that records successful agent interactions during normal ComfyUI-Copilot usage. When a user's request completes successfully (workflow built, nodes connected, parameters set correctly), the full conversation trace — user prompt, tool calls, tool responses, and final result — gets saved as a training example in OpenAI chat-completion format. How it works: User enables logging via a toggle in settings (off by default, privacy-first) Synthetic training data (what the fine-tuning pipeline currently generates) covers template patterns but can't anticipate the full diversity of real user requests A logging middleware in the agent pipeline that serializes conversation turns If this is possible, it's obvious to allow a user to opt out - privacy and security is a personal priority. Let me know what you think. |
- create_agent looked for 'model_select' but agent endpoint stored as 'model' - Now checks both keys so user's model dropdown selection is respected - Set OpenAI provider default to gpt-4.1-mini instead of gemini-2.5-flash
- Emphasize ALWAYS search_nodes first, never guess class_types - Emphasize ALWAYS get_node_details before building JSON - Add COMMON MISTAKES section: single-node, guessed names, string-for-connections - Require complete pipelines: loader -> processing -> output - Add list_available_models step for real filenames - ~700 tokens, fits Groq 6K budget
Add upstream contribution documentation infrastructure
|
Really strange the thing was signed for nearly 6 hours, then suddenly it's not and it's not allowing any changes/acting like the task is done while github reports something different. |
…LM Studio fixes
Major enhancements over upstream AIDC-AI/ComfyUI-Copilot v2.0:
Enhanced by Claude Opus 4.6