jeankassio
diff --git a/‎README.md‎
Lines changed: 120 additions & 71 deletions b/‎README.md‎
Lines changed: 120 additions & 71 deletions
diff --git a/‎__pycache__/__init__.cpython-312.pyc‎
1.95 KB b/‎__pycache__/__init__.cpython-312.pyc‎
1.95 KB
diff --git a/‎__pycache__/ace_step_ksampler.cpython-312.pyc‎
53.7 KB b/‎__pycache__/ace_step_ksampler.cpython-312.pyc‎
53.7 KB
diff --git a/‎__pycache__/ace_step_prompt_gen.cpython-312.pyc‎
39.3 KB b/‎__pycache__/ace_step_prompt_gen.cpython-312.pyc‎
39.3 KB
diff --git a/‎__pycache__/ace_step_save_text.cpython-312.pyc‎
3.6 KB b/‎__pycache__/ace_step_save_text.cpython-312.pyc‎
3.6 KB
diff --git a/‎__pycache__/gemini_nodes.cpython-312.pyc‎
10.7 KB b/‎__pycache__/gemini_nodes.cpython-312.pyc‎
10.7 KB
diff --git a/‎__pycache__/groq_nodes.cpython-312.pyc‎
7.22 KB b/‎__pycache__/groq_nodes.cpython-312.pyc‎
7.22 KB
diff --git a/‎__pycache__/lyrics_nodes.cpython-312.pyc‎
31.9 KB b/‎__pycache__/lyrics_nodes.cpython-312.pyc‎
31.9 KB
diff --git a/‎__pycache__/optimization_nodes.cpython-312.pyc‎
5.1 KB b/‎__pycache__/optimization_nodes.cpython-312.pyc‎
5.1 KB
diff --git a/‎__pycache__/torch_compile_node.cpython-312.pyc‎
7.32 KB b/‎__pycache__/torch_compile_node.cpython-312.pyc‎
7.32 KB
@@ -43,92 +43,100 @@ Ten AI-powered lyrics generation nodes supporting various LLM providers:
 ### Ace-Step OpenAI Lyrics
 Lyrics generation using OpenAI GPT models.
 
-**Supported Models:**
-- `gpt-4o` - Latest multimodal model (recommended)
-- `gpt-4o-mini` - Smaller, faster variant
-- `gpt-4-turbo` - Previous gen high performance
-- `gpt-4` - Stable production model
-- `gpt-3.5-turbo` - Fast, cost-effective
+**Supported Models (December 2025):**
+- `gpt-5.1` - Reasoning model (latest)
+- `gpt-5.1-codex` - Coding optimized
+- `gpt-5` - High performance
+- `gpt-5-pro` - Professional variant
+- `gpt-4o` - Multimodal (recommended)
+- `gpt-4o-mini` - Fast variant
+- `gpt-4-turbo` - High performance
+- `gpt-4` - Stable base
+- `o3` - Reasoning model
+- `o3-mini` - Compact reasoning
+- `o1` - Advanced reasoning
+- `o1-mini` - Compact advanced reasoning
 
 **Category:** `JK AceStep Nodes/Lyrics`
 
 ### Ace-Step Claude Lyrics
 Lyrics generation using Anthropic Claude models.
 
-**Supported Models:**
-- `claude-3-5-sonnet-20241022` - Latest, best quality
-- `claude-3-opus-20250219` - Most capable variant
-- `claude-3-sonnet-20240229` - Balanced performance
-- `claude-3-haiku-20240307` - Fast, compact model
+**Supported Models (December 2025):**
+- `claude-opus-4.5` - Latest flagship (recommended)
+- `claude-opus-4.1` - Previous flagship
+- `claude-sonnet-4.5` - Latest balanced
+- `claude-sonnet-4` - Previous balanced
+- `claude-haiku-4.5` - Latest fast
+- `claude-haiku-3.5` - Previous fast
+- `claude-3-5-sonnet-20241022` - Snapshot variant
+- `claude-3-5-haiku-20241022` - Snapshot variant
+- `claude-3-opus-20250219` - Dated variant
 
 **Category:** `JK AceStep Nodes/Lyrics`
 
 ### Ace-Step Gemini Lyrics
 Lyrics generation using Google Gemini API.
 
-**Supported Models:**
-- `gemini-2.5-flash` - Latest flash model
-- `gemini-2.5-flash-latest` - Latest updates
-- `gemini-2.5-flash-lite` - Fastest/cheapest
-- `gemini-2.5-pro` - Highest quality (paid)
-- `gemini-2.5-pro-latest` - Pro with latest features
+**Supported Models (December 2025):**
+- `gemini-3-pro` - Latest pro model (recommended)
+- `gemini-2.5-flash` - Fast with latest capabilities
+- `gemini-2.5-flash-lite` - Ultra-fast variant
+- `gemini-2.5-pro` - High quality
 - `gemini-2.0-flash` - Previous generation
-- `gemini-2.0-flash-lite` - Previous lite variant
-- `gemini-1.5-pro` - Older pro model
-- `gemini-1.5-flash` - Older flash model
+- `gemini-2.0-flash-lite` - Previous generation lite
 
 **Category:** `JK AceStep Nodes/Lyrics`
 
 ### Ace-Step Groq Lyrics
 High-speed lyrics generation using Groq API.
 
-**Supported Models (Production):**
-- `llama-3.3-70b-versatile` - Meta Llama 3.3 (best quality)
-- `llama-3.1-8b-instant` - Meta Llama 3.1 (fast, compact)
-- `llama-3.2-1b-preview` - Meta Llama 3.2 (ultra-compact)
-- `llama-3.2-3b-preview` - Meta Llama 3.2 (small)
-- `llama-3.2-11b-vision-preview` - Meta Llama with vision
-- `llama-3.2-90b-vision-preview` - Large vision model
-- `mixtral-8x7b-32768` - Mistral mixture of experts
-- `openai/gpt-oss-120b` - OpenAI OSS 120B
-- `openai/gpt-oss-20b` - OpenAI OSS 20B
-
-**Supported Models (Preview):**
-- `meta-llama/llama-4-maverick-17b-128e-instruct` - Llama 4 (preview)
-- `meta-llama/llama-4-scout-17b-16e-instruct` - Llama 4 Scout (preview)
+**Supported Models (December 2025):**
+- `llama-3.3-70b-versatile` - Meta Llama 3.3 70B (best quality)
+- `llama-3.1-8b-instant` - Meta Llama 3.1 8B (fast)
+- `llama-guard-4-12b` - Meta Guard model
+- `deepseek-v3` - DeepSeek V3
+- `mistral-small-3` - Mistral Small v3
+- `gpt-oss-120b` - OpenAI OSS 120B
+- `gpt-oss-20b` - OpenAI OSS 20B
+- Plus additional production models
 
 **Category:** `JK AceStep Nodes/Lyrics`
 
 ### Ace-Step Perplexity Lyrics
 Lyrics generation using Perplexity Sonar models.
 
-**Supported Models:**
+**Supported Models (December 2025):**
 - `sonar` - Standard model
 - `sonar-pro` - Professional variant
 - `sonar-reasoning` - Reasoning-focused
-- `sonar-reasoning-pro` - Professional reasoning
+- `sonar-reasoning-pro` - Advanced reasoning
+- `sonar-deep-research` - Deep research variant
 
 **Category:** `JK AceStep Nodes/Lyrics`
 
 ### Ace-Step Cohere Lyrics
 Lyrics generation using Cohere Command models.
 
-**Supported Models:**
-- `command-a-03-2025` - Latest Command A (upcoming)
-- `command-r7b-12-2024` - December 2024 update
+**Supported Models (December 2025):**
+- `command-a-03-2025` - Latest Command A
+- `command-r7b-12-2024` - December 2024 variant
 - `command-r-plus-08-2024` - R+ August 2024
 - `command-r-08-2024` - R August 2024
-- `command-r-plus` - Latest R+ variant
-- `command-r` - Latest R variant
-- `command-a-translate-08-2025` - Specialized translation model
-- `command-a-reasoning-08-2025` - Reasoning-focused variant
+- `command-a-translate` - Translation specialist
+- `command-a-reasoning` - Reasoning-focused
+- `command-a-vision` - Vision capabilities
+- `aya-expanse-32b` - Aya Expanse 32B
+- `aya-expanse-8b` - Aya Expanse 8B
+- `aya-vision` - Aya with vision
+- `aya-translate` - Aya translation specialist
 
 **Category:** `JK AceStep Nodes/Lyrics`
 
 ### Ace-Step Replicate Lyrics
 Lyrics generation using Replicate API models.
 
-**Supported Models:**
+**Supported Models (December 2025):**
 - `meta/llama-3.1-405b-instruct` - 405B instruction-tuned
 - `meta/llama-3.1-70b-instruct` - 70B instruction-tuned
 - `meta/llama-3.1-8b-instruct` - 8B instruction-tuned
@@ -143,52 +151,52 @@ Lyrics generation using Replicate API models.
 ### Ace-Step HuggingFace Lyrics
 Lyrics generation using HuggingFace Inference API.
 
-**Supported Models:**
+**Supported Models (December 2025):**
 - `meta-llama/Llama-3.1-405B-Instruct` - Large instruction-tuned
+- `meta-llama/Llama-3.3-70B-Instruct-Turbo` - Llama 3.3 70B turbo
 - `meta-llama/Llama-3.1-70B-Instruct` - 70B instruction-tuned
-- `mistralai/Mistral-7B-Instruct-v0.2` - Mistral 7B
+- `mistralai/Mistral-Large` - Large Mistral variant
+- `microsoft/Phi-4` - Phi-4 model
 - `deepseek-ai/deepseek-v3` - DeepSeek V3
-- `qwen/Qwen2.5-72B-Instruct` - Qwen 2.5 72B
-- `HuggingFaceH4/zephyr-7b-beta` - Zephyr 7B
-- `tiiuae/falcon-7b-instruct` - Falcon 7B
+- `Qwen/Qwen2.5-72B-Instruct` - Qwen 2.5 72B
+- `google/gemma-2-27b` - Gemma 2 27B
+- `tiiuae/falcon-180b` - Falcon 180B
 
 **Category:** `JK AceStep Nodes/Lyrics`
 
 ### Ace-Step Together AI Lyrics
 Lyrics generation using Together AI serverless models.
 
-**Supported Models (Selection):**
-- `meta-llama/Llama-3.3-70B-Instruct-Turbo` - Llama 3.3 70B
-- `meta-llama/Llama-3.1-405B-Instruct-Turbo` - Llama 3.1 405B
-- `meta-llama/Llama-3.1-70B-Instruct-Turbo` - Llama 3.1 70B
-- `meta-llama/Llama-3.1-8B-Instruct-Turbo` - Llama 3.1 8B
+**Supported Models (December 2025):**
+- `meta-llama/Llama-3.3-70B-Instruct-Turbo` - Llama 3.3 70B turbo
+- `meta-llama/Llama-3.1-405B-Instruct-Turbo` - Llama 3.1 405B turbo
 - `mistralai/Mistral-Small-24B-Instruct-2501` - Mistral Small 24B
-- `mistralai/Ministral-3-8B-Instruct-2512` - Ministral 3 8B
-- `deepseek-ai/DeepSeek-V3.1` - DeepSeek V3.1
-- `deepseek-ai/DeepSeek-R1` - DeepSeek R1 reasoning
-- `Qwen/Qwen3-235B-A22B-Instruct-2507` - Qwen 3 235B
-- `moonshotai/Kimi-K2-Instruct-0905` - Kimi K2 instruction
-- `google/gemma-3-27b-it` - Gemma 3 27B
-- Plus 50+ additional models available
+- `Qwen/Qwen2.5-72B-Instruct` - Qwen 2.5 72B
+- `deepseek-ai/DeepSeek-V3` - DeepSeek V3
+- `moonshotai/Kimi-K2-Instruct` - Kimi K2
+- `GLM-4-Plus` - GLM 4 Plus
+- `Nous-Hermes-3-70B` - Nous Hermes 3 70B
+- Plus 100+ additional models available
 
 **Category:** `JK AceStep Nodes/Lyrics`
 
 ### Ace-Step Fireworks Lyrics
 Lyrics generation using Fireworks AI models (100+ available).
 
-**Supported Models (Selection):**
-- `deepseek-ai/deepseek-v3p2` - DeepSeek V3 P2
+**Supported Models (December 2025):**
+- `deepseek-ai/deepseek-v3` - DeepSeek V3 (latest)
 - `deepseek-ai/deepseek-r1` - DeepSeek R1 reasoning
-- `Qwen/Qwen3-235B-A22B-Instruct-2507` - Qwen 3 235B
-- `Qwen/Qwen3-Next-80B-A3B-Instruct` - Qwen 3 Next 80B
+- `Qwen/Qwen3-235B-A22B-Instruct` - Qwen 3 235B
+- `Qwen/Qwen2.5-72B-Instruct-Turbo` - Qwen 2.5 72B turbo
+- `meta-llama/Llama-4-Maverick-17B` - Llama 4 Maverick
+- `meta-llama/Llama-4-Scout-17B` - Llama 4 Scout
 - `meta-llama/Llama-3.3-70B-Instruct` - Llama 3.3 70B
 - `meta-llama/Llama-3.1-405B-Instruct` - Llama 3.1 405B
-- `meta-llama/Llama-3.1-70B-Instruct` - Llama 3.1 70B
-- `mistralai/Mistral-Large-3-675B-Instruct-2512` - Mistral Large 675B
+- `mistralai/Mistral-Large-3-675B-Instruct` - Mistral Large 675B
 - `mistralai/Mistral-Small-24B-Instruct-2501` - Mistral Small 24B
-- `mistralai/Mistral-Nemo-Instruct-2407` - Mistral Nemo
-- `mistralai/Mixtral-8x22B-Instruct` - Mixtral 8x22B
-- `zai-org/GLM-4.6` - GLM 4.6
+- `google/GLM-4.6` - GLM 4.6
+- `moonshotai/Kimi-K2` - Kimi K2
+- `google/Gemma-3-27b` - Gemma 3 27B
 - Plus 90+ additional models available
 
 **Category:** `JK AceStep Nodes/Lyrics`
@@ -235,6 +243,13 @@ Select your preferred variant from any sampler dropdown (default: `jkass_quality
 - **CFG:** 4.0-4.5
 - **Anti-Autotune:** 0.25-0.35 (vocals), 0.0-0.15 (instruments)
 
+### ✅ Extra settings for reducing 'AI' female vocals
+- **Sampler:** `jkass_quality` (for best quality) or `jkass_fast` (for speed)
+- **Frequency Damping:** 0.15-0.5 for female vocals to reduce metallic sizzle (0=disabled)
+- **Temporal Smoothing:** 0.02-0.12 to reduce pitch quantization and temporal discontinuities
+- **Beat Stability:** 0.05-0.2 to preserve stable rhythmic strikes and avoid per-frame jitter
+- **Anti-Autotune:** 0.25-0.35 (vocals), 0.0-0.15 (instruments)
+
 ---
 
 ## 🎯 Quality Check Feature
@@ -249,6 +264,14 @@ Automatically tests multiple step counts to find optimal settings for your promp
 
 - **Word cutting/stuttering:** Use `jkass_quality` sampler, disable advanced optimizations
 - **Metallic voice:** Increase `anti_autotune_strength` to 0.3-0.4
+- **AI-sounding female voice:** Try the following sequence:
+  1. Use `jkass_quality` and 80-120 steps, CFG 4.0-4.5, APG enabled
+  2. Set Anti-Autotune (0.25-0.35), Frequency Damping (0.15-0.4), Temporal Smoothing (0.02-0.06)
+  3. Use the Prompt Gen with `voice_style` -> `natural_female` and add 'breathy, micro pitch variation' in extra prompt
+  4. Decode using a high-quality VAE/vocoder (HiFi-GAN, or validated VAE) for improved timbre
+  5. If still metallic: de-esser and mild EQ cut at 7-12 kHz; add subtle formant correction and breath overlay
+  
+  **Optional:** Use the `Ace-Step Post Process` node to apply a quick de-essing (reduce 6-10 kHz energy), spectral smoothing, and subtle breath overlay to humanize the vocal further.
 - **Poor quality:** Increase steps (80-120), use CFG 4.0-4.5, enable APG, try `jkass_quality` sampler
 
 ---
@@ -272,7 +295,7 @@ JK-AceStep-Nodes/
 - **OpenAI** - gpt-4o, gpt-4-turbo, gpt-4, gpt-3.5-turbo, and more
 - **Anthropic Claude** - Claude 3.5 Sonnet, Claude 3 Opus/Sonnet/Haiku
 - **Google Gemini** - gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-1.5-pro/flash
-- **Groq** - Llama 3.3 70B, Llama 3.1 8B, Mixtral 8x7B, GPT-OSS (120B/20B), and preview models
+- **Groq** - Llama 3.3 70B, Llama 3.1 8B, Llama Guard 3, GPT-OSS (120B/20B), and Llama 4 preview models
 - **Perplexity** - Sonar, Sonar Pro, Sonar Reasoning (with 128k context)
 - **Cohere** - Command A/R+ (with reasoning & vision), Aya (multilingual)
 - **Replicate** - Llama 3.1 (405B/70B/8B), Mistral Small/Nemo, Mixtral
@@ -282,6 +305,32 @@ JK-AceStep-Nodes/
 
 ---
 
+## 🗣️ How to Use the Vocoder (ADaMoSHiFiGAN)
+
+To enable audio conversion with the vocoder (for improved final audio quality):
+
+1. **Obtain the vocoder files:**
+   - `diffusion_pytorch_model.safetensors` (vocoder model)
+   - `config.json` (vocoder configuration)
+
+2. **Place both files in the folder:**
+   - `JK-AceStep-Nodes/vocoder/`
+
+   The final path should be:
+   ```
+   JK-AceStep-Nodes/vocoder/diffusion_pytorch_model.safetensors
+   JK-AceStep-Nodes/vocoder/config.json
+   ```
+
+3. **Done!**
+   - The system will automatically detect these files when using nodes with vocoder enabled.
+   - If the files are not present, audio will be generated without the vocoder.
+
+> **Tip:**
+> Always use the correct file pair (model + config) to avoid artifacts or loading errors.
+
+---
+
 ## 📄 License
 
 MIT License