Releases · ace-step/ACE-Step-1.5

02 Mar 09:53

ChuxiJ

v0.1.4

1ce57cc

v0.1.4 Latest

Latest

What's Changed

docs: add Awesome ACE-Step link to README by @ChuxiJ in #715
Fix auto-labelling crash when "auto" device is selected by @Copilot in #714
refactor(gradio): decompose LLM actions and harden UI test isolation by @1larity in #710
Automated clean up of issues by @schneidergithub in #723
refactor(gradio): decompose batch management with facade by @1larity in #716
feat(openrouter): add sample/audio2code endpoints and expand generation params by @ChuxiJ in #729
refactor(openrouter): inline audio2code for cover mode, remove standalone endpoints by @ChuxiJ in #730
Fixing the issue with _unwrap_decoder function import (see #719) by @arsenylosev in #720
refactor(api): decompose model service and reinitialize routes by @1larity in #718
Exclude macOS from flash-attn wheel dependencies in nano-vllm to stop ~480MB download on macOS ARM64 by @Copilot in #734
feat: Add NVIDIA Jetson Docker support with GPU acceleration by @toolboc in #735
refactor(api): decompose sample/query/release task routes by @1larity in #725
refactor(api): decompose runtime/job helpers (API FD) by @1larity in #726
fix(generation): add timeout, progress fallback, and VRAM pre-flight … by @Uni404x64 in #671
feat: add codes for openrouter_adapter by @DumoeDss in #738
refactor(api): decompose api_server into focused modules by @1larity in #741
fix(lrc): always return intermediate tensors so LRC generation works for all task types by @ChuxiJ in #750

New Contributors

@arsenylosev made their first contribution in #720
@toolboc made their first contribution in #735
@Uni404x64 made their first contribution in #671

Full Changelog: v0.1.3...v0.1.4

Contributors

toolboc, 1larity, and 5 other contributors

Assets 2

1 Join discussion

27 Feb 07:50

ChuxiJ

v0.1.3

f4e5459

v0.1.3

What's Changed

Refactor (gradio part1) events wiring contracts by @1larity in #652
Refactor (gradio part2) generation wiring decomposition by @1larity in #653
fix: AutoGen 'No next batch available' when next batch is ready by @ChuxiJ in #659
Refactor (gradio part3) mode wiring decomposition by @1larity in #654
refactor(gradio-events): extract generation run and batch navigation wiring modules (PR5) by @1larity in #655
Refactor (gradio part 6) mode wiring decomposition by @1larity in #663
Refactor (gradio part 7) generation interface decomposition by @1larity in #668
fix(gradio): restore language and optional parameter interactivity by @1larity in #672
Refactor: Dynamic i18n Language Management by @orlagno in #686
Fix Gradio audio volume persistence and playback reset by @1larity in #680
Improving and updating the Hebrew language and adding features by @start-life in #694
refactor: clean up i18n discovery and CLI help by @orlagno in #692
Revert "Fix Gradio audio volume persistence and playback reset" by @ChuxiJ in #696
feat: remove prompt parse to sample_query by @ChuxiJ in #700
Fix garbled audio after switching from Remix/Repaint to Custom mode by @Copilot in #701
feat: add model init API, model inventory, and MLX CFG/APG support by @ChuxiJ in #703
fix: respect ACESTEP_INIT_LLM=false in lazy load and prioritize expli… by @ChuxiJ in #707
feat: add Streamlit-based ACE Studio UI by @pabbasian in #693
fix(gradio): reapply audio volume persistence with sane default by @1larity in #697
fix: handle MLX incompatibility in python_embeded on macOS by @ChuxiJ in #711

New Contributors

@pabbasian made their first contribution in #693

Full Changelog: v0.1.2...v0.1.3

Contributors

pabbasian, orlagno, and 3 other contributors

Assets 2

0 Join discussion

20 Feb 11:43

ChuxiJ

v0.1.2

ce23166

v0.1.2

What's Changed

fix(training): skip torch.compile when PEFT LoRA adapters are active by @FeelTheFonk in #640
(File permisions) Set executable permissions 755 on start_*.sh scripts for consistency. by @Red007Master in #641
fix: resolve symlinks in safe_path to prevent false rejections by @ChuxiJ in #648
feat: LoKr adapter support, LoRA status fix, and training docs update by @ChuxiJ in #649

New Contributors

@FeelTheFonk made their first contribution in #640
@Red007Master made their first contribution in #641

Full Changelog: v0.1.1...v0.1.2

Contributors

ChuxiJ, Red007Master, and FeelTheFonk

Assets 2

0 Join discussion

19 Feb 07:20

ChuxiJ

v0.1.1

52e3b22

v0.1.1

What's Changed

Refactor(handler part 14): split initialize_service into focused init-service mixins and expand unit coverage by @1larity in #617
Refactor(handler part 15): extract training preset switching into dedicated mixin by @1larity in #620
Refactor(handler part 16): extract service_generate into dedicated mixin + normalize legacy encoding messages by @1larity in #621
Refactor(handler part 17): extract MLX helper mixins by @1larity in #625
Implemented new Studio UI by @goedzo in #627
Refactor(handler): decompose generate_music orchestration (part 18) by @1larity in #626
Refactor(handler part 19): final facade cleanup post 17/18 by @1larity in #628
fix: pass project_root instead of checkpoint dir to initialize_service by @ChuxiJ in #637
Add cross-platform interactive manual launch scripts by @orlagno in #636
fix: clear stale UI state on mode switch to prevent remix noise bug by @ChuxiJ in #638

New Contributors

@orlagno made their first contribution in #636

Full Changelog: v0.1.0...v0.1.1

Contributors

orlagno, 1larity, and 2 other contributors

Assets 2

0 Join discussion

16 Feb 14:10

ChuxiJ

v0.1.0

1b62344

v0.1.0

What's Changed

Refactor(handler part 10): decompose generate_music orchestration into reques… by @1larity in #588
Fix start scripts by @DumoeDss in #592
Refactor(handler part 11): decompose generate_music orchestration and add upl… by @1larity in #590
Chore(workflow): enforce independent PR branches with pre-push guard by @1larity in #595
feat: add lora training tutorial by @DumoeDss in #593
Add lycoris lokr load and load lora api by @sdbds in #582
Refactor(handler part 13): vae encode by @1larity in #594
Refactor(handler part 12): vae decode by @1larity in #591
refactor: relocate gradio_ui → ui/gradio and remove old shim by @ChuxiJ in #599
refactor: move mlx_dit and mlx_vae into models/mlx by @ChuxiJ in #602
refactor: consolidate scoring modules into acestep/core/scoring/ by @ChuxiJ in #603
docs: link LoRA Training Tutorials and Side-Step advanced training docs by @ChuxiJ in #605
Docs/link lora training tutorials by @ChuxiJ in #607
fix: manifest path double-nesting causes training to find no samples by @ChuxiJ in #611

New Contributors

@sdbds made their first contribution in #582

Full Changelog: v0.1.0-rc.1...v0.1.0

Contributors

1larity, sdbds, and 2 other contributors

Assets 2

0 Join discussion

15 Feb 12:36

ChuxiJ

v0.1.0-rc.1

3dbf44e

v0.1.0-rc.1 Pre-release

Pre-release

What's Changed

fix: remix strength info by @ChuxiJ in #546
fix: add tiled VAE encoding and CPU offloading to training preprocessing by @ChuxiJ in #547
Add Opus and AAC audio output formats by @Copilot in #554
Add .env support to launcher scripts to preserve user customizations across updates by @Copilot in #552
Relax Python constraint to support ROCm 7.2 Windows (requires 3.12) by @Copilot in #553
feat: Side-Step -- corrected LoRA/LoKR fine-tuning with interactive wizard by @koda-dernet in #557
Include LoRA state in audio file UUID generation by @Copilot in #556
Refactor(handler part 8): extract lyric alignment timestamp/score mixins by @1larity in #565
Refactor(handler part 9): service generate by @1larity in #568
fix: escape Rich markup in user-provided paths to prevent MarkupError by @koda-dernet in #573
i18n: add missing training/export translations for he, ja, zh by @jayvenn21 in #574
Make batch size configurable and persistent across UI actions by @Copilot in #567
Add batch size cap validation in service_generate by @Copilot in #570
Enable multi-LoRA stacking with per-adapter scaling (addresses #338) by @jayvenn21 in #571
(fix):DGX Spark dependencies by @tonyjohnvan in #575
Support meta auto by @ChuxiJ in #580
fix: differentiate LoRA adapters in UUID and load LM codes on JSON im… by @ChuxiJ in #583
fix: prevent stale src_audio from leaking into text2music (Custom) mode by @ChuxiJ in #584

Full Changelog: v0.1.0-beta.3...v0.1.0-rc.1

Contributors

tonyjohnvan, 1larity, and 3 other contributors

Assets 2

0 Join discussion

14 Feb 09:23

ChuxiJ

v0.1.0-beta.3

6db4465

v0.1.0-beta.3 Pre-release

Pre-release

What's Changed

docs: add AGENTS.md guardrails for AI-assisted contributions by @1larity in #475
Fix: {TRACK_NAME} not replaced by actual track name in API. by @goedzo in #474
fix: seeds param by @DumoeDss in #496
refactor(handler part 4): extract diffusion mixin and add validation tests by @1larity in #490
Fix repaint mode not populating lyrics from generated audio metadata by @Copilot in #489
[Fix] Eliminate redundant LLM load/offload cycles in PT batch mode by @agorevski in #492
Fix LoRA memory bloat: replace deepcopy with CPU state_dict backup by @Copilot in #499
fix: phase-aware max_new_tokens to fix misleading progress bar stopping early by @agorevski in #493
Add CodeQL analysis workflow configuration by @schneidergithub in #502
Use CPU for LM by @Jay4242 in #497
Add minimal Copilot instructions for repository by @Copilot in #506
Feat update skills simplemv by @DumoeDss in #500
Fix YAML indentation in CodeQL workflow configuration by @Copilot in #503
refactor(handler part 5+6 ): Extend decomposition and harden source-audio analyze validation by @1larity in #508
fix(handler): address post-merge review feedback in io/prompt/task/padding mixins by @1larity in #510
Potential fix for code scanning alert no. 35: Uncontrolled data used in path expression by @schneidergithub in #507
fix: improve error handling in training and generation handlers by @ChuxiJ in #513
Doc only: Give coderabbit explicit guidance on module loc limits. by @1larity in #512
Demote vllm→PyTorch backend auto-selection from warning to info by @Copilot in #511
feat: add dgx spark support by @NicasioSirvent in #485
support cover (exp) by @ChuxiJ in #479
feat: add Side-Step training v2 (corrected LoRA fine-tuning) by @koda-dernet in #478
Revert "feat: add dgx spark support" by @ChuxiJ in #514
Skip torch.distributed initialization in single-GPU mode by @Copilot in #477
Fix CodeQL alert #59: Sanitize checkpoint paths to prevent deserialization attacks by @Copilot in #525
Streamline path traversal security fix: remove verbose logging and defensive checks by @Copilot in #526
fix for intel gpu support by @xushengyuan in #515
Refactor(handler part 7): continue FD extraction for batch conditioning and embedding pipeline by @1larity in #516
fix(training): guard pin_memory_device when None to prevent DataLoader crash by @jayvenn21 in #517
(Feat)(mlx): Add dit progress bar and Fix the final-step `break by @tonyjohnvan in #519
Potential fix for code scanning alert no. 59: Deserialization of user-controlled data by @schneidergithub in #524
Support local LoKr safetensors in adapter loader by @riversedge in #530
(feat)(api)Enhance audio result data structure to include per-audio metadata by @tonyjohnvan in #531
Address review feedback: remove redundant path validations and tighten exception handling by @Copilot in #532
Potential fix for code scanning alert no. 59: Deserialization of user-controlled data by @schneidergithub in #529
Polishing tooltips in Gradio to de-clutter the GUI by @pedro16797 in #527
Refactor tooltip implementation in Gradio to improve GUI clarity and … by @ChuxiJ in #534
fix: vocal language ui by @ChuxiJ in #535
(feat)(mlx): Supports MLX Pre-compile model for DiT by @tonyjohnvan in #522
feat: enhances the Extract mode by @ChuxiJ in #536
fix: only hide optional outputs in Extract mode, not all modes by @ChuxiJ in #537
fix: Lego mode UI redesign and Extract/Lego mode switch cleanup by @DumoeDss in #538
feat: add inline help buttons with modal documentation to Gradio UI by @ChuxiJ in #542
Fix nanovllm CUBLAS error on Turing GPUs by detecting bfloat16 support by @Copilot in #541

New Contributors

@schneidergithub made their first contribution in #502
@Jay4242 made their first contribution in #497
@NicasioSirvent made their first contribution in #485
@koda-dernet made their first contribution in #478
@pedro16797 made their first contribution in #527

Full Changelog: v0.1.0-beta.2...v0.1.0-beta.3

Contributors

agorevski, tonyjohnvan, and 12 other contributors

Assets 2

0 Join discussion

12 Feb 08:44

ChuxiJ

v0.1.0-beta.2

46116a6

v0.1.0-beta.2 Pre-release

Pre-release

What's Changed

fix: guard vLLM and CUDA graph usage on 16GB GPUs to prevent OOM by @jayvenn21 in #173
fix: pin torchao version to >=0.14.1,<0.16.0 by @ChuxiJ in #440
feat(mlx): Native MLX backend for DiT diffusion on Apple Silicon (2-3x speedup) by @tonyjohnvan in #439
feat(training): auto-load .caption.txt and .lyrics.txt files for LoRA datasets by @jayvenn21 in #438
feat(training): add validation-aware graph + best-checkpoint tracking to LoRA training by @jayvenn21 in #437
Add torch compile logic to LoRA trainer by @fidel1234xdd in #422
Fixed the Edited Draft Injection by @chigkim in #389
fix(gradio): refresh audio results and reset batch navigation on new generation by @jayvenn21 in #178
fix(gradio): initialize UI controls correctly for SFT models in servi… by @jayvenn21 in #207
Enhance MPS LoRA training by enabling gradient checkpointing in DiT training path by @riversedge in #401
(fix)Add CORS support for direct open browser-based frontends (studio.html) by @tonyjohnvan in #404
fix: auto-detect Mac/MPS and apply optimal configuration by @ChuxiJ in #443
refactor: streamline default backend configuration for device compati… by @ChuxiJ in #444
enhance: implement caching for VAE audio encoding to improve efficiency by @ChuxiJ in #445
refactor: optimize audio encoding logic with caching enhancements by @ChuxiJ in #446
Normalize audio against clipping and 32bit wav support by @lutzkirschner64-dot in #406
fix: API 400 - absolute audio file paths are not allowed for temp folder by @goedzo in #447
refactor: improve audio encoding logic with enhanced caching by @ChuxiJ in #448
refact: Reorganized Advanced Settings UI & Added latent shift and rescale by @ChuxiJ in #452
(feat)(mlx): Native MLX VAE acceleration for Apple Silicon by @tonyjohnvan in #459
fix(repaint): pass lyrics conditioning to repaint generation pipeline by @jayvenn21 in #461
Fix #455: Restore missing training keys in en.json by @lutzkirschner64-dot in #463
Refactor(handler part 2): Extract init utility mixin and add focused init-service tests by @1larity in #456
Refactor(handler part 3): Feat/handler init service mixin v2 by @1larity in #464
fix: reset audio playback position on regenerate & add GPU tier i18n … by @ChuxiJ in #467
[Bugfix] Only use FlashAttention if GPU supports it by @agorevski in #469
Refine Gradio UI by @ChuxiJ in #468
Make torchcodec optional for ROCM and Intel XPU platforms by @Copilot in #465
Add AdamW 8 bit by @fidel1234xdd in #449
feature: support lokr by @ChuxiJ in #471
i18n-zh-fix by @ChuxiJ in #473

New Contributors

@fidel1234xdd made their first contribution in #422
@lutzkirschner64-dot made their first contribution in #406
@agorevski made their first contribution in #469

Full Changelog: v0.1.0-beta.1...v0.1.0-beta.2

Contributors

agorevski, tonyjohnvan, and 8 other contributors

Assets 2

11 Feb 03:27

ChuxiJ

v0.1.0-beta.1

d1090f5

v0.1.0-beta.1 Pre-release

Pre-release

What's Changed

Refact add inference by @ChuxiJ in #1
feat ✨ : add lyrics alignment scores by @keylxiao in #2
Fix lrc bugs by @ChuxiJ in #3
cover/repaint test ok by @ChuxiJ in #4
refact understand_music by @ChuxiJ in #5
test ok by @ChuxiJ in #6
Add rewrite lyrics by @ChuxiJ in #7
support input timesteps by @ChuxiJ in #8
max_model_len 8192 -> 4096 by @ChuxiJ in #9
reduce vRAM usage for vae decode by @ChuxiJ in #10
fix: 修复 offload 模式下的设备不匹配错误 by @DumoeDss in #11
fix language issue simple mode by @ChuxiJ in #12
support lora trianing & inter by @ChuxiJ in #13
fix cover logic for api_server by @ChuxiJ in #14
add docs and serve mode by @ChuxiJ in #15
add docs and readme by @ChuxiJ in #16
add model zoo by @ChuxiJ in #17
fix audio play position reset by @ChuxiJ in #18
update hf demo by @ChuxiJ in #19
fix duration str bug by @ChuxiJ in #20
fix vllm bug by @ChuxiJ in #21
Update abstract with specific performance metrics and unified messaging by @Copilot in #22
Fix author names in BibTeX citation by @ChuxiJ in #23
Verify author name order in BibTeX citation by @Copilot in #24
fix mac nv install by @ChuxiJ in #25
feat: update skills by @DumoeDss in #26
improve lora training by @ChuxiJ in #28
fix lora reload by @ChuxiJ in #29
add model downloader by @ChuxiJ in #30
add model downloader by @ChuxiJ in #31
openrouter compatible by @seaniezhao in #32
fix duration by @ChuxiJ in #33
fix cfg kv block allocate by @ChuxiJ in #34
fix_tiled_vae_encode_bug by @ChuxiJ in #35
fix refer audio shape by @ChuxiJ in #36
fix 4 vram by @ChuxiJ in #38
tutorial by @ChuxiJ in #39
Fix TypeError: AceStepConditionGenerationModel does not support len() by @saltchicken in #53
Fix runtime crash with integer seeds and correct type hints by @iamgrootns in #48
Fix torch.compile crash: add len to dynamically loaded models by @Copilot in #47
Update Readme.md by @Saganaki22 in #62
fix missing torchao by @ChuxiJ in #65
Update release status for acestep-5Hz-lm-4B by @ChuxiJ in #71
fix: reset linux torch version for nano vllm by @DumoeDss in #72
fix max audio code id by @ChuxiJ in #78
Revert "fix max audio code id" by @ChuxiJ in #81
add support for intel gpu, tested on U9 285H, with and without 5hz llm by @xushengyuan in #88
Fix max audio code by @ChuxiJ in #103
Fix AttributeError when llm_handler is None by @fspecii in #109
Fix brocken HuggingFace link in README by @oumaklaus in #118
Optimize env installation by @DumoeDss in #128
Fix a Windows error that was unexpected at this time. (Fixes #134) by @start-life in #137
fix: bat error by @DumoeDss in #148
feat: Add the enforced use of lm models by @DumoeDss in #152
Add tensorboard to training dependencies for LoRA by @tlennon-ie in #155
fix: Windows path compatibility and UI freeze issues (Issue #113) by @DumoeDss in #159
docs: add Korean translation for core documentation by @acidsound in #187
feat(i18n): add translation support to LoRA training UI by @jayvenn21 in #205
Adding Hebrew language by @start-life in #201
docs: clarify ROCm / AMD usage and uv behavior by @jayvenn21 in #179
fix: avoid epoch-boundary stalls during LoRA training on Windows by @jayvenn21 in #177
fix path traversal vulnerability in audio endpoint by @Albab-Hasan in #161
fix: guard against missing reference audio in text2music by @jayvenn21 in #171
docs: add responsible disclosure security policy by @jayvenn21 in #169
fix: disable CUDA graph capture for 5Hz LM during LoRA training by @jayvenn21 in #172
CLI by @chigkim in #94
fix: add tolerance to 16GB VRAM detection by @jayvenn21 in #175
fix: Wrap audio paths in FileData for Gradio 6.x compatibility by @DemetrionWare in #181
feat: expose denoise control for audio-conditioned generation in Gradio by @jayvenn21 in #176
Extended API to allow Analysis only without generating new music. by @goedzo in #197
Fix Gradio Audio component error by passing file paths directly by @Copilot in #219
Fix MPS tiled_decode conv1d output length limit on macOS 14.7 by @Copilot in #222
Fix 4B model support for 16GB GPUs by increasing VRAM detection tolerance by @Copilot in #221
fix: use per-socket timeout instead of global setdefaulttimeout by @Albab-Hasan in #226
Fix bugs and add debuglog by @ChuxiJ in #230
Add musician-friendly guide to documentation by @sigalarm in #238
Position Audio time labels lower to avoid scrollbar overlap by @rkfg in #234
Feat/progress UI: Prevent Gradio slider math error when no audio files are found by @1larity in #194
Include torchcodec on MacOS to read audio files. by @chigkim in #243
docs: clarify AMD / ROCm usage and uv behavior by @jayvenn21 in #244
fix(song): ensure lyrics / vocal conditioning is preserved during song generation by @jayvenn21 in #241
fix: resolve intermittent CUDA assertion error in concurrent serving … by @ChuxiJ in #257
fix: remove nano vllm install in toml by @ChuxiJ in #260
feat(ui): add experimental Studio UI for REST API by @jayvenn21 in #258
feat(mps): Comprehensive Apple Silicon MPS backend support by @tonyjohnvan in #256
fix(mps): Address memory management and performance issues for Apple … by @ChuxiJ in #261
feat: Add AMD ROCm support for Windows (RX 7000/6000 series) by @clowerweb in #210
[WIP] Fix uv sync regression with CUDA version mismatches by @Copilot in #268
fix: add anti-clipping normalization to prevent audio clipping/overload by @ChuxiJ in #276
fix: validate training-incompatible settings before starting LoRA training by @ChuxiJ in #277
Fix check_update.bat: handle line ending changes and remove pager blocking by @Copilot in #273
Add manual for rocm Linux (Cachy-OS) and additional requirements file by @Neresco in https://github.com/ac...