Skip to content

Releases: ace-step/ACE-Step-1.5

v0.1.4

02 Mar 09:53

Choose a tag to compare

What's Changed

  • docs: add Awesome ACE-Step link to README by @ChuxiJ in #715
  • Fix auto-labelling crash when "auto" device is selected by @Copilot in #714
  • refactor(gradio): decompose LLM actions and harden UI test isolation by @1larity in #710
  • Automated clean up of issues by @schneidergithub in #723
  • refactor(gradio): decompose batch management with facade by @1larity in #716
  • feat(openrouter): add sample/audio2code endpoints and expand generation params by @ChuxiJ in #729
  • refactor(openrouter): inline audio2code for cover mode, remove standalone endpoints by @ChuxiJ in #730
  • Fixing the issue with _unwrap_decoder function import (see #719) by @arsenylosev in #720
  • refactor(api): decompose model service and reinitialize routes by @1larity in #718
  • Exclude macOS from flash-attn wheel dependencies in nano-vllm to stop ~480MB download on macOS ARM64 by @Copilot in #734
  • feat: Add NVIDIA Jetson Docker support with GPU acceleration by @toolboc in #735
  • refactor(api): decompose sample/query/release task routes by @1larity in #725
  • refactor(api): decompose runtime/job helpers (API FD) by @1larity in #726
  • fix(generation): add timeout, progress fallback, and VRAM pre-flight … by @Uni404x64 in #671
  • feat: add codes for openrouter_adapter by @DumoeDss in #738
  • refactor(api): decompose api_server into focused modules by @1larity in #741
  • fix(lrc): always return intermediate tensors so LRC generation works for all task types by @ChuxiJ in #750

New Contributors

Full Changelog: v0.1.3...v0.1.4

v0.1.3

27 Feb 07:50

Choose a tag to compare

What's Changed

  • Refactor (gradio part1) events wiring contracts by @1larity in #652
  • Refactor (gradio part2) generation wiring decomposition by @1larity in #653
  • fix: AutoGen 'No next batch available' when next batch is ready by @ChuxiJ in #659
  • Refactor (gradio part3) mode wiring decomposition by @1larity in #654
  • refactor(gradio-events): extract generation run and batch navigation wiring modules (PR5) by @1larity in #655
  • Refactor (gradio part 6) mode wiring decomposition by @1larity in #663
  • Refactor (gradio part 7) generation interface decomposition by @1larity in #668
  • fix(gradio): restore language and optional parameter interactivity by @1larity in #672
  • Refactor: Dynamic i18n Language Management by @orlagno in #686
  • Fix Gradio audio volume persistence and playback reset by @1larity in #680
  • Improving and updating the Hebrew language and adding features by @start-life in #694
  • refactor: clean up i18n discovery and CLI help by @orlagno in #692
  • Revert "Fix Gradio audio volume persistence and playback reset" by @ChuxiJ in #696
  • feat: remove prompt parse to sample_query by @ChuxiJ in #700
  • Fix garbled audio after switching from Remix/Repaint to Custom mode by @Copilot in #701
  • feat: add model init API, model inventory, and MLX CFG/APG support by @ChuxiJ in #703
  • fix: respect ACESTEP_INIT_LLM=false in lazy load and prioritize expli… by @ChuxiJ in #707
  • feat: add Streamlit-based ACE Studio UI by @pabbasian in #693
  • fix(gradio): reapply audio volume persistence with sane default by @1larity in #697
  • fix: handle MLX incompatibility in python_embeded on macOS by @ChuxiJ in #711

New Contributors

Full Changelog: v0.1.2...v0.1.3

v0.1.2

20 Feb 11:43

Choose a tag to compare

What's Changed

  • fix(training): skip torch.compile when PEFT LoRA adapters are active by @FeelTheFonk in #640
  • (File permisions) Set executable permissions 755 on start_*.sh scripts for consistency. by @Red007Master in #641
  • fix: resolve symlinks in safe_path to prevent false rejections by @ChuxiJ in #648
  • feat: LoKr adapter support, LoRA status fix, and training docs update by @ChuxiJ in #649

New Contributors

Full Changelog: v0.1.1...v0.1.2

v0.1.1

19 Feb 07:20

Choose a tag to compare

What's Changed

  • Refactor(handler part 14): split initialize_service into focused init-service mixins and expand unit coverage by @1larity in #617
  • Refactor(handler part 15): extract training preset switching into dedicated mixin by @1larity in #620
  • Refactor(handler part 16): extract service_generate into dedicated mixin + normalize legacy encoding messages by @1larity in #621
  • Refactor(handler part 17): extract MLX helper mixins by @1larity in #625
  • Implemented new Studio UI by @goedzo in #627
  • Refactor(handler): decompose generate_music orchestration (part 18) by @1larity in #626
  • Refactor(handler part 19): final facade cleanup post 17/18 by @1larity in #628
  • fix: pass project_root instead of checkpoint dir to initialize_service by @ChuxiJ in #637
  • Add cross-platform interactive manual launch scripts by @orlagno in #636
  • fix: clear stale UI state on mode switch to prevent remix noise bug by @ChuxiJ in #638

New Contributors

Full Changelog: v0.1.0...v0.1.1

v0.1.0

16 Feb 14:10

Choose a tag to compare

What's Changed

  • Refactor(handler part 10): decompose generate_music orchestration into reques… by @1larity in #588
  • Fix start scripts by @DumoeDss in #592
  • Refactor(handler part 11): decompose generate_music orchestration and add upl… by @1larity in #590
  • Chore(workflow): enforce independent PR branches with pre-push guard by @1larity in #595
  • feat: add lora training tutorial by @DumoeDss in #593
  • Add lycoris lokr load and load lora api by @sdbds in #582
  • Refactor(handler part 13): vae encode by @1larity in #594
  • Refactor(handler part 12): vae decode by @1larity in #591
  • refactor: relocate gradio_ui → ui/gradio and remove old shim by @ChuxiJ in #599
  • refactor: move mlx_dit and mlx_vae into models/mlx by @ChuxiJ in #602
  • refactor: consolidate scoring modules into acestep/core/scoring/ by @ChuxiJ in #603
  • docs: link LoRA Training Tutorials and Side-Step advanced training docs by @ChuxiJ in #605
  • Docs/link lora training tutorials by @ChuxiJ in #607
  • fix: manifest path double-nesting causes training to find no samples by @ChuxiJ in #611

New Contributors

Full Changelog: v0.1.0-rc.1...v0.1.0

v0.1.0-rc.1

15 Feb 12:36
3dbf44e

Choose a tag to compare

v0.1.0-rc.1 Pre-release
Pre-release

What's Changed

  • fix: remix strength info by @ChuxiJ in #546
  • fix: add tiled VAE encoding and CPU offloading to training preprocessing by @ChuxiJ in #547
  • Add Opus and AAC audio output formats by @Copilot in #554
  • Add .env support to launcher scripts to preserve user customizations across updates by @Copilot in #552
  • Relax Python constraint to support ROCm 7.2 Windows (requires 3.12) by @Copilot in #553
  • feat: Side-Step -- corrected LoRA/LoKR fine-tuning with interactive wizard by @koda-dernet in #557
  • Include LoRA state in audio file UUID generation by @Copilot in #556
  • Refactor(handler part 8): extract lyric alignment timestamp/score mixins by @1larity in #565
  • Refactor(handler part 9): service generate by @1larity in #568
  • fix: escape Rich markup in user-provided paths to prevent MarkupError by @koda-dernet in #573
  • i18n: add missing training/export translations for he, ja, zh by @jayvenn21 in #574
  • Make batch size configurable and persistent across UI actions by @Copilot in #567
  • Add batch size cap validation in service_generate by @Copilot in #570
  • Enable multi-LoRA stacking with per-adapter scaling (addresses #338) by @jayvenn21 in #571
  • (fix):DGX Spark dependencies by @tonyjohnvan in #575
  • Support meta auto by @ChuxiJ in #580
  • fix: differentiate LoRA adapters in UUID and load LM codes on JSON im… by @ChuxiJ in #583
  • fix: prevent stale src_audio from leaking into text2music (Custom) mode by @ChuxiJ in #584

Full Changelog: v0.1.0-beta.3...v0.1.0-rc.1

v0.1.0-beta.3

14 Feb 09:23
6db4465

Choose a tag to compare

v0.1.0-beta.3 Pre-release
Pre-release

What's Changed

  • docs: add AGENTS.md guardrails for AI-assisted contributions by @1larity in #475
  • Fix: {TRACK_NAME} not replaced by actual track name in API. by @goedzo in #474
  • fix: seeds param by @DumoeDss in #496
  • refactor(handler part 4): extract diffusion mixin and add validation tests by @1larity in #490
  • Fix repaint mode not populating lyrics from generated audio metadata by @Copilot in #489
  • [Fix] Eliminate redundant LLM load/offload cycles in PT batch mode by @agorevski in #492
  • Fix LoRA memory bloat: replace deepcopy with CPU state_dict backup by @Copilot in #499
  • fix: phase-aware max_new_tokens to fix misleading progress bar stopping early by @agorevski in #493
  • Add CodeQL analysis workflow configuration by @schneidergithub in #502
  • Use CPU for LM by @Jay4242 in #497
  • Add minimal Copilot instructions for repository by @Copilot in #506
  • Feat update skills simplemv by @DumoeDss in #500
  • Fix YAML indentation in CodeQL workflow configuration by @Copilot in #503
  • refactor(handler part 5+6 ): Extend decomposition and harden source-audio analyze validation by @1larity in #508
  • fix(handler): address post-merge review feedback in io/prompt/task/padding mixins by @1larity in #510
  • Potential fix for code scanning alert no. 35: Uncontrolled data used in path expression by @schneidergithub in #507
  • fix: improve error handling in training and generation handlers by @ChuxiJ in #513
  • Doc only: Give coderabbit explicit guidance on module loc limits. by @1larity in #512
  • Demote vllm→PyTorch backend auto-selection from warning to info by @Copilot in #511
  • feat: add dgx spark support by @NicasioSirvent in #485
  • support cover (exp) by @ChuxiJ in #479
  • feat: add Side-Step training v2 (corrected LoRA fine-tuning) by @koda-dernet in #478
  • Revert "feat: add dgx spark support" by @ChuxiJ in #514
  • Skip torch.distributed initialization in single-GPU mode by @Copilot in #477
  • Fix CodeQL alert #59: Sanitize checkpoint paths to prevent deserialization attacks by @Copilot in #525
  • Streamline path traversal security fix: remove verbose logging and defensive checks by @Copilot in #526
  • fix for intel gpu support by @xushengyuan in #515
  • Refactor(handler part 7): continue FD extraction for batch conditioning and embedding pipeline by @1larity in #516
  • fix(training): guard pin_memory_device when None to prevent DataLoader crash by @jayvenn21 in #517
  • (Feat)(mlx): Add dit progress bar and Fix the final-step `break by @tonyjohnvan in #519
  • Potential fix for code scanning alert no. 59: Deserialization of user-controlled data by @schneidergithub in #524
  • Support local LoKr safetensors in adapter loader by @riversedge in #530
  • (feat)(api)Enhance audio result data structure to include per-audio metadata by @tonyjohnvan in #531
  • Address review feedback: remove redundant path validations and tighten exception handling by @Copilot in #532
  • Potential fix for code scanning alert no. 59: Deserialization of user-controlled data by @schneidergithub in #529
  • Polishing tooltips in Gradio to de-clutter the GUI by @pedro16797 in #527
  • Refactor tooltip implementation in Gradio to improve GUI clarity and … by @ChuxiJ in #534
  • fix: vocal language ui by @ChuxiJ in #535
  • (feat)(mlx): Supports MLX Pre-compile model for DiT by @tonyjohnvan in #522
  • feat: enhances the Extract mode by @ChuxiJ in #536
  • fix: only hide optional outputs in Extract mode, not all modes by @ChuxiJ in #537
  • fix: Lego mode UI redesign and Extract/Lego mode switch cleanup by @DumoeDss in #538
  • feat: add inline help buttons with modal documentation to Gradio UI by @ChuxiJ in #542
  • Fix nanovllm CUBLAS error on Turing GPUs by detecting bfloat16 support by @Copilot in #541

New Contributors

Full Changelog: v0.1.0-beta.2...v0.1.0-beta.3

v0.1.0-beta.2

12 Feb 08:44
46116a6

Choose a tag to compare

v0.1.0-beta.2 Pre-release
Pre-release

What's Changed

  • fix: guard vLLM and CUDA graph usage on 16GB GPUs to prevent OOM by @jayvenn21 in #173
  • fix: pin torchao version to >=0.14.1,<0.16.0 by @ChuxiJ in #440
  • feat(mlx): Native MLX backend for DiT diffusion on Apple Silicon (2-3x speedup) by @tonyjohnvan in #439
  • feat(training): auto-load .caption.txt and .lyrics.txt files for LoRA datasets by @jayvenn21 in #438
  • feat(training): add validation-aware graph + best-checkpoint tracking to LoRA training by @jayvenn21 in #437
  • Add torch compile logic to LoRA trainer by @fidel1234xdd in #422
  • Fixed the Edited Draft Injection by @chigkim in #389
  • fix(gradio): refresh audio results and reset batch navigation on new generation by @jayvenn21 in #178
  • fix(gradio): initialize UI controls correctly for SFT models in servi… by @jayvenn21 in #207
  • Enhance MPS LoRA training by enabling gradient checkpointing in DiT training path by @riversedge in #401
  • (fix)Add CORS support for direct open browser-based frontends (studio.html) by @tonyjohnvan in #404
  • fix: auto-detect Mac/MPS and apply optimal configuration by @ChuxiJ in #443
  • refactor: streamline default backend configuration for device compati… by @ChuxiJ in #444
  • enhance: implement caching for VAE audio encoding to improve efficiency by @ChuxiJ in #445
  • refactor: optimize audio encoding logic with caching enhancements by @ChuxiJ in #446
  • Normalize audio against clipping and 32bit wav support by @lutzkirschner64-dot in #406
  • fix: API 400 - absolute audio file paths are not allowed for temp folder by @goedzo in #447
  • refactor: improve audio encoding logic with enhanced caching by @ChuxiJ in #448
  • refact: Reorganized Advanced Settings UI & Added latent shift and rescale by @ChuxiJ in #452
  • (feat)(mlx): Native MLX VAE acceleration for Apple Silicon by @tonyjohnvan in #459
  • fix(repaint): pass lyrics conditioning to repaint generation pipeline by @jayvenn21 in #461
  • Fix #455: Restore missing training keys in en.json by @lutzkirschner64-dot in #463
  • Refactor(handler part 2): Extract init utility mixin and add focused init-service tests by @1larity in #456
  • Refactor(handler part 3): Feat/handler init service mixin v2 by @1larity in #464
  • fix: reset audio playback position on regenerate & add GPU tier i18n … by @ChuxiJ in #467
  • [Bugfix] Only use FlashAttention if GPU supports it by @agorevski in #469
  • Refine Gradio UI by @ChuxiJ in #468
  • Make torchcodec optional for ROCM and Intel XPU platforms by @Copilot in #465
  • Add AdamW 8 bit by @fidel1234xdd in #449
  • feature: support lokr by @ChuxiJ in #471
  • i18n-zh-fix by @ChuxiJ in #473

New Contributors

Full Changelog: v0.1.0-beta.1...v0.1.0-beta.2

v0.1.0-beta.1

11 Feb 03:27
d1090f5

Choose a tag to compare

v0.1.0-beta.1 Pre-release
Pre-release

What's Changed

  • Refact add inference by @ChuxiJ in #1
  • feat ✨ : add lyrics alignment scores by @keylxiao in #2
  • Fix lrc bugs by @ChuxiJ in #3
  • cover/repaint test ok by @ChuxiJ in #4
  • refact understand_music by @ChuxiJ in #5
  • test ok by @ChuxiJ in #6
  • Add rewrite lyrics by @ChuxiJ in #7
  • support input timesteps by @ChuxiJ in #8
  • max_model_len 8192 -> 4096 by @ChuxiJ in #9
  • reduce vRAM usage for vae decode by @ChuxiJ in #10
  • fix: 修复 offload 模式下的设备不匹配错误 by @DumoeDss in #11
  • fix language issue simple mode by @ChuxiJ in #12
  • support lora trianing & inter by @ChuxiJ in #13
  • fix cover logic for api_server by @ChuxiJ in #14
  • add docs and serve mode by @ChuxiJ in #15
  • add docs and readme by @ChuxiJ in #16
  • add model zoo by @ChuxiJ in #17
  • fix audio play position reset by @ChuxiJ in #18
  • update hf demo by @ChuxiJ in #19
  • fix duration str bug by @ChuxiJ in #20
  • fix vllm bug by @ChuxiJ in #21
  • Update abstract with specific performance metrics and unified messaging by @Copilot in #22
  • Fix author names in BibTeX citation by @ChuxiJ in #23
  • Verify author name order in BibTeX citation by @Copilot in #24
  • fix mac nv install by @ChuxiJ in #25
  • feat: update skills by @DumoeDss in #26
  • improve lora training by @ChuxiJ in #28
  • fix lora reload by @ChuxiJ in #29
  • add model downloader by @ChuxiJ in #30
  • add model downloader by @ChuxiJ in #31
  • openrouter compatible by @seaniezhao in #32
  • fix duration by @ChuxiJ in #33
  • fix cfg kv block allocate by @ChuxiJ in #34
  • fix_tiled_vae_encode_bug by @ChuxiJ in #35
  • fix refer audio shape by @ChuxiJ in #36
  • fix 4 vram by @ChuxiJ in #38
  • tutorial by @ChuxiJ in #39
  • Fix TypeError: AceStepConditionGenerationModel does not support len() by @saltchicken in #53
  • Fix runtime crash with integer seeds and correct type hints by @iamgrootns in #48
  • Fix torch.compile crash: add len to dynamically loaded models by @Copilot in #47
  • Update Readme.md by @Saganaki22 in #62
  • fix missing torchao by @ChuxiJ in #65
  • Update release status for acestep-5Hz-lm-4B by @ChuxiJ in #71
  • fix: reset linux torch version for nano vllm by @DumoeDss in #72
  • fix max audio code id by @ChuxiJ in #78
  • Revert "fix max audio code id" by @ChuxiJ in #81
  • add support for intel gpu, tested on U9 285H, with and without 5hz llm by @xushengyuan in #88
  • Fix max audio code by @ChuxiJ in #103
  • Fix AttributeError when llm_handler is None by @fspecii in #109
  • Fix brocken HuggingFace link in README by @oumaklaus in #118
  • Optimize env installation by @DumoeDss in #128
  • Fix a Windows error that was unexpected at this time. (Fixes #134) by @start-life in #137
  • fix: bat error by @DumoeDss in #148
  • feat: Add the enforced use of lm models by @DumoeDss in #152
  • Add tensorboard to training dependencies for LoRA by @tlennon-ie in #155
  • fix: Windows path compatibility and UI freeze issues (Issue #113) by @DumoeDss in #159
  • docs: add Korean translation for core documentation by @acidsound in #187
  • feat(i18n): add translation support to LoRA training UI by @jayvenn21 in #205
  • Adding Hebrew language by @start-life in #201
  • docs: clarify ROCm / AMD usage and uv behavior by @jayvenn21 in #179
  • fix: avoid epoch-boundary stalls during LoRA training on Windows by @jayvenn21 in #177
  • fix path traversal vulnerability in audio endpoint by @Albab-Hasan in #161
  • fix: guard against missing reference audio in text2music by @jayvenn21 in #171
  • docs: add responsible disclosure security policy by @jayvenn21 in #169
  • fix: disable CUDA graph capture for 5Hz LM during LoRA training by @jayvenn21 in #172
  • CLI by @chigkim in #94
  • fix: add tolerance to 16GB VRAM detection by @jayvenn21 in #175
  • fix: Wrap audio paths in FileData for Gradio 6.x compatibility by @DemetrionWare in #181
  • feat: expose denoise control for audio-conditioned generation in Gradio by @jayvenn21 in #176
  • Extended API to allow Analysis only without generating new music. by @goedzo in #197
  • Fix Gradio Audio component error by passing file paths directly by @Copilot in #219
  • Fix MPS tiled_decode conv1d output length limit on macOS 14.7 by @Copilot in #222
  • Fix 4B model support for 16GB GPUs by increasing VRAM detection tolerance by @Copilot in #221
  • fix: use per-socket timeout instead of global setdefaulttimeout by @Albab-Hasan in #226
  • Fix bugs and add debuglog by @ChuxiJ in #230
  • Add musician-friendly guide to documentation by @sigalarm in #238
  • Position Audio time labels lower to avoid scrollbar overlap by @rkfg in #234
  • Feat/progress UI: Prevent Gradio slider math error when no audio files are found by @1larity in #194
  • Include torchcodec on MacOS to read audio files. by @chigkim in #243
  • docs: clarify AMD / ROCm usage and uv behavior by @jayvenn21 in #244
  • fix(song): ensure lyrics / vocal conditioning is preserved during song generation by @jayvenn21 in #241
  • fix: resolve intermittent CUDA assertion error in concurrent serving … by @ChuxiJ in #257
  • fix: remove nano vllm install in toml by @ChuxiJ in #260
  • feat(ui): add experimental Studio UI for REST API by @jayvenn21 in #258
  • feat(mps): Comprehensive Apple Silicon MPS backend support by @tonyjohnvan in #256
  • fix(mps): Address memory management and performance issues for Apple … by @ChuxiJ in #261
  • feat: Add AMD ROCm support for Windows (RX 7000/6000 series) by @clowerweb in #210
  • [WIP] Fix uv sync regression with CUDA version mismatches by @Copilot in #268
  • fix: add anti-clipping normalization to prevent audio clipping/overload by @ChuxiJ in #276
  • fix: validate training-incompatible settings before starting LoRA training by @ChuxiJ in #277
  • Fix check_update.bat: handle line ending changes and remove pager blocking by @Copilot in #273
  • Add manual for rocm Linux (Cachy-OS) and additional requirements file by @Neresco in https://github.com/ac...
Read more