Releases: ace-step/ACE-Step-1.5
Releases · ace-step/ACE-Step-1.5
v0.1.4
What's Changed
- docs: add Awesome ACE-Step link to README by @ChuxiJ in #715
- Fix auto-labelling crash when "auto" device is selected by @Copilot in #714
- refactor(gradio): decompose LLM actions and harden UI test isolation by @1larity in #710
- Automated clean up of issues by @schneidergithub in #723
- refactor(gradio): decompose batch management with facade by @1larity in #716
- feat(openrouter): add sample/audio2code endpoints and expand generation params by @ChuxiJ in #729
- refactor(openrouter): inline audio2code for cover mode, remove standalone endpoints by @ChuxiJ in #730
- Fixing the issue with
_unwrap_decoderfunction import (see #719) by @arsenylosev in #720 - refactor(api): decompose model service and reinitialize routes by @1larity in #718
- Exclude macOS from flash-attn wheel dependencies in nano-vllm to stop ~480MB download on macOS ARM64 by @Copilot in #734
- feat: Add NVIDIA Jetson Docker support with GPU acceleration by @toolboc in #735
- refactor(api): decompose sample/query/release task routes by @1larity in #725
- refactor(api): decompose runtime/job helpers (API FD) by @1larity in #726
- fix(generation): add timeout, progress fallback, and VRAM pre-flight … by @Uni404x64 in #671
- feat: add codes for openrouter_adapter by @DumoeDss in #738
- refactor(api): decompose api_server into focused modules by @1larity in #741
- fix(lrc): always return intermediate tensors so LRC generation works for all task types by @ChuxiJ in #750
New Contributors
- @arsenylosev made their first contribution in #720
- @toolboc made their first contribution in #735
- @Uni404x64 made their first contribution in #671
Full Changelog: v0.1.3...v0.1.4
v0.1.3
What's Changed
- Refactor (gradio part1) events wiring contracts by @1larity in #652
- Refactor (gradio part2) generation wiring decomposition by @1larity in #653
- fix: AutoGen 'No next batch available' when next batch is ready by @ChuxiJ in #659
- Refactor (gradio part3) mode wiring decomposition by @1larity in #654
- refactor(gradio-events): extract generation run and batch navigation wiring modules (PR5) by @1larity in #655
- Refactor (gradio part 6) mode wiring decomposition by @1larity in #663
- Refactor (gradio part 7) generation interface decomposition by @1larity in #668
- fix(gradio): restore language and optional parameter interactivity by @1larity in #672
- Refactor: Dynamic i18n Language Management by @orlagno in #686
- Fix Gradio audio volume persistence and playback reset by @1larity in #680
- Improving and updating the Hebrew language and adding features by @start-life in #694
- refactor: clean up i18n discovery and CLI help by @orlagno in #692
- Revert "Fix Gradio audio volume persistence and playback reset" by @ChuxiJ in #696
- feat: remove prompt parse to sample_query by @ChuxiJ in #700
- Fix garbled audio after switching from Remix/Repaint to Custom mode by @Copilot in #701
- feat: add model init API, model inventory, and MLX CFG/APG support by @ChuxiJ in #703
- fix: respect ACESTEP_INIT_LLM=false in lazy load and prioritize expli… by @ChuxiJ in #707
- feat: add Streamlit-based ACE Studio UI by @pabbasian in #693
- fix(gradio): reapply audio volume persistence with sane default by @1larity in #697
- fix: handle MLX incompatibility in python_embeded on macOS by @ChuxiJ in #711
New Contributors
- @pabbasian made their first contribution in #693
Full Changelog: v0.1.2...v0.1.3
v0.1.2
What's Changed
- fix(training): skip torch.compile when PEFT LoRA adapters are active by @FeelTheFonk in #640
- (File permisions) Set executable permissions
755onstart_*.shscripts for consistency. by @Red007Master in #641 - fix: resolve symlinks in safe_path to prevent false rejections by @ChuxiJ in #648
- feat: LoKr adapter support, LoRA status fix, and training docs update by @ChuxiJ in #649
New Contributors
- @FeelTheFonk made their first contribution in #640
- @Red007Master made their first contribution in #641
Full Changelog: v0.1.1...v0.1.2
v0.1.1
What's Changed
- Refactor(handler part 14): split initialize_service into focused init-service mixins and expand unit coverage by @1larity in #617
- Refactor(handler part 15): extract training preset switching into dedicated mixin by @1larity in #620
- Refactor(handler part 16): extract service_generate into dedicated mixin + normalize legacy encoding messages by @1larity in #621
- Refactor(handler part 17): extract MLX helper mixins by @1larity in #625
- Implemented new Studio UI by @goedzo in #627
- Refactor(handler): decompose generate_music orchestration (part 18) by @1larity in #626
- Refactor(handler part 19): final facade cleanup post 17/18 by @1larity in #628
- fix: pass project_root instead of checkpoint dir to initialize_service by @ChuxiJ in #637
- Add cross-platform interactive manual launch scripts by @orlagno in #636
- fix: clear stale UI state on mode switch to prevent remix noise bug by @ChuxiJ in #638
New Contributors
Full Changelog: v0.1.0...v0.1.1
v0.1.0
What's Changed
- Refactor(handler part 10): decompose generate_music orchestration into reques… by @1larity in #588
- Fix start scripts by @DumoeDss in #592
- Refactor(handler part 11): decompose generate_music orchestration and add upl… by @1larity in #590
- Chore(workflow): enforce independent PR branches with pre-push guard by @1larity in #595
- feat: add lora training tutorial by @DumoeDss in #593
- Add lycoris lokr load and load lora api by @sdbds in #582
- Refactor(handler part 13): vae encode by @1larity in #594
- Refactor(handler part 12): vae decode by @1larity in #591
- refactor: relocate gradio_ui → ui/gradio and remove old shim by @ChuxiJ in #599
- refactor: move mlx_dit and mlx_vae into models/mlx by @ChuxiJ in #602
- refactor: consolidate scoring modules into acestep/core/scoring/ by @ChuxiJ in #603
- docs: link LoRA Training Tutorials and Side-Step advanced training docs by @ChuxiJ in #605
- Docs/link lora training tutorials by @ChuxiJ in #607
- fix: manifest path double-nesting causes training to find no samples by @ChuxiJ in #611
New Contributors
Full Changelog: v0.1.0-rc.1...v0.1.0
v0.1.0-rc.1
What's Changed
- fix: remix strength info by @ChuxiJ in #546
- fix: add tiled VAE encoding and CPU offloading to training preprocessing by @ChuxiJ in #547
- Add Opus and AAC audio output formats by @Copilot in #554
- Add .env support to launcher scripts to preserve user customizations across updates by @Copilot in #552
- Relax Python constraint to support ROCm 7.2 Windows (requires 3.12) by @Copilot in #553
- feat: Side-Step -- corrected LoRA/LoKR fine-tuning with interactive wizard by @koda-dernet in #557
- Include LoRA state in audio file UUID generation by @Copilot in #556
- Refactor(handler part 8): extract lyric alignment timestamp/score mixins by @1larity in #565
- Refactor(handler part 9): service generate by @1larity in #568
- fix: escape Rich markup in user-provided paths to prevent MarkupError by @koda-dernet in #573
- i18n: add missing training/export translations for he, ja, zh by @jayvenn21 in #574
- Make batch size configurable and persistent across UI actions by @Copilot in #567
- Add batch size cap validation in service_generate by @Copilot in #570
- Enable multi-LoRA stacking with per-adapter scaling (addresses #338) by @jayvenn21 in #571
- (fix):DGX Spark dependencies by @tonyjohnvan in #575
- Support meta auto by @ChuxiJ in #580
- fix: differentiate LoRA adapters in UUID and load LM codes on JSON im… by @ChuxiJ in #583
- fix: prevent stale src_audio from leaking into text2music (Custom) mode by @ChuxiJ in #584
Full Changelog: v0.1.0-beta.3...v0.1.0-rc.1
v0.1.0-beta.3
What's Changed
- docs: add AGENTS.md guardrails for AI-assisted contributions by @1larity in #475
- Fix: {TRACK_NAME} not replaced by actual track name in API. by @goedzo in #474
- fix: seeds param by @DumoeDss in #496
- refactor(handler part 4): extract diffusion mixin and add validation tests by @1larity in #490
- Fix repaint mode not populating lyrics from generated audio metadata by @Copilot in #489
- [Fix] Eliminate redundant LLM load/offload cycles in PT batch mode by @agorevski in #492
- Fix LoRA memory bloat: replace deepcopy with CPU state_dict backup by @Copilot in #499
- fix: phase-aware max_new_tokens to fix misleading progress bar stopping early by @agorevski in #493
- Add CodeQL analysis workflow configuration by @schneidergithub in #502
- Use CPU for LM by @Jay4242 in #497
- Add minimal Copilot instructions for repository by @Copilot in #506
- Feat update skills simplemv by @DumoeDss in #500
- Fix YAML indentation in CodeQL workflow configuration by @Copilot in #503
- refactor(handler part 5+6 ): Extend decomposition and harden source-audio analyze validation by @1larity in #508
- fix(handler): address post-merge review feedback in io/prompt/task/padding mixins by @1larity in #510
- Potential fix for code scanning alert no. 35: Uncontrolled data used in path expression by @schneidergithub in #507
- fix: improve error handling in training and generation handlers by @ChuxiJ in #513
- Doc only: Give coderabbit explicit guidance on module loc limits. by @1larity in #512
- Demote vllm→PyTorch backend auto-selection from warning to info by @Copilot in #511
- feat: add dgx spark support by @NicasioSirvent in #485
- support cover (exp) by @ChuxiJ in #479
- feat: add Side-Step training v2 (corrected LoRA fine-tuning) by @koda-dernet in #478
- Revert "feat: add dgx spark support" by @ChuxiJ in #514
- Skip torch.distributed initialization in single-GPU mode by @Copilot in #477
- Fix CodeQL alert #59: Sanitize checkpoint paths to prevent deserialization attacks by @Copilot in #525
- Streamline path traversal security fix: remove verbose logging and defensive checks by @Copilot in #526
- fix for intel gpu support by @xushengyuan in #515
- Refactor(handler part 7): continue FD extraction for batch conditioning and embedding pipeline by @1larity in #516
- fix(training): guard pin_memory_device when None to prevent DataLoader crash by @jayvenn21 in #517
- (Feat)(mlx): Add dit progress bar and Fix the final-step `break by @tonyjohnvan in #519
- Potential fix for code scanning alert no. 59: Deserialization of user-controlled data by @schneidergithub in #524
- Support local LoKr safetensors in adapter loader by @riversedge in #530
- (feat)(api)Enhance audio result data structure to include per-audio metadata by @tonyjohnvan in #531
- Address review feedback: remove redundant path validations and tighten exception handling by @Copilot in #532
- Potential fix for code scanning alert no. 59: Deserialization of user-controlled data by @schneidergithub in #529
- Polishing tooltips in Gradio to de-clutter the GUI by @pedro16797 in #527
- Refactor tooltip implementation in Gradio to improve GUI clarity and … by @ChuxiJ in #534
- fix: vocal language ui by @ChuxiJ in #535
- (feat)(mlx): Supports MLX Pre-compile model for DiT by @tonyjohnvan in #522
- feat: enhances the Extract mode by @ChuxiJ in #536
- fix: only hide optional outputs in Extract mode, not all modes by @ChuxiJ in #537
- fix: Lego mode UI redesign and Extract/Lego mode switch cleanup by @DumoeDss in #538
- feat: add inline help buttons with modal documentation to Gradio UI by @ChuxiJ in #542
- Fix nanovllm CUBLAS error on Turing GPUs by detecting bfloat16 support by @Copilot in #541
New Contributors
- @schneidergithub made their first contribution in #502
- @Jay4242 made their first contribution in #497
- @NicasioSirvent made their first contribution in #485
- @koda-dernet made their first contribution in #478
- @pedro16797 made their first contribution in #527
Full Changelog: v0.1.0-beta.2...v0.1.0-beta.3
v0.1.0-beta.2
What's Changed
- fix: guard vLLM and CUDA graph usage on 16GB GPUs to prevent OOM by @jayvenn21 in #173
- fix: pin torchao version to >=0.14.1,<0.16.0 by @ChuxiJ in #440
- feat(mlx): Native MLX backend for DiT diffusion on Apple Silicon (2-3x speedup) by @tonyjohnvan in #439
- feat(training): auto-load .caption.txt and .lyrics.txt files for LoRA datasets by @jayvenn21 in #438
- feat(training): add validation-aware graph + best-checkpoint tracking to LoRA training by @jayvenn21 in #437
- Add torch compile logic to LoRA trainer by @fidel1234xdd in #422
- Fixed the Edited Draft Injection by @chigkim in #389
- fix(gradio): refresh audio results and reset batch navigation on new generation by @jayvenn21 in #178
- fix(gradio): initialize UI controls correctly for SFT models in servi… by @jayvenn21 in #207
- Enhance MPS LoRA training by enabling gradient checkpointing in DiT training path by @riversedge in #401
- (fix)Add CORS support for direct open browser-based frontends (studio.html) by @tonyjohnvan in #404
- fix: auto-detect Mac/MPS and apply optimal configuration by @ChuxiJ in #443
- refactor: streamline default backend configuration for device compati… by @ChuxiJ in #444
- enhance: implement caching for VAE audio encoding to improve efficiency by @ChuxiJ in #445
- refactor: optimize audio encoding logic with caching enhancements by @ChuxiJ in #446
- Normalize audio against clipping and 32bit wav support by @lutzkirschner64-dot in #406
- fix: API 400 - absolute audio file paths are not allowed for temp folder by @goedzo in #447
- refactor: improve audio encoding logic with enhanced caching by @ChuxiJ in #448
- refact: Reorganized Advanced Settings UI & Added latent shift and rescale by @ChuxiJ in #452
- (feat)(mlx): Native MLX VAE acceleration for Apple Silicon by @tonyjohnvan in #459
- fix(repaint): pass lyrics conditioning to repaint generation pipeline by @jayvenn21 in #461
- Fix #455: Restore missing training keys in en.json by @lutzkirschner64-dot in #463
- Refactor(handler part 2): Extract init utility mixin and add focused init-service tests by @1larity in #456
- Refactor(handler part 3): Feat/handler init service mixin v2 by @1larity in #464
- fix: reset audio playback position on regenerate & add GPU tier i18n … by @ChuxiJ in #467
- [Bugfix] Only use FlashAttention if GPU supports it by @agorevski in #469
- Refine Gradio UI by @ChuxiJ in #468
- Make torchcodec optional for ROCM and Intel XPU platforms by @Copilot in #465
- Add AdamW 8 bit by @fidel1234xdd in #449
- feature: support lokr by @ChuxiJ in #471
- i18n-zh-fix by @ChuxiJ in #473
New Contributors
- @fidel1234xdd made their first contribution in #422
- @lutzkirschner64-dot made their first contribution in #406
- @agorevski made their first contribution in #469
Full Changelog: v0.1.0-beta.1...v0.1.0-beta.2
v0.1.0-beta.1
What's Changed
- Refact add inference by @ChuxiJ in #1
- feat ✨ : add lyrics alignment scores by @keylxiao in #2
- Fix lrc bugs by @ChuxiJ in #3
- cover/repaint test ok by @ChuxiJ in #4
- refact understand_music by @ChuxiJ in #5
- test ok by @ChuxiJ in #6
- Add rewrite lyrics by @ChuxiJ in #7
- support input timesteps by @ChuxiJ in #8
- max_model_len 8192 -> 4096 by @ChuxiJ in #9
- reduce vRAM usage for vae decode by @ChuxiJ in #10
- fix: 修复 offload 模式下的设备不匹配错误 by @DumoeDss in #11
- fix language issue simple mode by @ChuxiJ in #12
- support lora trianing & inter by @ChuxiJ in #13
- fix cover logic for api_server by @ChuxiJ in #14
- add docs and serve mode by @ChuxiJ in #15
- add docs and readme by @ChuxiJ in #16
- add model zoo by @ChuxiJ in #17
- fix audio play position reset by @ChuxiJ in #18
- update hf demo by @ChuxiJ in #19
- fix duration str bug by @ChuxiJ in #20
- fix vllm bug by @ChuxiJ in #21
- Update abstract with specific performance metrics and unified messaging by @Copilot in #22
- Fix author names in BibTeX citation by @ChuxiJ in #23
- Verify author name order in BibTeX citation by @Copilot in #24
- fix mac nv install by @ChuxiJ in #25
- feat: update skills by @DumoeDss in #26
- improve lora training by @ChuxiJ in #28
- fix lora reload by @ChuxiJ in #29
- add model downloader by @ChuxiJ in #30
- add model downloader by @ChuxiJ in #31
- openrouter compatible by @seaniezhao in #32
- fix duration by @ChuxiJ in #33
- fix cfg kv block allocate by @ChuxiJ in #34
- fix_tiled_vae_encode_bug by @ChuxiJ in #35
- fix refer audio shape by @ChuxiJ in #36
- fix 4 vram by @ChuxiJ in #38
- tutorial by @ChuxiJ in #39
- Fix
TypeError: AceStepConditionGenerationModel does not support len()by @saltchicken in #53 - Fix runtime crash with integer seeds and correct type hints by @iamgrootns in #48
- Fix torch.compile crash: add len to dynamically loaded models by @Copilot in #47
- Update Readme.md by @Saganaki22 in #62
- fix missing torchao by @ChuxiJ in #65
- Update release status for
acestep-5Hz-lm-4Bby @ChuxiJ in #71 - fix: reset linux torch version for nano vllm by @DumoeDss in #72
- fix max audio code id by @ChuxiJ in #78
- Revert "fix max audio code id" by @ChuxiJ in #81
- add support for intel gpu, tested on U9 285H, with and without 5hz llm by @xushengyuan in #88
- Fix max audio code by @ChuxiJ in #103
- Fix AttributeError when llm_handler is None by @fspecii in #109
- Fix brocken HuggingFace link in README by @oumaklaus in #118
- Optimize env installation by @DumoeDss in #128
- Fix a Windows error that was unexpected at this time. (Fixes #134) by @start-life in #137
- fix: bat error by @DumoeDss in #148
- feat: Add the enforced use of lm models by @DumoeDss in #152
- Add tensorboard to training dependencies for LoRA by @tlennon-ie in #155
- fix: Windows path compatibility and UI freeze issues (Issue #113) by @DumoeDss in #159
- docs: add Korean translation for core documentation by @acidsound in #187
- feat(i18n): add translation support to LoRA training UI by @jayvenn21 in #205
- Adding Hebrew language by @start-life in #201
- docs: clarify ROCm / AMD usage and uv behavior by @jayvenn21 in #179
- fix: avoid epoch-boundary stalls during LoRA training on Windows by @jayvenn21 in #177
- fix path traversal vulnerability in audio endpoint by @Albab-Hasan in #161
- fix: guard against missing reference audio in text2music by @jayvenn21 in #171
- docs: add responsible disclosure security policy by @jayvenn21 in #169
- fix: disable CUDA graph capture for 5Hz LM during LoRA training by @jayvenn21 in #172
- CLI by @chigkim in #94
- fix: add tolerance to 16GB VRAM detection by @jayvenn21 in #175
- fix: Wrap audio paths in FileData for Gradio 6.x compatibility by @DemetrionWare in #181
- feat: expose denoise control for audio-conditioned generation in Gradio by @jayvenn21 in #176
- Extended API to allow Analysis only without generating new music. by @goedzo in #197
- Fix Gradio Audio component error by passing file paths directly by @Copilot in #219
- Fix MPS tiled_decode conv1d output length limit on macOS 14.7 by @Copilot in #222
- Fix 4B model support for 16GB GPUs by increasing VRAM detection tolerance by @Copilot in #221
- fix: use per-socket timeout instead of global setdefaulttimeout by @Albab-Hasan in #226
- Fix bugs and add debuglog by @ChuxiJ in #230
- Add musician-friendly guide to documentation by @sigalarm in #238
- Position Audio time labels lower to avoid scrollbar overlap by @rkfg in #234
- Feat/progress UI: Prevent Gradio slider math error when no audio files are found by @1larity in #194
- Include torchcodec on MacOS to read audio files. by @chigkim in #243
- docs: clarify AMD / ROCm usage and uv behavior by @jayvenn21 in #244
- fix(song): ensure lyrics / vocal conditioning is preserved during song generation by @jayvenn21 in #241
- fix: resolve intermittent CUDA assertion error in concurrent serving … by @ChuxiJ in #257
- fix: remove nano vllm install in toml by @ChuxiJ in #260
- feat(ui): add experimental Studio UI for REST API by @jayvenn21 in #258
- feat(mps): Comprehensive Apple Silicon MPS backend support by @tonyjohnvan in #256
- fix(mps): Address memory management and performance issues for Apple … by @ChuxiJ in #261
- feat: Add AMD ROCm support for Windows (RX 7000/6000 series) by @clowerweb in #210
- [WIP] Fix uv sync regression with CUDA version mismatches by @Copilot in #268
- fix: add anti-clipping normalization to prevent audio clipping/overload by @ChuxiJ in #276
- fix: validate training-incompatible settings before starting LoRA training by @ChuxiJ in #277
- Fix check_update.bat: handle line ending changes and remove pager blocking by @Copilot in #273
- Add manual for rocm Linux (Cachy-OS) and additional requirements file by @Neresco in https://github.com/ac...