Audio to audio: analyze existing song to get the prompt for it #263

firepol · 2025-06-09T11:56:45Z

firepol
Jun 9, 2025

Hi, I remember stable diffusion had a way to generate a prompt from an existing image, img2img, you can search in youtube and find various examples, like this one: https://www.youtube.com/watch?v=PUwLT9JwCs8

Basically the user uploads the image, the model analyzes it and reverses from the image the prompt that it thinks should generate the closest result, then the user can modify the prompt a bit and regenerate a similar image.

Would be cool for ACE-Step to also have this kind of feature. Like this we could upload existing music having no idea how to describe it to make it happen (I mean the prompt), then modify either lyrics or prompt and generate a similar one.

woct0rdho · 2025-06-10T02:28:12Z

woct0rdho
Jun 10, 2025

'Generating the prompt from an image' is not 'img2img'. 'Img2img' is to encode the image into latents and send them to the model.

To generate the prompt from an audio, the authors use Qwen-Omni, see https://github.com/ace-step/ACE-Step/blob/main/TRAIN_INSTRUCTION.md . You can just send your audio to the online demo of Qwen-Omni and ask it to describe this audio.

1 reply

firepol Jun 10, 2025
Author

Thanks for the answer with the Qwen-Omni hint, I'll give it a try. In the meantime (yesterday) I tried also to upload to suno, suno describes also a song quite well. Just with the free version of suno you can upload only the first 60 seconds (you can upload more, but it cuts to 60 seconds) but it's not bad at all in describing the song. Creating a prompt to be used in ACE-Step is another thing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Audio to audio: analyze existing song to get the prompt for it #263

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Audio to audio: analyze existing song to get the prompt for it #263

Uh oh!

firepol Jun 9, 2025

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

woct0rdho Jun 10, 2025

Uh oh!

firepol Jun 10, 2025 Author

firepol
Jun 9, 2025

Replies: 1 comment 1 reply

woct0rdho
Jun 10, 2025

firepol Jun 10, 2025
Author