-
Notifications
You must be signed in to change notification settings - Fork 178
Description
Hi!
I have worked with unconditional generation using this fine repo. It is a lot of fun! I will do latent diffusion next. I am already looking forward to it.
Text conditional generation promises a lot of fun. I have a few questions.
-
In the README, in the conditional section, we can read "Text conditioning, one element per batch", this means "one text per waveform" and thus "a batch of texts for a batch of waveforms", right? Not "one text for a batch of waveforms"?
-
I believe latent diffusion and text conditioning to be orthogonal. Is it safe to assume that DiffuserAE would work with text conditioning by just adding the right kwargs?
-
What would be necessary in order to replace the T5 embeddings with something else?
-
What would be the consequences of extending the number of tokens for T5?
This is so cool!
Best,
Tristan