Skip to content

Commit c0020d5

Browse files
feat: add pretrained dmae1d
1 parent 4433c96 commit c0020d5

File tree

4 files changed

+36
-1
lines changed

4 files changed

+36
-1
lines changed

README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,7 +241,26 @@ composer = SpanBySpanComposer(
241241
y_long = composer(y, keep_start=True) # [1, 1, 98304]
242242
```
243243

244+
## Pretrained Models
244245

246+
### Diffusion (Magnitude) AutoEncoder ([`dmae1d-ATC64-v1`](https://huggingface.co/archinetai/dmae1d-ATC64-v1/tree/main))
247+
```py
248+
from audio_diffusion_pytorch import AudioModel
249+
250+
autoencoder = AudioModel.from_pretrained("dmae1d-ATC64-v1")
251+
252+
x = torch.randn(1, 2, 2**18)
253+
z = autoencoder.encode(x) # [1, 32, 256]
254+
y = autoencoder.decode(z, num_steps=20) # [1, 2, 262144]
255+
```
256+
257+
| Info | |
258+
| ------------- | ------------- |
259+
| Input type | Audio (stereo @ 48kHz) |
260+
| Number of parameters | 234.2M |
261+
| Compression Factor | 64x |
262+
| Downsampling Factor | 1024x |
263+
| Bottleneck Type | Tanh |
245264

246265

247266
## Experiments

audio_diffusion_pytorch/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
AudioDiffusionUpphaser,
2828
AudioDiffusionUpsampler,
2929
AudioDiffusionVocoder,
30+
AudioModel,
3031
DiffusionAutoencoder1d,
3132
DiffusionMAE1d,
3233
DiffusionUpphaser1d,

audio_diffusion_pytorch/model.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -500,3 +500,18 @@ def __init__(self, in_channels: int, **kwargs):
500500

501501
def sample(self, *args, **kwargs):
502502
return super().sample(*args, **{**get_default_sampling_kwargs(), **kwargs})
503+
504+
505+
""" Pretrained Models Helper """
506+
507+
REVISION = {"dmae1d-ATC64-v1": "07885065867977af43b460bb9c1422bdc90c29a0"}
508+
509+
510+
class AudioModel:
511+
@staticmethod
512+
def from_pretrained(name: str) -> nn.Module:
513+
from transformers import AutoModel
514+
515+
return AutoModel.from_pretrained(
516+
f"archinetai/{name}", trust_remote_code=True, revision=REVISION[name]
517+
)

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
setup(
44
name="audio-diffusion-pytorch",
55
packages=find_packages(exclude=[]),
6-
version="0.0.91",
6+
version="0.0.92",
77
license="MIT",
88
description="Audio Diffusion - PyTorch",
99
long_description_content_type="text/markdown",

0 commit comments

Comments
 (0)