Skip to content

Variable length WAV files #2

@madhavajay

Description

@madhavajay

This is really cool.

However I have tried it on my own data set and im getting the following errors:

> print(data)

[AudioClip (duration=4.040125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz), AudioClip (duration=4.040125s, sample_rate=16.0KHz), AudioClip (duration=4.030125s, sample_rate=16.0KHz)]...

> learn.fit_one_cycle(3)
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 1024])

When I look at the Magenta data looks to be all 4 second waves:

> print(data)

[AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz), AudioClip (duration=4.0s, sample_rate=16.0KHz)]...

I see this in the code, is it related?

# TODO: generalize this away from hard coding dim values
def pad_collate2d(batch):

If you can give me a general idea of what to look for I can see if i can fix it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions