Fixes wrong output dimensions in ConvTranspose1d by ecobost · Pull Request #78 · descriptinc/descript-audio-codec

ecobost · 2024-06-28T12:43:24Z

Solves #42, #58 and #68: all related to incorrect computation of output shape in the ConvTraspose1d of the DecoderBlock (as also pointed out in PR #44).

When using stride > 1 in a conv operation the output dimensions are underdetermined and ConvTranspose1d needs extra info (the output_padding) to compute the expected output (see note in docs).

Given the construction constraints of the conv/deconv operations (namely, kernel_size=stride/2, padding=ceil(stride/2)), I figured out the right output_padding (so we always recover the same input dimensions) is:

if s is even:
	output_padding = 0 if input_timesteps is divisible by stride, else 1
If stride is odd:
	output_padding = 0  if input_timesteps + 1 is divisible by stride, else 1

with input_timesteps = timestestep dimension of the input to the original conv1d.

This PR sets output_padding=0 for even strides and 1 for odd strides. This will work in the vast majority of cases (including for all pretrained models) except when:
1: if stride is even and input_timesteps is not divisible by stride.
2: if stride is odd and input_timesteps+1 is divisible by stride.
Both of which are unlikely ( and case 1 would fail anyway even without this PR). At the very least, I believe the current setting is a more sensible default.

Changes output_padding to deal better with odd strides.

plenty of changes since 1.0.0

ArchiMickey · 2025-05-22T08:16:40Z

@ecobost Hi thanks for your work. I come to this pr since I am fixing the same problem in stable-audio-tools repo. May I ask if you have observe any artifacts that will caused by using output_padding=0?

ecobost · 2025-05-22T08:27:55Z

Hi @ArchiMickey, I haven't seen any artifacts but I wasn't necessarily looking out for them. I doubt there will be any, though.

ecobost added 3 commits June 28, 2024 14:20

Fixes wrong output dimensions in ConvTranspose1d

3369b94

Changes output_padding to deal better with odd strides.

bump version from 1.0.0 to 1.0.1

9b12d8b

plenty of changes since 1.0.0

minor: code refactoring

99ecabf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes wrong output dimensions in ConvTranspose1d#78

Fixes wrong output dimensions in ConvTranspose1d#78
ecobost wants to merge 3 commits intodescriptinc:mainfrom
ecobost:main

ecobost commented Jun 28, 2024

Uh oh!

ArchiMickey commented May 22, 2025

Uh oh!

ecobost commented May 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

ecobost commented Jun 28, 2024

Uh oh!

ArchiMickey commented May 22, 2025

Uh oh!

ecobost commented May 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants