Firstly, thank you for the awesome project. I am a bit confused about why you configured the Transformer decoder like this:
d_model: 128
dim_feedforward: 256
nhead: 4
dropout: 0.3
num_decoder_layers: 3
max_output_len: 150
How can I configure these parameters correctly?
Thank you for reading, and I hope you will respond to me soon.
@kingyiusuen