Missing tie_word_embeddings in config.json causes incorrect weight tying in transformers 4.54+

## Issue

The `config.json` in the Step-Audio-EditX model is missing the `tie_word_embeddings` configuration key. This causes transformers 4.54+ to incorrectly tie the `lm_head` and `embed_tokens` weights together, even though they have different values in the checkpoint.

## Root Cause

- `config.json` does not contain `"tie_word_embeddings"`
- transformers 4.54+ defaults to `tie_word_embeddings=True` when this key is missing
- The model checkpoint has **separate weights** for `lm_head` and `embed_tokens` (different norms)
- Tying overwrites the correct `lm_head` weights with `embed_tokens` weights
- This causes the model to generate text tokens instead of audio tokens
- Result: Silent/gibberish audio generation and generation ignoring `max_new_tokens`

## Solution

Add the following line to `config.json`:
```json
"tie_word_embeddings": false
```

This tells transformers to keep the weights separate, which matches your checkpoint structure.

## Impact

This affects all users of Step-Audio-EditX with transformers 4.54+. Users have to implement workarounds to restore weights after model loading.

## References

- transformers PR addressing similar issues: https://github.com/huggingface/transformers/pull/42612
- Workaround implemented in TTS-Audio-Suite: https://github.com/diodiogod/TTS-Audio-Suite/issues/202

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing tie_word_embeddings in config.json causes incorrect weight tying in transformers 4.54+ #38

Issue

Root Cause

Solution

Impact

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Missing tie_word_embeddings in config.json causes incorrect weight tying in transformers 4.54+ #38

Description

Issue

Root Cause

Solution

Impact

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions