Skip to content

Sync Swift implementation of Qwen with mlx-lm#137

Draft
ronaldmannak wants to merge 6 commits intoml-explore:mainfrom
PicoMLX:qwen
Draft

Sync Swift implementation of Qwen with mlx-lm#137
ronaldmannak wants to merge 6 commits intoml-explore:mainfrom
PicoMLX:qwen

Conversation

@ronaldmannak
Copy link
Copy Markdown
Contributor

Proposed changes

I noticed that Qwen 3.5 can sometimes get stuck in infinite repetition of one or more paragraphs. This is mentioned in Qwen 3.5 readme of the 0.8B and 2B versions, but I've seen it happening with larger Qwen 3.5 models as well. This PR does not fix that issue, but while investigating it I found a few discrepancies with the Python implementation. This change updates the Swift version to match mlx-lm. I'm creating a draft pull request and will continue to investigate the repetition issue (if the issue is the swift implementation)

Checklist

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

ronaldmannak and others added 6 commits March 6, 2026 21:34
Upcast gate and normalized x to float32 before silu+multiply to match
upstream Python fix and prevent numerical degradation during generation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

swift-format Swift format failure in CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants