Enabling Flash Attention on 1.83.1 with a Mistral Small 3 model leads to the model replying with Unicode garbage when the prompt exceeds 4k tokens, regardless of context size (prompts only slightly over 4k tokens can produce passable results occasionally, try like 8k tokens for a clear picture). Disabling Flash Attention, or using 1.82.4, fixes the issue.
Additional Information:
Windows 10, AMD 7800X3D, RTX 4090, latest Nvidia drivers