Description & Motivation
Recently, models (e.g., Qwen3/Llama) have replaced the LayerNorm
layers with the RMSNorm
layers because they perform identically but are slightly more efficient.
To support it, a new branch almost identical to the LayerNorm
would have to be added to the conversion function.
cc @lantiga @Borda