Skip to content

[AVR] Change half to use softPromoteHalfType #152783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 10, 2025

Conversation

tgross35
Copy link
Contributor

@tgross35 tgross35 commented Aug 8, 2025

The default half legalization has some issues with quieting NaNs and carrying excess precision. As has been done for various other targets, update AVR to use softPromoteHalfType which avoids these issues.

The most obvious corrected test below is test_load_store, which no longer contains calls to extend and trunc (this passing through libcalls means that f16 does not round trip).

Fixes the AVR part of #97975
Fixes the AVR part of #97981

@tgross35
Copy link
Contributor Author

tgross35 commented Aug 8, 2025

This is based on top of #152708 which should land first, the second commit here is the relevant part.

@benshi001 could you review this as well? Cc @Patryk27

; CHECK-NEXT: mov r30, r22
; CHECK-NEXT: mov r31, r23
; CHECK-NEXT: std Z+1, r25
; CHECK-NEXT: st Z, r24
; CHECK-NEXT: ret
store half %x, ptr %p
ret void
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unclear about where the existing CC came from but I think it assumed that the f16 was stored in the lower half of an f32, which would be passed in r22..=r25. Now it is just using the i16 ABI, which makes more sense.

https://llvm.godbolt.org/z/f6Gfn9TPb

; CHECK-NEXT: ldd r25, Z+1
; CHECK-NEXT: mov r30, r22
; CHECK-NEXT: mov r31, r23
; CHECK-NEXT: std Z+1, r25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, kinda funny we first store the upper-byte and then the lower-byte, didn't realize it before (doesn't seem to be a bug though 🤔)

Copy link
Contributor

@Patryk27 Patryk27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @benshi001 as well

@tgross35 tgross35 force-pushed the avr-soft-promote-half branch 2 times, most recently from 31904ba to fabf0e6 Compare August 9, 2025 08:27
@benshi001
Copy link
Member

AVR is a quite simple 8-bit CPU, it lacks HW float unit and vector unit. In real production environment, clang-AVR relies on libgcc-7.3, which also lacks infrastrure of float16. In fact avr-gcc 7.3 even attributes double to float, as indicated in https://gcc.gnu.org/wiki/avr-gcc.

So I think the best way is also attributing half type to float32, to fit avr-gcc-7.3 (and its libgcc) which is the main stream version in real production environment.

However we can not prevent hand writing double / half in llvm IR, so I am not sure how to deal with that, please let me have a carefully consideration.

@tgross35
Copy link
Contributor Author

tgross35 commented Aug 9, 2025

I just replied at #97975 but I'll give some more details here:

However we can not prevent hand writing double / half in llvm IR, so I am not sure how to deal with that, please let me have a carefully consideration.

For context, I'm working on Rust's f16 type, which has the semantics of IEEE binary16, so we do want LLVM's half type with those same semantics (promoting to f32 is not an option). We also don't really need to be concerned about the missing support in libgcc since we bring our own runtime libs to fill in whatever it is missing. I think _Float16 support is just a case where LLVM is a bit ahead of GCC.

Currently for AVR, you can use half but the generated code is pretty broken, #97981 makes it difficult to implement the intrinsics (just taking a f16 parameter becomes a recursive libcall). This unbreaks that and at least lets us build the intrinsics to experiment more.

So I think the best way is also attributing half type to float32, to fit avr-gcc-7.3 (and its libgcc) which is the main stream version in real production environment.

You are talking about the Clang frontend here right, something like C23 _Float16x? Since LLVM's half is fixed-size, as is Rust and Zig's f16 and C23's _Float16 (if that ever winds up supported).

@benshi001 benshi001 self-requested a review August 10, 2025 02:22
Copy link
Member

@benshi001 benshi001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The default `half` legalization has some issues with quieting NaNs and
carrying excess precision. As has been done for various other targets,
update AVR to use `softPromoteHalfType` which avoids these issues.

The most obvious corrected test below is `test_load_store`, which no
longer contains calls to extend and trunc (which would cause
roundtripping to fail).
@tgross35 tgross35 force-pushed the avr-soft-promote-half branch from fabf0e6 to 4e3ca49 Compare August 10, 2025 02:32
@benshi001
Copy link
Member

benshi001 commented Aug 10, 2025

I just replied at #97975 but I'll give some more details here:

However we can not prevent hand writing double / half in llvm IR, so I am not sure how to deal with that, please let me have a carefully consideration.

For context, I'm working on Rust's f16 type, which has the semantics of IEEE binary16, so we do want LLVM's half type with those same semantics (promoting to f32 is not an option). We also don't really need to be concerned about the missing support in libgcc since we bring our own runtime libs to fill in whatever it is missing. I think _Float16 support is just a case where LLVM is a bit ahead of GCC.

Currently for AVR, you can use half but the generated code is pretty broken, #97981 makes it difficult to implement the intrinsics (just taking a f16 parameter becomes a recursive libcall). This unbreaks that and at least lets us build the intrinsics to experiment more.

So I think the best way is also attributing half type to float32, to fit avr-gcc-7.3 (and its libgcc) which is the main stream version in real production environment.

You are talking about the Clang frontend here right, something like C23 _Float16x? Since LLVM's half is fixed-size, as is Rust and Zig's f16 and C23's _Float16 (if that ever winds up supported).

I saw pairs of

; CHECK-NEXT:    rcall __truncsfhf2
; CHECK-NEXT:    rcall __extendhfsf2

generated after this PR, though this is less efficient, it is better than broken.

@tgross35
Copy link
Contributor Author

tgross35 commented Aug 10, 2025

Thank you for reviewing!

I saw pairs of

; CHECK-NEXT:    rcall __truncsfhf2
; CHECK-NEXT:    rcall __extendhfsf2

generated after this PR, though this is less efficient, it is better than broken.

Indeed: this changes from "unusably broken+slow" to "working but possibly slower". Really this just makes it possible to do future work.

half does need to round back to f16 after every op to get the right IEEE semantics if it does the work as f32 (the bfloat type doesn't have to). This could be better: promoting to a f32 for all operations doesn't really make sense on targets that don't have hardware f32. Eventually I'm planning to add the option of just using __addhf3 and similar instead, but for now I'm just trying to get all targets to work.

@benshi001 benshi001 merged commit 733fddb into llvm:main Aug 10, 2025
9 checks passed
@tgross35 tgross35 deleted the avr-soft-promote-half branch August 10, 2025 03:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants