Skip to content

Commit 6fe846d

Browse files
krishna2803asl
authored andcommitted
add: "Bfloat16 in LLVM libc" blog post
Signed-off-by: Krishna Pandey <[email protected]>
1 parent 3d07f43 commit 6fe846d

File tree

1 file changed

+58
-0
lines changed

1 file changed

+58
-0
lines changed
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
author: "Krishna Pandey"
3+
date: "2025-09-10"
4+
tags: ["GSoC", "libc", "math", "c++23"]
5+
title: "GSoC 2025: Bfloat16 in LLVM libc"
6+
---
7+
8+
9+
## Introduction
10+
BFloat16 is a 16-bit floating-point format, introduced by Google and standardized in C++23 as `std::bfloat16_t`. It uses 1 sign bit, 8 exponent bits (the same as `float`), and 7 mantissa bits. This configuration allows BFloat16 to represent a much wider dynamic range than IEEE `binary16` (~ 3×10^38 compared to 65,504), though with lower precision. BFloat16 has become popular in AI and machine learning use-cases where it offers significant performance advantages.
11+
12+
The goal of this project was to implement the BFloat16 type in LLVM libc along with the basic math functions like `fabsbf16`, `fmaxbf16`, etc.
13+
We also want all functions to be generic, platform independent and correctly rounded for all rounding modes.
14+
15+
## What was done
16+
17+
- BFloat16 type was added in the LLVM libc (`libc/src/__support/FPUtil/bfloat16.h`) [#144463](https://github.com/llvm/llvm-project/pull/144463).
18+
- All 70 expected basic math functions for `bfloat16` were implemented, with a generic approach that supports all libc supported architectures (ARM, RISC-V, GPUs, x86, Darwin) (see table below).
19+
- Implemented two addition basic math functions: `iscanonicalbf16` and `issignalingbf16`.
20+
- Implemented higher math functions: `sqrtbf16` [#156654](https://github.com/llvm/llvm-project/pull/156654) and `log_bf16` [#157811](https://github.com/llvm/llvm-project/pull/157811) (open).
21+
- Comparison operations for the `FPBits` class were added [#144983](https://github.com/llvm/llvm-project/pull/144983).
22+
23+
| Basic Math Function | PR |
24+
|----------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|
25+
| `fabsbf16` | [#148398](https://github.com/llvm/llvm-project/pull/148398) |
26+
| `ceilbf16`, `floorbf16`, `roundbf16`, `roundevenbf16`, `truncbf16` | [#152352](https://github.com/llvm/llvm-project/pull/152352) |
27+
| `bf16add`, `bf16addf`, `bf16addl`, `bf16addf128`, `bf16sub`, `bf16subf`, `bf16subl`, `bf16subf128` | [#152774](https://github.com/llvm/llvm-project/pull/152774) |
28+
| `fmaxbf16`, `fminbf16` | [#152782](https://github.com/llvm/llvm-project/pull/152782) |
29+
| `bf16mul`, `bf16mulf`, `bf16mull`, `bf16mulf128` | [#152847](https://github.com/llvm/llvm-project/pull/152847) |
30+
| `fmaximumbf16`, `fmaximum_magbf16`, `fmaximum_mag_numbf16`, `fmaximum_numbf16`, `fminimumbf16`, `fminimum_magbf16`, `fminimum_mag_numbf16`, `fminimum_numbf16` | [#152881](https://github.com/llvm/llvm-project/pull/152881) |
31+
| `bf16div`, `bf16divf`, `bf16divl`, `bf16divf128` | [#153191](https://github.com/llvm/llvm-project/pull/153191) |
32+
| `bf16fma`, `bf16fmaf`, `bf16fmal`, `bf16fmaf128` | [#153231](https://github.com/llvm/llvm-project/pull/153231) |
33+
| `llrintbf16`, `llroundbf16`, `lrintbf16`, `lroundbf16`, `nearbyintbf16`, `rintbf16` | [#153882](https://github.com/llvm/llvm-project/pull/153882) |
34+
| `fromfpbf16`, `fromfpxbf16`, `ufromfpbf16`, `ufromfpxbf16` | [#153992](https://github.com/llvm/llvm-project/pull/153992) |
35+
| `nextafterbf16`, `nextdownbf16`, `nexttowardbf16`, `nextupbf16` | [#153993](https://github.com/llvm/llvm-project/pull/153993) |
36+
| `getpayloadbf16`, `setpayloadbf16`, `setpayloadsigbf16` | [#153994](https://github.com/llvm/llvm-project/pull/153994) |
37+
| `nanbf16` | [#153995](https://github.com/llvm/llvm-project/pull/153995) |
38+
| `frexpbf16`, `ilogbbf16`, `ldexpbf16`, `llogbbf16`, `logbbf16` | [#154427](https://github.com/llvm/llvm-project/pull/154427) |
39+
| `modfbf16`, `remainderbf16`, `remquobf16` | [#154652](https://github.com/llvm/llvm-project/pull/154652) |
40+
| `canonicalizebf16`, `iscanonicalbf16`, `issignalingbf16`, `copysignbf16`, `fdimbf16` | [#155567](https://github.com/llvm/llvm-project/pull/155567) |
41+
| `totalorderbf16`, `totalordermagbf16` | [#155568](https://github.com/llvm/llvm-project/pull/155568) |
42+
| `scalbnbf16`, `scalblnbf16` | [#155569](https://github.com/llvm/llvm-project/pull/155569) |
43+
| `fmodbf16` | [#155575](https://github.com/llvm/llvm-project/pull/155575) |
44+
45+
46+
## What was not done
47+
48+
- The implementation relied on a generic approach, so the `__bf16` compiler intrinsic was not used.
49+
- Hardware optimizations provided by Intel's [AVX-512_BF16](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#avx512techs=AVX512_BF16) were not utilized. These instructions only support round-to-nearest-even mode, always flush output denormals to zero, and treat input denormals as zero, which does not align with our goal. See [VCVTNE2PS2BF16 instruction description](https://www.felixcloutier.com/x86/vcvtne2ps2bf16#description).
50+
- Not all higher math functions were implemented due to time constraints.
51+
52+
## Future Work
53+
- Implement the remaining higher math functions.
54+
- Perform performance comparisons with other libc implementations once their `bfloat16` support is available.
55+
- Update the test suite when the `mpfr_get_bfloat16` function becomes available.
56+
57+
## Acknowledgements
58+
I would like to thank my mentors, Tue Ly and Nicolas Celik, for their invaluable guidance and support throughout this project. The project wouldn't have been possible without them. I am also grateful to the LLVM Foundation and the GSoC admins for giving me this opportunity.

0 commit comments

Comments
 (0)