Skip to content

Commit c25243b

Browse files
committed
Some rewording in README
1 parent 5830697 commit c25243b

File tree

1 file changed

+9
-4
lines changed

1 file changed

+9
-4
lines changed

README.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,15 @@
99
![GitHub Repo stars](https://img.shields.io/github/stars/KernelTuner/kernel_float?style=social)
1010

1111

12-
_Kernel Float_ is a header-only library for CUDA that simplifies working with vector and reduced precision types in GPU code.
12+
_Kernel Float_ is a header-only library for CUDA that simplifies working with vector types and reduced precision floating-point arithmetic in GPU code.
1313

14-
CUDA offers several reduced precision floating-point types (`__half`, `__nv_bfloat16`, `__nv_fp8_e4m3`, `__nv_fp8_e5m2`)
14+
15+
## Summary
16+
17+
CUDA natively offers several reduced precision floating-point types (`__half`, `__nv_bfloat16`, `__nv_fp8_e4m3`, `__nv_fp8_e5m2`)
1518
and vector types (e.g., `__half2`, `__nv_fp8x4_e4m3`, `float3`).
1619
However, working with these types is cumbersome:
17-
mathematical operations require intrinsics (e.g., `__hadd2(x, y)` adds two `__half2`),
20+
mathematical operations require intrinsics (e.g., `__hadd2` performs addition for `__half2`),
1821
type conversion is awkward (e.g., `__nv_cvt_halfraw2_to_fp8x2` converts float16 to float8),
1922
and some functionality is missing (e.g., one cannot convert a `__half` to `__nv_bfloat16`).
2023

@@ -24,6 +27,8 @@ Internally, the data is stored using the most optimal type available, for exampl
2427
Operator overloading (like `+`, `*`, `&&`) has been implemented such that the most optimal intrinsic for the available types is selected automatically.
2528
Many mathetical functions (like `log`, `exp`, `sin`) and common operations (such as `sum`, `range`, `for_each`) are also available.
2629

30+
By using this library, developers can avoid the complexity of working with reduced precision floating-point types in CUDA and focus on their applications.
31+
2732

2833
## Features
2934

@@ -33,7 +38,7 @@ In a nutshell, _Kernel Float_ offers the following features:
3338
* Operator overloading to simplify programming.
3439
* Support for half (16 bit) and quarter (8 bit) floating-point precision.
3540
* Easy integration as a single header file.
36-
* Compatible with C++17.
41+
* Written for C++17.
3742
* Compatible with NVCC (NVIDIA Compiler) and NVRTC (NVIDIA Runtime Compilation).
3843

3944

0 commit comments

Comments
 (0)