-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Description
Description
When using ti.algorithms.PrefixSumExecutor, the internal kernel scan_add_inclusive triggers a warning from the AST transformer:
Warning: Casting range_for boundary values from f32 to i32, which may cause numerical issues.
This appears to be caused by the use of the standard division operator / inside a range() call in the library code, which returns a float in Python 3.
Steps to Reproduce
Run the following minimal script on a GPU architecture (CUDA/Vulkan):
import taichi as ti
ti.init(arch=ti.gpu)
# Minimal size to trigger the kernel
N = 1024
arr = ti.field(ti.i32, shape=N)
arr.fill(1)
# Initialize executor
prefix_sum = ti.algorithms.PrefixSumExecutor(N)
# The warning appears during the first run when JIT compilation happens
prefix_sum.run(arr)
print("Finished")
Observed Behavior
The code runs, but prints the following warning during compilation:
[Taichi] version 1.7.4, ...
...
site-packages\taichi\lang\ast\ast_transformer.py:63: Warning: Casting range_for boundary values from f32 to i32, which may cause numerical issues
warnings.warn(
Source of the Bug
The issue seems to be in taichi/_kernels.py inside the scan_add_inclusive function.
Specifically, this loop:
# Inter-warp scan, use the first thread in the first warp
if warp_id == 0 and lane_id == 0:
for k in range(1, BLOCK_SZ / WARP_SZ): # <--- CAUSE: returns float (e.g. 2.0)
pad_shared[k] += pad_shared[k - 1]
Since BLOCK_SZ (64) and WARP_SZ (32) are integers, BLOCK_SZ / WARP_SZ returns 2.0 (float), which triggers the AST warning when passed to range.
Environment
- Taichi Version: 1.7.4
- Python Version: 3.12.10
- OS: Windows
- Arch: CUDA
Suggested Fix
Replace the division operator / with the integer division operator //:
# Change this:
for k in range(1, BLOCK_SZ / WARP_SZ):
# To this:
for k in range(1, BLOCK_SZ // WARP_SZ):
Metadata
Metadata
Assignees
Labels
Type
Projects
Status