Skip to content

Commit 6c2ed1b

Browse files
committed
nontemporal vector loads
1 parent 3074009 commit 6c2ed1b

File tree

1 file changed

+1
-1
lines changed
  • tensorforge/backend/instructions/memory

1 file changed

+1
-1
lines changed

tensorforge/backend/instructions/memory/load.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -362,7 +362,7 @@ def gen_code_inner(self, writer: Writer) -> None:
362362
sidx = i // lead_count
363363
ridx = i % lead_count
364364
index = sidx * lead_size + ridx * self._num_threads
365-
writer(f'const auto v{i} = *(tensorforge::VectorT<{prec}, {g}>*)&{self._src.name}[{index} + f{g}idx];')
365+
writer(f'const auto v{i} = __builtin_nontemporal_load((tensorforge::VectorT<{prec}, {g}>*)&{self._src.name}[{index} + f{g}idx]);')
366366

367367
args2 = ', '.join(f'v{i}[{k}]' for k in range(g))
368368

0 commit comments

Comments
 (0)