Skip to content

Commit d5624bb

Browse files
fenghusthuctmarinas
authored andcommitted
asm-generic: introduce io_stop_wc() and add implementation for ARM64
For memory accesses with write-combining attributes (e.g. those returned by ioremap_wc()), the CPU may wait for prior accesses to be merged with subsequent ones. But in some situation, such wait is bad for the performance. We introduce io_stop_wc() to prevent the merging of write-combining memory accesses before this macro with those after it. We add implementation for ARM64 using DGH instruction and provide NOP implementation for other architectures. Signed-off-by: Xiongfeng Wang <[email protected]> Suggested-by: Will Deacon <[email protected]> Suggested-by: Catalin Marinas <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
1 parent c2c529b commit d5624bb

File tree

3 files changed

+28
-0
lines changed

3 files changed

+28
-0
lines changed

Documentation/memory-barriers.txt

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1950,6 +1950,14 @@ There are some more advanced barrier functions:
19501950
For load from persistent memory, existing read memory barriers are sufficient
19511951
to ensure read ordering.
19521952

1953+
(*) io_stop_wc();
1954+
1955+
For memory accesses with write-combining attributes (e.g. those returned
1956+
by ioremap_wc(), the CPU may wait for prior accesses to be merged with
1957+
subsequent ones. io_stop_wc() can be used to prevent the merging of
1958+
write-combining memory accesses before this macro with those after it when
1959+
such wait has performance implications.
1960+
19531961
===============================
19541962
IMPLICIT KERNEL MEMORY BARRIERS
19551963
===============================

arch/arm64/include/asm/barrier.h

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,14 @@
2626
#define __tsb_csync() asm volatile("hint #18" : : : "memory")
2727
#define csdb() asm volatile("hint #20" : : : "memory")
2828

29+
/*
30+
* Data Gathering Hint:
31+
* This instruction prevents merging memory accesses with Normal-NC or
32+
* Device-GRE attributes before the hint instruction with any memory accesses
33+
* appearing after the hint instruction.
34+
*/
35+
#define dgh() asm volatile("hint #6" : : : "memory")
36+
2937
#ifdef CONFIG_ARM64_PSEUDO_NMI
3038
#define pmr_sync() \
3139
do { \
@@ -46,6 +54,7 @@
4654
#define dma_rmb() dmb(oshld)
4755
#define dma_wmb() dmb(oshst)
4856

57+
#define io_stop_wc() dgh()
4958

5059
#define tsb_csync() \
5160
do { \

include/asm-generic/barrier.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -251,5 +251,16 @@ do { \
251251
#define pmem_wmb() wmb()
252252
#endif
253253

254+
/*
255+
* ioremap_wc() maps I/O memory as memory with write-combining attributes. For
256+
* this kind of memory accesses, the CPU may wait for prior accesses to be
257+
* merged with subsequent ones. In some situation, such wait is bad for the
258+
* performance. io_stop_wc() can be used to prevent the merging of
259+
* write-combining memory accesses before this macro with those after it.
260+
*/
261+
#ifndef io_stop_wc
262+
#define io_stop_wc do { } while (0)
263+
#endif
264+
254265
#endif /* !__ASSEMBLY__ */
255266
#endif /* __ASM_GENERIC_BARRIER_H */

0 commit comments

Comments
 (0)