Skip to content

Commit 6455b61

Browse files
Zheng Yejianrostedt
authored andcommitted
ring-buffer: Fix race while reader and writer are on the same page
When user reads file 'trace_pipe', kernel keeps printing following logs that warn at "cpu_buffer->reader_page->read > rb_page_size(reader)" in rb_get_reader_page(). It just looks like there's an infinite loop in tracing_read_pipe(). This problem occurs several times on arm64 platform when testing v5.10 and below. Call trace: rb_get_reader_page+0x248/0x1300 rb_buffer_peek+0x34/0x160 ring_buffer_peek+0xbc/0x224 peek_next_entry+0x98/0xbc __find_next_entry+0xc4/0x1c0 trace_find_next_entry_inc+0x30/0x94 tracing_read_pipe+0x198/0x304 vfs_read+0xb4/0x1e0 ksys_read+0x74/0x100 __arm64_sys_read+0x24/0x30 el0_svc_common.constprop.0+0x7c/0x1bc do_el0_svc+0x2c/0x94 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb4 el0_sync+0x160/0x180 Then I dump the vmcore and look into the problematic per_cpu ring_buffer, I found that tail_page/commit_page/reader_page are on the same page while reader_page->read is obviously abnormal: tail_page == commit_page == reader_page == { .write = 0x100d20, .read = 0x8f9f4805, // Far greater than 0xd20, obviously abnormal!!! .entries = 0x10004c, .real_end = 0x0, .page = { .time_stamp = 0x857257416af0, .commit = 0xd20, // This page hasn't been full filled. // .data[0...0xd20] seems normal. } } The root cause is most likely the race that reader and writer are on the same page while reader saw an event that not fully committed by writer. To fix this, add memory barriers to make sure the reader can see the content of what is committed. Since commit a0fcaae ("ring-buffer: Fix race between reset page and reading page") has added the read barrier in rb_get_reader_page(), here we just need to add the write barrier. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: [email protected] Fixes: 77ae365 ("ring-buffer: make lockless") Suggested-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Zheng Yejian <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
1 parent 4ccf11c commit 6455b61

File tree

1 file changed

+12
-1
lines changed

1 file changed

+12
-1
lines changed

kernel/trace/ring_buffer.c

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3098,6 +3098,10 @@ rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer)
30983098
if (RB_WARN_ON(cpu_buffer,
30993099
rb_is_reader_page(cpu_buffer->tail_page)))
31003100
return;
3101+
/*
3102+
* No need for a memory barrier here, as the update
3103+
* of the tail_page did it for this page.
3104+
*/
31013105
local_set(&cpu_buffer->commit_page->page->commit,
31023106
rb_page_write(cpu_buffer->commit_page));
31033107
rb_inc_page(&cpu_buffer->commit_page);
@@ -3107,6 +3111,8 @@ rb_set_commit_to_write(struct ring_buffer_per_cpu *cpu_buffer)
31073111
while (rb_commit_index(cpu_buffer) !=
31083112
rb_page_write(cpu_buffer->commit_page)) {
31093113

3114+
/* Make sure the readers see the content of what is committed. */
3115+
smp_wmb();
31103116
local_set(&cpu_buffer->commit_page->page->commit,
31113117
rb_page_write(cpu_buffer->commit_page));
31123118
RB_WARN_ON(cpu_buffer,
@@ -4684,7 +4690,12 @@ rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer)
46844690

46854691
/*
46864692
* Make sure we see any padding after the write update
4687-
* (see rb_reset_tail())
4693+
* (see rb_reset_tail()).
4694+
*
4695+
* In addition, a writer may be writing on the reader page
4696+
* if the page has not been fully filled, so the read barrier
4697+
* is also needed to make sure we see the content of what is
4698+
* committed by the writer (see rb_set_commit_to_write()).
46884699
*/
46894700
smp_rmb();
46904701

0 commit comments

Comments
 (0)