Skip to content

Commit 89635ea

Browse files
dhowellsbrauner
authored andcommitted
netfs: Fix race between cache write completion and ALL_QUEUED being set
When netfslib is issuing subrequests, the subrequests start processing immediately and may complete before we reach the end of the issuing function. At the end of the issuing function we set NETFS_RREQ_ALL_QUEUED to indicate to the collector that we aren't going to issue any more subreqs and that it can do the final notifications and cleanup. Now, this isn't a problem if the request is synchronous (NETFS_RREQ_OFFLOAD_COLLECTION is unset) as the result collection will be done in-thread and we're guaranteed an opportunity to run the collector. However, if the request is asynchronous, collection is primarily triggered by the termination of subrequests queuing it on a workqueue. Now, a race can occur here if the app thread sets ALL_QUEUED after the last subrequest terminates. This can happen most easily with the copy2cache code (as used by Ceph) where, in the collection routine of a read request, an asynchronous write request is spawned to copy data to the cache. Folios are added to the write request as they're unlocked, but there may be a delay before ALL_QUEUED is set as the write subrequests may complete before we get there. If all the write subreqs have finished by the ALL_QUEUED point, no further events happen and the collection never happens, leaving the request hanging. Fix this by queuing the collector after setting ALL_QUEUED. This is a bit heavy-handed and it may be sufficient to do it only if there are no extant subreqs. Also add a tracepoint to cross-reference both requests in a copy-to-request operation and add a trace to the netfs_rreq tracepoint to indicate the setting of ALL_QUEUED. Fixes: e2d46f2 ("netfs: Change the read result collector to only use one work item") Reported-by: Max Kellermann <[email protected]> Link: https://lore.kernel.org/r/CAKPOu+8z_ijTLHdiCYGU_Uk7yYD=shxyGLwfe-L7AV3DhebS3w@mail.gmail.com/ Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/[email protected] Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]> cc: Paulo Alcantara <[email protected]> cc: Viacheslav Dubeyko <[email protected]> cc: Alex Markuze <[email protected]> cc: Ilya Dryomov <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] cc: [email protected] Signed-off-by: Christian Brauner <[email protected]>
1 parent 4c238e3 commit 89635ea

File tree

2 files changed

+34
-0
lines changed

2 files changed

+34
-0
lines changed

fs/netfs/read_pgpriv2.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ static struct netfs_io_request *netfs_pgpriv2_begin_copy_to_cache(
111111
goto cancel_put;
112112

113113
__set_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &creq->flags);
114+
trace_netfs_copy2cache(rreq, creq);
114115
trace_netfs_write(creq, netfs_write_trace_copy_to_cache);
115116
netfs_stat(&netfs_n_wh_copy_to_cache);
116117
rreq->copy_to_cache = creq;
@@ -155,6 +156,9 @@ void netfs_pgpriv2_end_copy_to_cache(struct netfs_io_request *rreq)
155156
netfs_issue_write(creq, &creq->io_streams[1]);
156157
smp_wmb(); /* Write lists before ALL_QUEUED. */
157158
set_bit(NETFS_RREQ_ALL_QUEUED, &creq->flags);
159+
trace_netfs_rreq(rreq, netfs_rreq_trace_end_copy_to_cache);
160+
if (list_empty_careful(&creq->io_streams[1].subrequests))
161+
netfs_wake_collector(creq);
158162

159163
netfs_put_request(creq, netfs_rreq_trace_put_return);
160164
creq->copy_to_cache = NULL;

include/trace/events/netfs.h

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@
5555
EM(netfs_rreq_trace_copy, "COPY ") \
5656
EM(netfs_rreq_trace_dirty, "DIRTY ") \
5757
EM(netfs_rreq_trace_done, "DONE ") \
58+
EM(netfs_rreq_trace_end_copy_to_cache, "END-C2C") \
5859
EM(netfs_rreq_trace_free, "FREE ") \
5960
EM(netfs_rreq_trace_ki_complete, "KI-CMPL") \
6061
EM(netfs_rreq_trace_recollect, "RECLLCT") \
@@ -559,6 +560,35 @@ TRACE_EVENT(netfs_write,
559560
__entry->start, __entry->start + __entry->len - 1)
560561
);
561562

563+
TRACE_EVENT(netfs_copy2cache,
564+
TP_PROTO(const struct netfs_io_request *rreq,
565+
const struct netfs_io_request *creq),
566+
567+
TP_ARGS(rreq, creq),
568+
569+
TP_STRUCT__entry(
570+
__field(unsigned int, rreq)
571+
__field(unsigned int, creq)
572+
__field(unsigned int, cookie)
573+
__field(unsigned int, ino)
574+
),
575+
576+
TP_fast_assign(
577+
struct netfs_inode *__ctx = netfs_inode(rreq->inode);
578+
struct fscache_cookie *__cookie = netfs_i_cookie(__ctx);
579+
__entry->rreq = rreq->debug_id;
580+
__entry->creq = creq->debug_id;
581+
__entry->cookie = __cookie ? __cookie->debug_id : 0;
582+
__entry->ino = rreq->inode->i_ino;
583+
),
584+
585+
TP_printk("R=%08x CR=%08x c=%08x i=%x ",
586+
__entry->rreq,
587+
__entry->creq,
588+
__entry->cookie,
589+
__entry->ino)
590+
);
591+
562592
TRACE_EVENT(netfs_collect,
563593
TP_PROTO(const struct netfs_io_request *wreq),
564594

0 commit comments

Comments
 (0)