Skip to content

Commit ea4d453

Browse files
ps-ushankarkeithbusch
authored andcommitted
nvme: double KA polling frequency to avoid KATO with TBKAS on
With TBKAS on, the completion of one command can defer sending a keep alive for up to twice the delay between successive runs of nvme_keep_alive_work. The current delay of KATO / 2 thus makes it possible for one command to defer sending a keep alive for up to KATO, which can result in the controller detecting a KATO. The following trace demonstrates the issue, taking KATO = 8 for simplicity: 1. t = 0: run nvme_keep_alive_work, no keep-alive sent 2. t = ε: I/O completion seen, set comp_seen = true 3. t = 4: run nvme_keep_alive_work, see comp_seen == true, skip sending keep-alive, set comp_seen = false 4. t = 8: run nvme_keep_alive_work, see comp_seen == false, send a keep-alive command. Here, there is a delay of 8 - ε between receiving a command completion and sending the next command. With ε small, the controller is likely to detect a keep alive timeout. Fix this by running nvme_keep_alive_work with a delay of KATO / 4 whenever TBKAS is on. Going through the above trace now gives us a worst-case delay of 4 - ε, which is in line with the recommendation of sending a command every KATO / 2 in the NVMe specification. Reported-by: Costa Sapuntzakis <[email protected]> Reported-by: Randy Jennings <[email protected]> Signed-off-by: Uday Shankar <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]>
1 parent 31a5978 commit ea4d453

File tree

1 file changed

+17
-1
lines changed

1 file changed

+17
-1
lines changed

drivers/nvme/host/core.c

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1163,9 +1163,25 @@ EXPORT_SYMBOL_NS_GPL(nvme_passthru_end, NVME_TARGET_PASSTHRU);
11631163
* The host should send Keep Alive commands at half of the Keep Alive Timeout
11641164
* accounting for transport roundtrip times [..].
11651165
*/
1166+
static unsigned long nvme_keep_alive_work_period(struct nvme_ctrl *ctrl)
1167+
{
1168+
unsigned long delay = ctrl->kato * HZ / 2;
1169+
1170+
/*
1171+
* When using Traffic Based Keep Alive, we need to run
1172+
* nvme_keep_alive_work at twice the normal frequency, as one
1173+
* command completion can postpone sending a keep alive command
1174+
* by up to twice the delay between runs.
1175+
*/
1176+
if (ctrl->ctratt & NVME_CTRL_ATTR_TBKAS)
1177+
delay /= 2;
1178+
return delay;
1179+
}
1180+
11661181
static void nvme_queue_keep_alive_work(struct nvme_ctrl *ctrl)
11671182
{
1168-
queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ / 2);
1183+
queue_delayed_work(nvme_wq, &ctrl->ka_work,
1184+
nvme_keep_alive_work_period(ctrl));
11691185
}
11701186

11711187
static enum rq_end_io_ret nvme_keep_alive_end_io(struct request *rq,

0 commit comments

Comments
 (0)