Skip to content

Commit 1551ed5

Browse files
committed
Merge tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme into block-6.2
Pull NVMe fixes from Christoph: "nvme fixes for Linux 6.2 - fix various problems in handling the Command Supported and Effects log (Christoph Hellwig) - don't allow unprivileged passthrough of commands that don't transfer data but modify logical block content (Christoph Hellwig) - add a features and quirks policy document (Christoph Hellwig) - fix some really nasty code that was correct but made smatch complain (Sagi Grimberg)" * tag 'nvme-6.2-2022-12-29' of git://git.infradead.org/nvme: nvme-auth: fix smatch warning complaints nvme: consult the CSE log page for unprivileged passthrough nvme: also return I/O command effects from nvme_command_effects nvmet: don't defer passthrough commands with trivial effects to the workqueue nvmet: set the LBCC bit for commands that modify data nvmet: use NVME_CMD_EFFECTS_CSUPP instead of open coding it nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition docs, nvme: add a feature and quirk policy document
2 parents 88d356c + 76807fc commit 1551ed5

File tree

9 files changed

+159
-34
lines changed

9 files changed

+159
-34
lines changed

Documentation/maintainer/maintainer-entry-profile.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,3 +104,4 @@ to do something different in the near future.
104104
../riscv/patch-acceptance
105105
../driver-api/media/maintainer-entry-profile
106106
../driver-api/vfio-pci-device-specific-driver-acceptance
107+
../nvme/feature-and-quirk-policy
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
=======================================
4+
Linux NVMe feature and and quirk policy
5+
=======================================
6+
7+
This file explains the policy used to decide what is supported by the
8+
Linux NVMe driver and what is not.
9+
10+
11+
Introduction
12+
============
13+
14+
NVM Express is an open collection of standards and information.
15+
16+
The Linux NVMe host driver in drivers/nvme/host/ supports devices
17+
implementing the NVM Express (NVMe) family of specifications, which
18+
currently consists of a number of documents:
19+
20+
- the NVMe Base specification
21+
- various Command Set specifications (e.g. NVM Command Set)
22+
- various Transport specifications (e.g. PCIe, Fibre Channel, RDMA, TCP)
23+
- the NVMe Management Interface specification
24+
25+
See https://nvmexpress.org/developers/ for the NVMe specifications.
26+
27+
28+
Supported features
29+
==================
30+
31+
NVMe is a large suite of specifications, and contains features that are only
32+
useful or suitable for specific use-cases. It is important to note that Linux
33+
does not aim to implement every feature in the specification. Every additional
34+
feature implemented introduces more code, more maintenance and potentially more
35+
bugs. Hence there is an inherent tradeoff between functionality and
36+
maintainability of the NVMe host driver.
37+
38+
Any feature implemented in the Linux NVMe host driver must support the
39+
following requirements:
40+
41+
1. The feature is specified in a release version of an official NVMe
42+
specification, or in a ratified Technical Proposal (TP) that is
43+
available on NVMe website. Or if it is not directly related to the
44+
on-wire protocol, does not contradict any of the NVMe specifications.
45+
2. Does not conflict with the Linux architecture, nor the design of the
46+
NVMe host driver.
47+
3. Has a clear, indisputable value-proposition and a wide consensus across
48+
the community.
49+
50+
Vendor specific extensions are generally not supported in the NVMe host
51+
driver.
52+
53+
It is strongly recommended to work with the Linux NVMe and block layer
54+
maintainers and get feedback on specification changes that are intended
55+
to be used by the Linux NVMe host driver in order to avoid conflict at a
56+
later stage.
57+
58+
59+
Quirks
60+
======
61+
62+
Sometimes implementations of open standards fail to correctly implement parts
63+
of the standards. Linux uses identifier-based quirks to work around such
64+
implementation bugs. The intent of quirks is to deal with widely available
65+
hardware, usually consumer, which Linux users can't use without these quirks.
66+
Typically these implementations are not or only superficially tested with Linux
67+
by the hardware manufacturer.
68+
69+
The Linux NVMe maintainers decide ad hoc whether to quirk implementations
70+
based on the impact of the problem to Linux users and how it impacts
71+
maintainability of the driver. In general quirks are a last resort, if no
72+
firmware updates or other workarounds are available from the vendor.
73+
74+
Quirks will not be added to the Linux kernel for hardware that isn't available
75+
on the mass market. Hardware that fails qualification for enterprise Linux
76+
distributions, ChromeOS, Android or other consumers of the Linux kernel
77+
should be fixed before it is shipped instead of relying on Linux quirks.

MAINTAINERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14827,6 +14827,7 @@ L: [email protected]
1482714827
S: Supported
1482814828
W: http://git.infradead.org/nvme.git
1482914829
T: git://git.infradead.org/nvme.git
14830+
F: Documentation/nvme/
1483014831
F: drivers/nvme/host/
1483114832
F: drivers/nvme/common/
1483214833
F: include/linux/nvme*

drivers/nvme/host/auth.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -953,7 +953,7 @@ int nvme_auth_init_ctrl(struct nvme_ctrl *ctrl)
953953
goto err_free_dhchap_secret;
954954

955955
if (!ctrl->opts->dhchap_secret && !ctrl->opts->dhchap_ctrl_secret)
956-
return ret;
956+
return 0;
957957

958958
ctrl->dhchap_ctxs = kvcalloc(ctrl_max_dhchaps(ctrl),
959959
sizeof(*chap), GFP_KERNEL);

drivers/nvme/host/core.c

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1074,23 +1074,43 @@ static u32 nvme_known_admin_effects(u8 opcode)
10741074
return 0;
10751075
}
10761076

1077+
static u32 nvme_known_nvm_effects(u8 opcode)
1078+
{
1079+
switch (opcode) {
1080+
case nvme_cmd_write:
1081+
case nvme_cmd_write_zeroes:
1082+
case nvme_cmd_write_uncor:
1083+
return NVME_CMD_EFFECTS_LBCC;
1084+
default:
1085+
return 0;
1086+
}
1087+
}
1088+
10771089
u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns, u8 opcode)
10781090
{
10791091
u32 effects = 0;
10801092

10811093
if (ns) {
10821094
if (ns->head->effects)
10831095
effects = le32_to_cpu(ns->head->effects->iocs[opcode]);
1096+
if (ns->head->ids.csi == NVME_CAP_CSS_NVM)
1097+
effects |= nvme_known_nvm_effects(opcode);
10841098
if (effects & ~(NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC))
10851099
dev_warn_once(ctrl->device,
1086-
"IO command:%02x has unhandled effects:%08x\n",
1100+
"IO command:%02x has unusual effects:%08x\n",
10871101
opcode, effects);
1088-
return 0;
1089-
}
10901102

1091-
if (ctrl->effects)
1092-
effects = le32_to_cpu(ctrl->effects->acs[opcode]);
1093-
effects |= nvme_known_admin_effects(opcode);
1103+
/*
1104+
* NVME_CMD_EFFECTS_CSE_MASK causes a freeze all I/O queues,
1105+
* which would deadlock when done on an I/O command. Note that
1106+
* We already warn about an unusual effect above.
1107+
*/
1108+
effects &= ~NVME_CMD_EFFECTS_CSE_MASK;
1109+
} else {
1110+
if (ctrl->effects)
1111+
effects = le32_to_cpu(ctrl->effects->acs[opcode]);
1112+
effects |= nvme_known_admin_effects(opcode);
1113+
}
10941114

10951115
return effects;
10961116
}

drivers/nvme/host/ioctl.c

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@
1111
static bool nvme_cmd_allowed(struct nvme_ns *ns, struct nvme_command *c,
1212
fmode_t mode)
1313
{
14+
u32 effects;
15+
1416
if (capable(CAP_SYS_ADMIN))
1517
return true;
1618

@@ -43,11 +45,29 @@ static bool nvme_cmd_allowed(struct nvme_ns *ns, struct nvme_command *c,
4345
}
4446

4547
/*
46-
* Only allow I/O commands that transfer data to the controller if the
47-
* special file is open for writing, but always allow I/O commands that
48-
* transfer data from the controller.
48+
* Check if the controller provides a Commands Supported and Effects log
49+
* and marks this command as supported. If not reject unprivileged
50+
* passthrough.
51+
*/
52+
effects = nvme_command_effects(ns->ctrl, ns, c->common.opcode);
53+
if (!(effects & NVME_CMD_EFFECTS_CSUPP))
54+
return false;
55+
56+
/*
57+
* Don't allow passthrough for command that have intrusive (or unknown)
58+
* effects.
59+
*/
60+
if (effects & ~(NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC |
61+
NVME_CMD_EFFECTS_UUID_SEL |
62+
NVME_CMD_EFFECTS_SCOPE_MASK))
63+
return false;
64+
65+
/*
66+
* Only allow I/O commands that transfer data to the controller or that
67+
* change the logical block contents if the file descriptor is open for
68+
* writing.
4969
*/
50-
if (nvme_is_write(c))
70+
if (nvme_is_write(c) || (effects & NVME_CMD_EFFECTS_LBCC))
5171
return mode & FMODE_WRITE;
5272
return true;
5373
}

drivers/nvme/target/admin-cmd.c

Lines changed: 21 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -164,26 +164,31 @@ static void nvmet_execute_get_log_page_smart(struct nvmet_req *req)
164164

165165
static void nvmet_get_cmd_effects_nvm(struct nvme_effects_log *log)
166166
{
167-
log->acs[nvme_admin_get_log_page] = cpu_to_le32(1 << 0);
168-
log->acs[nvme_admin_identify] = cpu_to_le32(1 << 0);
169-
log->acs[nvme_admin_abort_cmd] = cpu_to_le32(1 << 0);
170-
log->acs[nvme_admin_set_features] = cpu_to_le32(1 << 0);
171-
log->acs[nvme_admin_get_features] = cpu_to_le32(1 << 0);
172-
log->acs[nvme_admin_async_event] = cpu_to_le32(1 << 0);
173-
log->acs[nvme_admin_keep_alive] = cpu_to_le32(1 << 0);
174-
175-
log->iocs[nvme_cmd_read] = cpu_to_le32(1 << 0);
176-
log->iocs[nvme_cmd_write] = cpu_to_le32(1 << 0);
177-
log->iocs[nvme_cmd_flush] = cpu_to_le32(1 << 0);
178-
log->iocs[nvme_cmd_dsm] = cpu_to_le32(1 << 0);
179-
log->iocs[nvme_cmd_write_zeroes] = cpu_to_le32(1 << 0);
167+
log->acs[nvme_admin_get_log_page] =
168+
log->acs[nvme_admin_identify] =
169+
log->acs[nvme_admin_abort_cmd] =
170+
log->acs[nvme_admin_set_features] =
171+
log->acs[nvme_admin_get_features] =
172+
log->acs[nvme_admin_async_event] =
173+
log->acs[nvme_admin_keep_alive] =
174+
cpu_to_le32(NVME_CMD_EFFECTS_CSUPP);
175+
176+
log->iocs[nvme_cmd_read] =
177+
log->iocs[nvme_cmd_flush] =
178+
log->iocs[nvme_cmd_dsm] =
179+
cpu_to_le32(NVME_CMD_EFFECTS_CSUPP);
180+
log->iocs[nvme_cmd_write] =
181+
log->iocs[nvme_cmd_write_zeroes] =
182+
cpu_to_le32(NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC);
180183
}
181184

182185
static void nvmet_get_cmd_effects_zns(struct nvme_effects_log *log)
183186
{
184-
log->iocs[nvme_cmd_zone_append] = cpu_to_le32(1 << 0);
185-
log->iocs[nvme_cmd_zone_mgmt_send] = cpu_to_le32(1 << 0);
186-
log->iocs[nvme_cmd_zone_mgmt_recv] = cpu_to_le32(1 << 0);
187+
log->iocs[nvme_cmd_zone_append] =
188+
log->iocs[nvme_cmd_zone_mgmt_send] =
189+
cpu_to_le32(NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC);
190+
log->iocs[nvme_cmd_zone_mgmt_recv] =
191+
cpu_to_le32(NVME_CMD_EFFECTS_CSUPP);
187192
}
188193

189194
static void nvmet_execute_get_log_cmd_effects_ns(struct nvmet_req *req)

drivers/nvme/target/passthru.c

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -334,14 +334,13 @@ static void nvmet_passthru_execute_cmd(struct nvmet_req *req)
334334
}
335335

336336
/*
337-
* If there are effects for the command we are about to execute, or
338-
* an end_req function we need to use nvme_execute_passthru_rq()
339-
* synchronously in a work item seeing the end_req function and
340-
* nvme_passthru_end() can't be called in the request done callback
341-
* which is typically in interrupt context.
337+
* If a command needs post-execution fixups, or there are any
338+
* non-trivial effects, make sure to execute the command synchronously
339+
* in a workqueue so that nvme_passthru_end gets called.
342340
*/
343341
effects = nvme_command_effects(ctrl, ns, req->cmd->common.opcode);
344-
if (req->p.use_workqueue || effects) {
342+
if (req->p.use_workqueue ||
343+
(effects & ~(NVME_CMD_EFFECTS_CSUPP | NVME_CMD_EFFECTS_LBCC))) {
345344
INIT_WORK(&req->p.work, nvmet_passthru_execute_cmd_work);
346345
req->p.rq = rq;
347346
queue_work(nvmet_wq, &req->p.work);

include/linux/nvme.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
#ifndef _LINUX_NVME_H
88
#define _LINUX_NVME_H
99

10+
#include <linux/bits.h>
1011
#include <linux/types.h>
1112
#include <linux/uuid.h>
1213

@@ -639,8 +640,9 @@ enum {
639640
NVME_CMD_EFFECTS_NCC = 1 << 2,
640641
NVME_CMD_EFFECTS_NIC = 1 << 3,
641642
NVME_CMD_EFFECTS_CCC = 1 << 4,
642-
NVME_CMD_EFFECTS_CSE_MASK = 3 << 16,
643+
NVME_CMD_EFFECTS_CSE_MASK = GENMASK(18, 16),
643644
NVME_CMD_EFFECTS_UUID_SEL = 1 << 19,
645+
NVME_CMD_EFFECTS_SCOPE_MASK = GENMASK(31, 20),
644646
};
645647

646648
struct nvme_effects_log {

0 commit comments

Comments
 (0)