Skip to content

Commit 750a02a

Browse files
committed
Merge tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe: "Core block changes that have been queued up for this release: - Remove dead blk-throttle and blk-wbt code (Guoqing) - Include pid in blktrace note traces (Jan) - Don't spew I/O errors on wouldblock termination (me) - Zone append addition (Johannes, Keith, Damien) - IO accounting improvements (Konstantin, Christoph) - blk-mq hardware map update improvements (Ming) - Scheduler dispatch improvement (Salman) - Inline block encryption support (Satya) - Request map fixes and improvements (Weiping) - blk-iocost tweaks (Tejun) - Fix for timeout failing with error injection (Keith) - Queue re-run fixes (Douglas) - CPU hotplug improvements (Christoph) - Queue entry/exit improvements (Christoph) - Move DMA drain handling to the few drivers that use it (Christoph) - Partition handling cleanups (Christoph)" * tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block: (127 commits) block: mark bio_wouldblock_error() bio with BIO_QUIET blk-wbt: rename __wbt_update_limits to wbt_update_limits blk-wbt: remove wbt_update_limits blk-throttle: remove tg_drain_bios blk-throttle: remove blk_throtl_drain null_blk: force complete for timeout request blk-mq: drain I/O when all CPUs in a hctx are offline blk-mq: add blk_mq_all_tag_iter blk-mq: open code __blk_mq_alloc_request in blk_mq_alloc_request_hctx blk-mq: use BLK_MQ_NO_TAG in more places blk-mq: rename BLK_MQ_TAG_FAIL to BLK_MQ_NO_TAG blk-mq: move more request initialization to blk_mq_rq_ctx_init blk-mq: simplify the blk_mq_get_request calling convention blk-mq: remove the bio argument to ->prepare_request nvme: force complete cancelled requests blk-mq: blk-mq: provide forced completion method block: fix a warning when blkdev.h is included for !CONFIG_BLOCK builds block: blk-crypto-fallback: remove redundant initialization of variable err block: reduce part_stat_lock() scope block: use __this_cpu_add() instead of access by smp_processor_id() ...
2 parents 1966391 + abb3046 commit 750a02a

File tree

122 files changed

+4510
-1489
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

122 files changed

+4510
-1489
lines changed

Documentation/block/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ Block
1414
cmdline-partition
1515
data-integrity
1616
deadline-iosched
17+
inline-encryption
1718
ioprio
1819
kyber-iosched
1920
null_blk
Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
.. SPDX-License-Identifier: GPL-2.0
2+
3+
=================
4+
Inline Encryption
5+
=================
6+
7+
Background
8+
==========
9+
10+
Inline encryption hardware sits logically between memory and the disk, and can
11+
en/decrypt data as it goes in/out of the disk. Inline encryption hardware has a
12+
fixed number of "keyslots" - slots into which encryption contexts (i.e. the
13+
encryption key, encryption algorithm, data unit size) can be programmed by the
14+
kernel at any time. Each request sent to the disk can be tagged with the index
15+
of a keyslot (and also a data unit number to act as an encryption tweak), and
16+
the inline encryption hardware will en/decrypt the data in the request with the
17+
encryption context programmed into that keyslot. This is very different from
18+
full disk encryption solutions like self encrypting drives/TCG OPAL/ATA
19+
Security standards, since with inline encryption, any block on disk could be
20+
encrypted with any encryption context the kernel chooses.
21+
22+
23+
Objective
24+
=========
25+
26+
We want to support inline encryption (IE) in the kernel.
27+
To allow for testing, we also want a crypto API fallback when actual
28+
IE hardware is absent. We also want IE to work with layered devices
29+
like dm and loopback (i.e. we want to be able to use the IE hardware
30+
of the underlying devices if present, or else fall back to crypto API
31+
en/decryption).
32+
33+
34+
Constraints and notes
35+
=====================
36+
37+
- IE hardware has a limited number of "keyslots" that can be programmed
38+
with an encryption context (key, algorithm, data unit size, etc.) at any time.
39+
One can specify a keyslot in a data request made to the device, and the
40+
device will en/decrypt the data using the encryption context programmed into
41+
that specified keyslot. When possible, we want to make multiple requests with
42+
the same encryption context share the same keyslot.
43+
44+
- We need a way for upper layers like filesystems to specify an encryption
45+
context to use for en/decrypting a struct bio, and a device driver (like UFS)
46+
needs to be able to use that encryption context when it processes the bio.
47+
48+
- We need a way for device drivers to expose their inline encryption
49+
capabilities in a unified way to the upper layers.
50+
51+
52+
Design
53+
======
54+
55+
We add a :c:type:`struct bio_crypt_ctx` to :c:type:`struct bio` that can
56+
represent an encryption context, because we need to be able to pass this
57+
encryption context from the upper layers (like the fs layer) to the
58+
device driver to act upon.
59+
60+
While IE hardware works on the notion of keyslots, the FS layer has no
61+
knowledge of keyslots - it simply wants to specify an encryption context to
62+
use while en/decrypting a bio.
63+
64+
We introduce a keyslot manager (KSM) that handles the translation from
65+
encryption contexts specified by the FS to keyslots on the IE hardware.
66+
This KSM also serves as the way IE hardware can expose its capabilities to
67+
upper layers. The generic mode of operation is: each device driver that wants
68+
to support IE will construct a KSM and set it up in its struct request_queue.
69+
Upper layers that want to use IE on this device can then use this KSM in
70+
the device's struct request_queue to translate an encryption context into
71+
a keyslot. The presence of the KSM in the request queue shall be used to mean
72+
that the device supports IE.
73+
74+
The KSM uses refcounts to track which keyslots are idle (either they have no
75+
encryption context programmed, or there are no in-flight struct bios
76+
referencing that keyslot). When a new encryption context needs a keyslot, it
77+
tries to find a keyslot that has already been programmed with the same
78+
encryption context, and if there is no such keyslot, it evicts the least
79+
recently used idle keyslot and programs the new encryption context into that
80+
one. If no idle keyslots are available, then the caller will sleep until there
81+
is at least one.
82+
83+
84+
blk-mq changes, other block layer changes and blk-crypto-fallback
85+
=================================================================
86+
87+
We add a pointer to a ``bi_crypt_context`` and ``keyslot`` to
88+
:c:type:`struct request`. These will be referred to as the ``crypto fields``
89+
for the request. This ``keyslot`` is the keyslot into which the
90+
``bi_crypt_context`` has been programmed in the KSM of the ``request_queue``
91+
that this request is being sent to.
92+
93+
We introduce ``block/blk-crypto-fallback.c``, which allows upper layers to remain
94+
blissfully unaware of whether or not real inline encryption hardware is present
95+
underneath. When a bio is submitted with a target ``request_queue`` that doesn't
96+
support the encryption context specified with the bio, the block layer will
97+
en/decrypt the bio with the blk-crypto-fallback.
98+
99+
If the bio is a ``WRITE`` bio, a bounce bio is allocated, and the data in the bio
100+
is encrypted stored in the bounce bio - blk-mq will then proceed to process the
101+
bounce bio as if it were not encrypted at all (except when blk-integrity is
102+
concerned). ``blk-crypto-fallback`` sets the bounce bio's ``bi_end_io`` to an
103+
internal function that cleans up the bounce bio and ends the original bio.
104+
105+
If the bio is a ``READ`` bio, the bio's ``bi_end_io`` (and also ``bi_private``)
106+
is saved and overwritten by ``blk-crypto-fallback`` to
107+
``bio_crypto_fallback_decrypt_bio``. The bio's ``bi_crypt_context`` is also
108+
overwritten with ``NULL``, so that to the rest of the stack, the bio looks
109+
as if it was a regular bio that never had an encryption context specified.
110+
``bio_crypto_fallback_decrypt_bio`` will decrypt the bio, restore the original
111+
``bi_end_io`` (and also ``bi_private``) and end the bio again.
112+
113+
Regardless of whether real inline encryption hardware is used or the
114+
blk-crypto-fallback is used, the ciphertext written to disk (and hence the
115+
on-disk format of data) will be the same (assuming the hardware's implementation
116+
of the algorithm being used adheres to spec and functions correctly).
117+
118+
If a ``request queue``'s inline encryption hardware claimed to support the
119+
encryption context specified with a bio, then it will not be handled by the
120+
``blk-crypto-fallback``. We will eventually reach a point in blk-mq when a
121+
:c:type:`struct request` needs to be allocated for that bio. At that point,
122+
blk-mq tries to program the encryption context into the ``request_queue``'s
123+
keyslot_manager, and obtain a keyslot, which it stores in its newly added
124+
``keyslot`` field. This keyslot is released when the request is completed.
125+
126+
When the first bio is added to a request, ``blk_crypto_rq_bio_prep`` is called,
127+
which sets the request's ``crypt_ctx`` to a copy of the bio's
128+
``bi_crypt_context``. bio_crypt_do_front_merge is called whenever a subsequent
129+
bio is merged to the front of the request, which updates the ``crypt_ctx`` of
130+
the request so that it matches the newly merged bio's ``bi_crypt_context``. In particular, the request keeps a copy of the ``bi_crypt_context`` of the first
131+
bio in its bio-list (blk-mq needs to be careful to maintain this invariant
132+
during bio and request merges).
133+
134+
To make it possible for inline encryption to work with request queue based
135+
layered devices, when a request is cloned, its ``crypto fields`` are cloned as
136+
well. When the cloned request is submitted, blk-mq programs the
137+
``bi_crypt_context`` of the request into the clone's request_queue's keyslot
138+
manager, and stores the returned keyslot in the clone's ``keyslot``.
139+
140+
141+
API presented to users of the block layer
142+
=========================================
143+
144+
``struct blk_crypto_key`` represents a crypto key (the raw key, size of the
145+
key, the crypto algorithm to use, the data unit size to use, and the number of
146+
bytes required to represent data unit numbers that will be specified with the
147+
``bi_crypt_context``).
148+
149+
``blk_crypto_init_key`` allows upper layers to initialize such a
150+
``blk_crypto_key``.
151+
152+
``bio_crypt_set_ctx`` should be called on any bio that a user of
153+
the block layer wants en/decrypted via inline encryption (or the
154+
blk-crypto-fallback, if hardware support isn't available for the desired
155+
crypto configuration). This function takes the ``blk_crypto_key`` and the
156+
data unit number (DUN) to use when en/decrypting the bio.
157+
158+
``blk_crypto_config_supported`` allows upper layers to query whether or not the
159+
an encryption context passed to request queue can be handled by blk-crypto
160+
(either by real inline encryption hardware, or by the blk-crypto-fallback).
161+
This is useful e.g. when blk-crypto-fallback is disabled, and the upper layer
162+
wants to use an algorithm that may not supported by hardware - this function
163+
lets the upper layer know ahead of time that the algorithm isn't supported,
164+
and the upper layer can fallback to something else if appropriate.
165+
166+
``blk_crypto_start_using_key`` - Upper layers must call this function on
167+
``blk_crypto_key`` and a ``request_queue`` before using the key with any bio
168+
headed for that ``request_queue``. This function ensures that either the
169+
hardware supports the key's crypto settings, or the crypto API fallback has
170+
transforms for the needed mode allocated and ready to go. Note that this
171+
function may allocate an ``skcipher``, and must not be called from the data
172+
path, since allocating ``skciphers`` from the data path can deadlock.
173+
174+
``blk_crypto_evict_key`` *must* be called by upper layers before a
175+
``blk_crypto_key`` is freed. Further, it *must* only be called only once
176+
there are no more in-flight requests that use that ``blk_crypto_key``.
177+
``blk_crypto_evict_key`` will ensure that a key is removed from any keyslots in
178+
inline encryption hardware that the key might have been programmed into (or the blk-crypto-fallback).
179+
180+
API presented to device drivers
181+
===============================
182+
183+
A :c:type:``struct blk_keyslot_manager`` should be set up by device drivers in
184+
the ``request_queue`` of the device. The device driver needs to call
185+
``blk_ksm_init`` on the ``blk_keyslot_manager``, which specifying the number of
186+
keyslots supported by the hardware.
187+
188+
The device driver also needs to tell the KSM how to actually manipulate the
189+
IE hardware in the device to do things like programming the crypto key into
190+
the IE hardware into a particular keyslot. All this is achieved through the
191+
:c:type:`struct blk_ksm_ll_ops` field in the KSM that the device driver
192+
must fill up after initing the ``blk_keyslot_manager``.
193+
194+
The KSM also handles runtime power management for the device when applicable
195+
(e.g. when it wants to program a crypto key into the IE hardware, the device
196+
must be runtime powered on) - so the device driver must also set the ``dev``
197+
field in the ksm to point to the `struct device` for the KSM to use for runtime
198+
power management.
199+
200+
``blk_ksm_reprogram_all_keys`` can be called by device drivers if the device
201+
needs each and every of its keyslots to be reprogrammed with the key it
202+
"should have" at the point in time when the function is called. This is useful
203+
e.g. if a device loses all its keys on runtime power down/up.
204+
205+
``blk_ksm_destroy`` should be called to free up all resources used by a keyslot
206+
manager upon ``blk_ksm_init``, once the ``blk_keyslot_manager`` is no longer
207+
needed.
208+
209+
210+
Layered Devices
211+
===============
212+
213+
Request queue based layered devices like dm-rq that wish to support IE need to
214+
create their own keyslot manager for their request queue, and expose whatever
215+
functionality they choose. When a layered device wants to pass a clone of that
216+
request to another ``request_queue``, blk-crypto will initialize and prepare the
217+
clone as necessary - see ``blk_crypto_insert_cloned_request`` in
218+
``blk-crypto.c``.
219+
220+
221+
Future Optimizations for layered devices
222+
========================================
223+
224+
Creating a keyslot manager for a layered device uses up memory for each
225+
keyslot, and in general, a layered device merely passes the request on to a
226+
"child" device, so the keyslots in the layered device itself are completely
227+
unused, and don't need any refcounting or keyslot programming. We can instead
228+
define a new type of KSM; the "passthrough KSM", that layered devices can use
229+
to advertise an unlimited number of keyslots, and support for any encryption
230+
algorithms they choose, while not actually using any memory for each keyslot.
231+
Another use case for the "passthrough KSM" is for IE devices that do not have a
232+
limited number of keyslots.
233+
234+
235+
Interaction between inline encryption and blk integrity
236+
=======================================================
237+
238+
At the time of this patch, there is no real hardware that supports both these
239+
features. However, these features do interact with each other, and it's not
240+
completely trivial to make them both work together properly. In particular,
241+
when a WRITE bio wants to use inline encryption on a device that supports both
242+
features, the bio will have an encryption context specified, after which
243+
its integrity information is calculated (using the plaintext data, since
244+
the encryption will happen while data is being written), and the data and
245+
integrity info is sent to the device. Obviously, the integrity info must be
246+
verified before the data is encrypted. After the data is encrypted, the device
247+
must not store the integrity info that it received with the plaintext data
248+
since that might reveal information about the plaintext data. As such, it must
249+
re-generate the integrity info from the ciphertext data and store that on disk
250+
instead. Another issue with storing the integrity info of the plaintext data is
251+
that it changes the on disk format depending on whether hardware inline
252+
encryption support is present or the kernel crypto API fallback is used (since
253+
if the fallback is used, the device will receive the integrity info of the
254+
ciphertext, not that of the plaintext).
255+
256+
Because there isn't any real hardware yet, it seems prudent to assume that
257+
hardware implementations might not implement both features together correctly,
258+
and disallow the combination for now. Whenever a device supports integrity, the
259+
kernel will pretend that the device does not support hardware inline encryption
260+
(by essentially setting the keyslot manager in the request_queue of the device
261+
to NULL). When the crypto API fallback is enabled, this means that all bios with
262+
and encryption context will use the fallback, and IO will complete as usual.
263+
When the fallback is disabled, a bio with an encryption context will be failed.

block/Kconfig

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ config BLK_CGROUP_IOLATENCY
146146
config BLK_CGROUP_IOCOST
147147
bool "Enable support for cost model based cgroup IO controller"
148148
depends on BLK_CGROUP=y
149+
select BLK_RQ_IO_DATA_LEN
149150
select BLK_RQ_ALLOC_TIME
150151
---help---
151152
Enabling this option enables the .weight interface for cost
@@ -185,6 +186,23 @@ config BLK_SED_OPAL
185186
Enabling this option enables users to setup/unlock/lock
186187
Locking ranges for SED devices using the Opal protocol.
187188

189+
config BLK_INLINE_ENCRYPTION
190+
bool "Enable inline encryption support in block layer"
191+
help
192+
Build the blk-crypto subsystem. Enabling this lets the
193+
block layer handle encryption, so users can take
194+
advantage of inline encryption hardware if present.
195+
196+
config BLK_INLINE_ENCRYPTION_FALLBACK
197+
bool "Enable crypto API fallback for blk-crypto"
198+
depends on BLK_INLINE_ENCRYPTION
199+
select CRYPTO
200+
select CRYPTO_SKCIPHER
201+
help
202+
Enabling this lets the block layer handle inline encryption
203+
by falling back to the kernel crypto API when inline
204+
encryption hardware is not present.
205+
188206
menu "Partition Types"
189207

190208
source "block/partitions/Kconfig"

block/Makefile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,3 +36,5 @@ obj-$(CONFIG_BLK_DEBUG_FS) += blk-mq-debugfs.o
3636
obj-$(CONFIG_BLK_DEBUG_FS_ZONED)+= blk-mq-debugfs-zoned.o
3737
obj-$(CONFIG_BLK_SED_OPAL) += sed-opal.o
3838
obj-$(CONFIG_BLK_PM) += blk-pm.o
39+
obj-$(CONFIG_BLK_INLINE_ENCRYPTION) += keyslot-manager.o blk-crypto.o
40+
obj-$(CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK) += blk-crypto-fallback.o

block/bfq-iosched.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6073,7 +6073,7 @@ static struct bfq_queue *bfq_get_bfqq_handle_split(struct bfq_data *bfqd,
60736073
* comments on bfq_init_rq for the reason behind this delayed
60746074
* preparation.
60756075
*/
6076-
static void bfq_prepare_request(struct request *rq, struct bio *bio)
6076+
static void bfq_prepare_request(struct request *rq)
60776077
{
60786078
/*
60796079
* Regardless of whether we have an icq attached, we have to

block/bio-integrity.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,9 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio,
4242
struct bio_set *bs = bio->bi_pool;
4343
unsigned inline_vecs;
4444

45+
if (WARN_ON_ONCE(bio_has_crypt_ctx(bio)))
46+
return ERR_PTR(-EOPNOTSUPP);
47+
4548
if (!bs || !mempool_initialized(&bs->bio_integrity_pool)) {
4649
bip = kmalloc(struct_size(bip, bip_inline_vecs, nr_vecs), gfp_mask);
4750
inline_vecs = nr_vecs;

0 commit comments

Comments
 (0)