Skip to content

Commit e7a4142

Browse files
committed
crypto: api - Fix generic algorithm self-test races
On Fri, Aug 30, 2024 at 10:51:54AM -0700, Eric Biggers wrote: > > Given below in defconfig form, use 'make olddefconfig' to apply. The failures > are nondeterministic and sometimes there are different ones, for example: > > [ 0.358017] alg: skcipher: failed to allocate transform for cbc(twofish-generic): -2 > [ 0.358365] alg: self-tests for cbc(twofish) using cbc(twofish-generic) failed (rc=-2) > [ 0.358535] alg: skcipher: failed to allocate transform for cbc(camellia-generic): -2 > [ 0.358918] alg: self-tests for cbc(camellia) using cbc(camellia-generic) failed (rc=-2) > [ 0.371533] alg: skcipher: failed to allocate transform for xts(ecb(aes-generic)): -2 > [ 0.371922] alg: self-tests for xts(aes) using xts(ecb(aes-generic)) failed (rc=-2) > > Modules are not enabled, maybe that matters (I haven't checked yet). Yes I think that was the key. This triggers a massive self-test run which executes in parallel and reveals a few race conditions in the system. I think it boils down to the following scenario: Base algorithm X-generic, X-optimised Template Y Optimised algorithm Y-X-optimised Everything gets registered, and then the self-tests are started. When Y-X-optimised gets tested, it requests the creation of the generic Y(X-generic). Which then itself undergoes testing. The race is that after Y(X-generic) gets registered, but just before it gets tested, X-optimised finally finishes self-testing which then causes all spawns of X-generic to be destroyed. So by the time the self-test for Y(X-generic) comes along, it can no longer find the algorithm. This error then bubbles up all the way up to the self-test of Y-X-optimised which then fails. Note that there is some complexity that I've omitted here because when the generic self-test fails to find Y(X-generic) it actually triggers the construction of it again which then fails for various other reasons (these are not important because the construction should *not* be triggered at this point). So in a way the error is expected, and we should probably remove the pr_err for the case where ENOENT is returned for the algorithm that we're currently testing. The solution is two-fold. First when an algorithm undergoes self-testing it should not trigger its construction. Secondly if an instance larval fails to materialise due to it being destroyed by a more optimised algorithm coming along, it should obviously retry the construction. Remove the check in __crypto_alg_lookup that stops a larval from matching new requests based on differences in the mask. It is better to block new requests even if it is wrong and then simply retry the lookup. If this ends up being the wrong larval it will sort iself out during the retry. Reduce the CRYPTO_ALG_TYPE_MASK bits in type during larval creation as otherwise LSKCIPHER algorithms may not match SKCIPHER larvals. Also block the instance creation during self-testing in the function crypto_larval_lookup by checking for CRYPTO_ALG_TESTED in the mask field. Finally change the return value when crypto_alg_lookup fails in crypto_larval_wait to EAGAIN to redo the lookup. Fixes: 37da5d0 ("crypto: api - Do not wait for tests during registration") Reported-by: Eric Biggers <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
1 parent b04f06f commit e7a4142

File tree

1 file changed

+8
-7
lines changed

1 file changed

+8
-7
lines changed

crypto/api.c

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -70,11 +70,6 @@ static struct crypto_alg *__crypto_alg_lookup(const char *name, u32 type,
7070
if ((q->cra_flags ^ type) & mask)
7171
continue;
7272

73-
if (crypto_is_larval(q) &&
74-
!crypto_is_test_larval((struct crypto_larval *)q) &&
75-
((struct crypto_larval *)q)->mask != mask)
76-
continue;
77-
7873
exact = !strcmp(q->cra_driver_name, name);
7974
fuzzy = !strcmp(q->cra_name, name);
8075
if (!exact && !(fuzzy && q->cra_priority > best))
@@ -113,6 +108,8 @@ struct crypto_larval *crypto_larval_alloc(const char *name, u32 type, u32 mask)
113108
if (!larval)
114109
return ERR_PTR(-ENOMEM);
115110

111+
type &= ~CRYPTO_ALG_TYPE_MASK | (mask ?: CRYPTO_ALG_TYPE_MASK);
112+
116113
larval->mask = mask;
117114
larval->alg.cra_flags = CRYPTO_ALG_LARVAL | type;
118115
larval->alg.cra_priority = -1;
@@ -229,7 +226,7 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg)
229226
type = alg->cra_flags & ~(CRYPTO_ALG_LARVAL | CRYPTO_ALG_DEAD);
230227
mask = larval->mask;
231228
alg = crypto_alg_lookup(alg->cra_name, type, mask) ?:
232-
ERR_PTR(-ENOENT);
229+
ERR_PTR(-EAGAIN);
233230
} else if (IS_ERR(alg))
234231
;
235232
else if (crypto_is_test_larval(larval) &&
@@ -308,8 +305,12 @@ static struct crypto_alg *crypto_larval_lookup(const char *name, u32 type,
308305

309306
if (!IS_ERR_OR_NULL(alg) && crypto_is_larval(alg))
310307
alg = crypto_larval_wait(alg);
311-
else if (!alg)
308+
else if (alg)
309+
;
310+
else if (!(mask & CRYPTO_ALG_TESTED))
312311
alg = crypto_larval_add(name, type, mask);
312+
else
313+
alg = ERR_PTR(-ENOENT);
313314

314315
return alg;
315316
}

0 commit comments

Comments
 (0)