Skip to content

Commit c8c1095

Browse files
committed
Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux
Pull zstd update from Nick Terrell: "Update to zstd-1.4.10. Add myself as the maintainer of zstd and update the zstd version in the kernel, which is now 4 years out of date, to a much more recent zstd release. This includes bug fixes, much more extensive fuzzing, and performance improvements. And generates the kernel zstd automatically from upstream zstd, so it is easier to keep the zstd verison up to date, and we don't fall so far out of date again. This includes 5 commits that update the zstd library version: - Adds a new kernel-style wrapper around zstd. This wrapper API is functionally equivalent to the subset of the current zstd API that is currently used. The wrapper API changes to be kernel style so that the symbols don't collide with zstd's symbols. The update to zstd-1.4.10 maintains the same API and preserves the semantics, so that none of the callers need to be updated. All callers are updated in the commit, because there are zero functional changes. - Adds an indirection for `lib/decompress_unzstd.c` so it doesn't depend on the layout of `lib/zstd/` to include every source file. This allows the next patch to be automatically generated. - Imports the zstd-1.4.10 source code. This commit is automatically generated from upstream zstd (https://github.com/facebook/zstd). - Adds me ([email protected]) as the maintainer of `lib/zstd`. - Fixes a newly added build warning for clang. The discussion around this patchset has been pretty long, so I've included a FAQ-style summary of the history of the patchset, and why we are taking this approach. Why do we need to update? ------------------------- The zstd version in the kernel is based off of zstd-1.3.1, which is was released August 20, 2017. Since then zstd has seen many bug fixes and performance improvements. And, importantly, upstream zstd is continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to older versions. So the only way to sanely get these fixes is to keep up to date with upstream zstd. There are no known security issues that affect the kernel, but we need to be able to update in case there are. And while there are no known security issues, there are relevant bug fixes. For example the problem with large kernel decompression has been fixed upstream for over 2 years [1] Additionally the performance improvements for kernel use cases are significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz: - BtrFS zstd compression at levels 1 and 3 is 5% faster - BtrFS zstd decompression+read is 15% faster - SquashFS zstd decompression+read is 15% faster - F2FS zstd compression+write at level 3 is 8% faster - F2FS zstd decompression+read is 20% faster - ZRAM decompression+read is 30% faster - Kernel zstd decompression is 35% faster - Initramfs zstd decompression+build is 5% faster On top of this, there are significant performance improvements coming down the line in the next zstd release, and the new automated update patch generation will allow us to pull them easily. How is the update patch generated? ---------------------------------- The first two patches are preparation for updating the zstd version. Then the 3rd patch in the series imports upstream zstd into the kernel. This patch is automatically generated from upstream. A script makes the necessary changes and imports it into the kernel. The changes are: - Replace all libc dependencies with kernel replacements and rewrite includes. - Remove unncessary portability macros like: #if defined(_MSC_VER). - Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. The automated process makes it easy to keep the kernel version of zstd up to date. The current zstd in the kernel shares the guts of the code, but has a lot of API and minor changes to work in the kernel. This is because at the time upstream zstd was not ready to be used in the kernel envrionment as-is. But, since then upstream zstd has evolved to support being used in the kernel as-is. Why are we updating in one big patch? ------------------------------------- The 3rd patch in the series is very large. This is because it is restructuring the code, so it both deletes the existing zstd, and re-adds the new structure. Future updates will be directly proportional to the changes in upstream zstd since the last import. They will admittidly be large, as zstd is an actively developed project, and has hundreds of commits between every release. However, there is no other great alternative. One option ruled out is to replay every upstream zstd commit. This is not feasible for several reasons: - There are over 3500 upstream commits since the zstd version in the kernel. - The automation to automatically generate the kernel update was only added recently, so older commits cannot easily be imported. - Not every upstream zstd commit builds. - Only zstd releases are "supported", and individual commits may have bugs that were fixed before a release. Another option to reduce the patch size would be to first reorganize to the new file structure, and then apply the patch. However, the current kernel zstd is formatted with clang-format to be more "kernel-like". But, the new method imports zstd as-is, without additional formatting, to allow for closer correlation with upstream, and easier debugging. So the patch wouldn't be any smaller. It also doesn't make sense to import upstream zstd commit by commit going forward. Upstream zstd doesn't support production use cases running of the development branch. We have a lot of post-commit fuzzing that catches many bugs, so indiviudal commits may be buggy, but fixed before a release. So going forward, I intend to import every (important) zstd release into the Kernel. So, while it isn't ideal, updating in one big patch is the only patch I see forward. Who is responsible for this code? --------------------------------- I am. This patchset adds me as the maintainer for zstd. Previously, there was no tree for zstd patches. Because of that, there were several patches that either got ignored, or took a long time to merge, since it wasn't clear which tree should pick them up. I'm officially stepping up as maintainer, and setting up my tree as the path through which zstd patches get merged. I'll make sure that patches to the kernel zstd get ported upstream, so they aren't erased when the next version update happens. How is this code tested? ------------------------ I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS, Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and aarch64. I checked both performance and correctness. Also, thanks to many people in the community who have tested these patches locally. Lastly, this code will bake in linux-next before being merged into v5.16. Why update to zstd-1.4.10 when zstd-1.5.0 has been released? ------------------------------------------------------------ This patchset has been outstanding since 2020, and zstd-1.4.10 was the latest release when it was created. Since the update patch is automatically generated from upstream, I could generate it from zstd-1.5.0. However, there were some large stack usage regressions in zstd-1.5.0, and are only fixed in the latest development branch. And the latest development branch contains some new code that needs to bake in the fuzzer before I would feel comfortable releasing to the kernel. Once this patchset has been merged, and we've released zstd-1.5.1, we can update the kernel to zstd-1.5.1, and exercise the update process. You may notice that zstd-1.4.10 doesn't exist upstream. This release is an artifical release based off of zstd-1.4.9, with some fixes for the kernel backported from the development branch. I will tag the zstd-1.4.10 release after this patchset is merged, so the Linux Kernel is running a known version of zstd that can be debugged upstream. Why was a wrapper API added? ---------------------------- The first versions of this patchset migrated the kernel to the upstream zstd API. It first added a shim API that supported the new upstream API with the old code, then updated callers to use the new shim API, then transitioned to the new code and deleted the shim API. However, Cristoph Hellwig suggested that we transition to a kernel style API, and hide zstd's upstream API behind that. This is because zstd's upstream API is supports many other use cases, and does not follow the kernel style guide, while the kernel API is focused on the kernel's use cases, and follows the kernel style guide. Where is the previous discussion? --------------------------------- Links for the discussions of the previous versions of the patch set below. The largest changes in the design of the patchset are driven by the discussions in v11, v5, and v1. Sorry for the mix of links, I couldn't find most of the the threads on lkml.org" Link: https://lkml.org/lkml/2020/9/29/27 [1] Link: https://www.spinics.net/lists/linux-crypto/msg58189.html [v12] Link: https://lore.kernel.org/linux-btrfs/[email protected]/ [v11] Link: https://lore.kernel.org/lkml/[email protected]/ [v10] Link: https://lore.kernel.org/linux-btrfs/[email protected]/ [v9] Link: https://lore.kernel.org/linux-f2fs-devel/[email protected]/ [v8] Link: https://lkml.org/lkml/2020/12/3/1195 [v7] Link: https://lkml.org/lkml/2020/12/2/1245 [v6] Link: https://lore.kernel.org/linux-btrfs/[email protected]/ [v5] Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html [v4] Link: https://lkml.org/lkml/2020/9/23/1074 [v3] Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html [v2] Link: https://lore.kernel.org/linux-btrfs/[email protected]/ [v1] Signed-off-by: Nick Terrell <[email protected]> Tested By: Paul Jones <[email protected]> Tested-by: Oleksandr Natalenko <[email protected]> Tested-by: Sedat Dilek <[email protected]> # LLVM/Clang v13.0.0 on x86-64 Tested-by: Jean-Denis Girard <[email protected]> * tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux: lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical MAINTAINERS: Add maintainer entry for zstd lib: zstd: Upgrade to latest upstream zstd version 1.4.10 lib: zstd: Add decompress_sources.h for decompress_unzstd lib: zstd: Add kernel-specific API
2 parents ccfff0a + 0a8ea23 commit c8c1095

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

76 files changed

+27373
-12941
lines changed

MAINTAINERS

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21078,6 +21078,18 @@ F: Documentation/vm/zsmalloc.rst
2107821078
F: include/linux/zsmalloc.h
2107921079
F: mm/zsmalloc.c
2108021080

21081+
ZSTD
21082+
M: Nick Terrell <[email protected]>
21083+
S: Maintained
21084+
B: https://github.com/facebook/zstd/issues
21085+
T: git git://github.com/terrelln/linux.git
21086+
F: include/linux/zstd*
21087+
F: lib/zstd/
21088+
F: lib/decompress_unzstd.c
21089+
F: crypto/zstd.c
21090+
N: zstd
21091+
K: zstd
21092+
2108121093
ZSWAP COMPRESSED SWAP CACHING
2108221094
M: Seth Jennings <[email protected]>
2108321095
M: Dan Streetman <[email protected]>

crypto/zstd.c

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,30 +18,30 @@
1818
#define ZSTD_DEF_LEVEL 3
1919

2020
struct zstd_ctx {
21-
ZSTD_CCtx *cctx;
22-
ZSTD_DCtx *dctx;
21+
zstd_cctx *cctx;
22+
zstd_dctx *dctx;
2323
void *cwksp;
2424
void *dwksp;
2525
};
2626

27-
static ZSTD_parameters zstd_params(void)
27+
static zstd_parameters zstd_params(void)
2828
{
29-
return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
29+
return zstd_get_params(ZSTD_DEF_LEVEL, 0);
3030
}
3131

3232
static int zstd_comp_init(struct zstd_ctx *ctx)
3333
{
3434
int ret = 0;
35-
const ZSTD_parameters params = zstd_params();
36-
const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
35+
const zstd_parameters params = zstd_params();
36+
const size_t wksp_size = zstd_cctx_workspace_bound(&params.cParams);
3737

3838
ctx->cwksp = vzalloc(wksp_size);
3939
if (!ctx->cwksp) {
4040
ret = -ENOMEM;
4141
goto out;
4242
}
4343

44-
ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
44+
ctx->cctx = zstd_init_cctx(ctx->cwksp, wksp_size);
4545
if (!ctx->cctx) {
4646
ret = -EINVAL;
4747
goto out_free;
@@ -56,15 +56,15 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
5656
static int zstd_decomp_init(struct zstd_ctx *ctx)
5757
{
5858
int ret = 0;
59-
const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
59+
const size_t wksp_size = zstd_dctx_workspace_bound();
6060

6161
ctx->dwksp = vzalloc(wksp_size);
6262
if (!ctx->dwksp) {
6363
ret = -ENOMEM;
6464
goto out;
6565
}
6666

67-
ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
67+
ctx->dctx = zstd_init_dctx(ctx->dwksp, wksp_size);
6868
if (!ctx->dctx) {
6969
ret = -EINVAL;
7070
goto out_free;
@@ -152,10 +152,10 @@ static int __zstd_compress(const u8 *src, unsigned int slen,
152152
{
153153
size_t out_len;
154154
struct zstd_ctx *zctx = ctx;
155-
const ZSTD_parameters params = zstd_params();
155+
const zstd_parameters params = zstd_params();
156156

157-
out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
158-
if (ZSTD_isError(out_len))
157+
out_len = zstd_compress_cctx(zctx->cctx, dst, *dlen, src, slen, &params);
158+
if (zstd_is_error(out_len))
159159
return -EINVAL;
160160
*dlen = out_len;
161161
return 0;
@@ -182,8 +182,8 @@ static int __zstd_decompress(const u8 *src, unsigned int slen,
182182
size_t out_len;
183183
struct zstd_ctx *zctx = ctx;
184184

185-
out_len = ZSTD_decompressDCtx(zctx->dctx, dst, *dlen, src, slen);
186-
if (ZSTD_isError(out_len))
185+
out_len = zstd_decompress_dctx(zctx->dctx, dst, *dlen, src, slen);
186+
if (zstd_is_error(out_len))
187187
return -EINVAL;
188188
*dlen = out_len;
189189
return 0;

fs/btrfs/zstd.c

Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,10 @@
2828
/* 307s to avoid pathologically clashing with transaction commit */
2929
#define ZSTD_BTRFS_RECLAIM_JIFFIES (307 * HZ)
3030

31-
static ZSTD_parameters zstd_get_btrfs_parameters(unsigned int level,
31+
static zstd_parameters zstd_get_btrfs_parameters(unsigned int level,
3232
size_t src_len)
3333
{
34-
ZSTD_parameters params = ZSTD_getParams(level, src_len, 0);
34+
zstd_parameters params = zstd_get_params(level, src_len);
3535

3636
if (params.cParams.windowLog > ZSTD_BTRFS_MAX_WINDOWLOG)
3737
params.cParams.windowLog = ZSTD_BTRFS_MAX_WINDOWLOG;
@@ -48,8 +48,8 @@ struct workspace {
4848
unsigned long last_used; /* jiffies */
4949
struct list_head list;
5050
struct list_head lru_list;
51-
ZSTD_inBuffer in_buf;
52-
ZSTD_outBuffer out_buf;
51+
zstd_in_buffer in_buf;
52+
zstd_out_buffer out_buf;
5353
};
5454

5555
/*
@@ -155,12 +155,12 @@ static void zstd_calc_ws_mem_sizes(void)
155155
unsigned int level;
156156

157157
for (level = 1; level <= ZSTD_BTRFS_MAX_LEVEL; level++) {
158-
ZSTD_parameters params =
158+
zstd_parameters params =
159159
zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
160160
size_t level_size =
161161
max_t(size_t,
162-
ZSTD_CStreamWorkspaceBound(params.cParams),
163-
ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
162+
zstd_cstream_workspace_bound(&params.cParams),
163+
zstd_dstream_workspace_bound(ZSTD_BTRFS_MAX_INPUT));
164164

165165
max_size = max_t(size_t, max_size, level_size);
166166
zstd_ws_mem_sizes[level - 1] = max_size;
@@ -371,7 +371,7 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
371371
unsigned long *total_in, unsigned long *total_out)
372372
{
373373
struct workspace *workspace = list_entry(ws, struct workspace, list);
374-
ZSTD_CStream *stream;
374+
zstd_cstream *stream;
375375
int ret = 0;
376376
int nr_pages = 0;
377377
struct page *in_page = NULL; /* The current page to read */
@@ -381,18 +381,18 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
381381
unsigned long len = *total_out;
382382
const unsigned long nr_dest_pages = *out_pages;
383383
unsigned long max_out = nr_dest_pages * PAGE_SIZE;
384-
ZSTD_parameters params = zstd_get_btrfs_parameters(workspace->req_level,
384+
zstd_parameters params = zstd_get_btrfs_parameters(workspace->req_level,
385385
len);
386386

387387
*out_pages = 0;
388388
*total_out = 0;
389389
*total_in = 0;
390390

391391
/* Initialize the stream */
392-
stream = ZSTD_initCStream(params, len, workspace->mem,
392+
stream = zstd_init_cstream(&params, len, workspace->mem,
393393
workspace->size);
394394
if (!stream) {
395-
pr_warn("BTRFS: ZSTD_initCStream failed\n");
395+
pr_warn("BTRFS: zstd_init_cstream failed\n");
396396
ret = -EIO;
397397
goto out;
398398
}
@@ -418,11 +418,11 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
418418
while (1) {
419419
size_t ret2;
420420

421-
ret2 = ZSTD_compressStream(stream, &workspace->out_buf,
421+
ret2 = zstd_compress_stream(stream, &workspace->out_buf,
422422
&workspace->in_buf);
423-
if (ZSTD_isError(ret2)) {
424-
pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
425-
ZSTD_getErrorCode(ret2));
423+
if (zstd_is_error(ret2)) {
424+
pr_debug("BTRFS: zstd_compress_stream returned %d\n",
425+
zstd_get_error_code(ret2));
426426
ret = -EIO;
427427
goto out;
428428
}
@@ -487,10 +487,10 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping,
487487
while (1) {
488488
size_t ret2;
489489

490-
ret2 = ZSTD_endStream(stream, &workspace->out_buf);
491-
if (ZSTD_isError(ret2)) {
492-
pr_debug("BTRFS: ZSTD_endStream returned %d\n",
493-
ZSTD_getErrorCode(ret2));
490+
ret2 = zstd_end_stream(stream, &workspace->out_buf);
491+
if (zstd_is_error(ret2)) {
492+
pr_debug("BTRFS: zstd_end_stream returned %d\n",
493+
zstd_get_error_code(ret2));
494494
ret = -EIO;
495495
goto out;
496496
}
@@ -548,17 +548,17 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
548548
struct workspace *workspace = list_entry(ws, struct workspace, list);
549549
struct page **pages_in = cb->compressed_pages;
550550
size_t srclen = cb->compressed_len;
551-
ZSTD_DStream *stream;
551+
zstd_dstream *stream;
552552
int ret = 0;
553553
unsigned long page_in_index = 0;
554554
unsigned long total_pages_in = DIV_ROUND_UP(srclen, PAGE_SIZE);
555555
unsigned long buf_start;
556556
unsigned long total_out = 0;
557557

558-
stream = ZSTD_initDStream(
558+
stream = zstd_init_dstream(
559559
ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
560560
if (!stream) {
561-
pr_debug("BTRFS: ZSTD_initDStream failed\n");
561+
pr_debug("BTRFS: zstd_init_dstream failed\n");
562562
ret = -EIO;
563563
goto done;
564564
}
@@ -574,11 +574,11 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
574574
while (1) {
575575
size_t ret2;
576576

577-
ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
577+
ret2 = zstd_decompress_stream(stream, &workspace->out_buf,
578578
&workspace->in_buf);
579-
if (ZSTD_isError(ret2)) {
580-
pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
581-
ZSTD_getErrorCode(ret2));
579+
if (zstd_is_error(ret2)) {
580+
pr_debug("BTRFS: zstd_decompress_stream returned %d\n",
581+
zstd_get_error_code(ret2));
582582
ret = -EIO;
583583
goto done;
584584
}
@@ -624,16 +624,16 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in,
624624
size_t destlen)
625625
{
626626
struct workspace *workspace = list_entry(ws, struct workspace, list);
627-
ZSTD_DStream *stream;
627+
zstd_dstream *stream;
628628
int ret = 0;
629629
size_t ret2;
630630
unsigned long total_out = 0;
631631
unsigned long pg_offset = 0;
632632

633-
stream = ZSTD_initDStream(
633+
stream = zstd_init_dstream(
634634
ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
635635
if (!stream) {
636-
pr_warn("BTRFS: ZSTD_initDStream failed\n");
636+
pr_warn("BTRFS: zstd_init_dstream failed\n");
637637
ret = -EIO;
638638
goto finish;
639639
}
@@ -657,15 +657,15 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in,
657657

658658
/* Check if the frame is over and we still need more input */
659659
if (ret2 == 0) {
660-
pr_debug("BTRFS: ZSTD_decompressStream ended early\n");
660+
pr_debug("BTRFS: zstd_decompress_stream ended early\n");
661661
ret = -EIO;
662662
goto finish;
663663
}
664-
ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
664+
ret2 = zstd_decompress_stream(stream, &workspace->out_buf,
665665
&workspace->in_buf);
666-
if (ZSTD_isError(ret2)) {
667-
pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
668-
ZSTD_getErrorCode(ret2));
666+
if (zstd_is_error(ret2)) {
667+
pr_debug("BTRFS: zstd_decompress_stream returned %d\n",
668+
zstd_get_error_code(ret2));
669669
ret = -EIO;
670670
goto finish;
671671
}

0 commit comments

Comments
 (0)