Skip to content

Commit 67a135b

Browse files
committed
Merge tag 'erofs-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang: "There are some new features available for this cycle. Firstly, EROFS LZMA algorithm support, specifically called MicroLZMA, is available as an option for embedded devices, LiveCDs and/or as the secondary auxiliary compression algorithm besides the primary algorithm in one file. In order to better support the LZMA fixed-sized output compression, especially for 4KiB pcluster size (which has lowest memory pressure thus useful for memory-sensitive scenarios), Lasse introduced a new LZMA header/container format called MicroLZMA to minimize the original LZMA1 header (for example, we don't need to waste 4-byte dictionary size and another 8-byte uncompressed size, which can be calculated by fs directly, for each pcluster) and enable EROFS fixed-sized output compression. Note that MicroLZMA can also be later used by other things in addition to EROFS too where wasting minimal amount of space for headers is important and it can be only compiled by enabling XZ_DEC_MICROLZMA. MicroLZMA has been supported by the latest upstream XZ embedded [1] & XZ utils [2], apply the latest related XZ embedded upstream patches by the XZ author Lasse here. Secondly, multiple device is also supported in this cycle, which is designed for multi-layer container images. By working together with inter-layer data deduplication and compression, we can achieve the next high-performance container image solution. Our team will announce the new Nydus container image service [3] implementation with new RAFS v6 (EROFS-compatible) format in Open Source Summit 2021 China [4] soon. Besides, the secondary compression head support and readmore decompression strategy are also included in this cycle. There are also some minor bugfixes and cleanups, as always. Summary: - support multiple devices for multi-layer container images; - support the secondary compression head; - support readmore decompression strategy; - support new LZMA algorithm (specifically called MicroLZMA); - some bugfixes & cleanups" * tag 'erofs-for-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: don't trigger WARN() when decompression fails erofs: get rid of ->lru usage erofs: lzma compression support erofs: rename some generic methods in decompressor lib/xz, lib/decompress_unxz.c: Fix spelling in comments lib/xz: Add MicroLZMA decoder lib/xz: Move s->lzma.len = 0 initialization to lzma_reset() lib/xz: Validate the value before assigning it to an enum variable lib/xz: Avoid overlapping memcpy() with invalid input with in-place decompression erofs: introduce readmore decompression strategy erofs: introduce the secondary compression head erofs: get compression algorithms directly on mapping erofs: add multiple device support erofs: decouple basic mount options from fs_context erofs: remove the fast path of per-CPU buffer decompression
2 parents cd3e8ea + a0961f3 commit 67a135b

25 files changed

+1281
-320
lines changed

Documentation/filesystems/erofs.rst

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,10 @@ It is designed as a better filesystem solution for the following scenarios:
1919
immutable and bit-for-bit identical to the official golden image for
2020
their releases due to security and other considerations and
2121

22-
- hope to save some extra storage space with guaranteed end-to-end performance
23-
by using reduced metadata and transparent file compression, especially
24-
for those embedded devices with limited memory (ex, smartphone);
22+
- hope to minimize extra storage space with guaranteed end-to-end performance
23+
by using compact layout, transparent file compression and direct access,
24+
especially for those embedded devices with limited memory and high-density
25+
hosts with numerous containers;
2526

2627
Here is the main features of EROFS:
2728

@@ -51,7 +52,9 @@ Here is the main features of EROFS:
5152
- Support POSIX.1e ACLs by using xattrs;
5253

5354
- Support transparent data compression as an option:
54-
LZ4 algorithm with the fixed-sized output compression for high performance.
55+
LZ4 algorithm with the fixed-sized output compression for high performance;
56+
57+
- Multiple device support for multi-layer container images.
5558

5659
The following git tree provides the file system user-space tools under
5760
development (ex, formatting tool mkfs.erofs):
@@ -87,6 +90,7 @@ cache_strategy=%s Select a strategy for cached decompression from now on:
8790
dax={always,never} Use direct access (no page cache). See
8891
Documentation/filesystems/dax.rst.
8992
dax A legacy option which is an alias for ``dax=always``.
93+
device=%s Specify a path to an extra device to be used together.
9094
=================== =========================================================
9195

9296
On-disk details

fs/erofs/Kconfig

Lines changed: 31 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -6,16 +6,22 @@ config EROFS_FS
66
select FS_IOMAP
77
select LIBCRC32C
88
help
9-
EROFS (Enhanced Read-Only File System) is a lightweight
10-
read-only file system with modern designs (eg. page-sized
11-
blocks, inline xattrs/data, etc.) for scenarios which need
12-
high-performance read-only requirements, e.g. Android OS
13-
for mobile phones and LIVECDs.
9+
EROFS (Enhanced Read-Only File System) is a lightweight read-only
10+
file system with modern designs (e.g. no buffer heads, inline
11+
xattrs/data, chunk-based deduplication, multiple devices, etc.) for
12+
scenarios which need high-performance read-only solutions, e.g.
13+
smartphones with Android OS, LiveCDs and high-density hosts with
14+
numerous containers;
1415

15-
It also provides fixed-sized output compression support,
16-
which improves storage density, keeps relatively higher
17-
compression ratios, which is more useful to achieve high
18-
performance for embedded devices with limited memory.
16+
It also provides fixed-sized output compression support in order to
17+
improve storage density as well as keep relatively higher compression
18+
ratios and implements in-place decompression to reuse the file page
19+
for compressed data temporarily with proper strategies, which is
20+
quite useful to ensure guaranteed end-to-end runtime decompression
21+
performance under extremely memory pressure without extra cost.
22+
23+
See the documentation at <file:Documentation/filesystems/erofs.rst>
24+
for more details.
1925

2026
If unsure, say N.
2127

@@ -76,3 +82,19 @@ config EROFS_FS_ZIP
7682
Enable fixed-sized output compression for EROFS.
7783

7884
If you don't want to enable compression feature, say N.
85+
86+
config EROFS_FS_ZIP_LZMA
87+
bool "EROFS LZMA compressed data support"
88+
depends on EROFS_FS_ZIP
89+
select XZ_DEC
90+
select XZ_DEC_MICROLZMA
91+
help
92+
Saying Y here includes support for reading EROFS file systems
93+
containing LZMA compressed data, specifically called microLZMA. it
94+
gives better compression ratios than the LZ4 algorithm, at the
95+
expense of more CPU overhead.
96+
97+
LZMA support is an experimental feature for now and so most file
98+
systems will be readable without selecting this option.
99+
100+
If unsure, say N.

fs/erofs/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@ obj-$(CONFIG_EROFS_FS) += erofs.o
44
erofs-objs := super.o inode.o data.o namei.o dir.o utils.o pcpubuf.o
55
erofs-$(CONFIG_EROFS_FS_XATTR) += xattr.o
66
erofs-$(CONFIG_EROFS_FS_ZIP) += decompressor.o zmap.o zdata.o
7+
erofs-$(CONFIG_EROFS_FS_ZIP_LZMA) += decompressor_lzma.o

fs/erofs/compress.h

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,6 @@
88

99
#include "internal.h"
1010

11-
enum {
12-
Z_EROFS_COMPRESSION_SHIFTED = Z_EROFS_COMPRESSION_MAX,
13-
Z_EROFS_COMPRESSION_RUNTIME_MAX
14-
};
15-
1611
struct z_erofs_decompress_req {
1712
struct super_block *sb;
1813
struct page **in, **out;
@@ -25,6 +20,12 @@ struct z_erofs_decompress_req {
2520
bool inplace_io, partial_decoding;
2621
};
2722

23+
struct z_erofs_decompressor {
24+
int (*decompress)(struct z_erofs_decompress_req *rq,
25+
struct page **pagepool);
26+
char *name;
27+
};
28+
2829
/* some special page->private (unsigned long, see below) */
2930
#define Z_EROFS_SHORTLIVED_PAGE (-1UL << 2)
3031
#define Z_EROFS_PREALLOCATED_PAGE (-2UL << 2)
@@ -63,7 +64,7 @@ static inline bool z_erofs_is_shortlived_page(struct page *page)
6364
return true;
6465
}
6566

66-
static inline bool z_erofs_put_shortlivedpage(struct list_head *pagepool,
67+
static inline bool z_erofs_put_shortlivedpage(struct page **pagepool,
6768
struct page *page)
6869
{
6970
if (!z_erofs_is_shortlived_page(page))
@@ -74,13 +75,22 @@ static inline bool z_erofs_put_shortlivedpage(struct list_head *pagepool,
7475
put_page(page);
7576
} else {
7677
/* follow the pcluster rule above. */
77-
set_page_private(page, 0);
78-
list_add(&page->lru, pagepool);
78+
erofs_pagepool_add(pagepool, page);
7979
}
8080
return true;
8181
}
8282

83+
#define MNGD_MAPPING(sbi) ((sbi)->managed_cache->i_mapping)
84+
static inline bool erofs_page_is_managed(const struct erofs_sb_info *sbi,
85+
struct page *page)
86+
{
87+
return page->mapping == MNGD_MAPPING(sbi);
88+
}
89+
8390
int z_erofs_decompress(struct z_erofs_decompress_req *rq,
84-
struct list_head *pagepool);
91+
struct page **pagepool);
8592

93+
/* prototypes for specific algorithms */
94+
int z_erofs_lzma_decompress(struct z_erofs_decompress_req *rq,
95+
struct page **pagepool);
8696
#endif

fs/erofs/data.c

Lines changed: 60 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@ static int erofs_map_blocks(struct inode *inode,
8989
erofs_off_t pos;
9090
int err = 0;
9191

92+
map->m_deviceid = 0;
9293
if (map->m_la >= inode->i_size) {
9394
/* leave out-of-bound access unmapped */
9495
map->m_flags = 0;
@@ -135,14 +136,8 @@ static int erofs_map_blocks(struct inode *inode,
135136
map->m_flags = 0;
136137
break;
137138
default:
138-
/* only one device is supported for now */
139-
if (idx->device_id) {
140-
erofs_err(sb, "invalid device id %u @ %llu for nid %llu",
141-
le16_to_cpu(idx->device_id),
142-
chunknr, vi->nid);
143-
err = -EFSCORRUPTED;
144-
goto out_unlock;
145-
}
139+
map->m_deviceid = le16_to_cpu(idx->device_id) &
140+
EROFS_SB(sb)->device_id_mask;
146141
map->m_pa = blknr_to_addr(le32_to_cpu(idx->blkaddr));
147142
map->m_flags = EROFS_MAP_MAPPED;
148143
break;
@@ -155,11 +150,55 @@ static int erofs_map_blocks(struct inode *inode,
155150
return err;
156151
}
157152

153+
int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
154+
{
155+
struct erofs_dev_context *devs = EROFS_SB(sb)->devs;
156+
struct erofs_device_info *dif;
157+
int id;
158+
159+
/* primary device by default */
160+
map->m_bdev = sb->s_bdev;
161+
map->m_daxdev = EROFS_SB(sb)->dax_dev;
162+
163+
if (map->m_deviceid) {
164+
down_read(&devs->rwsem);
165+
dif = idr_find(&devs->tree, map->m_deviceid - 1);
166+
if (!dif) {
167+
up_read(&devs->rwsem);
168+
return -ENODEV;
169+
}
170+
map->m_bdev = dif->bdev;
171+
map->m_daxdev = dif->dax_dev;
172+
up_read(&devs->rwsem);
173+
} else if (devs->extra_devices) {
174+
down_read(&devs->rwsem);
175+
idr_for_each_entry(&devs->tree, dif, id) {
176+
erofs_off_t startoff, length;
177+
178+
if (!dif->mapped_blkaddr)
179+
continue;
180+
startoff = blknr_to_addr(dif->mapped_blkaddr);
181+
length = blknr_to_addr(dif->blocks);
182+
183+
if (map->m_pa >= startoff &&
184+
map->m_pa < startoff + length) {
185+
map->m_pa -= startoff;
186+
map->m_bdev = dif->bdev;
187+
map->m_daxdev = dif->dax_dev;
188+
break;
189+
}
190+
}
191+
up_read(&devs->rwsem);
192+
}
193+
return 0;
194+
}
195+
158196
static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
159197
unsigned int flags, struct iomap *iomap, struct iomap *srcmap)
160198
{
161199
int ret;
162200
struct erofs_map_blocks map;
201+
struct erofs_map_dev mdev;
163202

164203
map.m_la = offset;
165204
map.m_llen = length;
@@ -168,8 +207,16 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
168207
if (ret < 0)
169208
return ret;
170209

171-
iomap->bdev = inode->i_sb->s_bdev;
172-
iomap->dax_dev = EROFS_I_SB(inode)->dax_dev;
210+
mdev = (struct erofs_map_dev) {
211+
.m_deviceid = map.m_deviceid,
212+
.m_pa = map.m_pa,
213+
};
214+
ret = erofs_map_dev(inode->i_sb, &mdev);
215+
if (ret)
216+
return ret;
217+
218+
iomap->bdev = mdev.m_bdev;
219+
iomap->dax_dev = mdev.m_daxdev;
173220
iomap->offset = map.m_la;
174221
iomap->length = map.m_llen;
175222
iomap->flags = 0;
@@ -188,15 +235,15 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
188235

189236
iomap->type = IOMAP_INLINE;
190237
ipage = erofs_get_meta_page(inode->i_sb,
191-
erofs_blknr(map.m_pa));
238+
erofs_blknr(mdev.m_pa));
192239
if (IS_ERR(ipage))
193240
return PTR_ERR(ipage);
194241
iomap->inline_data = page_address(ipage) +
195-
erofs_blkoff(map.m_pa);
242+
erofs_blkoff(mdev.m_pa);
196243
iomap->private = ipage;
197244
} else {
198245
iomap->type = IOMAP_MAPPED;
199-
iomap->addr = map.m_pa;
246+
iomap->addr = mdev.m_pa;
200247
}
201248
return 0;
202249
}

0 commit comments

Comments
 (0)