Skip to content

Commit c6b80eb

Browse files
committed
Merge tag 'ovl-update-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs update from Miklos Szeredi: - Fix failure to copy-up files from certain NFSv4 mounts - Sort out inconsistencies between st_ino and i_ino (used in /proc/locks) - Allow consistent (POSIX-y) inode numbering in more cases - Allow virtiofs to be used as upper layer - Miscellaneous cleanups and fixes * tag 'ovl-update-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: ovl: document xino expected behavior ovl: enable xino automatically in more cases ovl: avoid possible inode number collisions with xino=on ovl: use a private non-persistent ino pool ovl: fix WARN_ON nlink drop to zero ovl: fix a typo in comment ovl: replace zero-length array with flexible-array member ovl: ovl_obtain_alias(): don't call d_instantiate_anon() for old ovl: strict upper fs requirements for remote upper fs ovl: check if upper fs supports RENAME_WHITEOUT ovl: allow remote upper ovl: decide if revalidate needed on a per-dentry basis ovl: separate detection of remote upper layer from stacked overlay ovl: restructure dentry revalidation ovl: ignore failure to copy up unknown xattrs ovl: document permission model ovl: simplify i_ino initialization ovl: factor out helper ovl_get_root() ovl: fix out of date comment and unreachable code ovl: fix value of i_ino for lower hardlink corner case
2 parents 9744b92 + 2eda9ea commit c6b80eb

File tree

11 files changed

+460
-163
lines changed

11 files changed

+460
-163
lines changed

Documentation/filesystems/overlayfs.rst

Lines changed: 80 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,13 +40,46 @@ On 64bit systems, even if all overlay layers are not on the same
4040
underlying filesystem, the same compliant behavior could be achieved
4141
with the "xino" feature. The "xino" feature composes a unique object
4242
identifier from the real object st_ino and an underlying fsid index.
43+
4344
If all underlying filesystems support NFS file handles and export file
4445
handles with 32bit inode number encoding (e.g. ext4), overlay filesystem
4546
will use the high inode number bits for fsid. Even when the underlying
4647
filesystem uses 64bit inode numbers, users can still enable the "xino"
4748
feature with the "-o xino=on" overlay mount option. That is useful for the
4849
case of underlying filesystems like xfs and tmpfs, which use 64bit inode
49-
numbers, but are very unlikely to use the high inode number bit.
50+
numbers, but are very unlikely to use the high inode number bits. In case
51+
the underlying inode number does overflow into the high xino bits, overlay
52+
filesystem will fall back to the non xino behavior for that inode.
53+
54+
The following table summarizes what can be expected in different overlay
55+
configurations.
56+
57+
Inode properties
58+
````````````````
59+
60+
+--------------+------------+------------+-----------------+----------------+
61+
|Configuration | Persistent | Uniform | st_ino == d_ino | d_ino == i_ino |
62+
| | st_ino | st_dev | | [*] |
63+
+==============+=====+======+=====+======+========+========+========+=======+
64+
| | dir | !dir | dir | !dir | dir + !dir | dir | !dir |
65+
+--------------+-----+------+-----+------+--------+--------+--------+-------+
66+
| All layers | Y | Y | Y | Y | Y | Y | Y | Y |
67+
| on same fs | | | | | | | | |
68+
+--------------+-----+------+-----+------+--------+--------+--------+-------+
69+
| Layers not | N | Y | Y | N | N | Y | N | Y |
70+
| on same fs, | | | | | | | | |
71+
| xino=off | | | | | | | | |
72+
+--------------+-----+------+-----+------+--------+--------+--------+-------+
73+
| xino=on/auto | Y | Y | Y | Y | Y | Y | Y | Y |
74+
| | | | | | | | | |
75+
+--------------+-----+------+-----+------+--------+--------+--------+-------+
76+
| xino=on/auto,| N | Y | Y | N | N | Y | N | Y |
77+
| ino overflow | | | | | | | | |
78+
+--------------+-----+------+-----+------+--------+--------+--------+-------+
79+
80+
[*] nfsd v3 readdirplus verifies d_ino == i_ino. i_ino is exposed via several
81+
/proc files, such as /proc/locks and /proc/self/fdinfo/<fd> of an inotify
82+
file descriptor.
5083

5184

5285
Upper and Lower
@@ -248,6 +281,50 @@ overlay filesystem (though an operation on the name of the file such as
248281
rename or unlink will of course be noticed and handled).
249282

250283

284+
Permission model
285+
----------------
286+
287+
Permission checking in the overlay filesystem follows these principles:
288+
289+
1) permission check SHOULD return the same result before and after copy up
290+
291+
2) task creating the overlay mount MUST NOT gain additional privileges
292+
293+
3) non-mounting task MAY gain additional privileges through the overlay,
294+
compared to direct access on underlying lower or upper filesystems
295+
296+
This is achieved by performing two permission checks on each access
297+
298+
a) check if current task is allowed access based on local DAC (owner,
299+
group, mode and posix acl), as well as MAC checks
300+
301+
b) check if mounting task would be allowed real operation on lower or
302+
upper layer based on underlying filesystem permissions, again including
303+
MAC checks
304+
305+
Check (a) ensures consistency (1) since owner, group, mode and posix acls
306+
are copied up. On the other hand it can result in server enforced
307+
permissions (used by NFS, for example) being ignored (3).
308+
309+
Check (b) ensures that no task gains permissions to underlying layers that
310+
the mounting task does not have (2). This also means that it is possible
311+
to create setups where the consistency rule (1) does not hold; normally,
312+
however, the mounting task will have sufficient privileges to perform all
313+
operations.
314+
315+
Another way to demonstrate this model is drawing parallels between
316+
317+
mount -t overlay overlay -olowerdir=/lower,upperdir=/upper,... /merged
318+
319+
and
320+
321+
cp -a /lower /upper
322+
mount --bind /upper /merged
323+
324+
The resulting access permissions should be the same. The difference is in
325+
the time of copy (on-demand vs. up-front).
326+
327+
251328
Multiple lower layers
252329
---------------------
253330

@@ -383,7 +460,8 @@ guarantee that the values of st_ino and st_dev returned by stat(2) and the
383460
value of d_ino returned by readdir(3) will act like on a normal filesystem.
384461
E.g. the value of st_dev may be different for two objects in the same
385462
overlay filesystem and the value of st_ino for directory objects may not be
386-
persistent and could change even while the overlay filesystem is mounted.
463+
persistent and could change even while the overlay filesystem is mounted, as
464+
summarized in the `Inode properties`_ table above.
387465

388466

389467
Changes to underlying filesystems

fs/overlayfs/copy_up.c

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,13 @@ static int ovl_ccup_get(char *buf, const struct kernel_param *param)
3636
module_param_call(check_copy_up, ovl_ccup_set, ovl_ccup_get, NULL, 0644);
3737
MODULE_PARM_DESC(check_copy_up, "Obsolete; does nothing");
3838

39+
static bool ovl_must_copy_xattr(const char *name)
40+
{
41+
return !strcmp(name, XATTR_POSIX_ACL_ACCESS) ||
42+
!strcmp(name, XATTR_POSIX_ACL_DEFAULT) ||
43+
!strncmp(name, XATTR_SECURITY_PREFIX, XATTR_SECURITY_PREFIX_LEN);
44+
}
45+
3946
int ovl_copy_xattr(struct dentry *old, struct dentry *new)
4047
{
4148
ssize_t list_size, size, value_size = 0;
@@ -107,8 +114,13 @@ int ovl_copy_xattr(struct dentry *old, struct dentry *new)
107114
continue; /* Discard */
108115
}
109116
error = vfs_setxattr(new, name, value, size, 0);
110-
if (error)
111-
break;
117+
if (error) {
118+
if (error != -EOPNOTSUPP || ovl_must_copy_xattr(name))
119+
break;
120+
121+
/* Ignore failure to copy unknown xattrs */
122+
error = 0;
123+
}
112124
}
113125
kfree(value);
114126
out:

fs/overlayfs/dir.c

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ int ovl_cleanup(struct inode *wdir, struct dentry *wdentry)
4242
return err;
4343
}
4444

45-
static struct dentry *ovl_lookup_temp(struct dentry *workdir)
45+
struct dentry *ovl_lookup_temp(struct dentry *workdir)
4646
{
4747
struct dentry *temp;
4848
char name[20];
@@ -243,6 +243,9 @@ static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
243243

244244
ovl_dir_modified(dentry->d_parent, false);
245245
ovl_dentry_set_upper_alias(dentry);
246+
ovl_dentry_update_reval(dentry, newdentry,
247+
DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
248+
246249
if (!hardlink) {
247250
/*
248251
* ovl_obtain_alias() can be called after ovl_create_real()
@@ -819,6 +822,28 @@ static bool ovl_pure_upper(struct dentry *dentry)
819822
!ovl_test_flag(OVL_WHITEOUTS, d_inode(dentry));
820823
}
821824

825+
static void ovl_drop_nlink(struct dentry *dentry)
826+
{
827+
struct inode *inode = d_inode(dentry);
828+
struct dentry *alias;
829+
830+
/* Try to find another, hashed alias */
831+
spin_lock(&inode->i_lock);
832+
hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
833+
if (alias != dentry && !d_unhashed(alias))
834+
break;
835+
}
836+
spin_unlock(&inode->i_lock);
837+
838+
/*
839+
* Changes to underlying layers may cause i_nlink to lose sync with
840+
* reality. In this case prevent the link count from going to zero
841+
* prematurely.
842+
*/
843+
if (inode->i_nlink > !!alias)
844+
drop_nlink(inode);
845+
}
846+
822847
static int ovl_do_remove(struct dentry *dentry, bool is_dir)
823848
{
824849
int err;
@@ -856,7 +881,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
856881
if (is_dir)
857882
clear_nlink(dentry->d_inode);
858883
else
859-
drop_nlink(dentry->d_inode);
884+
ovl_drop_nlink(dentry);
860885
}
861886
ovl_nlink_end(dentry);
862887

@@ -1201,7 +1226,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
12011226
if (new_is_dir)
12021227
clear_nlink(d_inode(new));
12031228
else
1204-
drop_nlink(d_inode(new));
1229+
ovl_drop_nlink(new);
12051230
}
12061231

12071232
ovl_dir_modified(old->d_parent, ovl_type_origin(old) ||

fs/overlayfs/export.c

Lines changed: 23 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -308,29 +308,35 @@ static struct dentry *ovl_obtain_alias(struct super_block *sb,
308308
ovl_set_flag(OVL_UPPERDATA, inode);
309309

310310
dentry = d_find_any_alias(inode);
311-
if (!dentry) {
312-
dentry = d_alloc_anon(inode->i_sb);
313-
if (!dentry)
314-
goto nomem;
315-
oe = ovl_alloc_entry(lower ? 1 : 0);
316-
if (!oe)
317-
goto nomem;
318-
319-
if (lower) {
320-
oe->lowerstack->dentry = dget(lower);
321-
oe->lowerstack->layer = lowerpath->layer;
322-
}
323-
dentry->d_fsdata = oe;
324-
if (upper_alias)
325-
ovl_dentry_set_upper_alias(dentry);
311+
if (dentry)
312+
goto out_iput;
313+
314+
dentry = d_alloc_anon(inode->i_sb);
315+
if (unlikely(!dentry))
316+
goto nomem;
317+
oe = ovl_alloc_entry(lower ? 1 : 0);
318+
if (!oe)
319+
goto nomem;
320+
321+
if (lower) {
322+
oe->lowerstack->dentry = dget(lower);
323+
oe->lowerstack->layer = lowerpath->layer;
326324
}
325+
dentry->d_fsdata = oe;
326+
if (upper_alias)
327+
ovl_dentry_set_upper_alias(dentry);
328+
329+
ovl_dentry_update_reval(dentry, upper,
330+
DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE);
327331

328332
return d_instantiate_anon(dentry, inode);
329333

330334
nomem:
331-
iput(inode);
332335
dput(dentry);
333-
return ERR_PTR(-ENOMEM);
336+
dentry = ERR_PTR(-ENOMEM);
337+
out_iput:
338+
iput(inode);
339+
return dentry;
334340
}
335341

336342
/* Get the upper or lower dentry in stach whose on layer @idx */

0 commit comments

Comments
 (0)