1
+ .. SPDX-License-Identifier: GPL-2.0
2
+ ==============
3
+ FUSE
4
+ ==============
5
+
1
6
Definitions
2
- ~~~~~~~~~~~
7
+ ===========
3
8
4
9
Userspace filesystem:
5
-
6
10
A filesystem in which data and metadata are provided by an ordinary
7
11
userspace process. The filesystem can be accessed normally through
8
12
the kernel interface.
9
13
10
14
Filesystem daemon:
11
-
12
15
The process(es) providing the data and metadata of the filesystem.
13
16
14
17
Non-privileged mount (or user mount):
15
-
16
18
A userspace filesystem mounted by a non-privileged (non-root) user.
17
19
The filesystem daemon is running with the privileges of the mounting
18
20
user. NOTE: this is not the same as mounts allowed with the "user"
19
21
option in /etc/fstab, which is not discussed here.
20
22
21
23
Filesystem connection:
22
-
23
24
A connection between the filesystem daemon and the kernel. The
24
25
connection exists until either the daemon dies, or the filesystem is
25
26
umounted. Note that detaching (or lazy umounting) the filesystem
26
- does _not_ break the connection, in this case it will exist until
27
+ does * not * break the connection, in this case it will exist until
27
28
the last reference to the filesystem is released.
28
29
29
30
Mount owner:
30
-
31
31
The user who does the mounting.
32
32
33
33
User:
34
-
35
34
The user who is performing filesystem operations.
36
35
37
36
What is FUSE?
38
- ~~~~~~~~~~~~~
37
+ =============
39
38
40
39
FUSE is a userspace filesystem framework. It consists of a kernel
41
40
module (fuse.ko), a userspace library (libfuse.*) and a mount utility
@@ -46,79 +45,67 @@ non-privileged mounts. This opens up new possibilities for the use of
46
45
filesystems. A good example is sshfs: a secure network filesystem
47
46
using the sftp protocol.
48
47
49
- The userspace library and utilities are available from the FUSE
50
- homepage:
51
-
52
- http://fuse.sourceforge.net/
48
+ The userspace library and utilities are available from the
49
+ `FUSE homepage: <http://fuse.sourceforge.net/ >`_
53
50
54
51
Filesystem type
55
- ~~~~~~~~~~~~~~~
52
+ ===============
56
53
57
54
The filesystem type given to mount(2) can be one of the following:
58
55
59
- 'fuse'
60
-
61
- This is the usual way to mount a FUSE filesystem. The first
62
- argument of the mount system call may contain an arbitrary string,
63
- which is not interpreted by the kernel.
56
+ fuse
57
+ This is the usual way to mount a FUSE filesystem. The first
58
+ argument of the mount system call may contain an arbitrary string,
59
+ which is not interpreted by the kernel.
64
60
65
- 'fuseblk'
66
-
67
- The filesystem is block device based. The first argument of the
68
- mount system call is interpreted as the name of the device.
61
+ fuseblk
62
+ The filesystem is block device based. The first argument of the
63
+ mount system call is interpreted as the name of the device.
69
64
70
65
Mount options
71
- ~~~~~~~~~~~~~
72
-
73
- 'fd=N'
66
+ =============
74
67
68
+ fd=N
75
69
The file descriptor to use for communication between the userspace
76
70
filesystem and the kernel. The file descriptor must have been
77
71
obtained by opening the FUSE device ('/dev/fuse').
78
72
79
- 'rootmode=M'
80
-
73
+ rootmode=M
81
74
The file mode of the filesystem's root in octal representation.
82
75
83
- 'user_id=N'
84
-
76
+ user_id=N
85
77
The numeric user id of the mount owner.
86
78
87
- 'group_id=N'
88
-
79
+ group_id=N
89
80
The numeric group id of the mount owner.
90
81
91
- 'default_permissions'
92
-
82
+ default_permissions
93
83
By default FUSE doesn't check file access permissions, the
94
84
filesystem is free to implement its access policy or leave it to
95
85
the underlying file access mechanism (e.g. in case of network
96
86
filesystems). This option enables permission checking, restricting
97
87
access based on file mode. It is usually useful together with the
98
88
'allow_other' mount option.
99
89
100
- 'allow_other'
101
-
90
+ allow_other
102
91
This option overrides the security measure restricting file access
103
92
to the user mounting the filesystem. This option is by default only
104
93
allowed to root, but this restriction can be removed with a
105
94
(userspace) configuration option.
106
95
107
- 'max_read=N'
108
-
96
+ max_read=N
109
97
With this option the maximum size of read operations can be set.
110
98
The default is infinite. Note that the size of read requests is
111
99
limited anyway to 32 pages (which is 128kbyte on i386).
112
100
113
- 'blksize=N'
114
-
101
+ blksize=N
115
102
Set the block size for the filesystem. The default is 512. This
116
103
option is only valid for 'fuseblk' type mounts.
117
104
118
105
Control filesystem
119
- ~~~~~~~~~~~~~~~~~~
106
+ ==================
120
107
121
- There's a control filesystem for FUSE, which can be mounted by:
108
+ There's a control filesystem for FUSE, which can be mounted by::
122
109
123
110
mount -t fusectl none /sys/fs/fuse/connections
124
111
@@ -130,53 +117,51 @@ named by a unique number.
130
117
131
118
For each connection the following files exist within this directory:
132
119
133
- 'waiting'
134
-
135
- The number of requests which are waiting to be transferred to
136
- userspace or being processed by the filesystem daemon. If there is
137
- no filesystem activity and 'waiting' is non-zero, then the
138
- filesystem is hung or deadlocked.
139
-
140
- 'abort'
120
+ waiting
121
+ The number of requests which are waiting to be transferred to
122
+ userspace or being processed by the filesystem daemon. If there is
123
+ no filesystem activity and 'waiting' is non-zero, then the
124
+ filesystem is hung or deadlocked.
141
125
142
- Writing anything into this file will abort the filesystem
143
- connection. This means that all waiting requests will be aborted an
144
- error returned for all aborted and new requests.
126
+ abort
127
+ Writing anything into this file will abort the filesystem
128
+ connection. This means that all waiting requests will be aborted an
129
+ error returned for all aborted and new requests.
145
130
146
131
Only the owner of the mount may read or write these files.
147
132
148
133
Interrupting filesystem operations
149
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134
+ ##################################
150
135
151
136
If a process issuing a FUSE filesystem request is interrupted, the
152
137
following will happen:
153
138
154
- 1) If the request is not yet sent to userspace AND the signal is
139
+ - If the request is not yet sent to userspace AND the signal is
155
140
fatal (SIGKILL or unhandled fatal signal), then the request is
156
141
dequeued and returns immediately.
157
142
158
- 2) If the request is not yet sent to userspace AND the signal is not
159
- fatal, then an ' interrupted' flag is set for the request. When
143
+ - If the request is not yet sent to userspace AND the signal is not
144
+ fatal, then an interrupted flag is set for the request. When
160
145
the request has been successfully transferred to userspace and
161
146
this flag is set, an INTERRUPT request is queued.
162
147
163
- 3) If the request is already sent to userspace, then an INTERRUPT
148
+ - If the request is already sent to userspace, then an INTERRUPT
164
149
request is queued.
165
150
166
151
INTERRUPT requests take precedence over other requests, so the
167
152
userspace filesystem will receive queued INTERRUPTs before any others.
168
153
169
154
The userspace filesystem may ignore the INTERRUPT requests entirely,
170
- or may honor them by sending a reply to the _original_ request, with
155
+ or may honor them by sending a reply to the * original * request, with
171
156
the error set to EINTR.
172
157
173
158
It is also possible that there's a race between processing the
174
159
original request and its INTERRUPT request. There are two possibilities:
175
160
176
- 1) The INTERRUPT request is processed before the original request is
161
+ 1. The INTERRUPT request is processed before the original request is
177
162
processed
178
163
179
- 2) The INTERRUPT request is processed after the original request has
164
+ 2. The INTERRUPT request is processed after the original request has
180
165
been answered
181
166
182
167
If the filesystem cannot find the original request, it should wait for
@@ -186,7 +171,7 @@ should reply to the INTERRUPT request with an EAGAIN error. In case
186
171
reply will be ignored.
187
172
188
173
Aborting a filesystem connection
189
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
174
+ ================================
190
175
191
176
It is possible to get into certain situations where the filesystem is
192
177
not responding. Reasons for this may be:
@@ -216,7 +201,7 @@ the filesystem. There are several ways to do this:
216
201
powerful method, always works.
217
202
218
203
How do non-privileged mounts work?
219
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
204
+ ==================================
220
205
221
206
Since the mount() system call is a privileged operation, a helper
222
207
program (fusermount) is needed, which is installed setuid root.
@@ -235,15 +220,13 @@ system. Obvious requirements arising from this are:
235
220
other users' or the super user's processes
236
221
237
222
How are requirements fulfilled?
238
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
223
+ ===============================
239
224
240
225
A) The mount owner could gain elevated privileges by either:
241
226
242
- 1) creating a filesystem containing a device file, then opening
243
- this device
227
+ 1. creating a filesystem containing a device file, then opening this device
244
228
245
- 2) creating a filesystem containing a suid or sgid application,
246
- then executing this application
229
+ 2. creating a filesystem containing a suid or sgid application, then executing this application
247
230
248
231
The solution is not to allow opening device files and ignore
249
232
setuid and setgid bits when executing programs. To ensure this
@@ -275,16 +258,16 @@ How are requirements fulfilled?
275
258
of other users' processes.
276
259
277
260
i) It can slow down or indefinitely delay the execution of a
278
- filesystem operation creating a DoS against the user or the
279
- whole system. For example a suid application locking a
280
- system file, and then accessing a file on the mount owner's
281
- filesystem could be stopped, and thus causing the system
282
- file to be locked forever.
261
+ filesystem operation creating a DoS against the user or the
262
+ whole system. For example a suid application locking a
263
+ system file, and then accessing a file on the mount owner's
264
+ filesystem could be stopped, and thus causing the system
265
+ file to be locked forever.
283
266
284
267
ii) It can present files or directories of unlimited length, or
285
- directory structures of unlimited depth, possibly causing a
286
- system process to eat up diskspace, memory or other
287
- resources, again causing DoS.
268
+ directory structures of unlimited depth, possibly causing a
269
+ system process to eat up diskspace, memory or other
270
+ resources, again causing * DoS * .
288
271
289
272
The solution to this as well as B) is not to allow processes
290
273
to access the filesystem, which could otherwise not be
@@ -294,28 +277,27 @@ How are requirements fulfilled?
294
277
ptrace can be used to check if a process is allowed to access
295
278
the filesystem or not.
296
279
297
- Note that the ptrace check is not strictly necessary to
280
+ Note that the * ptrace * check is not strictly necessary to
298
281
prevent B/2/i, it is enough to check if mount owner has enough
299
282
privilege to send signal to the process accessing the
300
- filesystem, since SIGSTOP can be used to get a similar effect.
283
+ filesystem, since * SIGSTOP * can be used to get a similar effect.
301
284
302
285
I think these limitations are unacceptable?
303
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
286
+ ===========================================
304
287
305
288
If a sysadmin trusts the users enough, or can ensure through other
306
289
measures, that system processes will never enter non-privileged
307
- mounts, it can relax the last limitation with a " user_allow_other"
290
+ mounts, it can relax the last limitation with a ' user_allow_other'
308
291
config option. If this config option is set, the mounting user can
309
- add the " allow_other" mount option which disables the check for other
292
+ add the ' allow_other' mount option which disables the check for other
310
293
users' processes.
311
294
312
295
Kernel - userspace interface
313
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
296
+ ============================
314
297
315
298
The following diagram shows how a filesystem operation (in this
316
- example unlink) is performed in FUSE.
299
+ example unlink) is performed in FUSE. ::
317
300
318
- NOTE: everything in this description is greatly simplified
319
301
320
302
| "rm /mnt/fuse/file" | FUSE filesystem daemon
321
303
| |
@@ -357,12 +339,13 @@ NOTE: everything in this description is greatly simplified
357
339
| <fuse_unlink() |
358
340
| <sys_unlink() |
359
341
342
+ .. note :: Everything in the description above is greatly simplified
343
+
360
344
There are a couple of ways in which to deadlock a FUSE filesystem.
361
345
Since we are talking about unprivileged userspace programs,
362
346
something must be done about these.
363
347
364
- Scenario 1 - Simple deadlock
365
- -----------------------------
348
+ **Scenario 1 - Simple deadlock **::
366
349
367
350
| "rm /mnt/fuse/file" | FUSE filesystem daemon
368
351
| |
@@ -379,12 +362,12 @@ Scenario 1 - Simple deadlock
379
362
380
363
The solution for this is to allow the filesystem to be aborted.
381
364
382
- Scenario 2 - Tricky deadlock
383
- ----------------------------
365
+ ** Scenario 2 - Tricky deadlock **
366
+
384
367
385
368
This one needs a carefully crafted filesystem. It's a variation on
386
369
the above, only the call back to the filesystem is not explicit,
387
- but is caused by a pagefault.
370
+ but is caused by a pagefault. ::
388
371
389
372
| Kamikaze filesystem thread 1 | Kamikaze filesystem thread 2
390
373
| |
@@ -410,7 +393,7 @@ but is caused by a pagefault.
410
393
| | [lock page]
411
394
| | * DEADLOCK *
412
395
413
- Solution is basically the same as above.
396
+ The solution is basically the same as above.
414
397
415
398
An additional problem is that while the write buffer is being copied
416
399
to the request, the request must not be interrupted/aborted. This is
0 commit comments