Skip to content

Conversation

@arshad-run
Copy link

No description provided.

Jaegeuk Kim and others added 30 commits February 4, 2017 23:52
If i_size is already valid during roll_forward recovery, we should not update
it according to the block alignment.

Signed-off-by: Jaegeuk Kim <[email protected]>
If i_size is not aligned to the f2fs's block size, we should not skip inode
update during fsync.

Signed-off-by: Jaegeuk Kim <[email protected]>
Drop duplicate header timer.h from segment.c.

Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
While calculating inode count that we can create at most in the left space,
we should consider space which data/node blocks occupied, since we create
data/node mixly in main area. So fix the wrong calculation in ->statfs.

Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
The struct file_operations instance serving the f2fs/status debugfs file
lacks an initialization of its ->owner.

This means that although that file might have been opened, the f2fs module
can still get removed. Any further operation on that opened file, releasing
included,  will cause accesses to unmapped memory.

Indeed, Mike Marshall reported the following:

  BUG: unable to handle kernel paging request at ffffffffa0307430
  IP: [<ffffffff8132a224>] full_proxy_release+0x24/0x90
  <...>
  Call Trace:
   [] __fput+0xdf/0x1d0
   [] ____fput+0xe/0x10
   [] task_work_run+0x8e/0xc0
   [] do_exit+0x2ae/0xae0
   [] ? __audit_syscall_entry+0xae/0x100
   [] ? syscall_trace_enter+0x1ca/0x310
   [] do_group_exit+0x44/0xc0
   [] SyS_exit_group+0x14/0x20
   [] do_syscall_64+0x61/0x150
   [] entry_SYSCALL64_slow_path+0x25/0x25
  <...>
  ---[ end trace f22ae883fa3ea6b8 ]---
  Fixing recursive fault but reboot is needed!

Fix this by initializing the f2fs/status file_operations' ->owner with
THIS_MODULE.

This will allow debugfs to grab a reference to the f2fs module upon any
open on that file, thus preventing it from getting removed.

Fixes: 902829a ("f2fs: move proc files to debugfs")
Reported-by: Mike Marshall <[email protected]>
Reported-by: Martin Brandenburg <[email protected]>
Cc: [email protected]
Signed-off-by: Nicolai Stange <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
The addition of multiple-device support broke CONFIG_BLK_DEV_ZONED
on 32-bit machines because of a 64-bit division:

fs/f2fs/f2fs.o: In function `__issue_discard_async':
extent_cache.c:(.text.__issue_discard_async+0xd4): undefined reference to `__aeabi_uldivmod'

Fortunately, bdev_zone_size() is guaranteed to return a power-of-two
number, so we can replace the % operator with a cheaper bit mask.

Fixes: 792b84b74b54 ("f2fs: support multiple devices")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
If a file needs to keep its i_size by fallocate, we need to turn off auto
recovery during roll-forward recovery.

This will resolve the below scenario.

1. xfs_io -f /mnt/f2fs/file -c "pwrite 0 4096" -c "fsync"
2. xfs_io -f /mnt/f2fs/file -c "falloc -k 4096 4096" -c "fsync"
3. md5sum /mnt/f2fs/file;
4. godown /mnt/f2fs/
5. umount /mnt/f2fs/
6. mount -t f2fs /dev/sdx /mnt/f2fs
7. md5sum /mnt/f2fs/file

Reported-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
We should use AOP_WRITEPAGE_ACTIVATE when we bypass writing pages.

Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Miao Xie <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76.

The perpcu_counter doesn't provide atomicity in single core and consume more
DRAM. That incurs fs_mark test failure due to ENOMEM.

Cc: [email protected] # 4.7+
Signed-off-by: Jaegeuk Kim <[email protected]>
The sync_fs in f2fs_balance_fs_bg must avoid interrupting current user requests.

Signed-off-by: Jaegeuk Kim <[email protected]>
Previous mkfs.f2fs allows small partition inappropriately, so f2fs should detect
that as well.

Refer this in f2fs-tools.

mkfs.f2fs: detect small partition by overprovision ratio and # of segments

Reported-by: Eric Biggers <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
This fixes missing freeing meta pages in the error case.

Signed-off-by: Jaegeuk Kim <[email protected]>
f2fs_sync_file()             remount_ro
 - f2fs_readonly
                               - destroy_flush_cmd_control
 - f2fs_issue_flush
   - no fcc pointer!

So, this patch doesn't free fcc in this case, but just stop its kernel thread
which sends flush commands.

Signed-off-by: Jaegeuk Kim <[email protected]>
Change thaw_super() to check frozen != SB_FREEZE_COMPLETE rather than
frozen == SB_UNFROZEN, otherwise it can race with freeze_super() which
drops sb->s_umount after SB_FREEZE_WRITE to preserve the lock ordering.

In this case thaw_super() will wrongly call s_op->unfreeze_fs() before
it was actually frozen, and call sb_freeze_unlock() which leads to the
unbalanced percpu_up_write(). Unfortunately lockdep can't detect this,
so this triggers misc BUG_ON()'s in kernel/rcu/sync.c.

Reported-and-tested-by: Nikolay Borisov <[email protected]>
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: [email protected]
Signed-off-by: Al Viro <[email protected]>
This patch fix a missing size change in f2fs_setattr

Signed-off-by: Yunlei He <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
This takes code from fs/mpage.c and optimizes it for ext4.  Its
primary reason is to allow us to more easily add encryption to ext4's
read path in an efficient manner.

Change-Id: I1fd07e78fbbff50fd4028bbffbee73dbaec546a1
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Change-Id: I3cc7669ce5c2902bacf9ec365b1ba7049be781f0
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Change-Id: I784c5f57f031981e5d28796921f5e587d4f72422
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Required for future encryption xattr changes.

Change-Id: Ieaff30ae755d76f562c6c4b110bc0c1c59ea4dfd
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Change-Id: I6325cdbfb9666cca194b462878c157bd0449302e
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Change-Id: Iaabf029d28db2bd27e492e4e1bf7adc034b066e8
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Ildar Muslukhov <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
On encrypt, we will re-assign the buffer_heads to point to a bounce
page rather than the control_page (which is the original page to write
that contains the plaintext). The block I/O occurs against the bounce
page.  On write completion, we re-assign the buffer_heads to the
original plaintext page.

On decrypt, we will attach a read completion callback to the bio
struct. This read completion will decrypt the read contents in-place
prior to setting the page up-to-date.

The current encryption mode, AES-256-XTS, lacks cryptographic
integrity. AES-256-GCM is in-plan, but we will need to devise a
mechanism for handling the integrity data.

Change-Id: Icf3814a88aed38f24bf615663f9921f5c390fb32
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Ildar Muslukhov <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Change-Id: I4914284877331994b0d1f701bcbbcf820116e8ee
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Ildar Muslukhov <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Enforce the following inheritance policy:

1) An unencrypted directory may contain encrypted or unencrypted files
or directories.

2) All files or directories in a directory must be protected using the
same key as their containing directory.

As a result, assuming the following setup:

mke2fs -t ext4 -Fq -O encrypt /dev/vdc
mount -t ext4 /dev/vdc /vdc
mkdir /vdc/a /vdc/b /vdc/c
echo foo | e4crypt add_key /vdc/a
echo bar | e4crypt add_key /vdc/b
for i in a b c ; do cp /etc/motd /vdc/$i/motd-$i ; done

Then we will see the following results:

cd /vdc
mv a b			# will fail; /vdc/a and /vdc/b have different keys
mv b/motd-b a		# will fail, see above
ln a/motd-a b		# will fail, see above
mv c a	    		# will fail; all inodes in an encrypted directory
   	  		#	must be encrypted
ln c/motd-c b		# will fail, see above
mv a/motd-a c		# will succeed
mv c/motd-a a		# will succeed

Change-Id: I0eb702a4e5c426dfd38863ada7bdec3741e1ee8b
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Change-Id: Ic06753bdc015fc12b4f5620dfc15955765b1f117
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Pulls block_write_begin() into fs/ext4/inode.c because it might need
to do a low-level read of the existing data, in which case we need to
decrypt it.

Change-Id: Ib2067d50cb80e9017ccf7016b2e72683ebd4c74a
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Ildar Muslukhov <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Change-Id: I69043c9b36be0f8db1e80dfba54382d5328d9d4b
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Ildar Muslukhov <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
Change-Id: I52b2ad72b4599a720f6f7db27acb6a39fa2265c9
Signed-off-by: Uday Savagaonkar <[email protected]>
Signed-off-by: Ildar Muslukhov <[email protected]>
Signed-off-by: Michael Halcrow <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
…ames

For encrypted directories, we need to pass in a separate parameter for
the decrypted filename, since the directory entry contains the
encrypted filename.

Change-Id: Ie28cbb198c41daa743a3a18ab25ff2e4d016c275
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>
arshad-run and others added 24 commits February 26, 2017 22:35
This reverts commit b689bf6.

Change-Id: Ie6dce52a2dbb428cc899e70011d36e384bb01f95
This comes from the wrapfs patch
3dfec0ffe5e2 Wrapfs: implement vm_ops->page_mkwrite

Some file systems (e.g., ext4) require it.  Reported by Ted Ts'o.

Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Daniel Rosenberg <[email protected]>
Bug: 34133558
Change-Id: I1a389b2422c654a6d3046bb8ec3e20511aebfa8e
This comes from the wrapfs patch
2e346c83b26e Wrapfs: support direct-IO (DIO) operations

Signed-off-by: Li Mengyang <[email protected]>
Signed-off-by: Erez Zadok <[email protected]>
Signed-off-by: Daniel Rosenberg <[email protected]>
Bug: 34133558
Change-Id: I3fd779c510ab70d56b1d918f99c20421b524cdc4
cpuidle was disabled while entering suspend as part of commit
8651f97 in order to work around some
ACPI bugs. However, there's no reason to do this on modern
platforms. Leaving cpuidle enabled can result in improved power
consumption if dpm_resume_noirq runs for a significant time.

Change-Id: Ie182785b176f448698c0264eba554d1e315e8a06
the performance of memcpy and memmove of the general version is very
inefficient, this patch improved them.

Change-Id: I77b25e63ba19fee4e7b68178c5db1d9c94fec5af
Signed-off-by: Miao Xie <miaox*******>
the kernel's memcpy and memmove is very inefficient. But the glibc version is
quite fast, in some cases it is 10 times faster than the kernel version. So I
introduce some memory copy macros and functions of the glibc to improve the
kernel version's performance.

The strategy of the memory functions is:
1. Copy bytes until the destination pointer is aligned.
2. Copy words in unrolled loops.  If the source and destination are not
   aligned in the same way, use word memory operations, but shift and merge
   two read words before writing.
3. Copy the few remaining bytes.

Change-Id: I02b7cda69f97512695fb7fae81b7a4e452e73eaa
Signed-off-by: Miao Xie <miaox*******>
If we are building for a LE platform, and we haven't overriden the
MMIO ops, then we can optimize the mem*io operations using the
standard string functions.

Change-Id: I8692682281b0d0f44cee4e54b3f57488418d818e
Acked-by: Nicolas Pitre <[email protected]>
Signed-off-by: Russell King <[email protected]>
The asm-generic rwsem implementation directly acceses sem->cnt when
performing a __down_read_trylock operation. Whilst this is probably safe
on all architectures, we should stick to the atomic_long_* API and use
atomic_long_read instead.

Change-Id: I8991e8fa4e7f8c8d248bb51982737edbe0bdc73d
Signed-off-by: Will Deacon <[email protected]>
Change-Id: Iaa3a18521c19c298e66c0c4cfcdafd7dd62d7079
Signed-off-by: AngeloGioacchino Del Regno <[email protected]>
…es offline

To avoid polluting the kernel log when cpuquiet is active, lower the IRQ
migration printk() to pr_devel_ratelimited().

BUG=chrome-os-partner:40516
TEST=Less console spew on Smaug with cpuquiet enabled.

Change-Id: Ifcd02adfa0e12bbc71b115aee9a966bc676109b4
Signed-off-by: Joseph Lo <[email protected]>
Reviewed-on: https://chromium-review.googlesource.com/286302
Reviewed-by: Andrew Bresticker <[email protected]>
Reviewed-by: Benson Leung <[email protected]>
Tested-by: Benson Leung <[email protected]>
Commit-Queue: Benson Leung <[email protected]>
Result on 10,000,000 call.
Old:
sqrt(12345689) = 3513
real	0m0.768s
user	0m0.760s
sys	0m0.004s

New:
sqrt(12345689) = 3513
real	0m0.222s
user	0m0.224s
sys	0m0.000s

Signed-off-by: AngeloGioacchino Del Regno <[email protected]>
(Personal comment)
The only very important uses for this commit are in memcontrol.c
and page_alloc.c, where we play with inactive anon ratio that
uses sqrt.
Aside from this, we have some efficiency improvement in NFS.

Change-Id: Ia035962e27c116d16c5e54eec882f4c1a7e15238
Verify that unsigned int value will not become negative before cast to
signed int.

Bug: 31651010
Change-Id: I548a200f678762042617f11100b6966a405a3920
Signed-off-by: Francisco Franco <[email protected]>
Signed-off-by: NewEraCracker <[email protected]>
Although the arm64 vDSO is cleanly separated by code/data with the
code being read-only in userspace mappings, the code page is still
writable from the kernel.  There have been exploits (such as
http://itszn.com/blog/?p=21) that take advantage of this on x86 to go
from a bad kernel write to full root.

Prevent this specific exploit on arm64 by putting the vDSO code page
in read-only memory as well.

Before the change:
[    3.138366] vdso: 2 pages (1 code @ ffffffc000a71000, 1 data @ ffffffc000a70000)
---[ Kernel Mapping ]---
0xffffffc000000000-0xffffffc000082000         520K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc000082000-0xffffffc000200000        1528K     ro x  SHD AF            UXN MEM/NORMAL
0xffffffc000200000-0xffffffc000800000           6M     ro x  SHD AF        BLK UXN MEM/NORMAL
0xffffffc000800000-0xffffffc0009b6000        1752K     ro x  SHD AF            UXN MEM/NORMAL
0xffffffc0009b6000-0xffffffc000c00000        2344K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc000c00000-0xffffffc008000000         116M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc00c000000-0xffffffc07f000000        1840M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc800000000-0xffffffc840000000           1G     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc840000000-0xffffffc87ae00000         942M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc87ae00000-0xffffffc87ae70000         448K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87af80000-0xffffffc87af8a000          40K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87af8b000-0xffffffc87b000000         468K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87b000000-0xffffffc87fe00000          78M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc87fe00000-0xffffffc87ff50000        1344K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87ff90000-0xffffffc87ffa0000          64K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87fff0000-0xffffffc880000000          64K     RW NX SHD AF            UXN MEM/NORMAL

After:
[    3.138368] vdso: 2 pages (1 code @ ffffffc0006de000, 1 data @ ffffffc000a74000)
---[ Kernel Mapping ]---
0xffffffc000000000-0xffffffc000082000         520K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc000082000-0xffffffc000200000        1528K     ro x  SHD AF            UXN MEM/NORMAL
0xffffffc000200000-0xffffffc000800000           6M     ro x  SHD AF        BLK UXN MEM/NORMAL
0xffffffc000800000-0xffffffc0009b8000        1760K     ro x  SHD AF            UXN MEM/NORMAL
0xffffffc0009b8000-0xffffffc000c00000        2336K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc000c00000-0xffffffc008000000         116M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc00c000000-0xffffffc07f000000        1840M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc800000000-0xffffffc840000000           1G     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc840000000-0xffffffc87ae00000         942M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc87ae00000-0xffffffc87ae70000         448K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87af80000-0xffffffc87af8a000          40K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87af8b000-0xffffffc87b000000         468K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87b000000-0xffffffc87fe00000          78M     RW NX SHD AF        BLK UXN MEM/NORMAL
0xffffffc87fe00000-0xffffffc87ff50000        1344K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87ff90000-0xffffffc87ffa0000          64K     RW NX SHD AF            UXN MEM/NORMAL
0xffffffc87fff0000-0xffffffc880000000          64K     RW NX SHD AF            UXN MEM/NORMAL

Inspired by https://lkml.org/lkml/2016/1/19/494 based on work by the
PaX Team, Brad Spengler, and Kees Cook.

Signed-off-by: David Brown <[email protected]>
Acked-by: Will Deacon <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
[[email protected]: removed superfluous __PAGE_ALIGNED_DATA]
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Francisco Franco <[email protected]>
Signed-off-by: NewEraCracker <[email protected]>

Change-Id: I0cc57e08b37ca07c0bd55dcb4b68f407bd4f604c
If we end up sleeping due to running out of requests, we should
update the hardware and software queues in the map ctx structure.
Otherwise we could end up having rq->mq_ctx point to the pre-sleep
context, and risk corrupting ctx->rq_list since we'll be
grabbing the wrong lock when inserting the request.

Change-Id: I2920edd88dbd62c2bebd497aa6e1d39fe7adb437
Reported-by: Dave Jones <[email protected]>
Reported-by: Chris Mason <[email protected]>
Tested-by: Chris Mason <[email protected]>
Fixes: 63581af3f31e ("blk-mq: remove non-blocking pass in blk_mq_map_request")
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Francisco Franco <[email protected]>
Signed-off-by: NewEraCracker <[email protected]>
We want to avoid lots of different copy_page implementations, settling
for something that is "good enough" everywhere and hopefully easy to
understand and maintain whilst we're at it.

This patch reworks our copy_page implementation based on discussions
with Cavium on the list and benchmarking on Cortex-A processors so that:

  - The loop is unrolled to copy 128 bytes per iteration

  - The reads are offset so that we read from the next 128-byte block
    in the same iteration that we store the previous block

  - Explicit prefetch instructions are removed for now, since they hurt
    performance on CPUs with hardware prefetching

  - The loop exit condition is calculated at the start of the loop

Change-Id: I0d9f3bbe4efa2751f41432a3b4b299fbb0e494be
Signed-off-by: Will Deacon <[email protected]>
Tested-by: Andrew Pinski <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
(cherry picked from commit 8b3201d15a79346952540f6bc173a451f4a9637f)
@dmd79 dmd79 force-pushed the n7x-caf branch 4 times, most recently from 58c84f5 to 38106b2 Compare March 14, 2017 19:00
@dmd79 dmd79 force-pushed the n7x-caf branch 2 times, most recently from 2533e8a to ea2fb6a Compare June 27, 2017 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.