summaryrefslogtreecommitdiff
path: root/fs/f2fs/super.c
AgeCommit message (Collapse)AuthorFilesLines
2018-08-22Merge tag 'f2fs-for-4.19' of ↵Linus Torvalds1-40/+122
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've tuned f2fs to improve general performance by serializing block allocation and enhancing discard flows like fstrim which avoids user IO contention. And we've added fsync_mode=nobarrier which gives an option to user where it skips issuing cache_flush commands to underlying flash storage. And there are many bug fixes related to fuzzed images, revoked atomic writes, quota ops, and minor direct IO. Enhancements: - add fsync_mode=nobarrier which bypasses cache_flush command - enhance the discarding flow which avoids user IOs and issues in LBA order - readahead some encrypted blocks during GC - enable in-memory inode checksum to verify the blocks if F2FS_CHECK_FS is set - enhance nat_bits behavior - set -o discard by default - set REQ_RAHEAD to bio in ->readpages Bug fixes: - fix a corner case to corrupt atomic_writes revoking flow - revisit i_gc_rwsem to fix race conditions - fix some dio behaviors captured by xfstests - correct handling errors given by quota-related failures - add many sanity check flows to avoid fuzz test failures - add more error number propagation to their callers - fix several corner cases to continue fault injection w/ shutdown loop" * tag 'f2fs-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (89 commits) f2fs: readahead encrypted block during GC f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc f2fs: fix performance issue observed with multi-thread sequential read f2fs: fix to skip verifying block address for non-regular inode f2fs: rework fault injection handling to avoid a warning f2fs: support fault_type mount option f2fs: fix to return success when trimming meta area f2fs: fix use-after-free of dicard command entry f2fs: support discard submission error injection f2fs: split discard command in prior to block layer f2fs: wake up gc thread immediately when gc_urgent is set f2fs: fix incorrect range->len in f2fs_trim_fs() f2fs: refresh recent accessed nat entry in lru list f2fs: fix avoid race between truncate and background GC f2fs: avoid race between zero_range and background GC f2fs: fix to do sanity check with block address in main area v2 f2fs: fix to do sanity check with inline flags f2fs: fix to reset i_gc_failures correctly f2fs: fix invalid memory access f2fs: fix to avoid broken of dnode block list ...
2018-08-21f2fs: fix performance issue observed with multi-thread sequential readJaegeuk Kim1-0/+1
This reverts the commit - "b93f771 - f2fs: remove writepages lock" to fix the drop in sequential read throughput. Test: ./tiotest -t 32 -d /data/tio_tmp -f 32 -b 524288 -k 1 -k 3 -L device: UFS Before - read throughput: 185 MB/s total read requests: 85177 (of these ~80000 are 4KB size requests). total write requests: 2546 (of these ~2208 requests are written in 512KB). After - read throughput: 758 MB/s total read requests: 2417 (of these ~2042 are 512KB reads). total write requests: 2701 (of these ~2034 requests are written in 512KB). Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-13f2fs: support fault_type mount optionChao Yu1-7/+27
Previously, once fault injection is on, by default, all kind of faults will be injected to f2fs, if we want to trigger single or specified combined type during the test, we need to configure sysfs entry, it will be a little inconvenient to integrate sysfs configuring into testsuit, such as xfstest. So this patch introduces a new mount option 'fault_type' to assist old option 'fault_injection', with these two mount options, we can specify any fault rate/type at mount-time. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-13f2fs: support discard submission error injectionChao Yu1-0/+1
This patch adds to support discard submission error injection for testing error handling of __submit_discard_cmd(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-13f2fs: fix to do sanity check with inline flagsChao Yu1-1/+2
https://bugzilla.kernel.org/show_bug.cgi?id=200221 - Overview BUG() in clear_inode() when mounting and un-mounting a corrupted f2fs image - Reproduce - Kernel message [ 538.601448] F2FS-fs (loop0): Invalid segment/section count (31, 24 x 1376257) [ 538.601458] F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock [ 538.724091] F2FS-fs (loop0): Try to recover 2th superblock, ret: 0 [ 538.724102] F2FS-fs (loop0): Mounted with checkpoint version = 2 [ 540.970834] ------------[ cut here ]------------ [ 540.970838] kernel BUG at fs/inode.c:512! [ 540.971750] invalid opcode: 0000 [#1] SMP KASAN PTI [ 540.972755] CPU: 1 PID: 1305 Comm: umount Not tainted 4.18.0-rc1+ #4 [ 540.974034] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 540.982913] RIP: 0010:clear_inode+0xc0/0xd0 [ 540.983774] Code: 8d a3 30 01 00 00 4c 89 e7 e8 1c ec f8 ff 48 8b 83 30 01 00 00 49 39 c4 75 1a 48 c7 83 a0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 1f 40 00 66 66 66 66 90 55 [ 540.987570] RSP: 0018:ffff8801e34a7b70 EFLAGS: 00010002 [ 540.988636] RAX: 0000000000000000 RBX: ffff8801e9b744e8 RCX: ffffffffb840eb3a [ 540.990063] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8801e9b746b8 [ 540.991499] RBP: ffff8801e34a7b80 R08: ffffed003d36e8ce R09: ffffed003d36e8ce [ 540.992923] R10: 0000000000000001 R11: ffffed003d36e8cd R12: ffff8801e9b74668 [ 540.994360] R13: ffff8801e9b74760 R14: ffff8801e9b74528 R15: ffff8801e9b74530 [ 540.995786] FS: 00007f4662bdf840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 540.997403] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 540.998571] CR2: 000000000175c568 CR3: 00000001dcfe6000 CR4: 00000000000006e0 [ 541.000015] Call Trace: [ 541.000554] f2fs_evict_inode+0x253/0x630 [ 541.001381] evict+0x16f/0x290 [ 541.002015] iput+0x280/0x300 [ 541.002654] dentry_unlink_inode+0x165/0x1e0 [ 541.003528] __dentry_kill+0x16a/0x260 [ 541.004300] dentry_kill+0x70/0x250 [ 541.005018] dput+0x154/0x1d0 [ 541.005635] do_one_tree+0x34/0x40 [ 541.006354] shrink_dcache_for_umount+0x3f/0xa0 [ 541.007285] generic_shutdown_super+0x43/0x1c0 [ 541.008192] kill_block_super+0x52/0x80 [ 541.008978] kill_f2fs_super+0x62/0x70 [ 541.009750] deactivate_locked_super+0x6f/0xa0 [ 541.010664] deactivate_super+0x5e/0x80 [ 541.011450] cleanup_mnt+0x61/0xa0 [ 541.012151] __cleanup_mnt+0x12/0x20 [ 541.012893] task_work_run+0xc8/0xf0 [ 541.013635] exit_to_usermode_loop+0x125/0x130 [ 541.014555] do_syscall_64+0x138/0x170 [ 541.015340] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 541.016375] RIP: 0033:0x7f46624bf487 [ 541.017104] Code: 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 c9 2b 00 f7 d8 64 89 01 48 [ 541.020923] RSP: 002b:00007fff5e12e9a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 541.022452] RAX: 0000000000000000 RBX: 0000000001753030 RCX: 00007f46624bf487 [ 541.023885] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000175a1e0 [ 541.025318] RBP: 000000000175a1e0 R08: 0000000000000000 R09: 0000000000000014 [ 541.026755] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f46629c883c [ 541.028186] R13: 0000000000000000 R14: 0000000001753210 R15: 00007fff5e12ec30 [ 541.029626] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy [ 541.039445] ---[ end trace 4ce02f25ff7d3df5 ]--- [ 541.040392] RIP: 0010:clear_inode+0xc0/0xd0 [ 541.041240] Code: 8d a3 30 01 00 00 4c 89 e7 e8 1c ec f8 ff 48 8b 83 30 01 00 00 49 39 c4 75 1a 48 c7 83 a0 00 00 00 60 00 00 00 5b 41 5c 5d c3 <0f> 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 0b 0f 1f 40 00 66 66 66 66 90 55 [ 541.045042] RSP: 0018:ffff8801e34a7b70 EFLAGS: 00010002 [ 541.046099] RAX: 0000000000000000 RBX: ffff8801e9b744e8 RCX: ffffffffb840eb3a [ 541.047537] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8801e9b746b8 [ 541.048965] RBP: ffff8801e34a7b80 R08: ffffed003d36e8ce R09: ffffed003d36e8ce [ 541.050402] R10: 0000000000000001 R11: ffffed003d36e8cd R12: ffff8801e9b74668 [ 541.051832] R13: ffff8801e9b74760 R14: ffff8801e9b74528 R15: ffff8801e9b74530 [ 541.053263] FS: 00007f4662bdf840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 541.054891] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 541.056039] CR2: 000000000175c568 CR3: 00000001dcfe6000 CR4: 00000000000006e0 [ 541.058506] ================================================================== [ 541.059991] BUG: KASAN: stack-out-of-bounds in update_stack_state+0x38c/0x3e0 [ 541.061513] Read of size 8 at addr ffff8801e34a7970 by task umount/1305 [ 541.063302] CPU: 1 PID: 1305 Comm: umount Tainted: G D 4.18.0-rc1+ #4 [ 541.064838] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 541.066778] Call Trace: [ 541.067294] dump_stack+0x7b/0xb5 [ 541.067986] print_address_description+0x70/0x290 [ 541.068941] kasan_report+0x291/0x390 [ 541.069692] ? update_stack_state+0x38c/0x3e0 [ 541.070598] __asan_load8+0x54/0x90 [ 541.071315] update_stack_state+0x38c/0x3e0 [ 541.072172] ? __read_once_size_nocheck.constprop.7+0x20/0x20 [ 541.073340] ? vprintk_func+0x27/0x60 [ 541.074096] ? printk+0xa3/0xd3 [ 541.074762] ? __save_stack_trace+0x5e/0x100 [ 541.075634] unwind_next_frame.part.5+0x18e/0x490 [ 541.076594] ? unwind_dump+0x290/0x290 [ 541.077368] ? __show_regs+0x2c4/0x330 [ 541.078142] __unwind_start+0x106/0x190 [ 541.085422] __save_stack_trace+0x5e/0x100 [ 541.086268] ? __save_stack_trace+0x5e/0x100 [ 541.087161] ? unlink_anon_vmas+0xba/0x2c0 [ 541.087997] save_stack_trace+0x1f/0x30 [ 541.088782] save_stack+0x46/0xd0 [ 541.089475] ? __alloc_pages_slowpath+0x1420/0x1420 [ 541.090477] ? flush_tlb_mm_range+0x15e/0x220 [ 541.091364] ? __dec_node_state+0x24/0xb0 [ 541.092180] ? lock_page_memcg+0x85/0xf0 [ 541.092979] ? unlock_page_memcg+0x16/0x80 [ 541.093812] ? page_remove_rmap+0x198/0x520 [ 541.094674] ? mark_page_accessed+0x133/0x200 [ 541.095559] ? _cond_resched+0x1a/0x50 [ 541.096326] ? unmap_page_range+0xcd4/0xe50 [ 541.097179] ? rb_next+0x58/0x80 [ 541.097845] ? rb_next+0x58/0x80 [ 541.098518] __kasan_slab_free+0x13c/0x1a0 [ 541.099352] ? unlink_anon_vmas+0xba/0x2c0 [ 541.100184] kasan_slab_free+0xe/0x10 [ 541.100934] kmem_cache_free+0x89/0x1e0 [ 541.101724] unlink_anon_vmas+0xba/0x2c0 [ 541.102534] free_pgtables+0x101/0x1b0 [ 541.103299] exit_mmap+0x146/0x2a0 [ 541.103996] ? __ia32_sys_munmap+0x50/0x50 [ 541.104829] ? kasan_check_read+0x11/0x20 [ 541.105649] ? mm_update_next_owner+0x322/0x380 [ 541.106578] mmput+0x8b/0x1d0 [ 541.107191] do_exit+0x43a/0x1390 [ 541.107876] ? mm_update_next_owner+0x380/0x380 [ 541.108791] ? deactivate_super+0x5e/0x80 [ 541.109610] ? cleanup_mnt+0x61/0xa0 [ 541.110351] ? __cleanup_mnt+0x12/0x20 [ 541.111115] ? task_work_run+0xc8/0xf0 [ 541.111879] ? exit_to_usermode_loop+0x125/0x130 [ 541.112817] rewind_stack_do_exit+0x17/0x20 [ 541.113666] RIP: 0033:0x7f46624bf487 [ 541.114404] Code: Bad RIP value. [ 541.115094] RSP: 002b:00007fff5e12e9a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 541.116605] RAX: 0000000000000000 RBX: 0000000001753030 RCX: 00007f46624bf487 [ 541.118034] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000175a1e0 [ 541.119472] RBP: 000000000175a1e0 R08: 0000000000000000 R09: 0000000000000014 [ 541.120890] R10: 00000000000006b2 R11: 0000000000000246 R12: 00007f46629c883c [ 541.122321] R13: 0000000000000000 R14: 0000000001753210 R15: 00007fff5e12ec30 [ 541.124061] The buggy address belongs to the page: [ 541.125042] page:ffffea00078d29c0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 [ 541.126651] flags: 0x2ffff0000000000() [ 541.127418] raw: 02ffff0000000000 dead000000000100 dead000000000200 0000000000000000 [ 541.128963] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 541.130516] page dumped because: kasan: bad access detected [ 541.131954] Memory state around the buggy address: [ 541.132924] ffff8801e34a7800: 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 00 [ 541.134378] ffff8801e34a7880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 541.135814] >ffff8801e34a7900: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 [ 541.137253] ^ [ 541.138637] ffff8801e34a7980: f1 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 541.140075] ffff8801e34a7a00: 00 00 00 00 00 00 00 00 f3 00 00 00 00 00 00 00 [ 541.141509] ================================================================== - Location https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/inode.c#L512 BUG_ON(inode->i_data.nrpages); The root cause is root directory inode is corrupted, it has both inline_data and inline_dentry flag, and its nlink is zero, so in ->evict(), after dropping all page cache, it grabs page #0 for inline data truncation, result in panic in later clear_inode() where we will check inode->i_data.nrpages value. This patch adds inline flags check in sanity_check_inode, in addition, do sanity check with root inode's nlink. Reported-by Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-11f2fs: fix to avoid broken of dnode block listChao Yu1-0/+6
f2fs recovery flow is relying on dnode block link list, it means fsynced file recovery depends on previous dnode's persistence in the list, so during fsync() we should wait on all regular inode's dnode writebacked before issuing flush. By this way, we can avoid dnode block list being broken by out-of-order IO submission due to IO scheduler or driver. Sheng Yong helps to do the test with this patch: Target:/data (f2fs, -) 64MB / 32768KB / 4KB / 8 1 / PERSIST / Index Base: SEQ-RD(MB/s) SEQ-WR(MB/s) RND-RD(IOPS) RND-WR(IOPS) Insert(TPS) Update(TPS) Delete(TPS) 1 867.82 204.15 41440.03 41370.54 680.8 1025.94 1031.08 2 871.87 205.87 41370.3 40275.2 791.14 1065.84 1101.7 3 866.52 205.69 41795.67 40596.16 694.69 1037.16 1031.48 Avg 868.7366667 205.2366667 41535.33333 40747.3 722.21 1042.98 1054.753333 After: SEQ-RD(MB/s) SEQ-WR(MB/s) RND-RD(IOPS) RND-WR(IOPS) Insert(TPS) Update(TPS) Delete(TPS) 1 798.81 202.5 41143 40613.87 602.71 838.08 913.83 2 805.79 206.47 40297.2 41291.46 604.44 840.75 924.27 3 814.83 206.17 41209.57 40453.62 602.85 834.66 927.91 Avg 806.4766667 205.0466667 40883.25667 40786.31667 603.3333333 837.83 922.0033333 Patched/Original: 0.928332713 0.999074239 0.984300676 1.000957528 0.835398753 0.803303994 0.874141189 It looks like atomic write will suffer performance regression. I suspect that the criminal is that we forcing to wait all dnode being in storage cache before we issue PREFLUSH+FUA. BTW, will commit ("f2fs: don't need to wait for node writes for atomic write") cause the problem: we will lose data of last transaction after SPO, even if atomic write return no error: - atomic_open(); - write() P1, P2, P3; - atomic_commit(); - writeback data: P1, P2, P3; - writeback node: N1, N2, N3; <--- If N1, N2 is not writebacked, N3 with fsync_mark is writebacked, In SPOR, we won't find N3 since node chain is broken, turns out that losing last transaction. - preflush + fua; - power-cut If we don't wait dnode writeback for atomic_write: SEQ-RD(MB/s) SEQ-WR(MB/s) RND-RD(IOPS) RND-WR(IOPS) Insert(TPS) Update(TPS) Delete(TPS) 1 779.91 206.03 41621.5 40333.16 716.9 1038.21 1034.85 2 848.51 204.35 40082.44 39486.17 791.83 1119.96 1083.77 3 772.12 206.27 41335.25 41599.65 723.29 1055.07 971.92 Avg 800.18 205.55 41013.06333 40472.99333 744.0066667 1071.08 1030.18 Patched/Original: 0.92108464 1.001526693 0.987425886 0.993268102 1.030180511 1.026942031 0.976702294 SQLite's performance recovers. Jaegeuk: "Practically, I don't see db corruption becase of this. We can excuse to lose the last transaction." Finally, we decide to keep original implementation of atomic write interface sematics that we don't wait all dnode writeback before preflush+fua submission. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-11f2fs: fix to do sanity check with cp_pack_start_sumChao Yu1-0/+12
After fuzzing, cp_pack_start_sum could be corrupted, so current log's summary info should be wrong due to loading incorrect summary block. Then, if segment's type in current log is exceeded NR_CURSEG_TYPE, it can lead accessing invalid dirty_i->dirty_segmap bitmap finally. Add sanity check for cp_pack_start_sum to fix this issue. https://bugzilla.kernel.org/show_bug.cgi?id=200419 - Reproduce - Kernel message (f2fs-dev w/ KASAN) [ 3117.578432] F2FS-fs (loop0): Invalid log blocks per segment (8) [ 3117.578445] F2FS-fs (loop0): Can't find valid F2FS filesystem in 2th superblock [ 3117.581364] F2FS-fs (loop0): invalid crc_offset: 30716 [ 3117.583564] WARNING: CPU: 1 PID: 1225 at fs/f2fs/checkpoint.c:90 __get_meta_page+0x448/0x4b0 [ 3117.583570] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy [ 3117.584014] CPU: 1 PID: 1225 Comm: mount Not tainted 4.17.0+ #1 [ 3117.584017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 3117.584022] RIP: 0010:__get_meta_page+0x448/0x4b0 [ 3117.584023] Code: 00 49 8d bc 24 84 00 00 00 e8 74 54 da ff 41 83 8c 24 84 00 00 00 08 4c 89 f6 4c 89 ef e8 c0 d9 95 00 48 89 ef e8 18 e3 00 00 <0f> 0b f0 80 4d 48 04 e9 0f fe ff ff 0f 0b 48 89 c7 48 89 04 24 e8 [ 3117.584072] RSP: 0018:ffff88018eb678c0 EFLAGS: 00010286 [ 3117.584082] RAX: ffff88018f0a6a78 RBX: ffffea0007a46600 RCX: ffffffff9314d1b2 [ 3117.584085] RDX: ffffffff00000001 RSI: 0000000000000000 RDI: ffff88018f0a6a98 [ 3117.584087] RBP: ffff88018ebe9980 R08: 0000000000000002 R09: 0000000000000001 [ 3117.584090] R10: 0000000000000001 R11: ffffed00326e4450 R12: ffff880193722200 [ 3117.584092] R13: ffff88018ebe9afc R14: 0000000000000206 R15: ffff88018eb67900 [ 3117.584096] FS: 00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000 [ 3117.584098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3117.584101] CR2: 00000000016f21b8 CR3: 0000000191c22000 CR4: 00000000000006e0 [ 3117.584112] Call Trace: [ 3117.584121] ? f2fs_set_meta_page_dirty+0x150/0x150 [ 3117.584127] ? f2fs_build_segment_manager+0xbf9/0x3190 [ 3117.584133] ? f2fs_npages_for_summary_flush+0x75/0x120 [ 3117.584145] f2fs_build_segment_manager+0xda8/0x3190 [ 3117.584151] ? f2fs_get_valid_checkpoint+0x298/0xa00 [ 3117.584156] ? f2fs_flush_sit_entries+0x10e0/0x10e0 [ 3117.584184] ? map_id_range_down+0x17c/0x1b0 [ 3117.584188] ? __put_user_ns+0x30/0x30 [ 3117.584206] ? find_next_bit+0x53/0x90 [ 3117.584237] ? cpumask_next+0x16/0x20 [ 3117.584249] f2fs_fill_super+0x1948/0x2b40 [ 3117.584258] ? f2fs_commit_super+0x1a0/0x1a0 [ 3117.584279] ? sget_userns+0x65e/0x690 [ 3117.584296] ? set_blocksize+0x88/0x130 [ 3117.584302] ? f2fs_commit_super+0x1a0/0x1a0 [ 3117.584305] mount_bdev+0x1c0/0x200 [ 3117.584310] mount_fs+0x5c/0x190 [ 3117.584320] vfs_kern_mount+0x64/0x190 [ 3117.584330] do_mount+0x2e4/0x1450 [ 3117.584343] ? lockref_put_return+0x130/0x130 [ 3117.584347] ? copy_mount_string+0x20/0x20 [ 3117.584357] ? kasan_unpoison_shadow+0x31/0x40 [ 3117.584362] ? kasan_kmalloc+0xa6/0xd0 [ 3117.584373] ? memcg_kmem_put_cache+0x16/0x90 [ 3117.584377] ? __kmalloc_track_caller+0x196/0x210 [ 3117.584383] ? _copy_from_user+0x61/0x90 [ 3117.584396] ? memdup_user+0x3e/0x60 [ 3117.584401] ksys_mount+0x7e/0xd0 [ 3117.584405] __x64_sys_mount+0x62/0x70 [ 3117.584427] do_syscall_64+0x73/0x160 [ 3117.584440] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 3117.584455] RIP: 0033:0x7f5693f14b9a [ 3117.584456] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48 [ 3117.584505] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 3117.584510] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a [ 3117.584512] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040 [ 3117.584514] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013 [ 3117.584516] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040 [ 3117.584519] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003 [ 3117.584523] ---[ end trace a8e0d899985faf31 ]--- [ 3117.685663] F2FS-fs (loop0): f2fs_check_nid_range: out-of-range nid=2, run fsck to fix. [ 3117.685673] F2FS-fs (loop0): recover_data: ino = 2 (i_size: recover) recovered = 1, err = 0 [ 3117.685707] ================================================================== [ 3117.685955] BUG: KASAN: slab-out-of-bounds in __remove_dirty_segment+0xdd/0x1e0 [ 3117.686175] Read of size 8 at addr ffff88018f0a63d0 by task mount/1225 [ 3117.686477] CPU: 0 PID: 1225 Comm: mount Tainted: G W 4.17.0+ #1 [ 3117.686481] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 3117.686483] Call Trace: [ 3117.686494] dump_stack+0x71/0xab [ 3117.686512] print_address_description+0x6b/0x290 [ 3117.686517] kasan_report+0x28e/0x390 [ 3117.686522] ? __remove_dirty_segment+0xdd/0x1e0 [ 3117.686527] __remove_dirty_segment+0xdd/0x1e0 [ 3117.686532] locate_dirty_segment+0x189/0x190 [ 3117.686538] f2fs_allocate_new_segments+0xa9/0xe0 [ 3117.686543] recover_data+0x703/0x2c20 [ 3117.686547] ? f2fs_recover_fsync_data+0x48f/0xd50 [ 3117.686553] ? ksys_mount+0x7e/0xd0 [ 3117.686564] ? policy_nodemask+0x1a/0x90 [ 3117.686567] ? policy_node+0x56/0x70 [ 3117.686571] ? add_fsync_inode+0xf0/0xf0 [ 3117.686592] ? blk_finish_plug+0x44/0x60 [ 3117.686597] ? f2fs_ra_meta_pages+0x38b/0x5e0 [ 3117.686602] ? find_inode_fast+0xac/0xc0 [ 3117.686606] ? f2fs_is_valid_blkaddr+0x320/0x320 [ 3117.686618] ? __radix_tree_lookup+0x150/0x150 [ 3117.686633] ? dqget+0x670/0x670 [ 3117.686648] ? pagecache_get_page+0x29/0x410 [ 3117.686656] ? kmem_cache_alloc+0x176/0x1e0 [ 3117.686660] ? f2fs_is_valid_blkaddr+0x11d/0x320 [ 3117.686664] f2fs_recover_fsync_data+0xc23/0xd50 [ 3117.686670] ? f2fs_space_for_roll_forward+0x60/0x60 [ 3117.686674] ? rb_insert_color+0x323/0x3d0 [ 3117.686678] ? f2fs_recover_orphan_inodes+0xa5/0x700 [ 3117.686683] ? proc_register+0x153/0x1d0 [ 3117.686686] ? f2fs_remove_orphan_inode+0x10/0x10 [ 3117.686695] ? f2fs_attr_store+0x50/0x50 [ 3117.686700] ? proc_create_single_data+0x52/0x60 [ 3117.686707] f2fs_fill_super+0x1d06/0x2b40 [ 3117.686728] ? f2fs_commit_super+0x1a0/0x1a0 [ 3117.686735] ? sget_userns+0x65e/0x690 [ 3117.686740] ? set_blocksize+0x88/0x130 [ 3117.686745] ? f2fs_commit_super+0x1a0/0x1a0 [ 3117.686748] mount_bdev+0x1c0/0x200 [ 3117.686753] mount_fs+0x5c/0x190 [ 3117.686758] vfs_kern_mount+0x64/0x190 [ 3117.686762] do_mount+0x2e4/0x1450 [ 3117.686769] ? lockref_put_return+0x130/0x130 [ 3117.686773] ? copy_mount_string+0x20/0x20 [ 3117.686777] ? kasan_unpoison_shadow+0x31/0x40 [ 3117.686780] ? kasan_kmalloc+0xa6/0xd0 [ 3117.686786] ? memcg_kmem_put_cache+0x16/0x90 [ 3117.686790] ? __kmalloc_track_caller+0x196/0x210 [ 3117.686795] ? _copy_from_user+0x61/0x90 [ 3117.686801] ? memdup_user+0x3e/0x60 [ 3117.686804] ksys_mount+0x7e/0xd0 [ 3117.686809] __x64_sys_mount+0x62/0x70 [ 3117.686816] do_syscall_64+0x73/0x160 [ 3117.686824] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 3117.686829] RIP: 0033:0x7f5693f14b9a [ 3117.686830] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48 [ 3117.686887] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 3117.686892] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a [ 3117.686894] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040 [ 3117.686896] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013 [ 3117.686899] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040 [ 3117.686901] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003 [ 3117.687005] Allocated by task 1225: [ 3117.687152] kasan_kmalloc+0xa6/0xd0 [ 3117.687157] kmem_cache_alloc_trace+0xfd/0x200 [ 3117.687161] f2fs_build_segment_manager+0x2d09/0x3190 [ 3117.687165] f2fs_fill_super+0x1948/0x2b40 [ 3117.687168] mount_bdev+0x1c0/0x200 [ 3117.687171] mount_fs+0x5c/0x190 [ 3117.687174] vfs_kern_mount+0x64/0x190 [ 3117.687177] do_mount+0x2e4/0x1450 [ 3117.687180] ksys_mount+0x7e/0xd0 [ 3117.687182] __x64_sys_mount+0x62/0x70 [ 3117.687186] do_syscall_64+0x73/0x160 [ 3117.687190] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 3117.687285] Freed by task 19: [ 3117.687412] __kasan_slab_free+0x137/0x190 [ 3117.687416] kfree+0x8b/0x1b0 [ 3117.687460] ttm_bo_man_put_node+0x61/0x80 [ttm] [ 3117.687476] ttm_bo_cleanup_refs+0x15f/0x250 [ttm] [ 3117.687492] ttm_bo_delayed_delete+0x2f0/0x300 [ttm] [ 3117.687507] ttm_bo_delayed_workqueue+0x17/0x50 [ttm] [ 3117.687528] process_one_work+0x2f9/0x740 [ 3117.687531] worker_thread+0x78/0x6b0 [ 3117.687541] kthread+0x177/0x1c0 [ 3117.687545] ret_from_fork+0x35/0x40 [ 3117.687638] The buggy address belongs to the object at ffff88018f0a6300 which belongs to the cache kmalloc-192 of size 192 [ 3117.688014] The buggy address is located 16 bytes to the right of 192-byte region [ffff88018f0a6300, ffff88018f0a63c0) [ 3117.688382] The buggy address belongs to the page: [ 3117.688554] page:ffffea00063c2980 count:1 mapcount:0 mapping:ffff8801f3403180 index:0x0 [ 3117.688788] flags: 0x17fff8000000100(slab) [ 3117.688944] raw: 017fff8000000100 ffffea00063c2840 0000000e0000000e ffff8801f3403180 [ 3117.689166] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 [ 3117.689386] page dumped because: kasan: bad access detected [ 3117.689653] Memory state around the buggy address: [ 3117.689816] ffff88018f0a6280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 3117.690027] ffff88018f0a6300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 3117.690239] >ffff88018f0a6380: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 3117.690448] ^ [ 3117.690644] ffff88018f0a6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 3117.690868] ffff88018f0a6480: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 3117.691077] ================================================================== [ 3117.691290] Disabling lock debugging due to kernel taint [ 3117.693893] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 3117.694120] PGD 80000001f01bc067 P4D 80000001f01bc067 PUD 1d9638067 PMD 0 [ 3117.694338] Oops: 0002 [#1] SMP KASAN PTI [ 3117.694490] CPU: 1 PID: 1225 Comm: mount Tainted: G B W 4.17.0+ #1 [ 3117.694703] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 3117.695073] RIP: 0010:__remove_dirty_segment+0xe2/0x1e0 [ 3117.695246] Code: c4 48 89 c7 e8 cf bb d7 ff 45 0f b6 24 24 41 83 e4 3f 44 88 64 24 07 41 83 e4 3f 4a 8d 7c e3 08 e8 b3 bc d7 ff 4a 8b 4c e3 08 <f0> 4c 0f b3 29 0f 82 94 00 00 00 48 8d bd 20 04 00 00 e8 97 bb d7 [ 3117.695793] RSP: 0018:ffff88018eb67638 EFLAGS: 00010292 [ 3117.695969] RAX: 0000000000000000 RBX: ffff88018f0a6300 RCX: 0000000000000000 [ 3117.696182] RDX: 0000000000000000 RSI: 0000000000000297 RDI: 0000000000000297 [ 3117.696391] RBP: ffff88018ebe9980 R08: ffffed003e743ebb R09: ffffed003e743ebb [ 3117.696604] R10: 0000000000000001 R11: ffffed003e743eba R12: 0000000000000019 [ 3117.696813] R13: 0000000000000014 R14: 0000000000000320 R15: ffff88018ebe99e0 [ 3117.697032] FS: 00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000 [ 3117.697280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3117.702357] CR2: 00007fe89bb1a000 CR3: 0000000191c22000 CR4: 00000000000006e0 [ 3117.707235] Call Trace: [ 3117.712077] locate_dirty_segment+0x189/0x190 [ 3117.716891] f2fs_allocate_new_segments+0xa9/0xe0 [ 3117.721617] recover_data+0x703/0x2c20 [ 3117.726316] ? f2fs_recover_fsync_data+0x48f/0xd50 [ 3117.730957] ? ksys_mount+0x7e/0xd0 [ 3117.735573] ? policy_nodemask+0x1a/0x90 [ 3117.740198] ? policy_node+0x56/0x70 [ 3117.744829] ? add_fsync_inode+0xf0/0xf0 [ 3117.749487] ? blk_finish_plug+0x44/0x60 [ 3117.754152] ? f2fs_ra_meta_pages+0x38b/0x5e0 [ 3117.758831] ? find_inode_fast+0xac/0xc0 [ 3117.763448] ? f2fs_is_valid_blkaddr+0x320/0x320 [ 3117.768046] ? __radix_tree_lookup+0x150/0x150 [ 3117.772603] ? dqget+0x670/0x670 [ 3117.777159] ? pagecache_get_page+0x29/0x410 [ 3117.781648] ? kmem_cache_alloc+0x176/0x1e0 [ 3117.786067] ? f2fs_is_valid_blkaddr+0x11d/0x320 [ 3117.790476] f2fs_recover_fsync_data+0xc23/0xd50 [ 3117.794790] ? f2fs_space_for_roll_forward+0x60/0x60 [ 3117.799086] ? rb_insert_color+0x323/0x3d0 [ 3117.803304] ? f2fs_recover_orphan_inodes+0xa5/0x700 [ 3117.807563] ? proc_register+0x153/0x1d0 [ 3117.811766] ? f2fs_remove_orphan_inode+0x10/0x10 [ 3117.815947] ? f2fs_attr_store+0x50/0x50 [ 3117.820087] ? proc_create_single_data+0x52/0x60 [ 3117.824262] f2fs_fill_super+0x1d06/0x2b40 [ 3117.828367] ? f2fs_commit_super+0x1a0/0x1a0 [ 3117.832432] ? sget_userns+0x65e/0x690 [ 3117.836500] ? set_blocksize+0x88/0x130 [ 3117.840501] ? f2fs_commit_super+0x1a0/0x1a0 [ 3117.844420] mount_bdev+0x1c0/0x200 [ 3117.848275] mount_fs+0x5c/0x190 [ 3117.852053] vfs_kern_mount+0x64/0x190 [ 3117.855810] do_mount+0x2e4/0x1450 [ 3117.859441] ? lockref_put_return+0x130/0x130 [ 3117.862996] ? copy_mount_string+0x20/0x20 [ 3117.866417] ? kasan_unpoison_shadow+0x31/0x40 [ 3117.869719] ? kasan_kmalloc+0xa6/0xd0 [ 3117.872948] ? memcg_kmem_put_cache+0x16/0x90 [ 3117.876121] ? __kmalloc_track_caller+0x196/0x210 [ 3117.879333] ? _copy_from_user+0x61/0x90 [ 3117.882467] ? memdup_user+0x3e/0x60 [ 3117.885604] ksys_mount+0x7e/0xd0 [ 3117.888700] __x64_sys_mount+0x62/0x70 [ 3117.891742] do_syscall_64+0x73/0x160 [ 3117.894692] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 3117.897669] RIP: 0033:0x7f5693f14b9a [ 3117.900563] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48 [ 3117.906922] RSP: 002b:00007fff27346488 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 3117.910159] RAX: ffffffffffffffda RBX: 00000000016e2030 RCX: 00007f5693f14b9a [ 3117.913469] RDX: 00000000016e2210 RSI: 00000000016e3f30 RDI: 00000000016ee040 [ 3117.916764] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013 [ 3117.920071] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 00000000016ee040 [ 3117.923393] R13: 00000000016e2210 R14: 0000000000000000 R15: 0000000000000003 [ 3117.926680] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer joydev input_leds serio_raw snd soundcore mac_hid i2c_piix4 ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi btrfs zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear 8139too qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel psmouse aes_x86_64 8139cp crypto_simd cryptd mii glue_helper pata_acpi floppy [ 3117.949979] CR2: 0000000000000000 [ 3117.954283] ---[ end trace a8e0d899985faf32 ]--- [ 3117.958575] RIP: 0010:__remove_dirty_segment+0xe2/0x1e0 [ 3117.962810] Code: c4 48 89 c7 e8 cf bb d7 ff 45 0f b6 24 24 41 83 e4 3f 44 88 64 24 07 41 83 e4 3f 4a 8d 7c e3 08 e8 b3 bc d7 ff 4a 8b 4c e3 08 <f0> 4c 0f b3 29 0f 82 94 00 00 00 48 8d bd 20 04 00 00 e8 97 bb d7 [ 3117.971789] RSP: 0018:ffff88018eb67638 EFLAGS: 00010292 [ 3117.976333] RAX: 0000000000000000 RBX: ffff88018f0a6300 RCX: 0000000000000000 [ 3117.980926] RDX: 0000000000000000 RSI: 0000000000000297 RDI: 0000000000000297 [ 3117.985497] RBP: ffff88018ebe9980 R08: ffffed003e743ebb R09: ffffed003e743ebb [ 3117.990098] R10: 0000000000000001 R11: ffffed003e743eba R12: 0000000000000019 [ 3117.994761] R13: 0000000000000014 R14: 0000000000000320 R15: ffff88018ebe99e0 [ 3117.999392] FS: 00007f5694636840(0000) GS:ffff8801f3b00000(0000) knlGS:0000000000000000 [ 3118.004096] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3118.008816] CR2: 00007fe89bb1a000 CR3: 0000000191c22000 CR4: 00000000000006e0 - Location https://elixir.bootlin.com/linux/v4.18-rc3/source/fs/f2fs/segment.c#L775 if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t])) dirty_i->nr_dirty[t]--; Here dirty_i->dirty_segmap[t] can be NULL which leads to crash in test_and_clear_bit() Reported-by Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01f2fs: fix to restrict mount condition when without CONFIG_QUOTAChao Yu1-0/+6
Like quota_ino feature, we need to reject mounting RDWR with image which enables project_quota feature when there is no CONFIG_QUOTA be set in kernel. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01f2fs: quota: do not mount as RDWR without QUOTA if quota feature enabledSheng Yong1-6/+7
If quota feature is enabled, quota is on by default. However, if CONFIG_QUOTA is not built in kernel, dquot entries will not get updated, which leads to quota inconsistency. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01f2fs: quota: fix incorrect commentsSheng Yong1-4/+1
Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01f2fs: quota: decrease the lock granularity of statfs_projectSheng Yong1-2/+2
According to fs/quota/dquot.c, `dq_data_lock' protects mem_dqinfo structures and modifications of dquot pointers in the inode, and `dquot->dq_dqb_lock' protects data from dq_dqb. We should use dquot->dq_dqb_lock in statfs_project instead of dq_dat_lock. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-08-01f2fs: enable real-time discard by defaultChao Yu1-4/+4
f2fs is focused on flash based storage, so let's enable real-time discard by default, if user don't want to enable it, 'nodiscard' mount option should be used on mount. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-29f2fs: show the fsync_mode=nobarrier mount optionSahitya Tummala1-0/+2
This patch shows the fsync_mode=nobarrier mount option in f2fs_show_options(). Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-29f2fs: avoid the global name 'fault_name'Gao Xiang1-1/+1
Non-prefix global name 'fault_name' will pollute global namespace, fix it. Refer to: https://lists.01.org/pipermail/kbuild-all/2018-June/049660.html To: Jaegeuk Kim <jaegeuk@kernel.org> To: Chao Yu <yuchao0@huawei.com> Cc: linux-f2fs-devel@lists.sourceforge.net Cc: linux-kernel@vger.kernel.org Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Gao Xiang <gaoxiang25@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-29f2fs: fix to do sanity check with user_block_countChao Yu1-1/+12
This patch fixs to do sanity check with user_block_count. - Overview Divide zero in utilization when mount() a corrupted f2fs image - Reproduce (4.18 upstream kernel) - Kernel message [ 564.099503] F2FS-fs (loop0): invalid crc value [ 564.101991] divide error: 0000 [#1] SMP KASAN PTI [ 564.103103] CPU: 1 PID: 1298 Comm: f2fs_discard-7: Not tainted 4.18.0-rc1+ #4 [ 564.104584] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 564.106624] RIP: 0010:issue_discard_thread+0x248/0x5c0 [ 564.107692] Code: ff ff 48 8b bd e8 fe ff ff 41 8b 9d 4c 04 00 00 e8 cd b8 ad ff 41 8b 85 50 04 00 00 31 d2 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <48> f7 f3 83 f8 50 7e 16 41 c7 86 7c ff ff ff 01 00 00 00 41 c7 86 [ 564.111686] RSP: 0018:ffff8801f3117dc0 EFLAGS: 00010206 [ 564.112775] RAX: 0000000000000384 RBX: 0000000000000000 RCX: ffffffffb88c1e03 [ 564.114250] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e3aa4850 [ 564.115706] RBP: ffff8801f3117f00 R08: 1ffffffff751a1d0 R09: fffffbfff751a1d0 [ 564.117177] R10: 0000000000000001 R11: fffffbfff751a1d0 R12: 00000000fffffffc [ 564.118634] R13: ffff8801e3aa4400 R14: ffff8801f3117ed8 R15: ffff8801e2050000 [ 564.120094] FS: 0000000000000000(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 564.121748] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 564.122923] CR2: 000000000202b078 CR3: 00000001f11ac000 CR4: 00000000000006e0 [ 564.124383] Call Trace: [ 564.124924] ? __issue_discard_cmd+0x480/0x480 [ 564.125882] ? __sched_text_start+0x8/0x8 [ 564.126756] ? __kthread_parkme+0xcb/0x100 [ 564.127620] ? kthread_blkcg+0x70/0x70 [ 564.128412] kthread+0x180/0x1d0 [ 564.129105] ? __issue_discard_cmd+0x480/0x480 [ 564.130029] ? kthread_associate_blkcg+0x150/0x150 [ 564.131033] ret_from_fork+0x35/0x40 [ 564.131794] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy [ 564.141798] ---[ end trace 4ce02f25ff7d3df5 ]--- [ 564.142773] RIP: 0010:issue_discard_thread+0x248/0x5c0 [ 564.143885] Code: ff ff 48 8b bd e8 fe ff ff 41 8b 9d 4c 04 00 00 e8 cd b8 ad ff 41 8b 85 50 04 00 00 31 d2 48 8d 04 80 48 8d 04 80 48 c1 e0 02 <48> f7 f3 83 f8 50 7e 16 41 c7 86 7c ff ff ff 01 00 00 00 41 c7 86 [ 564.147776] RSP: 0018:ffff8801f3117dc0 EFLAGS: 00010206 [ 564.148856] RAX: 0000000000000384 RBX: 0000000000000000 RCX: ffffffffb88c1e03 [ 564.150424] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e3aa4850 [ 564.151906] RBP: ffff8801f3117f00 R08: 1ffffffff751a1d0 R09: fffffbfff751a1d0 [ 564.153463] R10: 0000000000000001 R11: fffffbfff751a1d0 R12: 00000000fffffffc [ 564.154915] R13: ffff8801e3aa4400 R14: ffff8801f3117ed8 R15: ffff8801e2050000 [ 564.156405] FS: 0000000000000000(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 564.158070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 564.159279] CR2: 000000000202b078 CR3: 00000001f11ac000 CR4: 00000000000006e0 [ 564.161043] ================================================================== [ 564.162587] BUG: KASAN: stack-out-of-bounds in from_kuid_munged+0x1d/0x50 [ 564.163994] Read of size 4 at addr ffff8801f3117c84 by task f2fs_discard-7:/1298 [ 564.165852] CPU: 1 PID: 1298 Comm: f2fs_discard-7: Tainted: G D 4.18.0-rc1+ #4 [ 564.167593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 564.169522] Call Trace: [ 564.170057] dump_stack+0x7b/0xb5 [ 564.170778] print_address_description+0x70/0x290 [ 564.171765] kasan_report+0x291/0x390 [ 564.172540] ? from_kuid_munged+0x1d/0x50 [ 564.173408] __asan_load4+0x78/0x80 [ 564.174148] from_kuid_munged+0x1d/0x50 [ 564.174962] do_notify_parent+0x1f5/0x4f0 [ 564.175808] ? send_sigqueue+0x390/0x390 [ 564.176639] ? css_set_move_task+0x152/0x340 [ 564.184197] do_exit+0x1290/0x1390 [ 564.184950] ? __issue_discard_cmd+0x480/0x480 [ 564.185884] ? mm_update_next_owner+0x380/0x380 [ 564.186829] ? __sched_text_start+0x8/0x8 [ 564.187672] ? __kthread_parkme+0xcb/0x100 [ 564.188528] ? kthread_blkcg+0x70/0x70 [ 564.189333] ? kthread+0x180/0x1d0 [ 564.190052] ? __issue_discard_cmd+0x480/0x480 [ 564.190983] rewind_stack_do_exit+0x17/0x20 [ 564.192190] The buggy address belongs to the page: [ 564.193213] page:ffffea0007cc45c0 count:0 mapcount:0 mapping:0000000000000000 index:0x0 [ 564.194856] flags: 0x2ffff0000000000() [ 564.195644] raw: 02ffff0000000000 0000000000000000 dead000000000200 0000000000000000 [ 564.197247] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 564.198826] page dumped because: kasan: bad access detected [ 564.200299] Memory state around the buggy address: [ 564.201306] ffff8801f3117b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 564.202779] ffff8801f3117c00: 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 [ 564.204252] >ffff8801f3117c80: f3 f3 f3 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 [ 564.205742] ^ [ 564.206424] ffff8801f3117d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 564.207908] ffff8801f3117d80: f3 f3 f3 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 [ 564.209389] ================================================================== [ 564.231795] F2FS-fs (loop0): Mounted with checkpoint version = 2 - Location https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.h#L586 return div_u64((u64)valid_user_blocks(sbi) * 100, sbi->user_block_count); Missing checks on sbi->user_block_count. Reported-by: Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-27f2fs: fix to do sanity check with {sit,nat}_ver_bitmap_bytesizeChao Yu1-2/+19
This patch adds to do sanity check with {sit,nat}_ver_bitmap_bytesize during mount, in order to avoid accessing across cache boundary with this abnormal bitmap size. - Overview buffer overrun in build_sit_info() when mounting a crafted f2fs image - Reproduce - Kernel message [ 548.580867] F2FS-fs (loop0): Invalid log blocks per segment (8201) [ 548.580877] F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock [ 548.584979] ================================================================== [ 548.586568] BUG: KASAN: use-after-free in kmemdup+0x36/0x50 [ 548.587715] Read of size 64 at addr ffff8801e9c265ff by task mount/1295 [ 548.589428] CPU: 1 PID: 1295 Comm: mount Not tainted 4.18.0-rc1+ #4 [ 548.589432] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 548.589438] Call Trace: [ 548.589474] dump_stack+0x7b/0xb5 [ 548.589487] print_address_description+0x70/0x290 [ 548.589492] kasan_report+0x291/0x390 [ 548.589496] ? kmemdup+0x36/0x50 [ 548.589509] check_memory_region+0x139/0x190 [ 548.589514] memcpy+0x23/0x50 [ 548.589518] kmemdup+0x36/0x50 [ 548.589545] f2fs_build_segment_manager+0x8fa/0x3410 [ 548.589551] ? __asan_loadN+0xf/0x20 [ 548.589560] ? f2fs_sanity_check_ckpt+0x1be/0x240 [ 548.589566] ? f2fs_flush_sit_entries+0x10c0/0x10c0 [ 548.589587] ? __put_user_ns+0x40/0x40 [ 548.589604] ? find_next_bit+0x57/0x90 [ 548.589610] f2fs_fill_super+0x194b/0x2b40 [ 548.589617] ? f2fs_commit_super+0x1b0/0x1b0 [ 548.589637] ? set_blocksize+0x90/0x140 [ 548.589651] mount_bdev+0x1c5/0x210 [ 548.589655] ? f2fs_commit_super+0x1b0/0x1b0 [ 548.589667] f2fs_mount+0x15/0x20 [ 548.589672] mount_fs+0x60/0x1a0 [ 548.589683] ? alloc_vfsmnt+0x309/0x360 [ 548.589688] vfs_kern_mount+0x6b/0x1a0 [ 548.589699] do_mount+0x34a/0x18c0 [ 548.589710] ? lockref_put_or_lock+0xcf/0x160 [ 548.589716] ? copy_mount_string+0x20/0x20 [ 548.589728] ? memcg_kmem_put_cache+0x1b/0xa0 [ 548.589734] ? kasan_check_write+0x14/0x20 [ 548.589740] ? _copy_from_user+0x6a/0x90 [ 548.589744] ? memdup_user+0x42/0x60 [ 548.589750] ksys_mount+0x83/0xd0 [ 548.589755] __x64_sys_mount+0x67/0x80 [ 548.589781] do_syscall_64+0x78/0x170 [ 548.589797] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 548.589820] RIP: 0033:0x7f76fc331b9a [ 548.589821] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48 [ 548.589880] RSP: 002b:00007ffd4f0a0e48 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 548.589890] RAX: ffffffffffffffda RBX: 000000000146c030 RCX: 00007f76fc331b9a [ 548.589892] RDX: 000000000146c210 RSI: 000000000146df30 RDI: 0000000001474ec0 [ 548.589895] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013 [ 548.589897] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001474ec0 [ 548.589900] R13: 000000000146c210 R14: 0000000000000000 R15: 0000000000000003 [ 548.590242] The buggy address belongs to the page: [ 548.591243] page:ffffea0007a70980 count:0 mapcount:0 mapping:0000000000000000 index:0x0 [ 548.592886] flags: 0x2ffff0000000000() [ 548.593665] raw: 02ffff0000000000 dead000000000100 dead000000000200 0000000000000000 [ 548.595258] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 548.603713] page dumped because: kasan: bad access detected [ 548.605203] Memory state around the buggy address: [ 548.606198] ffff8801e9c26480: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 548.607676] ffff8801e9c26500: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 548.609157] >ffff8801e9c26580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 548.610629] ^ [ 548.612088] ffff8801e9c26600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 548.613674] ffff8801e9c26680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 548.615141] ================================================================== [ 548.616613] Disabling lock debugging due to kernel taint [ 548.622871] WARNING: CPU: 1 PID: 1295 at mm/page_alloc.c:4065 __alloc_pages_slowpath+0xe4a/0x1420 [ 548.622878] Modules linked in: snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd mac_hid i2c_piix4 soundcore ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 multipath linear 8139too crct10dif_pclmul crc32_pclmul qxl drm_kms_helper syscopyarea aesni_intel sysfillrect sysimgblt fb_sys_fops ttm drm aes_x86_64 crypto_simd cryptd 8139cp glue_helper mii pata_acpi floppy [ 548.623217] CPU: 1 PID: 1295 Comm: mount Tainted: G B 4.18.0-rc1+ #4 [ 548.623219] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 548.623226] RIP: 0010:__alloc_pages_slowpath+0xe4a/0x1420 [ 548.623227] Code: ff ff 01 89 85 c8 fe ff ff e9 91 fc ff ff 41 89 c5 e9 5c fc ff ff 0f 0b 89 f8 25 ff ff f7 ff 89 85 8c fe ff ff e9 d5 f2 ff ff <0f> 0b e9 65 f2 ff ff 65 8b 05 38 81 d2 47 f6 c4 01 74 1c 65 48 8b [ 548.623281] RSP: 0018:ffff8801f28c7678 EFLAGS: 00010246 [ 548.623284] RAX: 0000000000000000 RBX: 00000000006040c0 RCX: ffffffffb82f73b7 [ 548.623287] RDX: 1ffff1003e518eeb RSI: 000000000000000c RDI: 0000000000000000 [ 548.623290] RBP: ffff8801f28c7880 R08: 0000000000000000 R09: ffffed0047fff2c5 [ 548.623292] R10: 0000000000000001 R11: ffffed0047fff2c4 R12: ffff8801e88de040 [ 548.623295] R13: 00000000006040c0 R14: 000000000000000c R15: ffff8801f28c7938 [ 548.623299] FS: 00007f76fca51840(0000) GS:ffff8801f6f00000(0000) knlGS:0000000000000000 [ 548.623302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 548.623304] CR2: 00007f19b9171760 CR3: 00000001ed952000 CR4: 00000000000006e0 [ 548.623317] Call Trace: [ 548.623325] ? kasan_check_read+0x11/0x20 [ 548.623330] ? __zone_watermark_ok+0x92/0x240 [ 548.623336] ? get_page_from_freelist+0x1c3/0x1d90 [ 548.623347] ? _raw_spin_lock_irqsave+0x2a/0x60 [ 548.623353] ? warn_alloc+0x250/0x250 [ 548.623358] ? save_stack+0x46/0xd0 [ 548.623361] ? kasan_kmalloc+0xad/0xe0 [ 548.623366] ? __isolate_free_page+0x2a0/0x2a0 [ 548.623370] ? mount_fs+0x60/0x1a0 [ 548.623374] ? vfs_kern_mount+0x6b/0x1a0 [ 548.623378] ? do_mount+0x34a/0x18c0 [ 548.623383] ? ksys_mount+0x83/0xd0 [ 548.623387] ? __x64_sys_mount+0x67/0x80 [ 548.623391] ? do_syscall_64+0x78/0x170 [ 548.623396] ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 548.623401] __alloc_pages_nodemask+0x3c5/0x400 [ 548.623407] ? __alloc_pages_slowpath+0x1420/0x1420 [ 548.623412] ? __mutex_lock_slowpath+0x20/0x20 [ 548.623417] ? kvmalloc_node+0x31/0x80 [ 548.623424] alloc_pages_current+0x75/0x110 [ 548.623436] kmalloc_order+0x24/0x60 [ 548.623442] kmalloc_order_trace+0x24/0xb0 [ 548.623448] __kmalloc_track_caller+0x207/0x220 [ 548.623455] ? f2fs_build_node_manager+0x399/0xbb0 [ 548.623460] kmemdup+0x20/0x50 [ 548.623465] f2fs_build_node_manager+0x399/0xbb0 [ 548.623470] f2fs_fill_super+0x195e/0x2b40 [ 548.623477] ? f2fs_commit_super+0x1b0/0x1b0 [ 548.623481] ? set_blocksize+0x90/0x140 [ 548.623486] mount_bdev+0x1c5/0x210 [ 548.623489] ? f2fs_commit_super+0x1b0/0x1b0 [ 548.623495] f2fs_mount+0x15/0x20 [ 548.623498] mount_fs+0x60/0x1a0 [ 548.623503] ? alloc_vfsmnt+0x309/0x360 [ 548.623508] vfs_kern_mount+0x6b/0x1a0 [ 548.623513] do_mount+0x34a/0x18c0 [ 548.623518] ? lockref_put_or_lock+0xcf/0x160 [ 548.623523] ? copy_mount_string+0x20/0x20 [ 548.623528] ? memcg_kmem_put_cache+0x1b/0xa0 [ 548.623533] ? kasan_check_write+0x14/0x20 [ 548.623537] ? _copy_from_user+0x6a/0x90 [ 548.623542] ? memdup_user+0x42/0x60 [ 548.623547] ksys_mount+0x83/0xd0 [ 548.623552] __x64_sys_mount+0x67/0x80 [ 548.623557] do_syscall_64+0x78/0x170 [ 548.623562] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 548.623566] RIP: 0033:0x7f76fc331b9a [ 548.623567] Code: 48 8b 0d 01 c3 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ce c2 2b 00 f7 d8 64 89 01 48 [ 548.623632] RSP: 002b:00007ffd4f0a0e48 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 548.623636] RAX: ffffffffffffffda RBX: 000000000146c030 RCX: 00007f76fc331b9a [ 548.623639] RDX: 000000000146c210 RSI: 000000000146df30 RDI: 0000000001474ec0 [ 548.623641] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013 [ 548.623643] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001474ec0 [ 548.623646] R13: 000000000146c210 R14: 0000000000000000 R15: 0000000000000003 [ 548.623650] ---[ end trace 4ce02f25ff7d3df5 ]--- [ 548.623656] F2FS-fs (loop0): Failed to initialize F2FS node manager [ 548.627936] F2FS-fs (loop0): Invalid log blocks per segment (8201) [ 548.627940] F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock [ 548.635835] F2FS-fs (loop0): Failed to initialize F2FS node manager - Location https://elixir.bootlin.com/linux/v4.18-rc1/source/fs/f2fs/segment.c#L3578 sit_i->sit_bitmap = kmemdup(src_bitmap, bitmap_size, GFP_KERNEL); Buffer overrun happens when doing memcpy. I suspect there is missing (inconsistent) checks on bitmap_size. Reported by Wen Xu (wen.xu@gatech.edu) from SSLab, Gatech. Reported-by: Wen Xu <wen.xu@gatech.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-27f2fs: fix to do sanity check with secs_per_zoneChao Yu1-2/+2
As Wen Xu reported in below link: https://bugzilla.kernel.org/show_bug.cgi?id=200183 - Overview Divide zero in reset_curseg() when mounting a crafted f2fs image - Reproduce - Kernel message [ 588.281510] divide error: 0000 [#1] SMP KASAN PTI [ 588.282701] CPU: 0 PID: 1293 Comm: mount Not tainted 4.18.0-rc1+ #4 [ 588.284000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 588.286178] RIP: 0010:reset_curseg+0x94/0x1a0 [ 588.298166] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246 [ 588.299360] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b [ 588.300809] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64 [ 588.305272] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000 [ 588.306822] FS: 00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000 [ 588.308456] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 588.309623] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0 [ 588.311085] Call Trace: [ 588.311637] f2fs_build_segment_manager+0x103f/0x3410 [ 588.316136] ? f2fs_commit_super+0x1b0/0x1b0 [ 588.317031] ? set_blocksize+0x90/0x140 [ 588.319473] f2fs_mount+0x15/0x20 [ 588.320166] mount_fs+0x60/0x1a0 [ 588.320847] ? alloc_vfsmnt+0x309/0x360 [ 588.321647] vfs_kern_mount+0x6b/0x1a0 [ 588.322432] do_mount+0x34a/0x18c0 [ 588.323175] ? strndup_user+0x46/0x70 [ 588.323937] ? copy_mount_string+0x20/0x20 [ 588.324793] ? memcg_kmem_put_cache+0x1b/0xa0 [ 588.325702] ? kasan_check_write+0x14/0x20 [ 588.326562] ? _copy_from_user+0x6a/0x90 [ 588.327375] ? memdup_user+0x42/0x60 [ 588.328118] ksys_mount+0x83/0xd0 [ 588.328808] __x64_sys_mount+0x67/0x80 [ 588.329607] do_syscall_64+0x78/0x170 [ 588.330400] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 588.331461] RIP: 0033:0x7fad848e8b9a [ 588.336022] RSP: 002b:00007ffd7c5b6be8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 [ 588.337547] RAX: ffffffffffffffda RBX: 00000000016f8030 RCX: 00007fad848e8b9a [ 588.338999] RDX: 00000000016f8210 RSI: 00000000016f9f30 RDI: 0000000001700ec0 [ 588.340442] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000013 [ 588.341887] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000000001700ec0 [ 588.343341] R13: 00000000016f8210 R14: 0000000000000000 R15: 0000000000000003 [ 588.354891] ---[ end trace 4ce02f25ff7d3df5 ]--- [ 588.355862] RIP: 0010:reset_curseg+0x94/0x1a0 [ 588.360742] RSP: 0018:ffff8801e88d7940 EFLAGS: 00010246 [ 588.361812] RAX: 0000000000000014 RBX: ffff8801e1d46d00 RCX: ffffffffb88bf60b [ 588.363485] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffff8801e1d46d64 [ 588.365213] RBP: ffff8801e88d7968 R08: ffffed003c32266f R09: ffffed003c32266f [ 588.366661] R10: 0000000000000001 R11: ffffed003c32266e R12: ffff8801f0337700 [ 588.368110] R13: 0000000000000000 R14: 0000000000000014 R15: 0000000000000000 [ 588.370057] FS: 00007fad85008840(0000) GS:ffff8801f6e00000(0000) knlGS:0000000000000000 [ 588.372099] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 588.373291] CR2: 0000000001705078 CR3: 00000001f30f8000 CR4: 00000000000006f0 - Location https://elixir.bootlin.com/linux/latest/source/fs/f2fs/segment.c#L2147 curseg->zone = GET_ZONE_FROM_SEG(sbi, curseg->segno); If secs_per_zone is corrupted due to fuzzing test, it will cause divide zero operation when using GET_ZONE_FROM_SEG macro, so we should do more sanity check with secs_per_zone during mount to avoid this issue. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-27f2fs: fix error path of fill_superChao Yu1-1/+1
In fill_super, if root inode's attribute is incorrect, we need to call f2fs_destroy_stats to release stats memory. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-27f2fs: relocate readdir_ra configure initializationChao Yu1-1/+2
readdir_ra is sysfs configuration instead of mount option, so it should not be initialized in default_options(), otherwise after remount, it can be reset to be enabled which may not as user wish, so let's move it to f2fs_tuning_parameters(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-27f2fs: move s_res{u,g}id initialization to default_options()Chao Yu1-3/+2
Let default_options() initialize s_res{u,g}id with default value like other options. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-27f2fs: indicate shutdown f2fs to allow unmount successfullyJaegeuk Kim1-4/+1
Once we shutdown f2fs, we have to flush stale pages in order to unmount the system. In order to make stable, we need to stop fault injection as well. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-07-18block: Define and use STAT_READ and STAT_WRITEMichael Callahan1-1/+2
Add defines for STAT_READ and STAT_WRITE for indexing the partition stat entries. This clarifies some fs/ code which has hardcoded 1 for STAT_WRITE and will make it easier to extend the stats with additional fields. tj: Refreshed on top of v4.17. Signed-off-by: Michael Callahan <michaelcallahan@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-07-15f2fs: do checkpoint in kill_sbJaegeuk Kim1-3/+13
When unmounting f2fs in force mode, we can get it stuck by io_schedule() by some pending IOs in meta_inode. io_schedule+0xd/0x30 wait_on_page_bit_common+0xc6/0x130 __filemap_fdatawait_range+0xbd/0x100 filemap_fdatawait_keep_errors+0x15/0x40 sync_inodes_sb+0x1cf/0x240 sync_filesystem+0x52/0x90 generic_shutdown_super+0x1d/0x110 kill_f2fs_super+0x28/0x80 [f2fs] deactivate_locked_super+0x35/0x60 cleanup_mnt+0x36/0x70 task_work_run+0x79/0xa0 exit_to_usermode_loop+0x62/0x70 do_syscall_64+0xdb/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 0xffffffffffffffff Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-06-13treewide: Use array_size() in f2fs_kzalloc()Kees Cook1-4/+8
The f2fs_kzalloc() function has no 2-factor argument form, so multiplication factors need to be wrapped in array_size(). This patch replaces cases of: f2fs_kzalloc(handle, a * b, gfp) with: f2fs_kzalloc(handle, array_size(a, b), gfp) as well as handling cases of: f2fs_kzalloc(handle, a * b * c, gfp) with: f2fs_kzalloc(handle, array3_size(a, b, c), gfp) This does, however, attempt to ignore constant size factors like: f2fs_kzalloc(handle, 4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ expression HANDLE; type TYPE; expression THING, E; @@ ( f2fs_kzalloc(HANDLE, - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | f2fs_kzalloc(HANDLE, - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression HANDLE; expression COUNT; typedef u8; typedef __u8; @@ ( f2fs_kzalloc(HANDLE, - sizeof(u8) * (COUNT) + COUNT , ...) | f2fs_kzalloc(HANDLE, - sizeof(__u8) * (COUNT) + COUNT , ...) | f2fs_kzalloc(HANDLE, - sizeof(char) * (COUNT) + COUNT , ...) | f2fs_kzalloc(HANDLE, - sizeof(unsigned char) * (COUNT) + COUNT , ...) | f2fs_kzalloc(HANDLE, - sizeof(u8) * COUNT + COUNT , ...) | f2fs_kzalloc(HANDLE, - sizeof(__u8) * COUNT + COUNT , ...) | f2fs_kzalloc(HANDLE, - sizeof(char) * COUNT + COUNT , ...) | f2fs_kzalloc(HANDLE, - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ expression HANDLE; type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( f2fs_kzalloc(HANDLE, - sizeof(TYPE) * (COUNT_ID) + array_size(COUNT_ID, sizeof(TYPE)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE) * COUNT_ID + array_size(COUNT_ID, sizeof(TYPE)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(TYPE)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE) * COUNT_CONST + array_size(COUNT_CONST, sizeof(TYPE)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING) * (COUNT_ID) + array_size(COUNT_ID, sizeof(THING)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING) * COUNT_ID + array_size(COUNT_ID, sizeof(THING)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(THING)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING) * COUNT_CONST + array_size(COUNT_CONST, sizeof(THING)) , ...) ) // 2-factor product, only identifiers. @@ expression HANDLE; identifier SIZE, COUNT; @@ f2fs_kzalloc(HANDLE, - SIZE * COUNT + array_size(COUNT, SIZE) , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression HANDLE; expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( f2fs_kzalloc(HANDLE, - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression HANDLE; expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( f2fs_kzalloc(HANDLE, - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | f2fs_kzalloc(HANDLE, - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ expression HANDLE; identifier STRIDE, SIZE, COUNT; @@ ( f2fs_kzalloc(HANDLE, - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kzalloc(HANDLE, - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kzalloc(HANDLE, - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kzalloc(HANDLE, - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kzalloc(HANDLE, - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kzalloc(HANDLE, - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kzalloc(HANDLE, - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kzalloc(HANDLE, - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products // when they're not all constants... @@ expression HANDLE; expression E1, E2, E3; constant C1, C2, C3; @@ ( f2fs_kzalloc(HANDLE, C1 * C2 * C3, ...) | f2fs_kzalloc(HANDLE, - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants. @@ expression HANDLE; expression E1, E2; constant C1, C2; @@ ( f2fs_kzalloc(HANDLE, C1 * C2, ...) | f2fs_kzalloc(HANDLE, - E1 * E2 + array_size(E1, E2) , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-13treewide: Use array_size() in f2fs_kmalloc()Kees Cook1-3/+5
The f2fs_kmalloc() function has no 2-factor argument form, so multiplication factors need to be wrapped in array_size(). This patch replaces cases of: f2fs_kmalloc(handle, a * b, gfp) with: f2fs_kmalloc(handle, array_size(a, b), gfp) as well as handling cases of: f2fs_kmalloc(handle, a * b * c, gfp) with: f2fs_kmalloc(handle, array3_size(a, b, c), gfp) This does, however, attempt to ignore constant size factors like: f2fs_kmalloc(handle, 4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ expression HANDLE; type TYPE; expression THING, E; @@ ( f2fs_kmalloc(HANDLE, - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) | f2fs_kmalloc(HANDLE, - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression HANDLE; expression COUNT; typedef u8; typedef __u8; @@ ( f2fs_kmalloc(HANDLE, - sizeof(u8) * (COUNT) + COUNT , ...) | f2fs_kmalloc(HANDLE, - sizeof(__u8) * (COUNT) + COUNT , ...) | f2fs_kmalloc(HANDLE, - sizeof(char) * (COUNT) + COUNT , ...) | f2fs_kmalloc(HANDLE, - sizeof(unsigned char) * (COUNT) + COUNT , ...) | f2fs_kmalloc(HANDLE, - sizeof(u8) * COUNT + COUNT , ...) | f2fs_kmalloc(HANDLE, - sizeof(__u8) * COUNT + COUNT , ...) | f2fs_kmalloc(HANDLE, - sizeof(char) * COUNT + COUNT , ...) | f2fs_kmalloc(HANDLE, - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ expression HANDLE; type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( f2fs_kmalloc(HANDLE, - sizeof(TYPE) * (COUNT_ID) + array_size(COUNT_ID, sizeof(TYPE)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE) * COUNT_ID + array_size(COUNT_ID, sizeof(TYPE)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(TYPE)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE) * COUNT_CONST + array_size(COUNT_CONST, sizeof(TYPE)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING) * (COUNT_ID) + array_size(COUNT_ID, sizeof(THING)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING) * COUNT_ID + array_size(COUNT_ID, sizeof(THING)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING) * (COUNT_CONST) + array_size(COUNT_CONST, sizeof(THING)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING) * COUNT_CONST + array_size(COUNT_CONST, sizeof(THING)) , ...) ) // 2-factor product, only identifiers. @@ expression HANDLE; identifier SIZE, COUNT; @@ f2fs_kmalloc(HANDLE, - SIZE * COUNT + array_size(COUNT, SIZE) , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression HANDLE; expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( f2fs_kmalloc(HANDLE, - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression HANDLE; expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( f2fs_kmalloc(HANDLE, - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) | f2fs_kmalloc(HANDLE, - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ expression HANDLE; identifier STRIDE, SIZE, COUNT; @@ ( f2fs_kmalloc(HANDLE, - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kmalloc(HANDLE, - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kmalloc(HANDLE, - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kmalloc(HANDLE, - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kmalloc(HANDLE, - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kmalloc(HANDLE, - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kmalloc(HANDLE, - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) | f2fs_kmalloc(HANDLE, - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products // when they're not all constants... @@ expression HANDLE; expression E1, E2, E3; constant C1, C2, C3; @@ ( f2fs_kmalloc(HANDLE, C1 * C2 * C3, ...) | f2fs_kmalloc(HANDLE, - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants. @@ expression HANDLE; expression E1, E2; constant C1, C2; @@ ( f2fs_kmalloc(HANDLE, C1 * C2, ...) | f2fs_kmalloc(HANDLE, - E1 * E2 + array_size(E1, E2) , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-11Merge tag 'f2fs-for-4.18' of ↵Linus Torvalds1-60/+138
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've mainly focused on discard, aka unmap, control along with fstrim for Android-specific usage model. In addition, we've fixed writepage flow which returned EAGAIN previously resulting in EIO of fsync(2) due to mapping's error state. In order to avoid old MM bug [1], we decided not to use __GFP_ZERO for the mapping for node and meta page caches. As always, we've cleaned up many places for future fsverity and symbol conflicts. Enhancements: - do discard/fstrim in lower priority considering fs utilization - split large discard commands into smaller ones for better responsiveness - add more sanity checks to address syzbot reports - add a mount option, fsync_mode=nobarrier, which can reduce # of cache flushes - clean up symbol namespace with modified function names - be strict on block allocation and IO control in corner cases Bug fixes: - don't use __GFP_ZERO for mappings - fix error reports in writepage to avoid fsync() failure - avoid selinux denial on CAP_RESOURCE on resgid/resuid - fix some subtle race conditions in GC/atomic writes/shutdown - fix overflow bugs in sanity_check_raw_super - fix missing bits on get_flags Clean-ups: - prepare the generic flow for future fsverity integration - fix some broken coding standard" [1] https://lkml.org/lkml/2018/4/8/661 * tag 'f2fs-for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (79 commits) f2fs: fix to clear FI_VOLATILE_FILE correctly f2fs: let sync node IO interrupt async one f2fs: don't change wbc->sync_mode f2fs: fix to update mtime correctly fs: f2fs: insert space around that ':' and ', ' fs: f2fs: add missing blank lines after declarations fs: f2fs: changed variable type of offset "unsigned" to "loff_t" f2fs: clean up symbol namespace f2fs: make set_de_type() static f2fs: make __f2fs_write_data_pages() static f2fs: fix to avoid accessing cross the boundary f2fs: fix to let caller retry allocating block address disable loading f2fs module on PAGE_SIZE > 4KB f2fs: fix error path of move_data_page f2fs: don't drop dentry pages after fs shutdown f2fs: fix to avoid race during access gc_thread pointer f2fs: clean up with clear_radix_tree_dirty_tag f2fs: fix to don't trigger writeback during recovery f2fs: clear discard_wake earlier f2fs: let discard thread wait a little longer if dev is busy ...
2018-06-05f2fs: let sync node IO interrupt async oneChao Yu1-1/+2
Although mixed sync/async IOs can have continuous LBA, as they have different IO priority, block IO scheduler will add them into different queues and commit them separately, result in splited IOs which causes wrose performance. This patch gives high priority to synchronous IO of nodes, means that once synchronous flow starts, it can interrupt asynchronous writeback flow of system flusher, so more big IOs can be expected. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: clean up symbol namespaceChao Yu1-47/+47
As Ted reported: "Hi, I was looking at f2fs's sources recently, and I noticed that there is a very large number of non-static symbols which don't have a f2fs prefix. There's well over a hundred (see attached below). As one example, in fs/f2fs/dir.c there is: unsigned char get_de_type(struct f2fs_dir_entry *de) This function is clearly only useful for f2fs, but it has a generic name. This means that if any other file system tries to have the same symbol name, there will be a symbol conflict and the kernel would not successfully build. It also means that when someone is looking f2fs sources, it's not at all obvious whether a function such as read_data_page(), invalidate_blocks(), is a generic kernel function found in the fs, mm, or block layers, or a f2fs specific function. You might want to fix this at some point. Hopefully Kent's bcachefs isn't similarly using genericly named functions, since that might cause conflicts with f2fs's functions --- but just as this would be a problem that we would rightly insist that Kent fix, this is something that we should have rightly insisted that f2fs should have fixed before it was integrated into the mainline kernel. acquire_orphan_inode add_ino_entry add_orphan_inode allocate_data_block allocate_new_segments alloc_nid alloc_nid_done alloc_nid_failed available_free_memory ...." This patch adds "f2fs_" prefix for all non-static symbols in order to: a) avoid conflict with other kernel generic symbols; b) to indicate the function is f2fs specific one instead of generic one; Reported-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31disable loading f2fs module on PAGE_SIZE > 4KBAnatoly Pugachev1-0/+6
The following patch disables loading of f2fs module on architectures which have PAGE_SIZE > 4096 , since it is impossible to mount f2fs on such architectures , log messages are: mount: /mnt: wrong fs type, bad option, bad superblock on /dev/vdiskb1, missing codepage or helper program, or other error. /dev/vdiskb1: F2FS filesystem, UUID=1d8b9ca4-2389-4910-af3b-10998969f09c, volume name "" May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS filesystem in 1th superblock May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS filesystem in 2th superblock May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB which was introduced by git commit 5c9b469295fb6b10d98923eab5e79c4edb80ed20 tested on git kernel 4.17.0-rc6-00309-gec30dcf7f425 with patch applied: modprobe: ERROR: could not insert 'f2fs': Invalid argument May 28 01:40:28 v215 kernel: F2FS not supported on PAGE_SIZE(8192) != 4096 Signed-off-by: Anatoly Pugachev <matorola@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: keep migration IO order in LFS modeChao Yu1-0/+1
For non-migration IO, we will keep order of data/node blocks' submitting as allocation sequence by sorting IOs in per log io_list list, but for migration IO, it could be out-of-order. In LFS mode, we should keep all IOs including migration IO be ordered, so that this patch fixes to add an additional lock to keep submitting order. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: fix to initialize i_current_depth according to inode typeChao Yu1-1/+0
i_current_depth is used only for directory inode, but its space is shared with i_gc_failures field used for regular inode, in order to avoid affecting i_gc_failures' value, this patch fixes to initialize the union's fields according to inode type. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: enhance sanity_check_raw_super() to avoid potential overflowsJaegeuk Kim1-7/+68
In order to avoid the below overflow issue, we should have checked the boundaries in superblock before reaching out to allocation. As Linus suggested, the right place should be sanity_check_raw_super(). Dr Silvio Cesare of InfoSect reported: There are integer overflows with using the cp_payload superblock field in the f2fs filesystem potentially leading to memory corruption. include/linux/f2fs_fs.h struct f2fs_super_block { ... __le32 cp_payload; fs/f2fs/f2fs.h typedef u32 block_t; /* * should not change u32, since it is the on-disk block * address format, __le32. */ ... static inline block_t __cp_payload(struct f2fs_sb_info *sbi) { return le32_to_cpu(F2FS_RAW_SUPER(sbi)->cp_payload); } fs/f2fs/checkpoint.c block_t start_blk, orphan_blocks, i, j; ... start_blk = __start_cp_addr(sbi) + 1 + __cp_payload(sbi); orphan_blocks = __start_sum_addr(sbi) - 1 - __cp_payload(sbi); +++ integer overflows ... unsigned int cp_blks = 1 + __cp_payload(sbi); ... sbi->ckpt = kzalloc(cp_blks * blk_size, GFP_KERNEL); +++ integer overflow leading to incorrect heap allocation. int cp_payload_blks = __cp_payload(sbi); ... ckpt->cp_pack_start_sum = cpu_to_le32(1 + cp_payload_blks + orphan_blocks); +++ sign bug and integer overflow ... for (i = 1; i < 1 + cp_payload_blks; i++) +++ integer overflow ... sbi->max_orphans = (sbi->blocks_per_seg - F2FS_CP_PACKS - NR_CURSEG_TYPE - __cp_payload(sbi)) * F2FS_ORPHANS_PER_BLOCK; +++ integer overflow Reported-by: Greg KH <greg@kroah.com> Reported-by: Silvio Cesare <silvio.cesare@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: rename dio_rwsem to i_gc_rwsemChao Yu1-2/+2
RW semphore dio_rwsem in struct f2fs_inode_info is introduced to avoid race between dio and data gc, but now, it is more wildly used to avoid foreground operation vs data gc. So rename it to i_gc_rwsem to improve its readability. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-31f2fs: introduce private inode status mappingChao Yu1-2/+2
Previously, we use generic FS_*_FL defined by vfs to indicate inode status for each bit of i_flags, so f2fs's flag status definition is tied to vfs' one, it will be hard for f2fs to reuse bits f2fs never used to indicate new status.. In order to solve this issue, we introduce private inode status mapping, Note, for these bits have already been persisted into disk, we should never change their definition, for other ones, we can remap them for later new coming status. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-29f2fs: add fsync_mode=nobarrier for non-atomic filesJaegeuk Kim1-0/+4
For non-atomic files, this patch adds an option to give nobarrier which doesn't issue flush commands to the device. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-05-20fscrypt: make fscrypt_operations.max_namelen an integerEric Biggers1-7/+1
Now ->max_namelen() is only called to limit the filename length when adding NUL padding, and only for real filenames -- not symlink targets. It also didn't give the correct length for symlink targets anyway since it forgot to subtract 'sizeof(struct fscrypt_symlink_data)'. Thus, change ->max_namelen from a function to a simple 'unsigned int' that gives the filesystem's maximum filename length. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2018-05-03f2fs: refactor read path to allow multiple postprocessing stepsEric Biggers1-0/+6
Currently f2fs's ->readpage() and ->readpages() assume that either the data undergoes no postprocessing, or decryption only. But with fs-verity, there will be an additional authenticity verification step, and it may be needed either by itself, or combined with decryption. To support this, store a 'struct bio_post_read_ctx' in ->bi_private which contains a work struct, a bitmask of postprocessing steps that are enabled, and an indicator of the current step. The bio completion routine, if there was no I/O error, enqueues the first postprocessing step. When that completes, it continues to the next step. Pages that fail any postprocessing step have PageError set. Once all steps have completed, pages without PageError set are set Uptodate, and all pages are unlocked. Also replace f2fs_encrypted_file() with a new function f2fs_post_read_required() in places like direct I/O and garbage collection that really should be testing whether the file needs special I/O processing, not whether it is encrypted specifically. This may also be useful for other future f2fs features such as compression. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-28f2fs: no need to initialize zero value for GFP_F2FS_ZEROYunlong Song1-5/+0
Since f2fs_inode_info is allocated with flag GFP_F2FS_ZERO, so we do not need to initialize zero value for its member any more. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-19f2fs: Set GF_NOFS in read_cache_page_gfp while doing f2fs_quota_readRitesh Harjani1-1/+1
Quota code itself is serializing the operations by taking mutex_lock. It seems a below deadlock can happen if GF_NOFS is not used in f2fs_quota_read __switch_to+0x88 __schedule+0x5b0 schedule+0x78 schedule_preempt_disabled+0x20 __mutex_lock_slowpath+0xdc //mutex owner is itself mutex_lock+0x2c dquot_commit+0x30 //mutex_lock(&dqopt->dqio_mutex); dqput+0xe0 __dquot_drop+0x80 dquot_drop+0x48 f2fs_evict_inode+0x218 evict+0xa8 dispose_list+0x3c prune_icache_sb+0x58 super_cache_scan+0xf4 do_shrink_slab+0x208 shrink_slab.part.40+0xac shrink_zone+0x1b0 do_try_to_free_pages+0x25c try_to_free_pages+0x164 __alloc_pages_nodemask+0x534 do_read_cache_page+0x6c read_cache_page+0x14 f2fs_quota_read+0xa4 read_blk+0x54 find_tree_dqentry+0xe4 find_tree_dqentry+0xb8 find_tree_dqentry+0xb8 find_tree_dqentry+0xb8 qtree_read_dquot+0x68 v2_read_dquot+0x24 dquot_acquire+0x5c // mutex_lock(&dqopt->dqio_mutex); dqget+0x238 __dquot_initialize+0xd4 dquot_initialize+0x10 dquot_file_open+0x34 f2fs_file_open+0x6c do_dentry_open+0x1e4 vfs_open+0x6c path_openat+0xa20 do_filp_open+0x4c do_sys_open+0x178 Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-17f2fs: introduce a new mount option test_dummy_encryptionSheng Yong1-0/+28
This patch introduces a new mount option `test_dummy_encryption' to allow fscrypt to create a fake fscrypt context. This is used by xfstests. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-17f2fs: introduce F2FS_FEATURE_LOST_FOUND featureSheng Yong1-0/+12
This patch introduces a new feature, F2FS_FEATURE_LOST_FOUND, which is set by mkfs. mkfs creates a directory named lost+found, which saves unreachable files. If fsck finds a file which has no parent, or its parent is removed by fsck, the file will be placed under lost+found directory by fsck. lost+found directory could not be encrypted. As a result, the root directory cannot be encrypted too. So if LOST_FOUND feature is enabled, let's avoid to encrypt root directory. Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-17f2fs: wrap all options with f2fs_sb_info.mount_optChao Yu1-118/+108
This patch merges miscellaneous mount options into struct f2fs_mount_info, After this patch, once we add new mount option, we don't need to worry about recovery of it in remount_fs(), since we will recover the f2fs_sb_info.mount_opt including all options. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-17f2fs: introduce mount option for fsync modeJunling Zheng1-0/+26
Commit "0a007b97aad6"(f2fs: recover directory operations by fsync) fixed xfstest generic/342 case, but it also increased the written data and caused the performance degradation. In most cases, there's no need to do so heavy fsync actually. So we introduce new mount option "fsync_mode={posix,strict}" to control the policy of fsync. "fsync_mode=posix" is set by default, and means that f2fs uses a light fsync, which follows POSIX semantics. And "fsync_mode=strict" means that it's a heavy fsync, which behaves in line with xfs, ext4 and btrfs, where generic/342 will pass, but the performance will regress. Signed-off-by: Junling Zheng <zhengjunling@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-17f2fs: fix to restore old mount option in ->remount_fsChao Yu1-0/+10
This patch fixes to restore old mount option once we encounter failure in ->remount_fs. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-17f2fs: wrap sb_rdonly with f2fs_readonlyChao Yu1-3/+3
Use f2fs_readonly to wrap sb_rdonly for cleanup, and spread it in all places. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: set readdir_ra by defaultJaegeuk Kim1-0/+1
It gives general readdir improvement. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: add auto tuning for small devicesJaegeuk Kim1-0/+14
If f2fs is running on top of very small devices, it's worth to avoid abusing free LBAs. In order to achieve that, this patch introduces some parameter tuning. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: add mount option for segment allocation policyJaegeuk Kim1-0/+26
This patch adds an mount option, "alloc_mode=%s" having two options, "default" and "reuse". In "alloc_mode=reuse" case, f2fs starts to allocate segments from 0'th segment all the time to reassign segments. It'd be useful for small-sized eMMC parts. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: expose extension_list sysfs entryChao Yu1-1/+1
This patch adds a sysfs entry 'extension_list' to support query/add/del item in extension list. Query: cat /sys/fs/f2fs/<device>/extension_list Add: echo 'extension' > /sys/fs/f2fs/<device>/extension_list Del: echo '!extension' > /sys/fs/f2fs/<device>/extension_list Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-03-13f2fs: introduce sb_lock to make encrypt pwsalt update exclusiveChao Yu1-0/+2
f2fs_super_block.encrypt_pw_salt can be udpated and persisted concurrently, result in getting different pwsalt in separated threads, so let's introduce sb_lock to exclude concurrent accessers. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>