summaryrefslogtreecommitdiff
path: root/fs/f2fs/compress.c
AgeCommit message (Collapse)AuthorFilesLines
2020-12-03f2fs: compress: support chksumChao Yu1-0/+23
This patch supports to store chksum value with compressed data, and verify the integrality of compressed data while reading the data. The feature can be enabled through specifying mount option 'compress_chksum'. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-29f2fs: fix slab leak of rpages pointerJaegeuk Kim1-1/+1
This fixes the below mem leak. [ 130.157600] ============================================================================= [ 130.159662] BUG f2fs_page_array_entry-252:16 (Tainted: G W O ): Objects remaining in f2fs_page_array_entry-252:16 on __kmem_cache_shutdown() [ 130.162742] ----------------------------------------------------------------------------- [ 130.162742] [ 130.164979] Disabling lock debugging due to kernel taint [ 130.166188] INFO: Slab 0x000000009f5a52d2 objects=22 used=4 fp=0x00000000ba72c3e9 flags=0xfffffc0010200 [ 130.168269] CPU: 7 PID: 3560 Comm: umount Tainted: G B W O 5.9.0-rc4+ #35 [ 130.170019] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 130.171941] Call Trace: [ 130.172528] dump_stack+0x74/0x9a [ 130.173298] slab_err+0xb7/0xdc [ 130.174044] ? kernel_poison_pages+0xc0/0xc0 [ 130.175065] ? on_each_cpu_cond_mask+0x48/0x90 [ 130.176096] __kmem_cache_shutdown.cold+0x34/0x141 [ 130.177190] kmem_cache_destroy+0x59/0x100 [ 130.178223] f2fs_destroy_page_array_cache+0x15/0x20 [f2fs] [ 130.179527] f2fs_put_super+0x1bc/0x380 [f2fs] [ 130.180538] generic_shutdown_super+0x72/0x110 [ 130.181547] kill_block_super+0x27/0x50 [ 130.182438] kill_f2fs_super+0x76/0xe0 [f2fs] [ 130.183448] deactivate_locked_super+0x3b/0x80 [ 130.184456] deactivate_super+0x3e/0x50 [ 130.185363] cleanup_mnt+0x109/0x160 [ 130.186179] __cleanup_mnt+0x12/0x20 [ 130.187003] task_work_run+0x70/0xb0 [ 130.187841] exit_to_user_mode_prepare+0x18f/0x1b0 [ 130.188917] syscall_exit_to_user_mode+0x31/0x170 [ 130.189989] do_syscall_64+0x45/0x90 [ 130.190828] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 130.191986] RIP: 0033:0x7faf868ea2eb [ 130.192815] Code: 7b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 7b 0c 00 f7 d8 64 89 01 [ 130.196872] RSP: 002b:00007fffb7edb478 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [ 130.198494] RAX: 0000000000000000 RBX: 00007faf86a18204 RCX: 00007faf868ea2eb [ 130.201021] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055971df71c50 [ 130.203415] RBP: 000055971df71a40 R08: 0000000000000000 R09: 00007fffb7eda1f0 [ 130.205772] R10: 00007faf86a04339 R11: 0000000000000246 R12: 000055971df71c50 [ 130.208150] R13: 0000000000000000 R14: 000055971df71b38 R15: 0000000000000000 [ 130.210515] INFO: Object 0x00000000a980843a @offset=744 [ 130.212476] INFO: Allocated in page_array_alloc+0x3d/0xe0 [f2fs] age=1572 cpu=0 pid=3297 [ 130.215030] __slab_alloc+0x20/0x40 [ 130.216566] kmem_cache_alloc+0x2a0/0x2e0 [ 130.218217] page_array_alloc+0x3d/0xe0 [f2fs] [ 130.219940] f2fs_init_compress_ctx+0x1f/0x40 [f2fs] [ 130.221736] f2fs_write_cache_pages+0x3db/0x860 [f2fs] [ 130.223591] f2fs_write_data_pages+0x2c9/0x300 [f2fs] [ 130.225414] do_writepages+0x43/0xd0 [ 130.226907] __filemap_fdatawrite_range+0xd5/0x110 [ 130.228632] filemap_write_and_wait_range+0x48/0xb0 [ 130.230336] __generic_file_write_iter+0x18a/0x1d0 [ 130.232035] f2fs_file_write_iter+0x226/0x550 [f2fs] [ 130.233737] new_sync_write+0x113/0x1a0 [ 130.235204] vfs_write+0x1a6/0x200 [ 130.236579] ksys_write+0x67/0xe0 [ 130.237898] __x64_sys_write+0x1a/0x20 [ 130.239309] do_syscall_64+0x38/0x90 Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-29f2fs: compress: introduce cic/dic slab cacheChao Yu1-7/+60
Add two slab caches: "f2fs_cic_entry" and "f2fs_dic_entry" for memory allocation of compress_io_ctx and decompress_io_ctx structure. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-29f2fs: compress: introduce page array slab cacheChao Yu1-30/+86
Add a per-sbi slab cache "f2fs_page_array_entry-%u:%u" for memory allocation of page pointer array in compress context. Signed-off-by: Chao Yu <yuchao0@huawei.com> [Jaegeuk Kim: Fix wrong memory allocation] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-11f2fs: change virtual mapping way for compression pagesDaeho Jeong1-10/+26
By profiling f2fs compression works, I've found vmap() callings have unexpected hikes in the execution time in our test environment and those are bottlenecks of f2fs decompression path. Changing these with vm_map_ram(), we can enhance f2fs decompression speed pretty much. [Verification] Android Pixel 3(ARM64, 6GB RAM, 128GB UFS) Turned on only 0-3 little cores(at 1.785GHz) dd if=/dev/zero of=dummy bs=1m count=1000 echo 3 > /proc/sys/vm/drop_caches dd if=dummy of=/dev/zero bs=512k - w/o compression - 1048576000 bytes (0.9 G) copied, 2.082554 s, 480 M/s 1048576000 bytes (0.9 G) copied, 2.081634 s, 480 M/s 1048576000 bytes (0.9 G) copied, 2.090861 s, 478 M/s - before patch - 1048576000 bytes (0.9 G) copied, 7.407527 s, 135 M/s 1048576000 bytes (0.9 G) copied, 7.283734 s, 137 M/s 1048576000 bytes (0.9 G) copied, 7.291508 s, 137 M/s - after patch - 1048576000 bytes (0.9 G) copied, 1.998959 s, 500 M/s 1048576000 bytes (0.9 G) copied, 1.987554 s, 503 M/s 1048576000 bytes (0.9 G) copied, 1.986380 s, 503 M/s Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-11f2fs: allocate proper size memory for zstd decompressChao Yu1-3/+4
As 5kft <5kft@5kft.org> reported: kworker/u9:3: page allocation failure: order:9, mode:0x40c40(GFP_NOFS|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 3 PID: 8168 Comm: kworker/u9:3 Tainted: G C 5.8.3-sunxi #trunk Hardware name: Allwinner sun8i Family Workqueue: f2fs_post_read_wq f2fs_post_read_work [<c010d6d5>] (unwind_backtrace) from [<c0109a55>] (show_stack+0x11/0x14) [<c0109a55>] (show_stack) from [<c056d489>] (dump_stack+0x75/0x84) [<c056d489>] (dump_stack) from [<c0243b53>] (warn_alloc+0xa3/0x104) [<c0243b53>] (warn_alloc) from [<c024473b>] (__alloc_pages_nodemask+0xb87/0xc40) [<c024473b>] (__alloc_pages_nodemask) from [<c02267c5>] (kmalloc_order+0x19/0x38) [<c02267c5>] (kmalloc_order) from [<c02267fd>] (kmalloc_order_trace+0x19/0x90) [<c02267fd>] (kmalloc_order_trace) from [<c047c665>] (zstd_init_decompress_ctx+0x21/0x88) [<c047c665>] (zstd_init_decompress_ctx) from [<c047e9cf>] (f2fs_decompress_pages+0x97/0x228) [<c047e9cf>] (f2fs_decompress_pages) from [<c045d0ab>] (__read_end_io+0xfb/0x130) [<c045d0ab>] (__read_end_io) from [<c045d141>] (f2fs_post_read_work+0x61/0x84) [<c045d141>] (f2fs_post_read_work) from [<c0130b2f>] (process_one_work+0x15f/0x3b0) [<c0130b2f>] (process_one_work) from [<c0130e7b>] (worker_thread+0xfb/0x3e0) [<c0130e7b>] (worker_thread) from [<c0135c3b>] (kthread+0xeb/0x10c) [<c0135c3b>] (kthread) from [<c0100159>] zstd may allocate large size memory for {,de}compression, it may cause file copy failure on low-end device which has very few memory. For decompression, let's just allocate proper size memory based on current file's cluster size instead of max cluster size. Reported-by: 5kft <5kft@5kft.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-11f2fs: compress: use more readable atomic_t type for {cic,dic}.refChao Yu1-5/+5
refcount_t type variable should never be less than one, so it's a little bit hard to understand when we use it to indicate pending compressed page count, let's change to use atomic_t for better readability. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-11f2fs: compress: remove unneeded codeChao Yu1-4/+0
- f2fs_write_multi_pages - f2fs_compress_pages - init_compress_ctx - compress_pages - destroy_compress_ctx --- 1 - f2fs_write_compressed_pages - destroy_compress_ctx --- 2 destroy_compress_ctx() in f2fs_write_multi_pages() is redundant, remove it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-08-11Merge tag 'f2fs-for-5.9-rc1' of ↵Linus Torvalds1-25/+64
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've added two small interfaces: (a) GC_URGENT_LOW mode for performance and (b) F2FS_IOC_SEC_TRIM_FILE ioctl for security. The new GC mode allows Android to run some lower priority GCs in background, while new ioctl discards user information without race condition when the account is removed. In addition, some patches were merged to address latency-related issues. We've fixed some compression-related bug fixes as well as edge race conditions. Enhancements: - add GC_URGENT_LOW mode in gc_urgent - introduce F2FS_IOC_SEC_TRIM_FILE ioctl - bypass racy readahead to improve read latencies - shrink node_write lock coverage to avoid long latency Bug fixes: - fix missing compression flag control, i_size, and mount option - fix deadlock between quota writes and checkpoint - remove inode eviction path in synchronous path to avoid deadlock - fix to wait GCed compressed page writeback - fix a kernel panic in f2fs_is_compressed_page - check page dirty status before writeback - wait page writeback before update in node page write flow - fix a race condition between f2fs_write_end_io and f2fs_del_fsync_node_entry We've added some minor sanity checks and refactored trivial code blocks for better readability and debugging information" * tag 'f2fs-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (52 commits) f2fs: prepare a waiter before entering io_schedule f2fs: update_sit_entry: Make the judgment condition of f2fs_bug_on more intuitive f2fs: replace test_and_set/clear_bit() with set/clear_bit() f2fs: make file immutable even if releasing zero compression block f2fs: compress: disable compression mount option if compression is off f2fs: compress: add sanity check during compressed cluster read f2fs: use macro instead of f2fs verity version f2fs: fix deadlock between quota writes and checkpoint f2fs: correct comment of f2fs_exist_written_data f2fs: compress: delay temp page allocation f2fs: compress: fix to update isize when overwriting compressed file f2fs: space related cleanup f2fs: fix use-after-free issue f2fs: Change the type of f2fs_flush_inline_data() to void f2fs: add F2FS_IOC_SEC_TRIM_FILE ioctl f2fs: should avoid inode eviction in synchronous path f2fs: segment.h: delete a duplicated word f2fs: compress: fix to avoid memory leak on cc->cpages f2fs: use generic names for generic ioctls f2fs: don't keep meta inode pages used for compressed block migration ...
2020-07-26f2fs: compress: delay temp page allocationChao Yu1-16/+21
Currently, we allocate temp pages which is used to pad hole in cluster during read IO submission, it may take long time before releasing them in f2fs_decompress_pages(), since they are only used as temp output buffer in decompression context, so let's just do the allocation in that context to reduce time of memory pool resource occupation. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-21f2fs: compress: fix to avoid memory leak on cc->cpagesChao Yu1-0/+2
Memory allocated for storing compressed pages' poitner should be released after f2fs_write_compressed_pages(), otherwise it will cause memory leak issue. Signed-off-by: Chao Yu <yuchao0@huawei.com> Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-08f2fs: add inline encryption supportSatya Tangirala1-1/+1
Wire up f2fs to support inline encryption via the helper functions which fs/crypto/ now provides. This includes: - Adding a mount option 'inlinecrypt' which enables inline encryption on encrypted files where it can be used. - Setting the bio_crypt_ctx on bios that will be submitted to an inline-encrypted file. - Not adding logically discontiguous data to bios that will be submitted to an inline-encrypted file. - Not doing filesystem-layer crypto on inline-encrypted files. This patch includes a fix for a race during IPU by Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Satya Tangirala <satyat@google.com> Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Link: https://lore.kernel.org/r/20200702015607.1215430-4-satyat@google.com Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-07-08f2fs: fix to wait GCed compressed page writebackChao Yu1-0/+7
like we did for encrypted page. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-08f2fs: fix an oops in f2fs_is_compressed_pageYu Changchun1-0/+7
This patch is to fix a crash: #3 [ffffb6580689f898] oops_end at ffffffffa2835bc2 #4 [ffffb6580689f8b8] no_context at ffffffffa28766e7 #5 [ffffb6580689f920] async_page_fault at ffffffffa320135e [exception RIP: f2fs_is_compressed_page+34] RIP: ffffffffa2ba83a2 RSP: ffffb6580689f9d8 RFLAGS: 00010213 RAX: 0000000000000001 RBX: fffffc0f50b34bc0 RCX: 0000000000002122 RDX: 0000000000002123 RSI: 0000000000000c00 RDI: fffffc0f50b34bc0 RBP: ffff97e815a40178 R8: 0000000000000000 R9: ffff97e83ffc9000 R10: 0000000000032300 R11: 0000000000032380 R12: ffffb6580689fa38 R13: fffffc0f50b34bc0 R14: ffff97e825cbd000 R15: 0000000000000c00 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #6 [ffffb6580689f9d8] __is_cp_guaranteed at ffffffffa2b7ea98 #7 [ffffb6580689f9f0] f2fs_submit_page_write at ffffffffa2b81a69 #8 [ffffb6580689fa30] f2fs_do_write_meta_page at ffffffffa2b99777 #9 [ffffb6580689fae0] __f2fs_write_meta_page at ffffffffa2b75f1a #10 [ffffb6580689fb18] f2fs_sync_meta_pages at ffffffffa2b77466 #11 [ffffb6580689fc98] do_checkpoint at ffffffffa2b78e46 #12 [ffffb6580689fd88] f2fs_write_checkpoint at ffffffffa2b79c29 #13 [ffffb6580689fdd0] f2fs_sync_fs at ffffffffa2b69d95 #14 [ffffb6580689fe20] sync_filesystem at ffffffffa2ad2574 #15 [ffffb6580689fe30] generic_shutdown_super at ffffffffa2a9b582 #16 [ffffb6580689fe48] kill_block_super at ffffffffa2a9b6d1 #17 [ffffb6580689fe60] kill_f2fs_super at ffffffffa2b6abe1 #18 [ffffb6580689fea0] deactivate_locked_super at ffffffffa2a9afb6 #19 [ffffb6580689feb8] cleanup_mnt at ffffffffa2abcad4 #20 [ffffb6580689fee0] task_work_run at ffffffffa28bca28 #21 [ffffb6580689ff00] exit_to_usermode_loop at ffffffffa28050b7 #22 [ffffb6580689ff38] do_syscall_64 at ffffffffa280560e #23 [ffffb6580689ff50] entry_SYSCALL_64_after_hwframe at ffffffffa320008c This occurred when umount f2fs if enable F2FS_FS_COMPRESSION with F2FS_IO_TRACE. Fixes it by adding IS_IO_TRACED_PAGE to check validity of pid for page_private. Signed-off-by: Yu Changchun <yuchangchun1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-08f2fs: fix to check page dirty status before writebackChao Yu1-0/+6
In f2fs_write_raw_pages(), we need to check page dirty status before writeback, because there could be a racer (e.g. reclaimer) helps writebacking the dirty page. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-08f2fs: remove the unused compr parameterWang Xiaojun1-3/+3
The parameter compr is unused in the f2fs_cluster_blocks function so we no longer need to pass it as a parameter. Signed-off-by: Wang Xiaojun <wangxiaojun11@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-08f2fs: shrink node_write lock coverageChao Yu1-3/+15
- to avoid race between checkpoint and quota file writeback, it just needs to hold read lock of node_write in writeback path. - node_write lock has covered all LFS data write paths, it's not necessary, we only need to hold node_write lock at write path of quota file. This refactors commit ca7f76e68074 ("f2fs: fix wrong discard space"). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-07-08f2fs: add prefix for exported symbolsChao Yu1-2/+2
to avoid polluting global symbol namespace. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-18f2fs: avoid checkpatch errorJaegeuk Kim1-1/+1
ERROR:INITIALISED_STATIC: do not initialise statics to NULL Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-06-09f2fs: remove unused parameter of f2fs_put_rpages_mapping()Chao Yu1-4/+3
Just cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-28f2fs: compress: don't compress any datas after cp stopChao Yu1-0/+2
While compressed data writeback, we need to drop dirty pages like we did for non-compressed pages if cp stops, however it's not needed to compress any data in such case, so let's detect cp stop condition in cluster_may_compress() to avoid redundant compressing and let following f2fs_write_raw_pages() drops dirty pages correctly. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-12f2fs: compress: fix zstd data corruptionChao Yu1-0/+7
During zstd compression, ZSTD_endStream() may return non-zero value because distination buffer is full, but there is still compressed data remained in intermediate buffer, it means that zstd algorithm can not save at last one block space, let's just writeback raw data instead of compressed one, this can fix data corruption when decompressing incomplete stored compression data. Fixes: 50cfa66f0de0 ("f2fs: compress: support zstd compress algorithm") Signed-off-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-12f2fs: compress: let lz4 compressor handle output buffer budget properlyChao Yu1-6/+9
Commonly, in order to handle lz4 worst compress case, caller should allocate buffer with size of LZ4_compressBound(inputsize) for target compressed data storing, however in this case, if caller didn't allocate enough space, lz4 compressor still can handle output buffer budget properly, and end up compressing when left space in output buffer is not enough. So we don't have to allocate buffer with size for worst case, then we can avoid 2 * 4KB size intermediate buffer allocation when log_cluster_size is 2, and avoid unnecessary compressing work of compressor if we can not save at least 4KB space. Suggested-by: Daeho Jeong <daehojeong@google.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-12f2fs: compress: support lzo-rle compress algorithmChao Yu1-0/+30
LZO-RLE extension (run length encoding) was introduced to improve performance of LZO algorithm in scenario of data contains many zeros, zram has changed to use this extended algorithm by default, this patch adds to support this algorithm extension, to enable this extension, it needs to enable F2FS_FS_LZO and F2FS_FS_LZORLE config, and specifies "compress_algorithm=lzo-rle" mountoption. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-12f2fs: introduce mempool for {,de}compress intermediate page allocationChao Yu1-22/+42
If compression feature is on, in scenario of no enough free memory, page refault ratio is higher than before, the root cause is: - {,de}compression flow needs to allocate intermediate pages to store compressed data in cluster, so during their allocation, vm may reclaim mmaped pages. - if above reclaimed pages belong to compressed cluster, during its refault, it may cause more intermediate pages allocation, result in reclaiming more mmaped pages. So this patch introduces a mempool for intermediate page allocation, in order to avoid high refault ratio, by default, number of preallocated page in pool is 512, user can change the number by assigning 'num_compress_pages' parameter during module initialization. Ma Feng found warnings in the original patch and fixed like below. Fix the following sparse warning: fs/f2fs/compress.c:501:5: warning: symbol 'num_compress_pages' was not declared. Should it be static? fs/f2fs/compress.c:530:6: warning: symbol 'f2fs_compress_free_page' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Ma Feng <mafeng.ma@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-05-08f2fs: support partial truncation on compressed inodeChao Yu1-0/+49
Supports to truncate compressed/normal cluster partially on compressed inode. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-04-24f2fs: fix quota_sync failure due to f2fs_lock_opJaegeuk Kim1-3/+5
f2fs_quota_sync() uses f2fs_lock_op() before flushing dirty pages, but f2fs_write_data_page() returns EAGAIN. Likewise dentry blocks, we can just bypass getting the lock, since quota blocks are also maintained by checkpoint. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-04-03f2fs: fix to verify tpage before releasing in f2fs_free_dic()Chao Yu1-0/+2
In below error path, tpages[i] could be NULL, fix to check it before releasing it. - f2fs_read_multi_pages - f2fs_alloc_dic - f2fs_free_dic Fixes: 61fbae2b2b12 ("f2fs: fix to avoid NULL pointer dereference") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-04-03f2fs: clean up dic->tpages assignmentChao Yu1-7/+3
Just cleanup, no logic change. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-04-03f2fs: compress: support zstd compress algorithmChao Yu1-0/+165
Add zstd compress algorithm support, use "compress_algorithm=zstd" mountoption to enable it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-31f2fs: compress: add .{init,destroy}_decompress_ctx callbackChao Yu1-6/+21
Add below two callback interfaces in struct f2fs_compress_ops: int (*init_decompress_ctx)(struct decompress_io_ctx *dic); void (*destroy_decompress_ctx)(struct decompress_io_ctx *dic); Which will be used by zstd compress algorithm later. In addition, this patch adds callback function pointer check, so that specified algorithm can avoid defining unneeded functions. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-31f2fs: compress: fix to call missing destroy_compress_ctx()Chao Yu1-0/+2
Otherwise, it will cause memory leak. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-31f2fs: clean up {cic,dic}.ref handlingChao Yu1-10/+6
{cic,dic}.ref should be initialized to number of compressed pages, let's initialize it directly rather than doing w/ f2fs_set_compressed_page(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-31f2fs: fix NULL pointer dereference in f2fs_verity_work()Chao Yu1-0/+2
If both compression and fsverity feature is on, generic/572 will report below NULL pointer dereference bug. BUG: kernel NULL pointer dereference, address: 0000000000000018 RIP: 0010:f2fs_verity_work+0x60/0x90 [f2fs] #PF: supervisor read access in kernel mode Workqueue: fsverity_read_queue f2fs_verity_work [f2fs] RIP: 0010:f2fs_verity_work+0x60/0x90 [f2fs] Call Trace: process_one_work+0x16c/0x3f0 worker_thread+0x4c/0x440 ? rescuer_thread+0x350/0x350 kthread+0xf8/0x130 ? kthread_unpark+0x70/0x70 ret_from_fork+0x35/0x40 There are two issue in f2fs_verity_work(): - it needs to traverse and verify all pages in bio. - if pages in bio belong to non-compressed cluster, accessing decompress IO context stored in page private will cause NULL pointer dereference. Fix them. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-31f2fs: fix to clear PG_error if fsverity failedChao Yu1-8/+10
In f2fs_decompress_end_io(), we should clear PG_error flag before page unlock, otherwise reread will fail due to the flag as described in commit fb7d70db305a ("f2fs: clear PageError on the read path"). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-31f2fs: fix potential deadlock on compressed quota fileChao Yu1-5/+10
generic/232 reports below deadlock: fsstress D 0 96980 96969 0x00084000 Call Trace: schedule+0x4a/0xb0 io_schedule+0x12/0x40 __lock_page+0x127/0x1d0 pagecache_get_page+0x1d8/0x250 prepare_compress_overwrite+0xe0/0x490 [f2fs] f2fs_prepare_compress_overwrite+0x5d/0x80 [f2fs] f2fs_write_begin+0x833/0xb90 [f2fs] f2fs_quota_write+0x145/0x1e0 [f2fs] write_blk+0x36/0x80 [quota_tree] do_insert_tree+0x2ac/0x4a0 [quota_tree] do_insert_tree+0x26e/0x4a0 [quota_tree] qtree_write_dquot+0x70/0x190 [quota_tree] v2_write_dquot+0x43/0x90 [quota_v2] dquot_acquire+0x77/0x100 f2fs_dquot_acquire+0x2f/0x60 [f2fs] dqget+0x310/0x450 dquot_transfer+0xb2/0x120 f2fs_setattr+0x11a/0x4a0 [f2fs] notify_change+0x349/0x480 chown_common+0x168/0x1c0 do_fchownat+0xbc/0xf0 __x64_sys_lchown+0x21/0x30 do_syscall_64+0x5f/0x220 entry_SYSCALL_64_after_hwframe+0x44/0xa9 task PC stack pid father kworker/u256:0 D 0 103444 2 0x80084000 Workqueue: writeback wb_workfn (flush-251:1) Call Trace: schedule+0x4a/0xb0 schedule_timeout+0x15e/0x2f0 io_schedule_timeout+0x19/0x40 congestion_wait+0x7e/0x120 f2fs_write_multi_pages+0x12a/0x840 [f2fs] f2fs_write_cache_pages+0x48f/0x790 [f2fs] f2fs_write_data_pages+0x2db/0x330 [f2fs] do_writepages+0x1a/0x60 __writeback_single_inode+0x3d/0x340 writeback_sb_inodes+0x225/0x4a0 wb_writeback+0xf7/0x320 wb_workfn+0xba/0x470 process_one_work+0x16c/0x3f0 worker_thread+0x4c/0x440 kthread+0xf8/0x130 ret_from_fork+0x35/0x40 fsstress D 0 5277 5266 0x00084000 Call Trace: schedule+0x4a/0xb0 rwsem_down_write_slowpath+0x29d/0x540 block_operations+0x105/0x360 [f2fs] f2fs_write_checkpoint+0x101/0x1010 [f2fs] f2fs_sync_fs+0xa8/0x130 [f2fs] f2fs_do_sync_file+0x1ad/0x890 [f2fs] do_fsync+0x38/0x60 __x64_sys_fdatasync+0x13/0x20 do_syscall_64+0x5f/0x220 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The root cause is there is potential deadlock between quota data update and writeback. Kworker Thread B Thread C - f2fs_write_cache_pages - lock whole cluster --- A - f2fs_write_multi_pages - f2fs_write_raw_pages - f2fs_write_single_data_page - f2fs_do_write_data_page - f2fs_setattr - f2fs_lock_op --- B - f2fs_write_checkpoint - block_operations - f2fs_lock_all --- B - dquot_transfer - f2fs_quota_write - f2fs_prepare_compress_overwrite - pagecache_get_page --- A - f2fs_trylock_op failed --- B - congestion_wait - goto rewrite To fix this issue, during quota file writeback, just redirty all pages left in cluster rather holding pages' lock in cluster and looping retrying lock cp_rwsem. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-23f2fs: fix to account compressed blocks in f2fs_compressed_blocks()Chao Yu1-6/+22
por_fsstress reports inconsistent status in orphan inode, the root cause of this is in f2fs_write_raw_pages() we decrease i_compr_blocks incorrectly due to wrong calculation in f2fs_compressed_blocks(). So this patch exposes below two functions based on __f2fs_cluster_blocks: - f2fs_compressed_blocks: get count of compressed blocks in compressed cluster - f2fs_cluster_blocks: get count of valid blocks (including reserved blocks) in compressed cluster. Then use f2fs_compress_blocks() to get correct compressed blocks count in f2fs_write_raw_pages(). sanity_check_inode: inode (ino=ad80) hash inconsistent i_compr_blocks:2, i_blocks:1, run fsck to fix Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-19f2fs: fix to avoid triggering IO in write pathChao Yu1-1/+1
If we are in write IO path, we need to avoid using GFP_KERNEL. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-19f2fs: introduce DEFAULT_IO_TIMEOUTChao Yu1-1/+2
As Geert Uytterhoeven reported: for parameter HZ/50 in congestion_wait(BLK_RW_ASYNC, HZ/50); On some platforms, HZ can be less than 50, then unexpected 0 timeout jiffies will be set in congestion_wait(). This patch introduces a macro DEFAULT_IO_TIMEOUT to wrap a determinate value with msecs_to_jiffies(20) to instead HZ/50 to avoid such issue. Quoted from Geert Uytterhoeven: "A timeout of HZ means 1 second. HZ/50 means 20 ms, but has the risk of being zero, if HZ < 50. If you want to use a timeout of 20 ms, you best use msecs_to_jiffies(20), as that takes care of the special cases, and never returns 0." Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-19f2fs: clean up codes with {f2fs_,}data_blkaddr()Chao Yu1-4/+3
- rename datablock_addr() to data_blkaddr(). - wrap data_blkaddr() with f2fs_data_blkaddr() to clean up parameters. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-10f2fs: fix to avoid use-after-free in f2fs_write_multi_pages()Chao Yu1-1/+1
In compress cluster, if physical block number is less than logic page number, race condition will cause use-after-free issue as described below: - f2fs_write_compressed_pages - fio.page = cic->rpages[0]; - f2fs_outplace_write_data - f2fs_compress_write_end_io - kfree(cic->rpages); - kfree(cic); - fio.page = cic->rpages[1]; f2fs_write_multi_pages+0xfd0/0x1a98 f2fs_write_data_pages+0x74c/0xb5c do_writepages+0x64/0x108 __writeback_single_inode+0xdc/0x4b8 writeback_sb_inodes+0x4d0/0xa68 __writeback_inodes_wb+0x88/0x178 wb_writeback+0x1f0/0x424 wb_workfn+0x2f4/0x574 process_one_work+0x210/0x48c worker_thread+0x2e8/0x44c kthread+0x110/0x120 ret_from_fork+0x10/0x18 Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-03-10f2fs: cover last_disk_size update with spinlockChao Yu1-2/+2
This change solves below hangtask issue: INFO: task kworker/u16:1:58 blocked for more than 122 seconds. Not tainted 5.6.0-rc2-00590-g9983bdae4974e #11 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/u16:1 D 0 58 2 0x00000000 Workqueue: writeback wb_workfn (flush-179:0) Backtrace: (__schedule) from [<c0913234>] (schedule+0x78/0xf4) (schedule) from [<c017ec74>] (rwsem_down_write_slowpath+0x24c/0x4c0) (rwsem_down_write_slowpath) from [<c0915f2c>] (down_write+0x6c/0x70) (down_write) from [<c0435b80>] (f2fs_write_single_data_page+0x608/0x7ac) (f2fs_write_single_data_page) from [<c0435fd8>] (f2fs_write_cache_pages+0x2b4/0x7c4) (f2fs_write_cache_pages) from [<c043682c>] (f2fs_write_data_pages+0x344/0x35c) (f2fs_write_data_pages) from [<c0267ee8>] (do_writepages+0x3c/0xd4) (do_writepages) from [<c0310cbc>] (__writeback_single_inode+0x44/0x454) (__writeback_single_inode) from [<c03112d0>] (writeback_sb_inodes+0x204/0x4b0) (writeback_sb_inodes) from [<c03115cc>] (__writeback_inodes_wb+0x50/0xe4) (__writeback_inodes_wb) from [<c03118f4>] (wb_writeback+0x294/0x338) (wb_writeback) from [<c0312dac>] (wb_workfn+0x35c/0x54c) (wb_workfn) from [<c014f2b8>] (process_one_work+0x214/0x544) (process_one_work) from [<c014f634>] (worker_thread+0x4c/0x574) (worker_thread) from [<c01564fc>] (kthread+0x144/0x170) (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c) Reported-and-tested-by: Ondřej Jirman <megi@xff.cz> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-02-27f2fs: fix to avoid potential deadlockChao Yu1-3/+3
Using f2fs_trylock_op() in f2fs_write_compressed_pages() to avoid potential deadlock like we did in f2fs_write_single_data_page(). Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-02-27f2fs: recycle unused compress_data.chksum feildChao Yu1-1/+0
In Struct compress_data, chksum field was never used, remove it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-02-27f2fs: fix to avoid NULL pointer dereferenceChao Yu1-1/+2
Unable to handle kernel NULL pointer dereference at virtual address 00000000 PC is at f2fs_free_dic+0x60/0x2c8 LR is at f2fs_decompress_pages+0x3c4/0x3e8 f2fs_free_dic+0x60/0x2c8 f2fs_decompress_pages+0x3c4/0x3e8 __read_end_io+0x78/0x19c f2fs_post_read_work+0x6c/0x94 process_one_work+0x210/0x48c worker_thread+0x2e8/0x44c kthread+0x110/0x120 ret_from_fork+0x10/0x18 In f2fs_free_dic(), we can not use f2fs_put_page(,1) to release dic->tpages[i], as the page's mapping is NULL. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-02-27f2fs: fix leaking uninitialized memory in compressed clustersEric Biggers1-2/+6
When the compressed data of a cluster doesn't end on a page boundary, the remainder of the last page must be zeroed in order to avoid leaking uninitialized memory to disk. Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-01-18f2fs: support data compressionChao Yu1-0/+1176
This patch tries to support compression in f2fs. - New term named cluster is defined as basic unit of compression, file can be divided into multiple clusters logically. One cluster includes 4 << n (n >= 0) logical pages, compression size is also cluster size, each of cluster can be compressed or not. - In cluster metadata layout, one special flag is used to indicate cluster is compressed one or normal one, for compressed cluster, following metadata maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs stores data including compress header and compressed data. - In order to eliminate write amplification during overwrite, F2FS only support compression on write-once file, data can be compressed only when all logical blocks in file are valid and cluster compress ratio is lower than specified threshold. - To enable compression on regular inode, there are three ways: * chattr +c file * chattr +c dir; touch dir/file * mount w/ -o compress_extension=ext; touch file.ext Compress metadata layout: [Dnode Structure] +-----------------------------------------------+ | cluster 1 | cluster 2 | ......... | cluster N | +-----------------------------------------------+ . . . . . . . . . Compressed Cluster . . Normal Cluster . +----------+---------+---------+---------+ +---------+---------+---------+---------+ |compr flag| block 1 | block 2 | block 3 | | block 1 | block 2 | block 3 | block 4 | +----------+---------+---------+---------+ +---------+---------+---------+---------+ . . . . . . +-------------+-------------+----------+----------------------------+ | data length | data chksum | reserved | compressed data | +-------------+-------------+----------+----------------------------+ Changelog: 20190326: - fix error handling of read_end_io(). - remove unneeded comments in f2fs_encrypt_one_page(). 20190327: - fix wrong use of f2fs_cluster_is_full() in f2fs_mpage_readpages(). - don't jump into loop directly to avoid uninitialized variables. - add TODO tag in error path of f2fs_write_cache_pages(). 20190328: - fix wrong merge condition in f2fs_read_multi_pages(). - check compressed file in f2fs_post_read_required(). 20190401 - allow overwrite on non-compressed cluster. - check cluster meta before writing compressed data. 20190402 - don't preallocate blocks for compressed file. - add lz4 compress algorithm - process multiple post read works in one workqueue Now f2fs supports processing post read work in multiple workqueue, it shows low performance due to schedule overhead of multiple workqueue executing orderly. 20190921 - compress: support buffered overwrite C: compress cluster flag V: valid block address N: NEW_ADDR One cluster contain 4 blocks before overwrite after overwrite - VVVV -> CVNN - CVNN -> VVVV - CVNN -> CVNN - CVNN -> CVVV - CVVV -> CVNN - CVVV -> CVVV 20191029 - add kconfig F2FS_FS_COMPRESSION to isolate compression related codes, add kconfig F2FS_FS_{LZO,LZ4} to cover backend algorithm. note that: will remove lzo backend if Jaegeuk agreed that too. - update codes according to Eric's comments. 20191101 - apply fixes from Jaegeuk 20191113 - apply fixes from Jaegeuk - split workqueue for fsverity 20191216 - apply fixes from Jaegeuk 20200117 - fix to avoid NULL pointer dereference [Jaegeuk Kim] - add tracepoint for f2fs_{,de}compress_pages() - fix many bugs and add some compression stats - fix overwrite/mmap bugs - address 32bit build error, reported by Geert. - bug fixes when handling errors and i_compressed_blocks Reported-by: <noreply@ellerman.id.au> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>