summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2024-04-09nilfs2: fix out-of-range warningArnd Bergmann1-1/+1
clang-14 points out that v_size is always smaller than a 64KB page size if that is configured by the CPU architecture: fs/nilfs2/ioctl.c:63:19: error: result of comparison of constant 65536 with expression of type '__u16' (aka 'unsigned short') is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (argv->v_size > PAGE_SIZE) ~~~~~~~~~~~~ ^ ~~~~~~~~~ This is ok, so just shut up that warning with a cast. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240328143051.1069575-7-arnd@kernel.org Fixes: 3358b4aaa84f ("nilfs2: fix problems of memory allocation in ioctl") Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-07fs: claw back a few FMODE_* bitsChristian Brauner5-10/+13
There's a bunch of flags that are purely based on what the file operations support while also never being conditionally set or unset. IOW, they're not subject to change for individual files. Imho, such flags don't need to live in f_mode they might as well live in the fops structs itself. And the fops struct already has that lonely mmap_supported_flags member. We might as well turn that into a generic fop_flags member and move a few flags from FMODE_* space into FOP_* space. That gets us four FMODE_* bits back and the ability for new static flags that are about file ops to not have to live in FMODE_* space but in their own FOP_* space. It's not the most beautiful thing ever but it gets the job done. Yes, there'll be an additional pointer chase but hopefully that won't matter for these flags. I suspect there's a few more we can move into there and that we can also redirect a bunch of new flag suggestions that follow this pattern into the fop_flags field instead of f_mode. Link: https://lore.kernel.org/r/20240328-gewendet-spargel-aa60a030ef74@brauner Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs: Annotate struct file_handle with __counted_by() and use struct_size()Gustavo A. R. Silva1-3/+3
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). While there, use struct_size() helper, instead of the open-coded version. [brauner@kernel.org: contains a fix by Edward for an OOB access] Reported-by: syzbot+4139435cb1b34cf759c2@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Link: https://lore.kernel.org/r/tencent_A7845DD769577306D813742365E976E3A205@qq.com Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/ZgImCXTdGDTeBvSS@neat Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs: aio: convert to ring_folios and internal_foliosKefeng Wang1-31/+31
Since aio use folios in most functions, convert ring/internal_pages to ring/internal_folios, let's directly use folio instead of page throughout aio to remove hidden calls to compound_head(), eg, flush_dcache_page(). Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20240321131640.948634-4-wangkefeng.wang@huawei.com Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs: aio: use a folio in aio_free_ring()Kefeng Wang1-6/+7
Use a folio throughout aio_free_ring() to remove calls to compound_head(), also move pr_debug after folio check to remove unnecessary print. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20240321131640.948634-3-wangkefeng.wang@huawei.com Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs: aio: use a folio in aio_setup_ring()Kefeng Wang1-9/+11
Use a folio throughout aio_setup_ring() to remove calls to compound_head(), also use folio_end_read() to simultaneously mark the folio uptodate and unlock it. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20240321131640.948634-2-wangkefeng.wang@huawei.com Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05ecryptfs: Fix buffer size for tag 66 packetBrian Kubisiak1-1/+3
The 'TAG 66 Packet Format' description is missing the cipher code and checksum fields that are packed into the message packet. As a result, the buffer allocated for the packet is 3 bytes too small and write_tag_66_packet() will write up to 3 bytes past the end of the buffer. Fix this by increasing the size of the allocation so the whole packet will always fit in the buffer. This fixes the below kasan slab-out-of-bounds bug: BUG: KASAN: slab-out-of-bounds in ecryptfs_generate_key_packet_set+0x7d6/0xde0 Write of size 1 at addr ffff88800afbb2a5 by task touch/181 CPU: 0 PID: 181 Comm: touch Not tainted 6.6.13-gnu #1 4c9534092be820851bb687b82d1f92a426598dc6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2/GNU Guix 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4c/0x70 print_report+0xc5/0x610 ? ecryptfs_generate_key_packet_set+0x7d6/0xde0 ? kasan_complete_mode_report_info+0x44/0x210 ? ecryptfs_generate_key_packet_set+0x7d6/0xde0 kasan_report+0xc2/0x110 ? ecryptfs_generate_key_packet_set+0x7d6/0xde0 __asan_store1+0x62/0x80 ecryptfs_generate_key_packet_set+0x7d6/0xde0 ? __pfx_ecryptfs_generate_key_packet_set+0x10/0x10 ? __alloc_pages+0x2e2/0x540 ? __pfx_ovl_open+0x10/0x10 [overlay 30837f11141636a8e1793533a02e6e2e885dad1d] ? dentry_open+0x8f/0xd0 ecryptfs_write_metadata+0x30a/0x550 ? __pfx_ecryptfs_write_metadata+0x10/0x10 ? ecryptfs_get_lower_file+0x6b/0x190 ecryptfs_initialize_file+0x77/0x150 ecryptfs_create+0x1c2/0x2f0 path_openat+0x17cf/0x1ba0 ? __pfx_path_openat+0x10/0x10 do_filp_open+0x15e/0x290 ? __pfx_do_filp_open+0x10/0x10 ? __kasan_check_write+0x18/0x30 ? _raw_spin_lock+0x86/0xf0 ? __pfx__raw_spin_lock+0x10/0x10 ? __kasan_check_write+0x18/0x30 ? alloc_fd+0xf4/0x330 do_sys_openat2+0x122/0x160 ? __pfx_do_sys_openat2+0x10/0x10 __x64_sys_openat+0xef/0x170 ? __pfx___x64_sys_openat+0x10/0x10 do_syscall_64+0x60/0xd0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 RIP: 0033:0x7f00a703fd67 Code: 25 00 00 41 00 3d 00 00 41 00 74 37 64 8b 04 25 18 00 00 00 85 c0 75 5b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 85 00 00 00 48 83 c4 68 5d 41 5c c3 0f 1f RSP: 002b:00007ffc088e30b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 00007ffc088e3368 RCX: 00007f00a703fd67 RDX: 0000000000000941 RSI: 00007ffc088e48d7 RDI: 00000000ffffff9c RBP: 00007ffc088e48d7 R08: 0000000000000001 R09: 0000000000000000 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 0000000000000000 R14: 00007ffc088e48d7 R15: 00007f00a7180040 </TASK> Allocated by task 181: kasan_save_stack+0x2f/0x60 kasan_set_track+0x29/0x40 kasan_save_alloc_info+0x25/0x40 __kasan_kmalloc+0xc5/0xd0 __kmalloc+0x66/0x160 ecryptfs_generate_key_packet_set+0x6d2/0xde0 ecryptfs_write_metadata+0x30a/0x550 ecryptfs_initialize_file+0x77/0x150 ecryptfs_create+0x1c2/0x2f0 path_openat+0x17cf/0x1ba0 do_filp_open+0x15e/0x290 do_sys_openat2+0x122/0x160 __x64_sys_openat+0xef/0x170 do_syscall_64+0x60/0xd0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: dddfa461fc89 ("[PATCH] eCryptfs: Public key; packet management") Signed-off-by: Brian Kubisiak <brian@kubisiak.com> Link: https://lore.kernel.org/r/5j2q56p6qkhezva6b2yuqfrsurmvrrqtxxzrnp3wqu7xrz22i7@hoecdztoplbl Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs/writeback: remove unnecessary return in writeback_inodes_sbKemeng Shi1-1/+1
writeback_inodes_sb doesn't have return value, just remove unnecessary return in it. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20240228091958.288260-7-shikemeng@huaweicloud.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs/writeback: correct comment of __wakeup_flusher_threads_bdiKemeng Shi1-2/+1
Commit e8e8a0c6c9bfc ("writeback: move nr_pages == 0 logic to one location") removed parameter nr_pages of __wakeup_flusher_threads_bdi and we try to writeback all dirty pages in __wakeup_flusher_threads_bdi now. Just correct stale comment. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20240228091958.288260-6-shikemeng@huaweicloud.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs/writeback: only calculate dirtied_before when b_io is emptyKemeng Shi1-12/+13
The dirtied_before is only used when b_io is not empty, so only calculate when b_io is not empty. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20240228091958.288260-5-shikemeng@huaweicloud.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs/writeback: remove unused parameter wb of finish_writeback_workKemeng Shi1-4/+3
Remove unused parameter wb of finish_writeback_work. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20240228091958.288260-4-shikemeng@huaweicloud.com Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-05fs/writeback: bail out if there is no more inodes for IO and queued onceKemeng Shi1-2/+5
For case there is no more inodes for IO in io list from last wb_writeback, We may bail out early even there is inode in dirty list should be written back. Only bail out when we queued once to avoid missing dirtied inode. This is from code reading... Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20240228091958.288260-3-shikemeng@huaweicloud.com Reviewed-by: Jan Kara <jack@suse.cz> [brauner@kernel.org: fold in memory corruption fix from Jan in [1]] Link: https://lore.kernel.org/r/20240405132346.bid7gibby3lxxhez@quack3 [1] Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-26fs/writeback: avoid to writeback non-expired inode in kupdate writebackKemeng Shi1-3/+10
In kupdate writeback, only expired inode (have been dirty for longer than dirty_expire_interval) is supposed to be written back. However, kupdate writeback will writeback non-expired inode left in b_io or b_more_io from last wb_writeback. As a result, writeback will keep being triggered unexpected when we keep dirtying pages even dirty memory is under threshold and inode is not expired. To be more specific: Assume dirty background threshold is > 1G and dirty_expire_centisecs is > 60s. When we running fio -size=1G -invalidate=0 -ioengine=libaio --time_based -runtime=60... (keep dirtying), the writeback will keep being triggered as following: wb_workfn wb_do_writeback wb_check_background_flush /* * Wb dirty background threshold starts at 0 if device was idle and * grows up when bandwidth of wb is updated. So a background * writeback is triggered. */ wb_over_bg_thresh /* * Dirtied inode will be written back and added to b_more_io list * after slice used up (because we keep dirtying the inode). */ wb_writeback Writeback is triggered per dirty_writeback_centisecs as following: wb_workfn wb_do_writeback wb_check_old_data_flush /* * Write back inode left in b_io and b_more_io from last wb_writeback * even the inode is non-expired and it will be added to b_more_io * again as slice will be used up (because we keep dirtying the * inode) */ wb_writeback Fix this by moving non-expired inode to dirty list instead of more io list for kupdate writeback in requeue_inode. Test as following: /* make it more easier to observe the issue */ echo 300000 > /proc/sys/vm/dirty_expire_centisecs echo 100 > /proc/sys/vm/dirty_writeback_centisecs /* create a idle device */ mkfs.ext4 -F /dev/vdb mount /dev/vdb /bdi1/ /* run buffer write with fio */ fio -name test -filename=/bdi1/file -size=800M -ioengine=libaio -bs=4K \ -iodepth=1 -rw=write -direct=0 --time_based -runtime=60 -invalidate=0 Fio result before fix (run three tests): 1360MB/s 1329MB/s 1455MB/s Fio result after fix (run three tests): 1737MB/s 1729MB/s 1789MB/s Writeback for non-expired inode is gone as expeted. Observe this with trace writeback_start and writeback_written as following: echo 1 > /sys/kernel/debug/tracing/events/writeback/writeback_start/enab echo 1 > /sys/kernel/debug/tracing/events/writeback/writeback_written/enable cat /sys/kernel/tracing/trace_pipe Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20240228091958.288260-2-shikemeng@huaweicloud.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-26fs: Add kernel-doc comments to proc_create_net_data_write()Yang Li1-0/+1
This commit adds kernel-doc style comments with complete parameter descriptions for the function proc_create_net_data_write. Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20240315073805.77463-1-yang.lee@linux.alibaba.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-26fs_parser: move fsparam_string_empty() helper into headerLuis Henriques (SUSE)2-8/+0
Since both ext4 and overlayfs define the same macro to specify string parameters that may allow empty values, define it in an header file so that this helper can be shared. Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Link: https://lore.kernel.org/r/20240312104757.27333-1-luis.henriques@linux.dev Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-26statx: stx_subvolKent Overstreet3-0/+7
Add a new statx field for (sub)volume identifiers, as implemented by btrfs and bcachefs. This includes bcachefs support; we'll definitely want btrfs support as well. Link: https://lore.kernel.org/linux-fsdevel/2uvhm6gweyl7iyyp2xpfryvcu2g3padagaeqcbiavjyiis6prl@yjm725bizncq/ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Link: https://lore.kernel.org/r/20240308022914.196982-1-kent.overstreet@linux.dev Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-03-22Merge tag 'ceph-for-6.9-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds2-13/+26
Pull ceph updates from Ilya Dryomov: "A patch to minimize blockage when processing very large batches of dirty caps and two fixes to better handle EOF in the face of multiple clients performing reads and size-extending writes at the same time" * tag 'ceph-for-6.9-rc1' of https://github.com/ceph/ceph-client: ceph: set correct cap mask for getattr request for read ceph: stop copying to iter at EOF on sync reads ceph: remove SLAB_MEM_SPREAD flag usage ceph: break the check delayed cap loop every 5s
2024-03-22Merge tag 'xfs-6.9-merge-9' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds3-9/+22
Pull xfs fixes from Chandan Babu: - Fix invalid pointer dereference by initializing xmbuf before tracepoint function is invoked - Use memalloc_nofs_save() when inserting into quota radix tree * tag 'xfs-6.9-merge-9' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: quota radix tree allocations need to be NOFS on insert xfs: fix dev_t usage in xmbuf tracepoints
2024-03-22Merge tag '6.9-rc-smb3-client-fixes-part2' of ↵Linus Torvalds11-39/+54
git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - Various get_inode_info_fixes - Fix for querying xattrs of cached dirs - Four minor cleanup fixes (including adding some header corrections and a missing flag) - Performance improvement for deferred close - Two query interface fixes * tag '6.9-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: smb311: additional compression flag defined in updated protocol spec smb311: correct incorrect offset field in compression header cifs: Move some extern decls from .c files to .h cifs: remove redundant variable assignment cifs: fixes for get_inode_info cifs: open_cached_dir(): add FILE_READ_EA to desired access cifs: reduce warning log level for server not advertising interfaces cifs: make sure server interfaces are requested only for SMB3+ cifs: defer close file handles having RH lease
2024-03-22Merge tag 'ubifs-for-linus-6.9-rc1' of ↵Linus Torvalds11-274/+428
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull UBI and UBIFS updates from Richard Weinberger: "UBI: - Add Zhihao Cheng as reviewer - Attach via device tree - Add NVMEM layer - Various fastmap related fixes UBIFS: - Add Zhihao Cheng as reviewer - Convert to folios - Various fixes (memory leaks in error paths, function prototypes)" * tag 'ubifs-for-linus-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: (34 commits) mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems mtd: ubi: provide NVMEM layer over UBI volumes mtd: ubi: populate ubi volume fwnode mtd: ubi: introduce pre-removal notification for UBI volumes mtd: ubi: attach from device tree mtd: ubi: block: use notifier to create ubiblock from parameter dt-bindings: mtd: ubi-volume: allow UBI volumes to provide NVMEM dt-bindings: mtd: add basic bindings for UBI ubifs: Queue up space reservation tasks if retrying many times ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path ubifs: dbg_check_idx_size: Fix kmemleak if loading znode failed ubi: Correct the number of PEBs after a volume resize failure ubi: fix slab-out-of-bounds in ubi_eba_get_ldesc+0xfb/0x130 ubi: correct the calculation of fastmap size ubifs: Remove unreachable code in dbg_check_ltab_lnum ubifs: fix function pointer cast warnings ubifs: fix sort function prototype ubi: Check for too small LEB size in VTBL code MAINTAINERS: Add Zhihao Cheng as UBI/UBIFS reviewer ubifs: Convert populate_page() to take a folio ...
2024-03-21Merge tag 'driver-core-6.9-rc1' of ↵Linus Torvalds5-33/+88
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the "big" set of driver core and kernfs changes for 6.9-rc1. Nothing all that crazy here, just some good updates that include: - automatic attribute group hiding from Dan Williams (he fixed up my horrible attempt at doing this.) - kobject lock contention fixes from Eric Dumazet - driver core cleanups from Andy - kernfs rcu work from Tejun - fw_devlink changes to resolve some reported issues - other minor changes, all details in the shortlog All of these have been in linux-next for a long time with no reported issues" * tag 'driver-core-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (28 commits) device: core: Log warning for devices pending deferred probe on timeout driver: core: Use dev_* instead of pr_* so device metadata is added driver: core: Log probe failure as error and with device metadata of: property: fw_devlink: Add support for "post-init-providers" property driver core: Add FWLINK_FLAG_IGNORE to completely ignore a fwnode link driver core: Adds flags param to fwnode_link_add() debugfs: fix wait/cancellation handling during remove device property: Don't use "proxy" headers device property: Move enum dev_dma_attr to fwnode.h driver core: Move fw_devlink stuff to where it belongs driver core: Drop unneeded 'extern' keyword in fwnode.h firmware_loader: Suppress warning on FW_OPT_NO_WARN flag sysfs:Addresses documentation in sysfs_merge_group and sysfs_unmerge_group. firmware_loader: introduce __free() cleanup hanler platform-msi: Remove usage of the deprecated ida_simple_xx() API sysfs: Introduce DEFINE_SIMPLE_SYSFS_GROUP_VISIBLE() sysfs: Document new "group visible" helpers sysfs: Fix crash on empty group attributes array sysfs: Introduce a mechanism to hide static attribute_groups sysfs: Introduce a mechanism to hide static attribute_groups ...
2024-03-21Merge tag 'for-6.9-part2-tag' of ↵Linus Torvalds1-11/+47
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "Fix a problem found in 6.7 after adding the temp-fsid feature which changed device tracking in memory and broke grub-probe. This is used on initrd-less systems. There were several iterations of the fix and it took longer than expected" * tag 'for-6.9-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: do not skip re-registration for the mounted device
2024-03-21Merge tag 'exfat-for-6.9-rc1' of ↵Linus Torvalds4-376/+293
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Improve dirsync performance by syncing on a dentry-set rather than on a per-directory entry * tag 'exfat-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: remove duplicate update parent dir exfat: do not sync parent dir if just update timestamp exfat: remove unused functions exfat: convert exfat_find_empty_entry() to use dentry cache exfat: convert exfat_init_ext_entry() to use dentry cache exfat: move free cluster out of exfat_init_ext_entry() exfat: convert exfat_remove_entries() to use dentry cache exfat: convert exfat_add_entry() to use dentry cache exfat: add exfat_get_empty_dentry_set() helper exfat: add __exfat_get_dentry_set() helper
2024-03-21Merge tag 'v6.9-rc-smb3-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds15-147/+716
Pull smb server updates from Steve French: - add support for durable file handles (an important data integrity feature) - fixes for potential out of bounds issues - fix possible null dereference in close - getattr fixes - trivial typo fix and minor cleanup * tag 'v6.9-rc-smb3-server-fixes' of git://git.samba.org/ksmbd: ksmbd: remove module version ksmbd: fix potencial out-of-bounds when buffer offset is invalid ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16() ksmbd: Fix spelling mistake "connction" -> "connection" ksmbd: fix possible null-deref in smb_lazy_parent_lease_break_close ksmbd: add support for durable handles v1/v2 ksmbd: mark SMB2_SESSION_EXPIRED to session when destroying previous session ksmbd: retrieve number of blocks using vfs_getattr in set_file_allocation_info ksmbd: replace generic_fillattr with vfs_getattr
2024-03-20smb311: additional compression flag defined in updated protocol specSteve French1-4/+6
Added new compression flag that was recently documented, in addition fix some typos and clarify the sid_attr_data struct definition. Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-20smb311: correct incorrect offset field in compression headerSteve French1-1/+1
The offset field in the compression header is 32 bits not 16. Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reported-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-20cifs: Move some extern decls from .c files to .hDavid Howells4-10/+2
Move the following: extern mempool_t *cifs_sm_req_poolp; extern mempool_t *cifs_req_poolp; extern mempool_t *cifs_mid_poolp; extern bool disable_legacy_dialects; from various .c files to cifsglob.h. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-20Merge tag 'bcachefs-2024-03-19' of https://evilpiepirate.org/git/bcachefsLinus Torvalds26-111/+157
Pull bcachefs fixes from Kent Overstreet: "Assorted bugfixes. Most are fixes for simple assertion pops; the most significant fix is for a deadlock in recovery when we have to rewrite large numbers of btree nodes to fix errors. This was incorrectly running out of the same workqueue as the core interior btree update path - we now give it its own single threaded workqueue. This was visible to users as "bch2_btree_update_start(): error: BCH_ERR_journal_reclaim_would_deadlock" - and then recovery hanging" * tag 'bcachefs-2024-03-19' of https://evilpiepirate.org/git/bcachefs: bcachefs: Fix lost wakeup on journal shutdown bcachefs; Fix deadlock in bch2_btree_update_start() bcachefs: ratelimit errors from async_btree_node_rewrite bcachefs: Run check_topology() first bcachefs: Improve bch2_fatal_error() bcachefs: Fix lost transaction restart error bcachefs: Don't corrupt journal keys gap buffer when dropping alloc info bcachefs: fix for building in userspace bcachefs: bch2_snapshot_is_ancestor() now safe to call in early recovery bcachefs: Fix nested transaction restart handling in bch2_bucket_gens_init() bcachefs: Improve sysfs internal/btree_updates bcachefs: Split out btree_node_rewrite_worker bcachefs: Fix locking in bch2_alloc_write_key() bcachefs: Avoid extent entry type assertions in .invalid() bcachefs: Fix spurious -BCH_ERR_transaction_restart_nested bcachefs: Fix check_key_has_snapshot() call bcachefs: Change "accounting overran journal reservation" to a warning
2024-03-19ceph: set correct cap mask for getattr request for readXiubo Li1-3/+5
In case of hitting the file EOF, ceph_read_iter() needs to retrieve the file size from MDS, and Fr caps aren't neccessary. [ idryomov: fold into existing retry_op == READ_INLINE branch ] Reported-by: Frank Hsiao <frankhsiao@qnap.com> Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Frank Hsiao <frankhsiao@qnap.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-19ceph: stop copying to iter at EOF on sync readsXiubo Li1-10/+13
If EOF is encountered, ceph_sync_read() return value is adjusted down according to i_size, but the "to" iter is advanced by the actual number of bytes read. Then, when retrying, the remainder of the range may be skipped incorrectly. Ensure that the "to" iter is advanced only until EOF. [ idryomov: changelog ] Fixes: c3d8e0b5de48 ("ceph: return the real size read when it hits EOF") Reported-by: Frank Hsiao <frankhsiao@qnap.com> Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Frank Hsiao <frankhsiao@qnap.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-19exfat: remove duplicate update parent dirYuezhang Mo1-1/+2
For renaming, the directory only needs to be updated once if it is in the same directory. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: do not sync parent dir if just update timestampYuezhang Mo1-11/+8
When sync or dir_sync is enabled, there is no need to sync the parent directory's inode if only for updating its timestamp. 1. If an unexpected power failure occurs, the timestamp of the parent directory is not updated to the storage, which has no impact on the user. 2. The number of writes will be greatly reduced, which can not only improve performance, but also prolong device life. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: remove unused functionsYuezhang Mo3-64/+4
exfat_count_ext_entries() is no longer called, remove it. exfat_update_dir_chksum() is no longer called, remove it and rename exfat_update_dir_chksum_with_entry_set() to it. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: convert exfat_find_empty_entry() to use dentry cacheYuezhang Mo1-84/+42
Before this conversion, each dentry traversed needs to be read from the storage device or page cache. There are at least 16 dentries in a sector. This will result in frequent page cache searches. After this conversion, if all directory entries in a sector are used, the sector only needs to be read once. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: convert exfat_init_ext_entry() to use dentry cacheYuezhang Mo3-77/+33
Before this conversion, in exfat_init_ext_entry(), to init the dentries in a dentry set, the sync times is equals the dentry number if 'dirsync' or 'sync' is enabled. That affects not only performance but also device life. After this conversion, only needs to be synchronized once if 'dirsync' or 'sync' is enabled. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: move free cluster out of exfat_init_ext_entry()Yuezhang Mo2-5/+3
exfat_init_ext_entry() is an init function, it's a bit strange to free cluster in it. And the argument 'inode' will be removed from exfat_init_ext_entry(). So this commit changes to free the cluster in exfat_remove_entries(). Code refinement, no functional changes. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: convert exfat_remove_entries() to use dentry cacheYuezhang Mo3-115/+90
Before this conversion, in exfat_remove_entries(), to mark the dentries in a dentry set as deleted, the sync times is equals the dentry numbers if 'dirsync' or 'sync' is enabled. That affects not only performance but also device life. After this conversion, only needs to be synchronized once if 'dirsync' or 'sync' is enabled. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: convert exfat_add_entry() to use dentry cacheYuezhang Mo3-33/+22
After this conversion, if "dirsync" or "sync" is enabled, the number of synchronized dentries in exfat_add_entry() will change from 2 to 1. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: add exfat_get_empty_dentry_set() helperYuezhang Mo2-0/+82
This helper is used to lookup empty dentry set. If there are no enough empty dentries at the input location, this helper will return the number of dentries that need to be skipped for the next lookup. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19exfat: add __exfat_get_dentry_set() helperYuezhang Mo2-22/+43
Since exfat_get_dentry_set() invokes the validate functions of exfat_validate_entry(), it only supports getting a directory entry set of an existing file, doesn't support getting an empty entry set. To remove the limitation, add this helper. Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com> Reviewed-by: Andy Wu <Andy.Wu@sony.com> Reviewed-by: Aoyama Wataru <wataru.aoyama@sony.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
2024-03-19bcachefs: Fix lost wakeup on journal shutdownKent Overstreet1-6/+6
We need to check for journal shutdown first in __journal_res_get() - after the journal is shutdown, j->watermark won't be changing anymore. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-19bcachefs; Fix deadlock in bch2_btree_update_start()Kent Overstreet1-4/+9
BCH_TRANS_COMMIT_journal_reclaim with watermark != BCH_WATERMARK_reclaim means nonblocking, and we need the journal_res_get() in btree_update_start() to respect that. In a future refactoring we'll be deleting BCH_TRANS_COMMIT_journal_reclaim and replacing it with an explicit BCH_TRANS_COMMIT_nonblocking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-19ksmbd: remove module versionNamjae Jeon2-3/+0
ksmbd module version marking is not needed. Since there is a Linux kernel version, there is no point in increasing it anymore. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-19ksmbd: fix potencial out-of-bounds when buffer offset is invalidNamjae Jeon2-29/+42
I found potencial out-of-bounds when buffer offset fields of a few requests is invalid. This patch set the minimum value of buffer offset field to ->Buffer offset to validate buffer length. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-19Merge tag 'dlm-6.9' of ↵Linus Torvalds3-39/+81
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: - Fix mistaken variable assignment that caused a refcounting problem - Revert a recent change that began using atomic counters where they were not needed (for lkb wait_count) - Add comments around forced state reset for waiting lock operations during recovery * tag 'dlm-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: add comments about forced waiters reset dlm: revert atomic_t lkb_wait_count dlm: fix user space lkb refcounting
2024-03-19Merge tag 'trace-v6.9-2' of ↵Linus Torvalds3-19/+63
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: "Main user visible change: - User events can now have "multi formats" The current user events have a single format. If another event is created with a different format, it will fail to be created. That is, once an event name is used, it cannot be used again with a different format. This can cause issues if a library is using an event and updates its format. An application using the older format will prevent an application using the new library from registering its event. A task could also DOS another application if it knows the event names, and it creates events with different formats. The multi-format event is in a different name space from the single format. Both the event name and its format are the unique identifier. This will allow two different applications to use the same user event name but with different payloads. - Added support to have ftrace_dump_on_oops dump out instances and not just the main top level tracing buffer. Other changes: - Add eventfs_root_inode Only the root inode has a dentry that is static (never goes away) and stores it upon creation. There's no reason that the thousands of other eventfs inodes should have a pointer that never gets set in its descriptor. Create a eventfs_root_inode desciptor that has a eventfs_inode descriptor and a dentry pointer, and only the root inode will use this. - Added WARN_ON()s in eventfs There's some conditionals remaining in eventfs that should never be hit, but instead of removing them, add WARN_ON() around them to make sure that they are never hit. - Have saved_cmdlines allocation also include the map_cmdline_to_pid array The saved_cmdlines structure allocates a large amount of data to hold its mappings. Within it, it has three arrays. Two are already apart of it: map_pid_to_cmdline[] and saved_cmdlines[]. More memory can be saved by also including the map_cmdline_to_pid[] array as well. - Restructure __string() and __assign_str() macros used in TRACE_EVENT() Dynamic strings in TRACE_EVENT() are declared with: __string(name, source) And assigned with: __assign_str(name, source) In the tracepoint callback of the event, the __string() is used to get the size needed to allocate on the ring buffer and __assign_str() is used to copy the string into the ring buffer. There's a helper structure that is created in the TRACE_EVENT() macro logic that will hold the string length and its position in the ring buffer which is created by __string(). There are several trace events that have a function to create the string to save. This function is executed twice. Once for __string() and again for __assign_str(). There's no reason for this. The helper structure could also save the string it used in __string() and simply copy that into __assign_str() (it also already has its length). By using the structure to store the source string for the assignment, it means that the second argument to __assign_str() is no longer needed. It will be removed in the next merge window, but for now add a warning if the source string given to __string() is different than the source string given to __assign_str(), as the source to __assign_str() isn't even used and will be going away. - Added checks to make sure that the source of __string() is also the source of __assign_str() so that it can be safely removed in the next merge window. Included fixes that the above check found. - Other minor clean ups and fixes" * tag 'trace-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits) tracing: Add __string_src() helper to help compilers not to get confused tracing: Use strcmp() in __assign_str() WARN_ON() check tracepoints: Use WARN() and not WARN_ON() for warnings tracing: Use div64_u64() instead of do_div() tracing: Support to dump instance traces by ftrace_dump_on_oops tracing: Remove second parameter to __assign_rel_str() tracing: Add warning if string in __assign_str() does not match __string() tracing: Add __string_len() example tracing: Remove __assign_str_len() ftrace: Fix most kernel-doc warnings tracing: Decrement the snapshot if the snapshot trigger fails to register tracing: Fix snapshot counter going between two tracers that use it tracing: Use EVENT_NULL_STR macro instead of open coding "(null)" tracing: Use ? : shortcut in trace macros tracing: Do not calculate strlen() twice for __string() fields tracing: Rework __assign_str() and __string() to not duplicate getting the string cxl/trace: Properly initialize cxl_poison region name net: hns3: tracing: fix hclgevf trace event strings drm/i915: Add missing ; to __assign_str() macros in tracepoint code NFSD: Fix nfsd_clid_class use of __string_len() macro ...
2024-03-19ceph: remove SLAB_MEM_SPREAD flag usageChengming Zhou1-9/+9
The SLAB_MEM_SPREAD flag used to be implemented in SLAB, which was removed as of v6.8-rc1, so it became a dead flag since the commit 16a1d968358a ("mm/slab: remove mm/slab.c and slab_def.h"). And the series [1] went on to mark it obsolete to avoid confusion for users. Here we can just remove all its users, which has no functional change. [1] https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-19ceph: break the check delayed cap loop every 5sXiubo Li1-0/+8
In some cases this may take a long time and will block renewing the caps to MDS. [ idryomov: massage comment ] Link: https://tracker.ceph.com/issues/50223#note-21 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-18Merge tag 'for-linus-6.9-ofs1' of ↵Linus Torvalds3-13/+3
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs updates from Mike Marshall: "One fix, one cleanup... Fix: Julia Lawall pointed out a null pointer dereference. Cleanup: Vlastimil Babka sent me a patch to remove some SLAB related code" * tag 'for-linus-6.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: Julia Lawall reported this null pointer dereference, this should fix it. fs/orangefs: remove ORANGEFS_CACHE_CREATE_FLAGS
2024-03-18Merge tag 'f2fs-for-6.9-rc1' of ↵Linus Torvalds19-824/+999
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs update from Jaegeuk Kim: "In this round, there are a number of updates on mainly two areas: Zoned block device support and Per-file compression. For example, we've found several issues to support Zoned block device especially having large sections regarding to GC and file pinning used for Android devices. In compression side, we've fixed many corner race conditions that had broken the design assumption. Enhancements: - Support file pinning for Zoned block device having large section - Enhance the data recovery after sudden power cut on Zoned block device - Add more error injection cases to easily detect the kernel panics - add a proc entry show the entire disk layout - Improve various error paths paniced by BUG_ON in block allocation and GC - support SEEK_DATA and SEEK_HOLE for compression files Bug fixes: - avoid use-after-free issue in f2fs_filemap_fault - fix some race conditions to break the atomic write design assumption - fix to truncate meta inode pages forcely - resolve various per-file compression issues wrt the space management and compression policies - fix some swap-related bugs In addition, we removed deprecated codes such as io_bits and heap_allocation, and also fixed minor error handling routines with neat debugging messages" * tag 'f2fs-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits) f2fs: fix to avoid use-after-free issue in f2fs_filemap_fault f2fs: truncate page cache before clearing flags when aborting atomic write f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag f2fs: prevent atomic write on pinned file f2fs: fix to handle error paths of {new,change}_curseg() f2fs: unify the error handling of f2fs_is_valid_blkaddr f2fs: zone: fix to remove pow2 check condition for zoned block device f2fs: fix to truncate meta inode pages forcely f2fs: compress: fix reserve_cblocks counting error when out of space f2fs: compress: relocate some judgments in f2fs_reserve_compress_blocks f2fs: add a proc entry show disk layout f2fs: introduce SEGS_TO_BLKS/BLKS_TO_SEGS for cleanup f2fs: fix to check return value of f2fs_gc_range f2fs: fix to check return value __allocate_new_segment f2fs: fix to do sanity check in update_sit_entry f2fs: fix to reset fields for unloaded curseg f2fs: clean up new_curseg() f2fs: relocate f2fs_precache_extents() in f2fs_swap_activate() f2fs: fix blkofs_end correctly in f2fs_migrate_blocks() f2fs: ro: don't start discard thread for readonly image ...