summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2024-02-01Merge tag 'exfat-for-6.8-rc3' of ↵Linus Torvalds1-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fix from Namjae Jeon: - Fix BUG in iov_iter_revert reported from syzbot * tag 'exfat-for-6.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix zero the unwritten part for dio read
2024-01-31Merge tag 'erofs-for-6.8-rc3-fixes' of ↵Linus Torvalds8-63/+87
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: - fix an infinite loop issue of sub-page compressed data support found with lengthy stress tests on a 64k-page arm64 VM - optimize the temporary buffer allocation for low-memory scenarios, which can reduce 20.21% on average under a heavy multi-app launch benchmark workload - get rid of unnecessary GFP_NOFS * tag 'erofs-for-6.8-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: relaxed temporary buffers allocation on readahead erofs: fix infinite loop due to a race of filling compressed_bvecs erofs: get rid of unneeded GFP_NOFS
2024-01-30Merge tag 'jfs-6.8-rc3' of github.com:kleikamp/linux-shaggyLinus Torvalds1-7/+1
Pull jfs fix from David Kleikamp: "Revert a bad sanity check" * tag 'jfs-6.8-rc3' of github.com:kleikamp/linux-shaggy: Revert "jfs: fix shift-out-of-bounds in dbJoin"
2024-01-30Merge tag 'trace-v6.8-rc1-2' of ↵Linus Torvalds2-39/+0
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "Two small fixes for tracefs and eventfs: - Fix register_snapshot_trigger() on allocation error If the snapshot fails to allocate, the register_snapshot_trigger() can still return success. If the call to tracing_alloc_snapshot_instance() returned anything but 0, it returned 0, but it should have been returning the error code from that allocation function. - Remove leftover code from tracefs doing a dentry walk on remount. The update_gid() function was called by the tracefs code on remount to update the gid of eventfs, but that is no longer the case, but that code wasn't deleted. Nothing calls it. Remove it" * tag 'trace-v6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracefs: remove stale 'update_gid' code tracing/trigger: Fix to return error if failed to alloc snapshot
2024-01-30Merge tag 'mm-hotfixes-stable-2024-01-28-23-21' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "22 hotfixes. 11 are cc:stable and the remainder address post-6.7 issues or aren't considered appropriate for backporting" * tag 'mm-hotfixes-stable-2024-01-28-23-21' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) mm: thp_get_unmapped_area must honour topdown preference mm: huge_memory: don't force huge page alignment on 32 bit userfaultfd: fix mmap_changing checking in mfill_atomic_hugetlb selftests/mm: ksm_tests should only MADV_HUGEPAGE valid memory scs: add CONFIG_MMU dependency for vfree_atomic() mm/memory: fix folio_set_dirty() vs. folio_mark_dirty() in zap_pte_range() mm/huge_memory: fix folio_set_dirty() vs. folio_mark_dirty() selftests/mm: Update va_high_addr_switch.sh to check CPU for la57 flag selftests: mm: fix map_hugetlb failure on 64K page size systems MAINTAINERS: supplement of zswap maintainers update stackdepot: make fast paths lock-less again stackdepot: add stats counters exported via debugfs mm, kmsan: fix infinite recursion due to RCU critical section mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again selftests/mm: switch to bash from sh MAINTAINERS: add man-pages git trees mm: memcontrol: don't throttle dying tasks on memory.high mm: mmap: map MAP_STACK to VM_NOHUGEPAGE uprobes: use pagesize-aligned virtual address when replacing pages selftests/mm: mremap_test: fix build warning ...
2024-01-29Revert "jfs: fix shift-out-of-bounds in dbJoin"Dave Kleikamp1-7/+1
This reverts commit cca974daeb6c43ea971f8ceff5a7080d7d49ee30. The added sanity check is incorrect. BUDMIN is not the wrong value and is too small. Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2024-01-28tracefs: remove stale 'update_gid' codeLinus Torvalds2-39/+0
The 'eventfs_update_gid()' function is no longer called, so remove it (and the helper function it uses). Link: https://lore.kernel.org/all/CAHk-=wj+DsZZ=2iTUkJ-Nojs9fjYMvPs1NuoM3yK7aTDtJfPYQ@mail.gmail.com/ Fixes: 8186fff7ab64 ("tracefs/eventfs: Use root and instance inodes as default ownership") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-01-27Merge tag 'xfs-6.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-10/+17
Pull xfs fix from Chandan Babu: - Fix read only mounts when using fsopen mount API * tag 'xfs-6.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: read only mounts with fsopen mount API are busted
2024-01-27Merge tag 'bcachefs-2024-01-26' of https://evilpiepirate.org/git/bcachefsLinus Torvalds10-32/+42
Pull bcachefs fixes from Kent Overstreet: - fix for REQ_OP_FLUSH usage; this fixes filesystems going read only with -EOPNOTSUPP from the block layer. (this really should have gone in with the block layer patch causing the -EOPNOTSUPP, or should have gone in before). - fix an allocation in non-sleepable context - fix one source of srcu lock latency, on devices with terrible discard latency - fix a reattach_inode() issue in fsck * tag 'bcachefs-2024-01-26' of https://evilpiepirate.org/git/bcachefs: bcachefs: __lookup_dirent() works in snapshot, not subvol bcachefs: discard path uses unlock_long() bcachefs: fix incorrect usage of REQ_OP_FLUSH bcachefs: Add gfp flags param to bch2_prt_task_backtrace()
2024-01-27Merge tag '6.8-rc2-smb3-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds3-3/+6
Pull smb server fixes from Steve French: - Fix netlink OOB - Minor kernel doc fix * tag '6.8-rc2-smb3-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix global oob in ksmbd_nl_policy smb: Fix some kernel-doc comments
2024-01-27Merge tag '6.8-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds13-74/+467
Pull smb client fixes from Steve French: "Nine cifs/smb client fixes - Four network error fixes (three relating to replays of requests that need to be retried, and one fixing some places where we were returning the wrong rc up the stack on network errors) - Two multichannel fixes including locking fix and case where subset of channels need reconnect - netfs integration fixup: share remote i_size with netfslib - Two small cleanups (one for addressing a clang warning)" * tag '6.8-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix stray unlock in cifs_chan_skip_or_disable cifs: set replay flag for retries of write command cifs: commands that are retried should have replay flag set cifs: helper function to check replayable error codes cifs: translate network errors on send to -ECONNABORTED cifs: cifs_pick_channel should try selecting active channels cifs: Share server EOF pos with netfslib smb: Work around Clang __bdos() type confusion smb: client: delete "true", "false" defines
2024-01-27erofs: relaxed temporary buffers allocation on readaheadChunhai Guo5-20/+42
Even with inplace decompression, sometimes very few temporary buffers may be still needed for a single decompression shot (e.g. 16 pages for 64k sliding window or 4 pages for 16k sliding window). In low-memory scenarios, it would be better to try to allocate with GFP_NOWAIT on readahead first. That can help reduce the time spent on page allocation under durative memory pressure. Here are detailed performance numbers under multi-app launch benchmark workload [1] on ARM64 Android devices (8-core CPU and 8GB of memory) running a 5.15 LTS kernel with EROFS of 4k pclusters: +----------------------------------------------+ | LZ4 | vanilla | patched | diff | |----------------+---------+---------+---------| | Average (ms) | 3364 | 2684 | -20.21% | [64k sliding window] |----------------+---------+---------+---------| | Average (ms) | 2079 | 1610 | -22.56% | [16k sliding window] +----------------------------------------------+ The total size of system images for 4k pclusters is almost unchanged: (64k sliding window) 9,117,044 KB (16k sliding window) 9,113,096 KB Therefore, in addition to switch the sliding window from 64k to 16k, after applying this patch, it can eventually save 52.14% (3364 -> 1610) on average with no memory reservation. That is particularly useful for embedded devices with limited resources. [1] https://lore.kernel.org/r/20240109074143.4138783-1-guochunhai@vivo.com Suggested-by: Gao Xiang <xiang@kernel.org> Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Link: https://lore.kernel.org/r/20240126140142.201718-1-hsiangkao@linux.alibaba.com
2024-01-26erofs: fix infinite loop due to a race of filling compressed_bvecsGao Xiang1-36/+38
I encountered a race issue after lengthy (~594647 secs) stress tests on a 64k-page arm64 VM with several 4k-block EROFS images. The timing is like below: z_erofs_try_inplace_io z_erofs_fill_bio_vec cmpxchg(&compressed_bvecs[].page, NULL, ..) [access bufvec] compressed_bvecs[] = *bvec; Previously, z_erofs_submit_queue() just accessed bufvec->page only, so other fields in bufvec didn't matter. After the subpage block support is landed, .offset and .end can be used too, but filling bufvec isn't an atomic operation which can cause inconsistency. Let's use a spinlock to keep the atomicity of each bufvec. More specifically, just reuse the existing spinlock `pcl->obj.lockref.lock` since it's rarely used (also it takes a short time if even used) as long as the pcluster has a reference. Fixes: 192351616a9d ("erofs: support I/O submission for sub-page compressed blocks") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Yue Hu <huyue2@coolpad.com> Reviewed-by: Sandeep Dhavale <dhavale@google.com> Link: https://lore.kernel.org/r/20240125120039.3228103-1-hsiangkao@linux.alibaba.com
2024-01-26fs/hugetlbfs/inode.c: mm/memory-failure.c: fix hugetlbfs hwpoison handlingSidhartha Kumar1-1/+1
has_extra_refcount() makes the assumption that the page cache adds a ref count of 1 and subtracts this in the extra_pins case. Commit a08c7193e4f1 (mm/filemap: remove hugetlb special casing in filemap.c) modifies __filemap_add_folio() by calling folio_ref_add(folio, nr); for all cases (including hugtetlb) where nr is the number of pages in the folio. We should adjust the number of references coming from the page cache by subtracing the number of pages rather than 1. In hugetlbfs_read_iter(), folio_test_has_hwpoisoned() is testing the wrong flag as, in the hugetlb case, memory-failure code calls folio_test_set_hwpoison() to indicate poison. folio_test_hwpoison() is the correct function to test for that flag. After these fixes, the hugetlb hwpoison read selftest passes all cases. Link: https://lkml.kernel.org/r/20240112180840.367006-1-sidhartha.kumar@oracle.com Fixes: a08c7193e4f1 ("mm/filemap: remove hugetlb special casing in filemap.c") Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Closes: https://lore.kernel.org/linux-mm/20230713001833.3778937-1-jiaqiyan@google.com/T/#m8e1469119e5b831bbd05d495f96b842e4a1c5519 Reported-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Muchun Song <muchun.song@linux.dev> Cc: James Houghton <jthoughton@google.com> Cc: Jiaqi Yan <jiaqiyan@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: <stable@vger.kernel.org> [6.7+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-26bcachefs: __lookup_dirent() works in snapshot, not subvolKent Overstreet2-18/+27
Add a new helper, bch2_hash_lookup_in_snapshot(), for when we're not operating in a subvolume and already have a snapshot ID, and then use it in lookup_lostfound() -> __lookup_dirent(). This is a bugfix - lookup_lostfound() doesn't take a subvolume ID, we were passing a nonsense subvolume ID before, and don't have one to pass since we may be operating in an interior snapshot node that doesn't have a subvolume ID. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-25Merge tag 'ovl-fixes-6.8-rc2' of ↵Linus Torvalds6-48/+97
git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs fix from Amir Goldstein: "Change the on-disk format for the new "xwhiteouts" feature introduced in v6.7 The change reduces unneeded overhead of an extra getxattr per readdir. The only user of the "xwhiteout" feature is the external composefs tool, which has been updated to support the new on-disk format. This change is also designated for 6.7.y" * tag 'ovl-fixes-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: mark xwhiteouts directory with overlay.opaque='x'
2024-01-25Merge tag 'vfs-6.8-rc2.netfs' of ↵Linus Torvalds10-40/+51
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull netfs fixes from Christian Brauner: "This contains various fixes for the netfs work merged earlier this cycle: afs: - Fix locking imbalance in afs_proc_addr_prefs_show() - Remove afs_dynroot_d_revalidate() which is redundant - Fix error handling during lookup - Hide sillyrenames from userspace. This fixes a race between silly-rename files being created/removed and userspace iterating over directory entries - Don't use unnecessary folio_*() functions cifs: - Don't use unnecessary folio_*() functions cachefiles: - erofs: Fix Null dereference when cachefiles are not doing ondemand-mode - Update mailing list netfs library: - Add Jeff Layton as reviewer - Update mailing list - Fix a error checking in netfs_perform_write() - fscache: Check error before dereferencing - Don't use unnecessary folio_*() functions" * tag 'vfs-6.8-rc2.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: afs: Fix missing/incorrect unlocking of RCU read lock afs: Remove afs_dynroot_d_revalidate() as it is redundant afs: Fix error handling with lookup via FS.InlineBulkStatus afs: Hide silly-rename files from userspace cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode netfs: Fix a NULL vs IS_ERR() check in netfs_perform_write() netfs, fscache: Prevent Oops in fscache_put_cache() cifs: Don't use certain unnecessary folio_*() functions afs: Don't use certain unnecessary folio_*() functions netfs: Don't use certain unnecessary folio_*() functions netfs: Add Jeff Layton as reviewer netfs, cachefiles: Change mailing list
2024-01-25Merge tag 'nfsd-6.8-1' of ↵Linus Torvalds1-11/+15
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix in-kernel RPC UDP transport - Fix NFSv4.0 RELEASE_LOCKOWNER * tag 'nfsd-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: fix RELEASE_LOCKOWNER SUNRPC: use request size to initialize bio_vec in svc_udp_sendto()
2024-01-25ksmbd: fix global oob in ksmbd_nl_policyLin Ma2-3/+4
Similar to a reported issue (check the commit b33fb5b801c6 ("net: qualcomm: rmnet: fix global oob in rmnet_policy"), my local fuzzer finds another global out-of-bounds read for policy ksmbd_nl_policy. See bug trace below: ================================================================== BUG: KASAN: global-out-of-bounds in validate_nla lib/nlattr.c:386 [inline] BUG: KASAN: global-out-of-bounds in __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600 Read of size 1 at addr ffffffff8f24b100 by task syz-executor.1/62810 CPU: 0 PID: 62810 Comm: syz-executor.1 Tainted: G N 6.1.0 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x8b/0xb3 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:284 [inline] print_report+0x172/0x475 mm/kasan/report.c:395 kasan_report+0xbb/0x1c0 mm/kasan/report.c:495 validate_nla lib/nlattr.c:386 [inline] __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600 __nla_parse+0x3e/0x50 lib/nlattr.c:697 __nlmsg_parse include/net/netlink.h:748 [inline] genl_family_rcv_msg_attrs_parse.constprop.0+0x1b0/0x290 net/netlink/genetlink.c:565 genl_family_rcv_msg_doit+0xda/0x330 net/netlink/genetlink.c:734 genl_family_rcv_msg net/netlink/genetlink.c:833 [inline] genl_rcv_msg+0x441/0x780 net/netlink/genetlink.c:850 netlink_rcv_skb+0x14f/0x410 net/netlink/af_netlink.c:2540 genl_rcv+0x24/0x40 net/netlink/genetlink.c:861 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x54e/0x800 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x930/0xe50 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0x154/0x190 net/socket.c:734 ____sys_sendmsg+0x6df/0x840 net/socket.c:2482 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536 __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fdd66a8f359 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fdd65e00168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fdd66bbcf80 RCX: 00007fdd66a8f359 RDX: 0000000000000000 RSI: 0000000020000500 RDI: 0000000000000003 RBP: 00007fdd66ada493 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc84b81aff R14: 00007fdd65e00300 R15: 0000000000022000 </TASK> The buggy address belongs to the variable: ksmbd_nl_policy+0x100/0xa80 The buggy address belongs to the physical page: page:0000000034f47940 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1ccc4b flags: 0x200000000001000(reserved|node=0|zone=2) raw: 0200000000001000 ffffea00073312c8 ffffea00073312c8 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffffffff8f24b000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffff8f24b080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffffffff8f24b100: f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 00 00 07 f9 ^ ffffffff8f24b180: f9 f9 f9 f9 00 05 f9 f9 f9 f9 f9 f9 00 00 00 05 ffffffff8f24b200: f9 f9 f9 f9 00 00 03 f9 f9 f9 f9 f9 00 00 04 f9 ================================================================== To fix it, add a placeholder named __KSMBD_EVENT_MAX and let KSMBD_EVENT_MAX to be its original value - 1 according to what other netlink families do. Also change two sites that refer the KSMBD_EVENT_MAX to correct value. Cc: stable@vger.kernel.org Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Signed-off-by: Lin Ma <linma@zju.edu.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-25erofs: get rid of unneeded GFP_NOFSJingbo Xu4-7/+7
Clean up some leftovers since there is no way for EROFS to be called again from a reclaim context. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240124031945.130782-1-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-01-25bcachefs: discard path uses unlock_long()Kent Overstreet1-1/+1
Some (bad) devices can have really terrible discard latency; we don't want them blocking memory reclaim and causing warnings. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-25Merge tag 'execve-v6.8-rc2' of ↵Linus Torvalds1-9/+30
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve fixes from Kees Cook: - Fix error handling in begin_new_exec() (Bernd Edlinger) - MAINTAINERS: specifically mention ELF (Alexey Dobriyan) - Various cleanups related to earlier open() (Askar Safin, Kees Cook) * tag 'execve-v6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: exec: Distinguish in_execve from in_exec exec: Fix error handling in begin_new_exec() exec: Add do_close_execat() helper exec: remove useless comment ELF, MAINTAINERS: specifically mention ELF
2024-01-25uselib: remove use of __FMODE_EXECLinus Torvalds1-1/+1
Jann Horn points out that uselib() really shouldn't trigger the new FMODE_EXEC logic introduced by commit 4759ff71f23e ("exec: __FMODE_EXEC instead of in_execve for LSMs"). In fact, it shouldn't even have ever triggered the old pre-existing logic for __FMODE_EXEC (like the NFS code that makes executables not need read permissions). Unlike a real execve(), that can work even with files that are purely executable by the user (not readable), uselib() has that MAY_READ requirement becasue it's really just a convenience wrapper around mmap() for legacy shared libraries. The whole FMODE_EXEC bit was originally introduced by commit b500531e6f5f ("[PATCH] Introduce FMODE_EXEC file flag"), primarily to give ETXTBUSY error returns for distributed filesystems. It has since grown a few other warts (like that NFS thing), but there really isn't any reason to use it for uselib(), and now that we are trying to use it to replace the horrid 'tsk->in_execve' flag, it's actively wrong. Of course, as Jann Horn also points out, nobody should be enabling CONFIG_USELIB in the first place in this day and age, but that's a different discussion entirely. Reported-by: Jann Horn <jannh@google.com> Fixes: 4759ff71f23e ("exec: __FMODE_EXEC instead of in_execve for LSMs") Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-01-24exec: Distinguish in_execve from in_execKees Cook1-0/+1
Just to help distinguish the fs->in_exec flag from the current->in_execve flag, add comments in check_unsafe_exec() and copy_fs() for more context. Also note that in_execve is only used by TOMOYO now. Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: linux-fsdevel@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: Kees Cook <keescook@chromium.org>
2024-01-24nfsd: fix RELEASE_LOCKOWNERNeilBrown1-11/+15
The test on so_count in nfsd4_release_lockowner() is nonsense and harmful. Revert to using check_for_locks(), changing that to not sleep. First: harmful. As is documented in the kdoc comment for nfsd4_release_lockowner(), the test on so_count can transiently return a false positive resulting in a return of NFS4ERR_LOCKS_HELD when in fact no locks are held. This is clearly a protocol violation and with the Linux NFS client it can cause incorrect behaviour. If RELEASE_LOCKOWNER is sent while some other thread is still processing a LOCK request which failed because, at the time that request was received, the given owner held a conflicting lock, then the nfsd thread processing that LOCK request can hold a reference (conflock) to the lock owner that causes nfsd4_release_lockowner() to return an incorrect error. The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so it knows that the error is impossible. It assumes the lock owner was in fact released so it feels free to use the same lock owner identifier in some later locking request. When it does reuse a lock owner identifier for which a previous RELEASE failed, it will naturally use a lock_seqid of zero. However the server, which didn't release the lock owner, will expect a larger lock_seqid and so will respond with NFS4ERR_BAD_SEQID. So clearly it is harmful to allow a false positive, which testing so_count allows. The test is nonsense because ... well... it doesn't mean anything. so_count is the sum of three different counts. 1/ the set of states listed on so_stateids 2/ the set of active vfs locks owned by any of those states 3/ various transient counts such as for conflicting locks. When it is tested against '2' it is clear that one of these is the transient reference obtained by find_lockowner_str_locked(). It is not clear what the other one is expected to be. In practice, the count is often 2 because there is precisely one state on so_stateids. If there were more, this would fail. In my testing I see two circumstances when RELEASE_LOCKOWNER is called. In one case, CLOSE is called before RELEASE_LOCKOWNER. That results in all the lock states being removed, and so the lockowner being discarded (it is removed when there are no more references which usually happens when the lock state is discarded). When nfsd4_release_lockowner() finds that the lock owner doesn't exist, it returns success. The other case shows an so_count of '2' and precisely one state listed in so_stateid. It appears that the Linux client uses a separate lock owner for each file resulting in one lock state per lock owner, so this test on '2' is safe. For another client it might not be safe. So this patch changes check_for_locks() to use the (newish) find_any_file_locked() so that it doesn't take a reference on the nfs4_file and so never calls nfsd_file_put(), and so never sleeps. With this check is it safe to restore the use of check_for_locks() rather than testing so_count against the mysterious '2'. Fixes: ce3c4ad7f4ce ("NFSD: Fix possible sleep during nfsd4_release_lockowner()") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: stable@vger.kernel.org # v6.2+ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-24cifs: fix stray unlock in cifs_chan_skip_or_disableShyam Prasad N1-1/+0
A recent change moved the code that decides to skip a channel or disable multichannel entirely, into a helper function. During this, a mutex_unlock of the session_mutex should have been removed. Doing that here. Fixes: f591062bdbf4 ("cifs: handle servers that still advertise multichannel after disabling") Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-24cifs: set replay flag for retries of write commandShyam Prasad N3-1/+5
Similar to the rest of the commands, this is a change to add replay flags on retry. This one does not add a back-off, considering that we may want to flush a write ASAP to the server. Considering that this will be a flush of cached pages, the retrans value is also not honoured. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-24cifs: commands that are retried should have replay flag setShyam Prasad N6-45/+404
MS-SMB2 states that the header flag SMB2_FLAGS_REPLAY_OPERATION needs to be set when a command needs to be retried, so that the server is aware that this is a replay for an operation that appeared before. This can be very important, for example, for state changing operations and opens which get retried following a reconnect; since the client maybe unaware of the status of the previous open. This is particularly important for multichannel scenario, since disconnection of one connection does not mean that the session is lost. The requests can be replayed on another channel. This change also makes use of exponential back-off before replays and also limits the number of retries to "retrans" mount option value. Also, this change does not modify the read/write codepath. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-24cifs: helper function to check replayable error codesShyam Prasad N2-0/+8
The code to check for replay is not just -EAGAIN. In some cases, the send request or receive response may result in network errors, which we're now mapping to -ECONNABORTED. This change introduces a helper function which checks if the error returned in one of the above two errors. And all checks for replays will now use this helper. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-24cifs: translate network errors on send to -ECONNABORTEDShyam Prasad N1-2/+9
When the network stack returns various errors, we today bubble up the error to the user (in case of soft mounts). This change translates all network errors except -EINTR and -EAGAIN to -ECONNABORTED. A similar approach is taken when we receive network errors when reading from the socket. The change also forces the cifsd thread to reconnect during it's next activity. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-24cifs: cifs_pick_channel should try selecting active channelsShyam Prasad N1-0/+3
cifs_pick_channel today just selects a channel based on the policy of least loaded channel. However, it does not take into account if the channel needs reconnect. As a result, we can have failures in send that can be completely avoided. This change doesn't make a channel a candidate for this selection if it needs reconnect. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-24cifs: Share server EOF pos with netfslibDavid Howells6-17/+37
Use cifsi->netfs_ctx.remote_i_size instead of cifsi->server_eof so that netfslib can refer to it to. Signed-off-by: David Howells <dhowells@redhat.com> cc: Shyam Prasad N <nspmangalore@gmail.com> cc: Rohith Surabattula <rohiths.msft@gmail.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cifs@vger.kernel.org cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-24smb: Work around Clang __bdos() type confusionKees Cook1-1/+1
Recent versions of Clang gets confused about the possible size of the "user" allocation, and CONFIG_FORTIFY_SOURCE ends up emitting a warning[1]: repro.c:126:4: warning: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning] 126 | __write_overflow_field(p_size_field, size); | ^ for this memset(): int len; __le16 *user; ... len = ses->user_name ? strlen(ses->user_name) : 0; user = kmalloc(2 + (len * 2), GFP_KERNEL); ... if (len) { ... } else { memset(user, '\0', 2); } While Clang works on this bug[2], switch to using a direct assignment, which avoids memset() entirely which both simplifies the code and silences the false positive warning. (Making "len" size_t also silences the warning, but the direct assignment seems better.) Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://github.com/ClangBuiltLinux/linux/issues/1966 [1] Link: https://github.com/llvm/llvm-project/issues/77813 [2] Cc: Steve French <sfrench@samba.org> Cc: Paulo Alcantara <pc@manguebit.com> Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com> Cc: Shyam Prasad N <sprasad@microsoft.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: llvm@lists.linux.dev Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-24Merge tag 'trace-v6.8-rc1' of ↵Linus Torvalds2-6/+15
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing and eventfs fixes from Steven Rostedt: - Fix histogram tracing_map insertion. The tracing_map_insert copies the value into the elt variable and then assigns the elt to the entry value. But it is possible that the entry value becomes visible on other CPUs before the elt is fully initialized. This is fixed by adding a wmb() between the initialization of the elt variable and assigning it. - Have eventfs directory have unique inode numbers. Having them be all the same proved to be a failure as the 'find' application will think that the directories are causing loops, as it checks for directory loops via their inodes. Have the evenfs dir entries get their inodes assigned when they are referenced and then save them in the eventfs_inode structure. * tag 'trace-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: eventfs: Save directory inodes in the eventfs_inode structure tracing: Ensure visibility when inserting an element into tracing_map
2024-01-23smb: client: delete "true", "false" definesAlexey Dobriyan1-7/+0
Kernel has its own official true/false definitions. The defines aren't even used in this file. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-23eventfs: Save directory inodes in the eventfs_inode structureSteven Rostedt (Google)2-6/+15
The eventfs inodes and directories are allocated when referenced. But this leaves the issue of keeping consistent inode numbers and the number is only saved in the inode structure itself. When the inode is no longer referenced, it can be freed. When the file that the inode was representing is referenced again, the inode is once again created, but the inode number needs to be the same as it was before. Just making the inode numbers the same for all files is fine, but that does not work with directories. The find command will check for loops via the inode number and having the same inode number for directories triggers: # find /sys/kernel/tracing find: File system loop detected; '/sys/kernel/debug/tracing/events/initcall/initcall_finish' is part of the same file system loop as '/sys/kernel/debug/tracing/events/initcall'. [..] Linus pointed out that the eventfs_inode structure ends with a single 32bit int, and on 64 bit machines, there's likely a 4 byte hole due to alignment. We can use this hole to store the inode number for the eventfs_inode. All directories in eventfs are represented by an eventfs_inode and that data structure can hold its inode number. That last int was also purposely placed at the end of the structure to prevent holes from within. Now that there's a 4 byte number to hold the inode, both the inode number and the last integer can be moved up in the structure for better cache locality, where the llist and rcu fields can be moved to the end as they are only used when the eventfs_inode is being deleted. Link: https://lore.kernel.org/all/CAMuHMdXKiorg-jiuKoZpfZyDJ3Ynrfb8=X+c7x0Eewxn-YRdCA@mail.gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240122152748.46897388@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Fixes: 53c41052ba31 ("eventfs: Have the inodes all for files and directories all be the same") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Kees Cook <keescook@chromium.org>
2024-01-23ovl: mark xwhiteouts directory with overlay.opaque='x'Amir Goldstein6-48/+97
An opaque directory cannot have xwhiteouts, so instead of marking an xwhiteouts directory with a new xattr, overload overlay.opaque xattr for marking both opaque dir ('y') and xwhiteouts dir ('x'). This is more efficient as the overlay.opaque xattr is checked during lookup of directory anyway. This also prevents unnecessary checking the xattr when reading a directory without xwhiteouts, i.e. most of the time. Note that the xwhiteouts marker is not checked on the upper layer and on the last layer in lowerstack, where xwhiteouts are not expected. Fixes: bc8df7a3dc03 ("ovl: Add an alternative type of whiteout") Cc: <stable@vger.kernel.org> # v6.7 Reviewed-by: Alexander Larsson <alexl@redhat.com> Tested-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2024-01-23Revert "btrfs: zstd: fix and simplify the inline extent decompression"Linus Torvalds2-23/+54
This reverts commit 1e7f6def8b2370ecefb54b3c8f390ff894b0c51b. It causes my machine to not even boot, and Klara Modin reports that the cause is that small zstd-compressed files return garbage when read. Reported-by: Klara Modin <klarasmodin@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CABq1_vj4GpUeZpVG49OHCo-3sdbe2-2ROcu_xDvUG-6-5zPRXg@mail.gmail.com/ Reported-and-bisected-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: David Sterba <dsterba@suse.com> Cc: Qu Wenruo <wqu@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-01-23afs: Fix missing/incorrect unlocking of RCU read lockDavid Howells1-2/+3
In afs_proc_addr_prefs_show(), we need to unlock the RCU read lock in both places before returning (and not lock it again). Fixes: f94f70d39cc2 ("afs: Provide a way to configure address priorities") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202401172243.cd53d5f6-oliver.sang@intel.com Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org
2024-01-23afs: Remove afs_dynroot_d_revalidate() as it is redundantDavid Howells1-9/+0
Remove afs_dynroot_d_revalidate() as it is redundant as all it does is return 1 and the caller assumes that if the op is not given. Suggested-by: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org
2024-01-23afs: Fix error handling with lookup via FS.InlineBulkStatusDavid Howells1-3/+9
When afs does a lookup, it tries to use FS.InlineBulkStatus to preemptively look up a bunch of files in the parent directory and cache this locally, on the basis that we might want to look at them too (for example if someone does an ls on a directory, they may want want to then stat every file listed). FS.InlineBulkStatus can be considered a compound op with the normal abort code applying to the compound as a whole. Each status fetch within the compound is then given its own individual abort code - but assuming no error that prevents the bulk fetch from returning the compound result will be 0, even if all the constituent status fetches failed. At the conclusion of afs_do_lookup(), we should use the abort code from the appropriate status to determine the error to return, if any - but instead it is assumed that we were successful if the op as a whole succeeded and we return an incompletely initialised inode, resulting in ENOENT, no matter the actual reason. In the particular instance reported, a vnode with no permission granted to be accessed is being given a UAEACCES abort code which should be reported as EACCES, but is instead being reported as ENOENT. Fix this by abandoning the inode (which will be cleaned up with the op) if file[1] has an abort code indicated and turn that abort code into an error instead. Whilst we're at it, add a tracepoint so that the abort codes of the individual subrequests of FS.InlineBulkStatus can be logged. At the moment only the container abort code can be 0. Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2024-01-23afs: Hide silly-rename files from userspaceDavid Howells1-0/+8
There appears to be a race between silly-rename files being created/removed and various userspace tools iterating over the contents of a directory, leading to such errors as: find: './kernel/.tmp_cpio_dir/include/dt-bindings/reset/.__afs2080': No such file or directory tar: ./include/linux/greybus/.__afs3C95: File removed before we read it when building a kernel. Fix afs_readdir() so that it doesn't return .__afsXXXX silly-rename files to userspace. This doesn't stop them being looked up directly by name as we need to be able to look them up from within the kernel as part of the silly-rename algorithm. Fixes: 79ddbfa500b3 ("afs: Implement sillyrename for unlink and rename") Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2024-01-23cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-modeDavid Howells1-0/+3
cachefiles_ondemand_init_object() as called from cachefiles_open_file() and cachefiles_create_tmpfile() does not check if object->ondemand is set before dereferencing it, leading to an oops something like: RIP: 0010:cachefiles_ondemand_init_object+0x9/0x41 ... Call Trace: <TASK> cachefiles_open_file+0xc9/0x187 cachefiles_lookup_cookie+0x122/0x2be fscache_cookie_state_machine+0xbe/0x32b fscache_cookie_worker+0x1f/0x2d process_one_work+0x136/0x208 process_scheduled_works+0x3a/0x41 worker_thread+0x1a2/0x1f6 kthread+0xca/0xd2 ret_from_fork+0x21/0x33 Fix this by making cachefiles_ondemand_init_object() return immediately if cachefiles->ondemand is NULL. Fixes: 3c5ecfe16e76 ("cachefiles: extract ondemand info field from cachefiles_object") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Gao Xiang <xiang@kernel.org> cc: Chao Yu <chao@kernel.org> cc: Yue Hu <huyue2@coolpad.com> cc: Jeffle Xu <jefflexu@linux.alibaba.com> cc: linux-erofs@lists.ozlabs.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org
2024-01-23netfs: Fix a NULL vs IS_ERR() check in netfs_perform_write()Dan Carpenter1-2/+3
The netfs_grab_folio_for_write() function doesn't return NULL, it returns error pointers. Update the check accordingly. Fixes: c38f4e96e605 ("netfs: Provide func to copy data to pagecache for buffered write") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/29fb1310-8e2d-47ba-b68d-40354eb7b896@moroto.mountain/
2024-01-23netfs, fscache: Prevent Oops in fscache_put_cache()Dan Carpenter1-1/+2
This function dereferences "cache" and then checks if it's IS_ERR_OR_NULL(). Check first, then dereference. Fixes: 9549332df4ed ("fscache: Implement cache registration") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/e84bc740-3502-4f16-982a-a40d5676615c@moroto.mountain/ # v2
2024-01-23cifs: Don't use certain unnecessary folio_*() functionsDavid Howells1-5/+5
Filesystems should use folio->index and folio->mapping, instead of folio_index(folio), folio_mapping() and folio_file_mapping() since they know that it's in the pagecache. Change this automagically with: perl -p -i -e 's/folio_mapping[(]([^)]*)[)]/\1->mapping/g' fs/smb/client/*.c perl -p -i -e 's/folio_file_mapping[(]([^)]*)[)]/\1->mapping/g' fs/smb/client/*.c perl -p -i -e 's/folio_index[(]([^)]*)[)]/\1->index/g' fs/smb/client/*.c Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.com> cc: Ronnie Sahlberg <lsahlber@redhat.com> cc: Shyam Prasad N <sprasad@microsoft.com> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org
2024-01-23afs: Don't use certain unnecessary folio_*() functionsDavid Howells1-5/+5
Filesystems should use folio->index and folio->mapping, instead of folio_index(folio), folio_mapping() and folio_file_mapping() since they know that it's in the pagecache. Change this automagically with: perl -p -i -e 's/folio_mapping[(]([^)]*)[)]/\1->mapping/g' fs/afs/*.c perl -p -i -e 's/folio_file_mapping[(]([^)]*)[)]/\1->mapping/g' fs/afs/*.c perl -p -i -e 's/folio_index[(]([^)]*)[)]/\1->index/g' fs/afs/*.c Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org
2024-01-23netfs: Don't use certain unnecessary folio_*() functionsDavid Howells4-13/+13
Filesystems should use folio->index and folio->mapping, instead of folio_index(folio), folio_mapping() and folio_file_mapping() since they know that it's in the pagecache. Change this automagically with: perl -p -i -e 's/folio_mapping[(]([^)]*)[)]/\1->mapping/g' fs/netfs/*.c perl -p -i -e 's/folio_file_mapping[(]([^)]*)[)]/\1->mapping/g' fs/netfs/*.c perl -p -i -e 's/folio_index[(]([^)]*)[)]/\1->index/g' fs/netfs/*.c Reported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-afs@lists.infradead.org cc: linux-cachefs@redhat.com cc: linux-cifs@vger.kernel.org cc: linux-erofs@lists.ozlabs.org cc: linux-fsdevel@vger.kernel.org
2024-01-23Merge tag 'for-6.8-rc1-tag' of ↵Linus Torvalds16-186/+176
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - zoned mode fixes: - fix slowdown when writing large file sequentially by looking up block groups with enough space faster - locking fixes when activating a zone - new mount API fixes: - preserve mount options for a ro/rw mount of the same subvolume - scrub fixes: - fix use-after-free in case the chunk length is not aligned to 64K, this does not happen normally but has been reported on images converted from ext4 - similar alignment check was missing with raid-stripe-tree - subvolume deletion fixes: - prevent calling ioctl on already deleted subvolume - properly track flag tracking a deleted subvolume - in subpage mode, fix decompression of an inline extent (zlib, lzo, zstd) - fix crash when starting writeback on a folio, after integration with recent MM changes this needs to be started conditionally - reject unknown flags in defrag ioctl - error handling, API fixes, minor warning fixes * tag 'for-6.8-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: scrub: limit RST scrub to chunk boundary btrfs: scrub: avoid use-after-free when chunk length is not 64K aligned btrfs: don't unconditionally call folio_start_writeback in subpage btrfs: use the original mount's mount options for the legacy reconfigure btrfs: don't warn if discard range is not aligned to sector btrfs: tree-checker: fix inline ref size in error messages btrfs: zstd: fix and simplify the inline extent decompression btrfs: lzo: fix and simplify the inline extent decompression btrfs: zlib: fix and simplify the inline extent decompression btrfs: defrag: reject unknown flags of btrfs_ioctl_defrag_range_args btrfs: avoid copying BTRFS_ROOT_SUBVOL_DEAD flag to snapshot of subvolume being deleted btrfs: don't abort filesystem when attempting to snapshot deleted subvolume btrfs: zoned: fix lock ordering in btrfs_zone_activate() btrfs: fix unbalanced unlock of mapping_tree_lock btrfs: ref-verify: free ref cache before clearing mount opt btrfs: fix kvcalloc() arguments order in btrfs_ioctl_send() btrfs: zoned: optimize hint byte for zoned allocator btrfs: zoned: factor out prepare_allocation_zoned()
2024-01-22exec: Fix error handling in begin_new_exec()Bernd Edlinger1-0/+3
If get_unused_fd_flags() fails, the error handling is incomplete because bprm->cred is already set to NULL, and therefore free_bprm will not unlock the cred_guard_mutex. Note there are two error conditions which end up here, one before and one after bprm->cred is cleared. Fixes: b8a61c9e7b4a ("exec: Generic execfd support") Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Link: https://lore.kernel.org/r/AS8P193MB128517ADB5EFF29E04389EDAE4752@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>