summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2024-05-14Merge tag 'sched-core-2024-05-13' of ↵Linus Torvalds6-16/+22
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Add cpufreq pressure feedback for the scheduler - Rework misfit load-balancing wrt affinity restrictions - Clean up and simplify the code around ::overutilized and ::overload access. - Simplify sched_balance_newidle() - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES handling that changed the output. - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch() - Reorganize, clean up and unify most of the higher level scheduler balancing function names around the sched_balance_*() prefix - Simplify the balancing flag code (sched_balance_running) - Miscellaneous cleanups & fixes * tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) sched/pelt: Remove shift of thermal clock sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() thermal/cpufreq: Remove arch_update_thermal_pressure() sched/cpufreq: Take cpufreq feedback into account cpufreq: Add a cpufreq pressure feedback for the scheduler sched/fair: Fix update of rd->sg_overutilized sched/vtime: Do not include <asm/vtime.h> header s390/irq,nmi: Include <asm/vtime.h> header directly s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover sched/vtime: Get rid of generic vtime_task_switch() implementation sched/vtime: Remove confusing arch_vtime_task_switch() declaration sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized() sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded() sched/fair: Rename root_domain::overload to ::overloaded sched/fair: Use helper functions to access root_domain::overload sched/fair: Check root_domain::overload value before update sched/fair: Combine EAS check with root_domain::overutilized access sched/fair: Simplify the continue_balancing logic in sched_balance_newidle() ...
2024-05-14Merge tag 'perf-core-2024-05-13' of ↵Linus Torvalds1-20/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Combine perf and BPF for fast evalution of HW breakpoint conditions - Add LBR capture support outside of hardware events - Trigger IO signals for watermark_wakeup - Add RAPL support for Intel Arrow Lake and Lunar Lake - Optimize frequency-throttling - Miscellaneous cleanups & fixes * tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) perf/bpf: Mark perf_event_set_bpf_handler() and perf_event_free_bpf_handler() as inline too selftests/perf_events: Test FASYNC with watermark wakeups perf/ring_buffer: Trigger IO signals for watermark_wakeup perf: Move perf_event_fasync() to perf_event.h perf/bpf: Change the !CONFIG_BPF_SYSCALL stubs to static inlines selftest/bpf: Test a perf BPF program that suppresses side effects perf/bpf: Allow a BPF program to suppress all sample side effects perf/bpf: Remove unneeded uses_default_overflow_handler() perf/bpf: Call BPF handler directly, not through overflow machinery perf/bpf: Remove #ifdef CONFIG_BPF_SYSCALL from struct perf_event members perf/bpf: Create bpf_overflow_handler() stub for !CONFIG_BPF_SYSCALL perf/bpf: Reorder bpf_overflow_handler() ahead of __perf_event_overflow() perf/x86/rapl: Add support for Intel Lunar Lake perf/x86/rapl: Add support for Intel Arrow Lake perf/core: Reduce PMU access to adjust sample freq perf/core: Optimize perf_adjust_freq_unthr_context() perf/x86/amd: Don't reject non-sampling events with configured LBR perf/x86/amd: Support capturing LBR from software events perf/x86/amd: Avoid taking branches before disabling LBR perf/x86/amd: Ensure amd_pmu_core_disable_all() is always inlined ...
2024-05-14Merge tag 'locking-core-2024-05-13' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Over a dozen code generation micro-optimizations for the atomic and spinlock code - Add more __ro_after_init attributes - Robustify the lockdevent_*() macros * tag 'locking-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/pvqspinlock/x86: Use _Q_LOCKED_VAL in PV_UNLOCK_ASM macro locking/qspinlock/x86: Micro-optimize virt_spin_lock() locking/atomic/x86: Merge __arch{,_try}_cmpxchg64_emu_local() with __arch{,_try}_cmpxchg64_emu() locking/atomic/x86: Introduce arch_try_cmpxchg64_local() locking/pvqspinlock/x86: Remove redundant CMP after CMPXCHG in __raw_callee_save___pv_queued_spin_unlock() locking/pvqspinlock: Use try_cmpxchg() in qspinlock_paravirt.h locking/pvqspinlock: Use try_cmpxchg_acquire() in trylock_clear_pending() locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail() locking/atomic/x86: Define arch_atomic_sub() family using arch_atomic_add() functions locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions locking/atomic/x86: Introduce arch_atomic64_read_nonatomic() to x86_32 locking/atomic/x86: Introduce arch_atomic64_try_cmpxchg() to x86_32 locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() x86/tsc: Make __use_tsc __ro_after_init x86/kvm: Make kvm_async_pf_enabled __ro_after_init context_tracking: Make context_tracking_key __ro_after_init jump_label,module: Don't alloc static_key_mod for __ro_after_init keys locking/qspinlock: Always evaluate lockevent* non-event parameter once
2024-05-14Merge tag 'v6.10-p1' of ↵Linus Torvalds2-0/+32
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Remove crypto stats interface Algorithms: - Add faster AES-XTS on modern x86_64 CPUs - Forbid curves with order less than 224 bits in ecc (FIPS 186-5) - Add ECDSA NIST P521 Drivers: - Expose otp zone in atmel - Add dh fallback for primes > 4K in qat - Add interface for live migration in qat - Use dma for aes requests in starfive - Add full DMA support for stm32mpx in stm32 - Add Tegra Security Engine driver Others: - Introduce scope-based x509_certificate allocation" * tag 'v6.10-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (123 commits) crypto: atmel-sha204a - provide the otp content crypto: atmel-sha204a - add reading from otp zone crypto: atmel-i2c - rename read function crypto: atmel-i2c - add missing arg description crypto: iaa - Use kmemdup() instead of kzalloc() and memcpy() crypto: sahara - use 'time_left' variable with wait_for_completion_timeout() crypto: api - use 'time_left' variable with wait_for_completion_killable_timeout() crypto: caam - i.MX8ULP donot have CAAM page0 access crypto: caam - init-clk based on caam-page0-access crypto: starfive - Use fallback for unaligned dma access crypto: starfive - Do not free stack buffer crypto: starfive - Skip unneeded fallback allocation crypto: starfive - Skip dma setup for zeroed message crypto: hisilicon/sec2 - fix for register offset crypto: hisilicon/debugfs - mask the unnecessary info from the dump crypto: qat - specify firmware files for 402xx crypto: x86/aes-gcm - simplify GCM hash subkey derivation crypto: x86/aes-gcm - delete unused GCM assembly code crypto: x86/aes-xts - simplify loop in xts_crypt_slowpath() hwrng: stm32 - repair clock handling ...
2024-05-14Merge tag 'hardening-6.10-rc1' of ↵Linus Torvalds2-7/+64
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "The bulk of the changes here are related to refactoring and expanding the KUnit tests for string helper and fortify behavior. Some trivial strncpy replacements in fs/ were carried in my tree. Also some fixes to SCSI string handling were carried in my tree since the helper for those was introduce here. Beyond that, just little fixes all around: objtool getting confused about LKDTM+KCFI, preparing for future refactors (constification of sysctl tables, additional __counted_by annotations), a Clang UBSAN+i386 crash fix, and adding more options in the hardening.config Kconfig fragment. Summary: - selftests: Add str*cmp tests (Ivan Orlov) - __counted_by: provide UAPI for _le/_be variants (Erick Archer) - Various strncpy deprecation refactors (Justin Stitt) - stackleak: Use a copy of soon-to-be-const sysctl table (Thomas Weißschuh) - UBSAN: Work around i386 -regparm=3 bug with Clang prior to version 19 - Provide helper to deal with non-NUL-terminated string copying - SCSI: Fix older string copying bugs (with new helper) - selftests: Consolidate string helper behavioral tests - selftests: add memcpy() fortify tests - string: Add additional __realloc_size() annotations for "dup" helpers - LKDTM: Fix KCFI+rodata+objtool confusion - hardening.config: Enable KCFI" * tag 'hardening-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (29 commits) uapi: stddef.h: Provide UAPI macros for __counted_by_{le, be} stackleak: Use a copy of the ctl_table argument string: Add additional __realloc_size() annotations for "dup" helpers kunit/fortify: Fix replaced failure path to unbreak __alloc_size hardening: Enable KCFI and some other options lkdtm: Disable CFI checking for perms functions kunit/fortify: Add memcpy() tests kunit/fortify: Do not spam logs with fortify WARNs kunit/fortify: Rename tests to use recommended conventions init: replace deprecated strncpy with strscpy_pad kunit/fortify: Fix mismatched kvalloc()/vfree() usage scsi: qla2xxx: Avoid possible run-time warning with long model_num scsi: mpi3mr: Avoid possible run-time warning with long manufacturer strings scsi: mptfusion: Avoid possible run-time warning with long manufacturer strings fs: ecryptfs: replace deprecated strncpy with strscpy hfsplus: refactor copy_name to not use strncpy reiserfs: replace deprecated strncpy with scnprintf virt: acrn: replace deprecated strncpy with strscpy ubsan: Avoid i386 UBSAN handler crashes with Clang ubsan: Remove 1-element array usage in debug reporting ...
2024-05-14Merge tag 'execve-6.10-rc1' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve updates from Kees Cook: - Provide knob to change (previously fixed) coredump NOTES size (Allen Pais) - Add sched_prepare_exec tracepoint (Marco Elver) - Make /proc/$pid/auxv work under binfmt_elf_fdpic (Max Filippov) - Convert ARCH_HAVE_EXTRA_ELF_NOTES to proper Kconfig (Vignesh Balasubramanian) - Leave a gap between .bss and brk * tag 'execve-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: fs/coredump: Enable dynamic configuration of max file note size binfmt_elf_fdpic: fix /proc/<pid>/auxv binfmt_elf: Leave a gap between .bss and brk Replace macro "ARCH_HAVE_EXTRA_ELF_NOTES" with kconfig tracing: Add sched_prepare_exec tracepoint
2024-05-13Merge tag 'for-6.10/block-20240511' of git://git.kernel.dk/linuxLinus Torvalds4-156/+86
Pull block updates from Jens Axboe: - Add a partscan attribute in sysfs, fixing an issue with systemd relying on an internal interface that went away. - Attempt #2 at making long running discards interruptible. The previous attempt went into 6.9, but we ended up mostly reverting it as it had issues. - Remove old ida_simple API in bcache - Support for zoned write plugging, greatly improving the performance on zoned devices. - Remove the old throttle low interface, which has been experimental since 2017 and never made it beyond that and isn't being used. - Remove page->index debugging checks in brd, as it hasn't caught anything and prepares us for removing in struct page. - MD pull request from Song - Don't schedule block workers on isolated CPUs * tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux: (84 commits) blk-throttle: delay initialization until configuration blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW block: fix that util can be greater than 100% block: support to account io_ticks precisely block: add plug while submitting IO bcache: fix variable length array abuse in btree_iter bcache: Remove usage of the deprecated ida_simple_xx() API md: Revert "md: Fix overflow in is_mddev_idle" blk-lib: check for kill signal in ioctl BLKDISCARD block: add a bio_await_chain helper block: add a blk_alloc_discard_bio helper block: add a bio_chain_and_submit helper block: move discard checks into the ioctl handler block: remove the discard_granularity check in __blkdev_issue_discard block/ioctl: prefer different overflow check null_blk: Fix the WARNING: modpost: missing MODULE_DESCRIPTION() block: fix and simplify blkdevparts= cmdline parsing block: refine the EOF check in blkdev_iomap_begin block: add a partscan sysfs attribute for disks block: add a disk_has_partscan helper ...
2024-05-13Merge tag 'for-6.10/io_uring-20240511' of git://git.kernel.dk/linuxLinus Torvalds5-26/+62
Pull io_uring updates from Jens Axboe: - Greatly improve send zerocopy performance, by enabling coalescing of sent buffers. MSG_ZEROCOPY already does this with send(2) and sendmsg(2), but the io_uring side did not. In local testing, the crossover point for send zerocopy being faster is now around 3000 byte packets, and it performs better than the sync syscall variants as well. This feature relies on a shared branch with net-next, which was pulled into both branches. - Unification of how async preparation is done across opcodes. Previously, opcodes that required extra memory for async retry would allocate that as needed, using on-stack state until that was the case. If async retry was needed, the on-stack state was adjusted appropriately for a retry and then copied to the allocated memory. This led to some fragile and ugly code, particularly for read/write handling, and made storage retries more difficult than they needed to be. Allocate the memory upfront, as it's cheap from our pools, and use that state consistently both initially and also from the retry side. - Move away from using remap_pfn_range() for mapping the rings. This is really not the right interface to use and can cause lifetime issues or leaks. Additionally, it means the ring sq/cq arrays need to be physically contigious, which can cause problems in production with larger rings when services are restarted, as memory can be very fragmented at that point. Move to using vm_insert_page(s) for the ring sq/cq arrays, and apply the same treatment to mapped ring provided buffers. This also helps unify the code we have dealing with allocating and mapping memory. Hard to see in the diffstat as we're adding a few features as well, but this kills about ~400 lines of code from the codebase as well. - Add support for bundles for send/recv. When used with provided buffers, bundles support sending or receiving more than one buffer at the time, improving the efficiency by only needing to call into the networking stack once for multiple sends or receives. - Tweaks for our accept operations, supporting both a DONTWAIT flag for skipping poll arm and retry if we can, and a POLLFIRST flag that the application can use to skip the initial accept attempt and rely purely on poll for triggering the operation. Both of these have identical flags on the receive side already. - Make the task_work ctx locking unconditional. We had various code paths here that would do a mix of lock/trylock and set the task_work state to whether or not it was locked. All of that goes away, we lock it unconditionally and get rid of the state flag indicating whether it's locked or not. The state struct still exists as an empty type, can go away in the future. - Add support for specifying NOP completion values, allowing it to be used for error handling testing. - Use set/test bit for io-wq worker flags. Not strictly needed, but also doesn't hurt and helps silence a KCSAN warning. - Cleanups for io-wq locking and work assignments, closing a tiny race where cancelations would not be able to find the work item reliably. - Misc fixes, cleanups, and improvements * tag 'for-6.10/io_uring-20240511' of git://git.kernel.dk/linux: (97 commits) io_uring: support to inject result for NOP io_uring: fail NOP if non-zero op flags is passed in io_uring/net: add IORING_ACCEPT_POLL_FIRST flag io_uring/net: add IORING_ACCEPT_DONTWAIT flag io_uring/filetable: don't unnecessarily clear/reset bitmap io_uring/io-wq: Use set_bit() and test_bit() at worker->flags io_uring/msg_ring: cleanup posting to IOPOLL vs !IOPOLL ring io_uring: Require zeroed sqe->len on provided-buffers send io_uring/notif: disable LAZY_WAKE for linked notifs io_uring/net: fix sendzc lazy wake polling io_uring/msg_ring: reuse ctx->submitter_task read using READ_ONCE instead of re-reading it io_uring/rw: reinstate thread check for retries io_uring/notif: implement notification stacking io_uring/notif: simplify io_notif_flush() net: add callback for setting a ubuf_info to skb net: extend ubuf_info callback to ops structure io_uring/net: support bundles for recv io_uring/net: support bundles for send io_uring/kbuf: add helpers for getting/peeking multiple buffers io_uring/net: add provided buffer support for IORING_OP_SEND ...
2024-05-13Merge tag 'vfs-6.10.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds1-0/+10
Pull vfs rw iterator updates from Christian Brauner: "The core fs signalfd, userfaultfd, and timerfd subsystems did still use f_op->read() instead of f_op->read_iter(). Convert them over since we should aim to get rid of f_op->read() at some point. Aside from that io_uring and others want to mark files as FMODE_NOWAIT so it can make use of per-IO nonblocking hints to enable more efficient IO. Converting those users to f_op->read_iter() allows them to be marked with FMODE_NOWAIT" * tag 'vfs-6.10.rw' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: signalfd: convert to ->read_iter() userfaultfd: convert to ->read_iter() timerfd: convert to ->read_iter() new helper: copy_to_iter_full()
2024-05-13Merge tag 'vfs-6.10.netfs' of ↵Linus Torvalds3-105/+116
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull netfs updates from Christian Brauner: "This reworks the netfslib writeback implementation so that pages read from the cache are written to the cache through ->writepages(), thereby allowing the fscache page flag to be retired. The reworking also: - builds on top of the new writeback_iter() infrastructure - makes it possible to use vectored write RPCs as discontiguous streams of pages can be accommodated - makes it easier to do simultaneous content crypto and stream division - provides support for retrying writes and re-dividing a stream - replaces the ->launder_folio() op, so that ->writepages() is used instead - uses mempools to allocate the netfs_io_request and netfs_io_subrequest structs to avoid allocation failure in the writeback path Some code that uses the fscache page flag is retained for compatibility purposes with nfs and ceph. The code is switched to using the synonymous private_2 label instead and marked with deprecation comments. The merge commit contains additional details on the new algorithm that I've left out of here as it would probably be excessively detailed. On top of the netfslib infrastructure this contains the work to convert cifs over to netfslib" * tag 'vfs-6.10.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (38 commits) cifs: Enable large folio support cifs: Remove some code that's no longer used, part 3 cifs: Remove some code that's no longer used, part 2 cifs: Remove some code that's no longer used, part 1 cifs: Cut over to using netfslib cifs: Implement netfslib hooks cifs: Make add_credits_and_wake_if() clear deducted credits cifs: Add mempools for cifs_io_request and cifs_io_subrequest structs cifs: Set zero_point in the copy_file_range() and remap_file_range() cifs: Move cifs_loose_read_iter() and cifs_file_write_iter() to file.c cifs: Replace the writedata replay bool with a netfs sreq flag cifs: Make wait_mtu_credits take size_t args cifs: Use more fields from netfs_io_subrequest cifs: Replace cifs_writedata with a wrapper around netfs_io_subrequest cifs: Replace cifs_readdata with a wrapper around netfs_io_subrequest cifs: Use alternative invalidation to using launder_folio netfs, afs: Use writeback retry to deal with alternate keys netfs: Miscellaneous tidy ups netfs: Remove the old writeback code netfs: Cut over to using new writeback code ...
2024-05-13Merge tag 'vfs-6.10.misc' of ↵Linus Torvalds9-50/+82
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual fses. Features: - Free up FMODE_* bits. I've freed up bits 6, 7, 8, and 24. That means we now have six free FMODE_* bits in total (but bit #6 already got used for FMODE_WRITE_RESTRICTED) - Add FOP_HUGE_PAGES flag (follow-up to FMODE_* cleanup) - Add fd_raw cleanup class so we can make use of automatic cleanup provided by CLASS(fd_raw, f)(fd) for O_PATH fds as well - Optimize seq_puts() - Simplify __seq_puts() - Add new anon_inode_getfile_fmode() api to allow specifying f_mode instead of open-coding it in multiple places - Annotate struct file_handle with __counted_by() and use struct_size() - Warn in get_file() whether f_count resurrection from zero is attempted (epoll/drm discussion) - Folio-sophize aio - Export the subvolume id in statx() for both btrfs and bcachefs - Relax linkat(AT_EMPTY_PATH) requirements - Add F_DUPFD_QUERY fcntl() allowing to compare two file descriptors for dup*() equality replacing kcmp() Cleanups: - Compile out swapfile inode checks when swap isn't enabled - Use (1 << n) notation for FMODE_* bitshifts for clarity - Remove redundant variable assignment in fs/direct-io - Cleanup uses of strncpy in orangefs - Speed up and cleanup writeback - Move fsparam_string_empty() helper into header since it's currently open-coded in multiple places - Add kernel-doc comments to proc_create_net_data_write() - Don't needlessly read dentry->d_flags twice Fixes: - Fix out-of-range warning in nilfs2 - Fix ecryptfs overflow due to wrong encryption packet size calculation - Fix overly long line in xfs file_operations (follow-up to FMODE_* cleanup) - Don't raise FOP_BUFFER_{R,W}ASYNC for directories in xfs (follow-up to FMODE_* cleanup) - Don't call xfs_file_open from xfs_dir_open (follow-up to FMODE_* cleanup) - Fix stable offset api to prevent endless loops - Fix afs file server rotations - Prevent xattr node from overflowing the eraseblock in jffs2 - Move fdinfo PTRACE_MODE_READ procfs check into the .permission() operation instead of .open() operation since this caused userspace regressions" * tag 'vfs-6.10.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (39 commits) afs: Fix fileserver rotation getting stuck selftests: add F_DUPDFD_QUERY selftests fcntl: add F_DUPFD_QUERY fcntl() file: add fd_raw cleanup class fs: WARN when f_count resurrection is attempted seq_file: Simplify __seq_puts() seq_file: Optimize seq_puts() proc: Move fdinfo PTRACE_MODE_READ check into the inode .permission operation fs: Create anon_inode_getfile_fmode() xfs: don't call xfs_file_open from xfs_dir_open xfs: drop fop_flags for directories xfs: fix overly long line in the file_operations shmem: Fix shmem_rename2() libfs: Add simple_offset_rename() API libfs: Fix simple_offset_rename_exchange() jffs2: prevent xattr node from overflowing the eraseblock vfs, swap: compile out IS_SWAPFILE() on swapless configs vfs: relax linkat() AT_EMPTY_PATH - aka flink() - requirements fs/direct-io: remove redundant assignment to variable retval fs/dcache: Re-use value stored to dentry->d_flags instead of re-reading ...
2024-05-13Merge tag 'tpmdd-next-6.10-rc1' of ↵Linus Torvalds1-90/+226
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull TPM updates from Jarkko Sakkinen: "These are the changes for the TPM driver with a single major new feature: TPM bus encryption and integrity protection. The key pair on TPM side is generated from so called null random seed per power on of the machine [1]. This supports the TPM encryption of the hard drive by adding layer of protection against bus interposer attacks. Other than that, a few minor fixes and documentation for tpm_tis to clarify basics of TPM localities for future patch review discussions (will be extended and refined over times, just a seed)" Link: https://lore.kernel.org/linux-integrity/20240429202811.13643-1-James.Bottomley@HansenPartnership.com/ [1] * tag 'tpmdd-next-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (28 commits) Documentation: tpm: Add TPM security docs toctree entry tpm: disable the TPM if NULL name changes Documentation: add tpm-security.rst tpm: add the null key name as a sysfs export KEYS: trusted: Add session encryption protection to the seal/unseal path tpm: add session encryption protection to tpm2_get_random() tpm: add hmac checks to tpm2_pcr_extend() tpm: Add the rest of the session HMAC API tpm: Add HMAC session name/handle append tpm: Add HMAC session start and end functions tpm: Add TCG mandated Key Derivation Functions (KDFs) tpm: Add NULL primary creation tpm: export the context save and load commands tpm: add buffer function to point to returned parameters crypto: lib - implement library version of AES in CFB mode KEYS: trusted: tpm2: Use struct tpm_buf for sized buffers tpm: Add tpm_buf_read_{u8,u16,u32} tpm: TPM2B formatted buffers tpm: Store the length of the tpm_buf data separately. tpm: Update struct tpm_buf documentation comments ...
2024-05-13Merge tag 'kcsan.2024.05.10a' of ↵Linus Torvalds1-0/+7
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull kcsan update from Paul McKenney: "Introduce __data_racy type qualifier This adds a __data_racy type qualifier that enables kernel developers to inform KCSAN that a given variable is a shared variable without needing to mark each and every access. This allows pre-KCSAN code to be correctly (if approximately) instrumented withh very little effort, and also provides people reading the code a clear indication that the variable is in fact shared. In addition, it permits incremental transition to per-access KCSAN marking, so that (for example) a given subsystem can be transitioned one variable at a time, while avoiding large numbers of KCSAN warnings during this transition" * tag 'kcsan.2024.05.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: kcsan, compiler_types: Introduce __data_racy type qualifier
2024-05-13Merge tag 'cmpxchg.2024.05.11a' of ↵Linus Torvalds1-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull cmpxchg updates from Paul McKenney: "Provide one-byte and two-byte cmpxchg() support on sparc32, parisc, and csky This provides native one-byte and two-byte cmpxchg() support for sparc32 and parisc, courtesy of Al Viro. This support is provided by the same hashed-array-of-locks technique used for the other atomic operations provided for these two platforms. There is also emulated one-byte cmpxchg() support for csky using a new cmpxchg_emu_u8() function that uses a four-byte cmpxchg() to emulate the one-byte variant. Similar patches for emulation of one-byte cmpxchg() for arc, sh, and xtensa have not yet received maintainer acks, so they are slated for the v6.11 merge window" * tag 'cmpxchg.2024.05.11a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: csky: Emulate one-byte cmpxchg lib: Add one-byte emulation function parisc: add u16 support to cmpxchg() parisc: add missing export of __cmpxchg_u8() parisc: unify implementations of __cmpxchg_u{8,32,64} parisc: __cmpxchg_u32(): lift conversion into the callers sparc32: add __cmpxchg_u{8,16}() and teach __cmpxchg() to handle those sizes sparc32: unify __cmpxchg_u{32,64} sparc32: make the first argument of __cmpxchg_u64() volatile u64 * sparc32: make __cmpxchg_u32() return u32
2024-05-13Merge tag 'rcu.next.v6.10' of https://github.com/urezki/linuxLinus Torvalds3-14/+28
Pull RCU updates from Uladzislau Rezki: - Fix a lockdep complain for lazy-preemptible kernel, remove redundant BH disable for TINY_RCU, remove redundant READ_ONCE() in tree.c, fix false positives KCSAN splat and fix buffer overflow in the print_cpu_stall_info(). - Misc updates related to bpf, tracing and update the MAINTAINERS file. - An improvement of a normal synchronize_rcu() call in terms of latency. It maintains a separate track for sync. users only. This approach bypasses per-cpu nocb-lists thus sync-users do not depend on nocb-list length and how fast regular callbacks are processed. - RCU tasks: switch tasks RCU grace periods to sleep at TASK_IDLE priority, fix some comments, add some diagnostic warning to the exit_tasks_rcu_start() and fix a buffer overflow in the show_rcu_tasks_trace_gp_kthread(). - RCU torture: Increase memory to guest OS, fix a Tasks Rude RCU testing, some updates for TREE09, dump mode information to debug GP kthread state, remove redundant READ_ONCE(), fix some comments about RCU_TORTURE_PIPE_LEN and pipe_count, remove some redundant pointer initialization, fix a hung splat task by when the rcutorture tests start to exit, fix invalid context warning, add '--do-kvfree' parameter to torture test and use slow register unregister callbacks only for rcutype test. * tag 'rcu.next.v6.10' of https://github.com/urezki/linux: (48 commits) rcutorture: Use rcu_gp_slow_register/unregister() only for rcutype test torture: Scale --do-kvfree test time rcutorture: Fix invalid context warning when enable srcu barrier testing rcutorture: Make stall-tasks directly exit when rcutorture tests end rcutorture: Removing redundant function pointer initialization rcutorture: Make rcutorture support print rcu-tasks gp state rcutorture: Use the gp_kthread_dbg operation specified by cur_ops rcutorture: Re-use value stored to ->rtort_pipe_count instead of re-reading rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment rcutorture: Remove extraneous rcu_torture_pipe_update_one() READ_ONCE() rcu: Allocate WQ with WQ_MEM_RECLAIM bit set rcu: Support direct wake-up of synchronize_rcu() users rcu: Add a trace event for synchronize_rcu_normal() rcu: Reduce synchronize_rcu() latency rcu: Fix buffer overflow in print_cpu_stall_info() rcu: Mollify sparse with RCU guard rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer rcu-tasks: Replace exit_tasks_rcu_start() initialization with WARN_ON_ONCE() rcu: Remove redundant CONFIG_PROVE_RCU #if condition ...
2024-05-13Merge tag 'asm-generic-alpha' of ↵Linus Torvalds2-16/+4
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull alpha updates from Arnd Bergmann: "I had investigated dropping support for alpha EV5 and earlier a while ago after noticing that this is the only supported CPU family in the kernel without native byte access and that Debian has already dropped support for this generation last year [1] in order to improve performance for the newer machines. This topic came up again when Paul McKenney noticed that parts of the RCU code already rely on byte access and do not work on alpha EV5 reliably, so we decided on using my series to avoid the problem entirely. Al Viro did another series for alpha to address all the known build issues. I rebased his patches without any further changes and included it as a baseline for my work here to avoid conflicts and allow backporting the fixes to stable kernels for the now removed hardware support as well" [ I dearly loved alpha back in the days, but the lack of byte and word operations was a horrible mistake and made everything worse - including very much the crazy IO contortions that resulted from it. It certainly wasn't the only mistake in the architecture, but it's the first-order issue. So while it's a bit sad to see the support for my first alpha go away, if you want to run museum hardware, maybe you should use museum kernels.. - Linus ] * tag 'asm-generic-alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: alpha: drop pre-EV56 support alpha: cabriolet: remove EV5 CPU support alpha: remove LCA and APECS based machines alpha: sable: remove early machine support alpha: remove DECpc AXP150 (Jensen) support alpha: trim the unused stuff from asm-offsets.c alpha: jensen, t2 - make __EXTERN_INLINE same as for the rest alpha: core_lca: take the unused functions out alpha: missing includes alpha: sys_sio: fix misspelled ifdefs alpha: don't make functions public without a reason alpha: add clone3() support alpha: fix modversions for strcpy() et.al. alpha: sort scr_mem{cpy,move}w() out
2024-05-13Merge tag 'soc-drivers-6.10' of ↵Linus Torvalds7-280/+706
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC driver updates from Arnd Bergmann: "As usual, these are updates for drivers that are specific to certain SoCs or firmware running on them. Notable updates include - The new STMicroelectronics STM32 "firewall" bus driver that is used to provide a barrier between different parts of an SoC - Lots of updates for the Qualcomm platform drivers, in particular SCM, which gets a rewrite of its initialization code - Firmware driver updates for Arm FF-A notification interrupts and indirect messaging, SCMI firmware support for pin control and vendor specific interfaces, and TEE firmware interface changes across multiple TEE drivers - A larger cleanup of the Mediatek CMDQ driver and some related bits - Kconfig changes for riscv drivers to prepare for adding Kanaan k230 support - Multiple minor updates for the TI sysc bus driver, memory controllers, hisilicon hccs and more" * tag 'soc-drivers-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (103 commits) firmware: qcom: uefisecapp: Allow on sc8180x Primus and Flex 5G soc: qcom: pmic_glink: Make client-lock non-sleeping dt-bindings: soc: qcom,wcnss: fix bluetooth address example soc/tegra: pmc: Add EQOS wake event for Tegra194 and Tegra234 bus: stm32_firewall: fix off by one in stm32_firewall_get_firewall() bus: etzpc: introduce ETZPC firewall controller driver firmware: arm_ffa: Avoid queuing work when running on the worker queue bus: ti-sysc: Drop legacy idle quirk handling bus: ti-sysc: Drop legacy quirk handling for smartreflex bus: ti-sysc: Drop legacy quirk handling for uarts bus: ti-sysc: Add a description and copyrights bus: ti-sysc: Move check for no-reset-on-init soc: hisilicon: kunpeng_hccs: replace MAILBOX dependency with PCC soc: hisilicon: kunpeng_hccs: Add the check for obtaining complete port attribute firmware: arm_ffa: Fix memory corruption in ffa_msg_send2() bus: rifsc: introduce RIFSC firewall controller driver of: property: fw_devlink: Add support for "access-controller" soc: mediatek: mtk-socinfo: Correct the marketing name for MT8188GV soc: mediatek: mtk-socinfo: Add entry for MT8395AV/ZA Genio 1200 soc: mediatek: mtk-mutex: Add support for MT8188 VPPSYS ...
2024-05-12Merge tag 'x86_urgent_for_v6.9' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a new PCI ID which belongs to a new AMD CPU family 0x1a - Ensure that that last level cache ID is set in all cases, in the AMD CPU topology parsing code, in order to prevent invalid scheduling domain CPU masks * tag 'x86_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology/amd: Ensure that LLC ID is initialized x86/amd_nb: Add new PCI IDs for AMD family 0x1a
2024-05-11Merge tag 'mm-hotfixes-stable-2024-05-10-13-14' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM fixes from Andrew Morton: "18 hotfixes, 7 of which are cc:stable. More fixups for this cycle's page_owner updates. And a few userfaultfd fixes. Otherwise, random singletons - see the individual changelogs for details" * tag 'mm-hotfixes-stable-2024-05-10-13-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add entry for Barry Song selftests/mm: fix powerpc ARCH check mailmap: add entry for John Garry XArray: set the marks correctly when splitting an entry selftests/vDSO: fix runtime errors on LoongArch selftests/vDSO: fix building errors on LoongArch mm,page_owner: don't remove __GFP_NOLOCKDEP in add_stack_record_to_list fs/proc/task_mmu: fix uffd-wp confusion in pagemap_scan_pmd_entry() fs/proc/task_mmu: fix loss of young/dirty bits during pagemap scan mm/vmalloc: fix return value of vb_alloc if size is 0 mm: use memalloc_nofs_save() in page_cache_ra_order() kmsan: compiler_types: declare __no_sanitize_or_inline lib/test_xarray.c: fix error assumptions on check_xa_multi_store_adv_add() tools: fix userspace compilation with new test_xarray changes MAINTAINERS: update URL's for KEYS/KEYRINGS_INTEGRITY and TPM DEVICE DRIVER mm: page_owner: fix wrong information in dump_page_owner maple_tree: fix mas_empty_area_rev() null pointer dereference mm/userfaultfd: reset ptes when close() for wr-protected ones
2024-05-10x86/amd_nb: Add new PCI IDs for AMD family 0x1aShyam Sundar S K1-0/+1
Add the new PCI Device IDs to the MISC IDs list to support new generation of AMD 1Ah family 70h Models of processors. [ bp: Massage commit message. ] Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240510111829.969501-1-Shyam-sundar.S-k@amd.com
2024-05-09tpm: disable the TPM if NULL name changesJames Bottomley1-1/+3
Update tpm2_load_context() to return -EINVAL on integrity failures and use this as a signal when loading the NULL context that something might be wrong. If the signal fails, check the name of the NULL primary against the one stored in the chip data and if there is a mismatch disable the TPM because it is likely to have suffered a reset attack. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Add the rest of the session HMAC APIJames Bottomley1-0/+69
The final pieces of the HMAC API are for manipulating the session area of the command. To add an authentication HMAC session tpm_buf_append_hmac_session() is called where tpm2_append_auth() would go. If a non empty password is passed in, this is correctly added to the HMAC to prove knowledge of it without revealing it. Note that if the session is only used to encrypt or decrypt parameters (no authentication) then tpm_buf_append_hmac_session_opt() must be used instead. This functions identically to tpm_buf_append_hmac_session() when TPM_BUS_SECURITY is enabled, but differently when it isn't, because effectively nothing is appended to the session area. Next the parameters should be filled in for the command and finally tpm_buf_fill_hmac_session() is called immediately prior to transmitting the command which computes the correct HMAC and places it in the command at the session location in the tpm buffer Finally, after tpm_transmit_cmd() is called, tpm_buf_check_hmac_response() is called to check that the returned HMAC matched and collect the new state for the next use of the session, if any. The features of the session are controlled by the session attributes set in tpm_buf_append_hmac_session(). If TPM2_SA_CONTINUE_SESSION is not specified, the session will be flushed and the tpm2_auth structure freed in tpm_buf_check_hmac_response(); otherwise the session may be used again. Parameter encryption is specified by or'ing the flag TPM2_SA_DECRYPT and response encryption by or'ing the flag TPM2_SA_ENCRYPT. the various encryptions will be taken care of by tpm_buf_fill_hmac_session() and tpm_buf_check_hmac_response() respectively. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> # crypto API parts Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Add HMAC session name/handle appendJames Bottomley1-0/+26
Add tpm2_append_name() for appending to the handle area of the TPM command. When TPM_BUS_SECURITY is enabled and HMAC sessions are in use this adds the standard u32 handle to the buffer but additionally records the name of the object which must be used as part of the HMAC computation. The name of certain object types (volatile and permanent handles and NV indexes) is a hash of the public area of the object. Since this hash is not known ahead of time, it must be requested from the TPM using TPM2_ReadPublic() (which cannot be HMAC protected, but if an interposer lies about it, the HMAC check will fail and the problem will be detected). Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> # crypto API parts Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Add HMAC session start and end functionsJames Bottomley1-0/+34
Add session based HMAC authentication plus parameter decryption and response encryption using AES. The basic design is to segregate all the nasty crypto, hash and hmac code into tpm2-sessions.c and export a usable API. The API first of all starts off by gaining a session with tpm2_start_auth_session() which initiates a session with the TPM and allocates an opaque tpm2_auth structure to handle the session parameters. The design is that session use will be single threaded from start to finish under the ops lock, so the tpm2_auth structure is stored in struct tpm2_chip to simpify the externally visible API. The session can be ended with tpm2_end_auth_session() which is designed only to be used in error legs. Ordinarily the further session API (future patches) will end or continue the session appropriately without having to call this. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> # crypto API parts Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Add NULL primary creationJames Bottomley1-0/+69
The session handling code uses a "salted" session, meaning a session whose salt is encrypted to the public part of another TPM key so an observer cannot obtain it (and thus deduce the session keys). This patch creates and context saves in the tpm_chip area the primary key of the NULL hierarchy for this purpose. [jarkko@kernel.org: fixed documentation errors] Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: add buffer function to point to returned parametersJames Bottomley1-0/+2
Replace all instances of &buf.data[TPM_HEADER_SIZE] with a new function tpm_buf_parameters() because encryption sessions change where the return parameters are located in the buffer since if a return session is present they're 4 bytes beyond the header with those 4 bytes giving the parameter length. If there is no return session, then they're in the usual place immediately after the header. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Add tpm_buf_read_{u8,u16,u32}Jarkko Sakkinen1-0/+5
Declare reader functions for the instances of struct tpm_buf. If the read goes out of boundary, TPM_BUF_BOUNDARY_ERROR is set, and subsequent read will do nothing. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: TPM2B formatted buffersJarkko Sakkinen1-0/+4
Declare tpm_buf_init_sized() and tpm_buf_reset_sized() for creating TPM2B formatted buffers. These buffers are also known as sized buffers in the specifications and literature. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Store the length of the tpm_buf data separately.Jarkko Sakkinen1-3/+3
TPM2B buffers, or sized buffers, have a two byte header, which contains the length of the payload as a 16-bit big-endian number, without counting in the space taken by the header. This differs from encoding in the TPM header where the length includes also the bytes taken by the header. Unbound the length of a tpm_buf from the value stored to the TPM command header. A separate encoding and decoding step so that different buffer types can be supported, with variant header format and length encoding. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Update struct tpm_buf documentation commentsJarkko Sakkinen1-5/+4
Remove deprecated portions and document enum values. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Move buffer handling from static inlines to real functionsJames Bottomley1-71/+9
separate out the tpm_buf_... handling functions from static inlines in tpm.h and move them to their own tpm-buf.c file. This is a precursor to adding new functions for other TPM type handling because the amount of code will grow from the current 70 lines in tpm.h to about 200 lines when the additions are done. 200 lines of inline functions is a bit too much to keep in a header file. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Remove tpm_send()Jarkko Sakkinen1-5/+0
Open code the last remaining call site for tpm_send(). Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09tpm: Remove unused tpm_buf_tag()Jarkko Sakkinen1-7/+0
The helper function has no call sites. Thus, remove it. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-05-09Merge tag 'net-6.9-rc8' of ↵Linus Torvalds1-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and IPsec. The bridge patch is actually a follow-up to a recent fix in the same area. We have a pending v6.8 AF_UNIX regression; it should be solved soon, but not in time for this PR. Current release - regressions: - eth: ks8851: Queue RX packets in IRQ handler instead of disabling BHs - net: bridge: fix corrupted ethernet header on multicast-to-unicast Current release - new code bugs: - xfrm: fix possible bad pointer derferencing in error path Previous releases - regressionis: - core: fix out-of-bounds access in ops_init - ipv6: - fix potential uninit-value access in __ip6_make_skb() - fib6_rules: avoid possible NULL dereference in fib6_rule_action() - tcp: use refcount_inc_not_zero() in tcp_twsk_unique(). - rtnetlink: correct nested IFLA_VF_VLAN_LIST attribute validation - rxrpc: fix congestion control algorithm - bluetooth: - l2cap: fix slab-use-after-free in l2cap_connect() - msft: fix slab-use-after-free in msft_do_close() - eth: hns3: fix kernel crash when devlink reload during initialization - eth: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family Previous releases - always broken: - xfrm: preserve vlan tags for transport mode software GRO - tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets - eth: hns3: keep using user config after hardware reset" * tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family net: hns3: fix kernel crash when devlink reload during initialization net: hns3: fix port vlan filter not disabled issue net: hns3: use appropriate barrier function after setting a bit value net: hns3: release PTP resources if pf initialization failed net: hns3: change type of numa_node_mask as nodemask_t net: hns3: direct return when receive a unknown mailbox message net: hns3: using user configure after hardware reset net/smc: fix neighbour and rtable leak in smc_ib_find_route() ipv6: prevent NULL dereference in ip6_output() hsr: Simplify code for announcing HSR nodes timer setup ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action() dt-bindings: net: mediatek: remove wrongly added clocks and SerDes rxrpc: Only transmit one ACK per jumbo packet received rxrpc: Fix congestion control algorithm selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC ipv6: Fix potential uninit-value access in __ip6_make_skb() net: phy: marvell-88q2xxx: add support for Rev B1 and B2 appletalk: Improve handling of broadcast packets ...
2024-05-09file: add fd_raw cleanup classChristian Brauner1-0/+1
So we can also use CLASS(fd_raw, f)(fd) for codepaths where we allow FMODE_PATH aka O_PATH file descriptors to be used. Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-08fs/coredump: Enable dynamic configuration of max file note sizeAllen Pais1-0/+2
Introduce the capability to dynamically configure the maximum file note size for ELF core dumps via sysctl. Why is this being done? We have observed that during a crash when there are more than 65k mmaps in memory, the existing fixed limit on the size of the ELF notes section becomes a bottleneck. The notes section quickly reaches its capacity, leading to incomplete memory segment information in the resulting coredump. This truncation compromises the utility of the coredumps, as crucial information about the memory state at the time of the crash might be omitted. This enhancement removes the previous static limit of 4MB, allowing system administrators to adjust the size based on system-specific requirements or constraints. Eg: $ sysctl -a | grep core_file_note_size_limit kernel.core_file_note_size_limit = 4194304 $ sysctl -n kernel.core_file_note_size_limit 4194304 $echo 519304 > /proc/sys/kernel/core_file_note_size_limit $sysctl -n kernel.core_file_note_size_limit 519304 Attempting to write beyond the ceiling value of 16MB $echo 17194304 > /proc/sys/kernel/core_file_note_size_limit bash: echo: write error: Invalid argument Signed-off-by: Vijay Nag <nagvijay@microsoft.com> Signed-off-by: Allen Pais <apais@linux.microsoft.com> Link: https://lore.kernel.org/r/20240506193700.7884-1-apais@linux.microsoft.com Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-07kcsan, compiler_types: Introduce __data_racy type qualifierMarco Elver1-0/+7
Based on the discussion at [1], it would be helpful to mark certain variables as explicitly "data racy", which would result in KCSAN not reporting data races involving any accesses on such variables. To do that, introduce the __data_racy type qualifier: struct foo { ... int __data_racy bar; ... }; In KCSAN-kernels, __data_racy turns into volatile, which KCSAN already treats specially by considering them "marked". In non-KCSAN kernels the type qualifier turns into no-op. The generated code between KCSAN-instrumented kernels and non-KCSAN kernels is already huge (inserted calls into runtime for every memory access), so the extra generated code (if any) due to volatile for few such __data_racy variables are unlikely to have measurable impact on performance. Link: https://lore.kernel.org/all/CAHk-=wi3iondeh_9V2g3Qz5oHTRjLsOpoy83hb58MVh=nRZe0A@mail.gmail.com/ [1] Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Marco Elver <elver@google.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2024-05-07md: Revert "md: Fix overflow in is_mddev_idle"Li Nan1-1/+1
This reverts commit 3f9f231236ce7e48780d8a4f1f8cb9fae2df1e4e. Using 64bit for 'sync_io' is unnecessary from the gendisk side. This overflow will not cause any functional impact, except for a UBSAN warning. Solving this overflow requires introducing additional calculations and checks which are not necessary. So just keep using 32bit for 'sync_io'. Signed-off-by: Li Nan <linan122@huawei.com> Link: https://lore.kernel.org/r/20240507023103.781816-1-linan666@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-07block: add a blk_alloc_discard_bio helperChristoph Hellwig1-0/+3
Factor out a helper from __blkdev_issue_discard that chews off as much as possible from a discard range and allocates a bio for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240506042027.2289826-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-07block: add a bio_chain_and_submit helperChristoph Hellwig1-0/+1
This is basically blk_next_bio just with the bio allocation moved to the caller to allow for more flexible bio handling in the caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240506042027.2289826-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-05-06Reapply "drm/qxl: simplify qxl_fence_wait"Linus Torvalds1-7/+0
This reverts commit 07ed11afb68d94eadd4ffc082b97c2331307c5ea. Stephen Rostedt reports: "I went to run my tests on my VMs and the tests hung on boot up. Unfortunately, the most I ever got out was: [ 93.607888] Testing event system initcall: OK [ 93.667730] Running tests on all trace events: [ 93.669757] Testing all events: OK [ 95.631064] ------------[ cut here ]------------ Timed out after 60 seconds" and further debugging points to a possible circular locking dependency between the console_owner locking and the worker pool locking. Reverting the commit allows Steve's VM to boot to completion again. [ This may obviously result in the "[TTM] Buffer eviction failed" messages again, which was the reason for that original revert. But at this point this seems preferable to a non-booting system... ] Reported-and-bisected-by: Steven Rostedt <rostedt@goodmis.org> Link: https://lore.kernel.org/all/20240502081641.457aa25f@gandalf.local.home/ Acked-by: Maxime Ripard <mripard@kernel.org> Cc: Alex Constantino <dreaming.about.electric.sheep@gmail.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Timo Lindfors <timo.lindfors@iki.fi> Cc: Dave Airlie <airlied@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-06Merge tag 'slab-for-6.9-rc7-fixes' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: - Fix for cleanup infrastructure (Dan Carpenter) This makes the __free(kfree) cleanup hooks not crash on error pointers. - SLUB fix for freepointer checking (Nicolas Bouchinet) This fixes a recently introduced bug that manifests when init_on_free, CONFIG_SLAB_FREELIST_HARDENED and consistency checks (slub_debug=F) are all enabled, and results in false-positive freepointer corrupt reports for caches that store freepointer outside of the object area. * tag 'slab-for-6.9-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slab: make __free(kfree) accept error pointers mm/slub: avoid zeroing outside-object freepointer for single free
2024-05-06alpha: drop pre-EV56 supportArnd Bergmann2-16/+4
All EV4 machines are already gone, and the remaining EV5 based machines all support the slightly more modern EV56 generation as well. Debian only supports EV56 and later. Drop both of these and build kernels optimized for EV56 and higher when the "generic" options is selected, tuning for an out-of-order EV6 pipeline, same as Debian userspace. Since this was the only supported architecture without 8-bit and 16-bit stores, common kernel code no longer has to worry about aligning struct members, and existing workarounds from the block and tty layers can be removed. The alpha memory management code no longer needs an abstraction for the differences between EV4 and EV5+. Link: https://lists.debian.org/debian-alpha/2023/05/msg00009.html Acked-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-06kmsan: compiler_types: declare __no_sanitize_or_inlineAlexander Potapenko1-0/+11
It turned out that KMSAN instruments READ_ONCE_NOCHECK(), resulting in false positive reports, because __no_sanitize_or_inline enforced inlining. Properly declare __no_sanitize_or_inline under __SANITIZE_MEMORY__, so that it does not __always_inline the annotated function. Link: https://lkml.kernel.org/r/20240426091622.3846771-1-glider@google.com Fixes: 5de0ce85f5a4 ("kmsan: mark noinstr as __no_sanitize_memory") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: syzbot+355c5bb8c1445c871ee8@syzkaller.appspotmail.com Link: https://lkml.kernel.org/r/000000000000826ac1061675b0e3@google.com Cc: <stable@vger.kernel.org> Reviewed-by: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-05Merge tag 'trace-v6.9-rc6-2' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing and tracefs fixes from Steven Rostedt: - Fix RCU callback of freeing an eventfs_inode. The freeing of the eventfs_inode from the kref going to zero freed the contents of the eventfs_inode and then used kfree_rcu() to free the inode itself. But the contents should also be protected by RCU. Switch to a call_rcu() that calls a function to free all of the eventfs_inode after the RCU synchronization. - The tracing subsystem maps its own descriptor to a file represented by eventfs. The freeing of this descriptor needs to know when the last reference of an eventfs_inode is released, but currently there is no interface for that. Add a "release" callback to the eventfs_inode entry array that allows for freeing of data that can be referenced by the eventfs_inode being opened. Then increment the ref counter for this descriptor when the eventfs_inode file is created, and decrement/free it when the last reference to the eventfs_inode is released and the file is removed. This prevents races between freeing the descriptor and the opening of the eventfs file. - Fix the permission processing of eventfs. The change to make the permissions of eventfs default to the mount point but keep track of when changes were made had a side effect that could cause security concerns. When the tracefs is remounted with a given gid or uid, all the files within it should inherit that gid or uid. But if the admin had changed the permission of some file within the tracefs file system, it would not get updated by the remount. This caused the kselftest of file permissions to fail the second time it is run. The first time, all changes would look fine, but the second time, because the changes were "saved", the remount did not reset them. Create a link list of all existing tracefs inodes, and clear the saved flags on them on a remount if the remount changes the corresponding gid or uid fields. This also simplifies the code by removing the distinction between the toplevel eventfs and an instance eventfs. They should both act the same. They were different because of a misconception due to the remount not resetting the flags. Now that remount resets all the files and directories to default to the root node if a uid/gid is specified, it makes the logic simpler to implement. * tag 'trace-v6.9-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: eventfs: Have "events" directory get permissions from its parent eventfs: Do not treat events directory different than other directories eventfs: Do not differentiate the toplevel events directory tracefs: Still use mount point as default permissions for instances tracefs: Reset permissions on remount if permissions are options eventfs: Free all of the eventfs_inode after RCU eventfs/tracing: Add callback for release of an eventfs_inode
2024-05-04fs: WARN when f_count resurrection is attemptedKees Cook1-1/+2
It should never happen that get_file() is called on a file with f_count equal to zero. If this happens, a use-after-free condition has happened[1], and we need to attempt a best-effort reporting of the situation to help find the root cause more easily. Additionally, this serves as a data corruption indicator that system owners using warn_limit or panic_on_warn would like to have detected. Link: https://lore.kernel.org/lkml/7c41cf3c-2a71-4dbb-8f34-0337890906fc@gmail.com/ [1] Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20240503201620.work.651-kees@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-04eventfs/tracing: Add callback for release of an eventfs_inodeSteven Rostedt (Google)1-0/+3
Synthetic events create and destroy tracefs files when they are created and removed. The tracing subsystem has its own file descriptor representing the state of the events attached to the tracefs files. There's a race between the eventfs files and this file descriptor of the tracing system where the following can cause an issue: With two scripts 'A' and 'B' doing: Script 'A': echo "hello int aaa" > /sys/kernel/tracing/synthetic_events while : do echo 0 > /sys/kernel/tracing/events/synthetic/hello/enable done Script 'B': echo > /sys/kernel/tracing/synthetic_events Script 'A' creates a synthetic event "hello" and then just writes zero into its enable file. Script 'B' removes all synthetic events (including the newly created "hello" event). What happens is that the opening of the "enable" file has: { struct trace_event_file *file = inode->i_private; int ret; ret = tracing_check_open_get_tr(file->tr); [..] But deleting the events frees the "file" descriptor, and a "use after free" happens with the dereference at "file->tr". The file descriptor does have a reference counter, but there needs to be a way to decrement it from the eventfs when the eventfs_inode is removed that represents this file descriptor. Add an optional "release" callback to the eventfs_entry array structure, that gets called when the eventfs file is about to be removed. This allows for the creating on the eventfs file to increment the tracing file descriptor ref counter. When the eventfs file is deleted, it can call the release function that will call the put function for the tracing file descriptor. This will protect the tracing file from being freed while a eventfs file that references it is being opened. Link: https://lore.kernel.org/linux-trace-kernel/20240426073410.17154-1-Tze-nan.Wu@mediatek.com/ Link: https://lore.kernel.org/linux-trace-kernel/20240502090315.448cba46@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode") Reported-by: Tze-nan wu <Tze-nan.Wu@mediatek.com> Tested-by: Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-04Merge tag 'ipsec-2024-05-02' of ↵Jakub Kicinski1-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2024-05-02 1) Fix an error pointer dereference in xfrm_in_fwd_icmp. From Antony Antony. 2) Preserve vlan tags for ESP transport mode software GRO. From Paul Davey. 3) Fix a spelling mistake in an uapi xfrm.h comment. From Anotny Antony. * tag 'ipsec-2024-05-02' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec: xfrm: Correct spelling mistake in xfrm.h comment xfrm: Preserve vlan tags for transport mode software GRO xfrm: fix possible derferencing in error path ==================== Link: https://lore.kernel.org/r/20240502084838.2269355-1-steffen.klassert@secunet.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-03Merge tag 'sound-6.9-rc7' of ↵Linus Torvalds1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "As usual in a late stage, we received a fair amount of fixes for ASoC, and it became bigger than wished. But all fixes are rather device- specific, and they look pretty safe to apply. A major par of changes are series of fixes for ASoC meson and SOF drivers as well as for Realtek and Cirrus codecs. In addition, recent emu10k1 regression fixes and usual HD-audio quirks are included" * tag 'sound-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (46 commits) ALSA: hda/realtek: Fix build error without CONFIG_PM ALSA: hda/realtek: Fix conflicting PCI SSID 17aa:386f for Lenovo Legion models ALSA: hda/realtek - Set GPIO3 to default at S4 state for Thinkpad with ALC1318 ALSA: hda: intel-sdw-acpi: fix usage of device_get_named_child_node() ALSA: hda: intel-dsp-config: harden I2C/I2S codec detection ASoC: cs35l56: fix usages of device_get_named_child_node() ASoC: da7219-aad: fix usage of device_get_named_child_node() ASoC: meson: cards: select SND_DYNAMIC_MINORS ASoC: meson: axg-tdm: add continuous clock support ASoC: meson: axg-tdm-interface: manage formatters in trigger ASoC: meson: axg-card: make links nonatomic ASoC: meson: axg-fifo: use threaded irq to check periods ALSA: hda/realtek: Fix mute led of HP Laptop 15-da3001TU ALSA: emu10k1: make E-MU FPGA writes potentially more reliable ALSA: emu10k1: fix E-MU dock initialization ALSA: emu10k1: use mutex for E-MU FPGA access locking ALSA: emu10k1: move the whole GPIO event handling to the workqueue ALSA: emu10k1: factor out snd_emu1010_load_dock_firmware() ALSA: emu10k1: fix E-MU card dock presence monitoring ASoC: rt715-sdca: volume step modification ...
2024-05-03block: add a disk_has_partscan helperChristoph Hellwig1-0/+13
Add a helper to check if partition scanning is enabled instead of open coding the check in a few places. This now always checks for the hidden flag even if all but one of the callers are never reachable for hidden gendisks. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240502130033.1958492-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>