summaryrefslogtreecommitdiff
path: root/Documentation
AgeCommit message (Collapse)AuthorFilesLines
2022-03-25tools/vm/page_owner_sort.c: support for user-defined culling rulesJiajian Ye1-1/+28
When viewing page owner information, we may want to cull blocks of information with our own rules. So it is important to enhance culling function to provide the support for customizing culling rules. Therefore, following adjustments are made: 1. Add --cull option to support the culling of blocks of information with user-defined culling rules. ./page_owner_sort <input> <output> --cull=<rules> ./page_owner_sort <input> <output> --cull <rules> <rules> is a single argument in the form of a comma-separated list to specify individual culling rules, by the sequence of keys k1,k2, .... Mixed use of abbreviated and complete-form of keys is allowed. For reference, please see the document(Documentation/vm/page_owner.rst). Now, assuming two blocks in the input file are as follows: Page allocated via order 0, mask xxxx, pid 1, tgid 1 (task_name_demo) PFN xxxx prep_new_page+0xd0/0xf8 get_page_from_freelist+0x4a0/0x1290 __alloc_pages+0x168/0x340 alloc_pages+0xb0/0x158 Page allocated via order 0, mask xxxx, pid 32, tgid 32 (task_name_demo) PFN xxxx prep_new_page+0xd0/0xf8 get_page_from_freelist+0x4a0/0x1290 __alloc_pages+0x168/0x340 alloc_pages+0xb0/0x158 If we want to cull the blocks by stacktrace and task command name, we can use this command: ./page_owner_sort <input> <output> --cull=stacktrace,name The output would be like: 2 times, 2 pages, task_comm_name: task_name_demo prep_new_page+0xd0/0xf8 get_page_from_freelist+0x4a0/0x1290 __alloc_pages+0x168/0x340 alloc_pages+0xb0/0x158 As we can see, these two blocks are culled successfully, for they share the same pid and task command name. However, if we want to cull the blocks by pid, stacktrace and task command name, we can this command: ./page_owner_sort <input> <output> --cull=stacktrace,name,pid The output would be like: 1 times, 1 pages, PID 1, task_comm_name: task_name_demo prep_new_page+0xd0/0xf8 get_page_from_freelist+0x4a0/0x1290 __alloc_pages+0x168/0x340 alloc_pages+0xb0/0x158 1 times, 1 pages, PID 32, task_comm_name: task_name_demo prep_new_page+0xd0/0xf8 get_page_from_freelist+0x4a0/0x1290 __alloc_pages+0x168/0x340 alloc_pages+0xb0/0x158 As we can see, these two blocks are failed to cull, for their PIDs are different. 2. Add explanations of --cull options to the document. This work is coauthored by Yixuan Cao Shenghong Han Yinan Zhang Chongxi Zhao Yuhong Feng Link: https://lkml.kernel.org/r/20220312145834.624-1-yejiajian2018@email.szu.edu.cn Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn> Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn> Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn> Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn> Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn> Cc: Yuhong Feng <yuhongf@szu.edu.cn> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Sean Anderson <seanga2@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-25tools/vm/page_owner_sort.c: support for selecting by PID, TGID or task ↵Jiajian Ye1-0/+5
command name When viewing page owner information, we may also need to select the blocks by PID, TGID or task command name, which helps to get more accurate page allocation information as needed. Therefore, following adjustments are made: 1. Add three new options, including --pid, --tgid and --name, to support the selection of information blocks by a specific pid, tgid and task command name. In addtion, multiple options are allowed to be used at the same time. ./page_owner_sort [input] [output] --pid <PID> ./page_owner_sort [input] [output] --tgid <TGID> ./page_owner_sort [input] [output] --name <TASK_COMMAND_NAME> Assuming a scenario when a multi-threaded program, ./demo (PID = 5280), is running, and ./demo creates a child process (PID = 5281). $ps PID TTY TIME CMD 5215 pts/0 00:00:00 bash 5280 pts/0 00:00:00 ./demo 5281 pts/0 00:00:00 ./demo 5282 pts/0 00:00:00 ps It would be better to filter out the records with tgid=5280 and the task name "demo" when debugging the parent process, and the specific usage is ./page_owner_sort [input] [output] --tgid 5280 --name demo 2. Add explanations of three new options, including --pid, --tgid and --name, to the document. This work is coauthored by Shenghong Han <hanshenghong2019@email.szu.edu.cn>, Yixuan Cao <caoyixuan2019@email.szu.edu.cn>, Yinan Zhang <zhangyinan2019@email.szu.edu.cn>, Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>, Yuhong Feng <yuhongf@szu.edu.cn>. Link: https://lkml.kernel.org/r/1646835223-7584-1-git-send-email-yejiajian2018@email.szu.edu.cn Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn> Cc: Sean Anderson <seanga2@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Zhenliang Wei <weizhenliang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-25tools/vm/page_owner_sort: support for sorting by task command nameJiajian Ye1-0/+1
When viewing page owner information, we may also need to the block to be sorted by task command name. Therefore, the following adjustments are made: 1. Add a member variable to record task command name of block. 2. Add a new -n option to sort the information of blocks by task command name. 3. Add -n option explanation in the document. Link: https://lkml.kernel.org/r/20220306030640.43054-2-yejiajian2018@email.szu.edu.cn Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Sean Anderson <seanga2@gmail.com> Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn> Cc: Zhenliang Wei <weizhenliang@huawei.com> Cc: <zhaochongxi2019@email.szu.edu.cn> Cc: <hanshenghong2019@email.szu.edu.cn> Cc: <zhangyinan2019@email.szu.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-25tools/vm/page_owner_sort.c: support sorting by tgid and update documentationJiajian Ye1-0/+1
When the "page owner" information is read, the information sorted by TGID is expected. As a result, the following adjustments have been made: 1. Add a new -P option to sort the information of blocks by TGID in ascending order. 2. Adjust the order of member variables in block_list strust to avoid one 4 byte hole. 3. Add -P option explanation in the document. Link: https://lkml.kernel.org/r/20220301151438.166118-3-yejiajian2018@email.szu.edu.cn Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn> Cc: Zhenliang Wei <weizhenliang@huawei.com> Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-25tools/vm/page_owner_sort.c: fix commentsJiajian Ye1-2/+2
Two adjustments are made: 1. Correct a grammatical error: replace the "what" in "Do the job what you want to debug" with "that". 2. Replace "has not been" with "has been" in the description of the -f option: According to Commit b1c9ba071e7d ("tools/vm/page_owner_sort.c: fix the instructions for use"), the description of the "-f" option is "Filter out the information of blocks whose memory has been released." Link: https://lkml.kernel.org/r/20220301151438.166118-1-yejiajian2018@email.szu.edu.cn Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn> Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn> Cc: Zhenliang Wei <weizhenliang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-25Documentation/vm/page_owner.rst: fix unexpected indentation warnsShuah Khan1-3/+3
Fix Unexpected indentation warns in page_owner: Documentation/vm/page_owner.rst:92: WARNING: Unexpected indentation. Documentation/vm/page_owner.rst:96: WARNING: Unexpected indentation. Documentation/vm/page_owner.rst:107: WARNING: Unexpected indentation. Link: https://lkml.kernel.org/r/20211215001929.47866-1-skhan@linuxfoundation.org Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-25Documentation/vm/page_owner.rst: update the documentationShenghong Han1-2/+21
Update the documentation of ``page_owner``. [akpm@linux-foundation.org: small grammatical tweaks] Link: https://lkml.kernel.org/r/20211214134736.2569-1-hanshenghong2019@email.szu.edu.cn Signed-off-by: Shenghong Han <hanshenghong2019@email.szu.edu.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Georgi Djakov <georgi.djakov@linaro.org> Cc: Liam Mark <lmark@codeaurora.org> Cc: Tang Bin <tangbin@cmss.chinamobile.com> Cc: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Cc: Zhenliang Wei <weizhenliang@huawei.com> Cc: Xiaoming Ni <nixiaoming@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-25Merge branch 'akpm' (patches from Andrew)Linus Torvalds4-3/+16
Merge more updates from Andrew Morton: "Various misc subsystems, before getting into the post-linux-next material. 41 patches. Subsystems affected by this patch series: procfs, misc, core-kernel, lib, checkpatch, init, pipe, minix, fat, cgroups, kexec, kdump, taskstats, panic, kcov, resource, and ubsan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (41 commits) Revert "ubsan, kcsan: Don't combine sanitizer with kcov on clang" kernel/resource: fix kfree() of bootmem memory again kcov: properly handle subsequent mmap calls kcov: split ioctl handling into locked and unlocked parts panic: move panic_print before kmsg dumpers panic: add option to dump all CPUs backtraces in panic_print docs: sysctl/kernel: add missing bit to panic_print taskstats: remove unneeded dead assignment kasan: no need to unset panic_on_warn in end_report() ubsan: no need to unset panic_on_warn in ubsan_epilogue() panic: unset panic_on_warn inside panic() docs: kdump: add scp example to write out the dump file docs: kdump: update description about sysfs file system support arm64: mm: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef x86/setup: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef riscv: mm: init: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef kexec: make crashk_res, crashk_low_res and crash_notes symbols always visible cgroup: use irqsave in cgroup_rstat_flush_locked(). fat: use pointer to simple type in put_user() minix: fix bug when opening a file with O_DIRECT ...
2022-03-24Merge tag 'net-next-5.18' of ↵Linus Torvalds45-491/+2011
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "The sprinkling of SPI drivers is because we added a new one and Mark sent us a SPI driver interface conversion pull request. Core ---- - Introduce XDP multi-buffer support, allowing the use of XDP with jumbo frame MTUs and combination with Rx coalescing offloads (LRO). - Speed up netns dismantling (5x) and lower the memory cost a little. Remove unnecessary per-netns sockets. Scope some lists to a netns. Cut down RCU syncing. Use batch methods. Allow netdev registration to complete out of order. - Support distinguishing timestamp types (ingress vs egress) and maintaining them across packet scrubbing points (e.g. redirect). - Continue the work of annotating packet drop reasons throughout the stack. - Switch netdev error counters from an atomic to dynamically allocated per-CPU counters. - Rework a few preempt_disable(), local_irq_save() and busy waiting sections problematic on PREEMPT_RT. - Extend the ref_tracker to allow catching use-after-free bugs. BPF --- - Introduce "packing allocator" for BPF JIT images. JITed code is marked read only, and used to be allocated at page granularity. Custom allocator allows for more efficient memory use, lower iTLB pressure and prevents identity mapping huge pages from getting split. - Make use of BTF type annotations (e.g. __user, __percpu) to enforce the correct probe read access method, add appropriate helpers. - Convert the BPF preload to use light skeleton and drop the user-mode-driver dependency. - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling its use as a packet generator. - Allow local storage memory to be allocated with GFP_KERNEL if called from a hook allowed to sleep. - Introduce fprobe (multi kprobe) to speed up mass attachment (arch bits to come later). - Add unstable conntrack lookup helpers for BPF by using the BPF kfunc infra. - Allow cgroup BPF progs to return custom errors to user space. - Add support for AF_UNIX iterator batching. - Allow iterator programs to use sleepable helpers. - Support JIT of add, and, or, xor and xchg atomic ops on arm64. - Add BTFGen support to bpftool which allows to use CO-RE in kernels without BTF info. - Large number of libbpf API improvements, cleanups and deprecations. Protocols --------- - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev. - Adjust TSO packet sizes based on min_rtt, allowing very low latency links (data centers) to always send full-sized TSO super-frames. - Make IPv6 flow label changes (AKA hash rethink) more configurable, via sysctl and setsockopt. Distinguish between server and client behavior. - VxLAN support to "collect metadata" devices to terminate only configured VNIs. This is similar to VLAN filtering in the bridge. - Support inserting IPv6 IOAM information to a fraction of frames. - Add protocol attribute to IP addresses to allow identifying where given address comes from (kernel-generated, DHCP etc.) - Support setting socket and IPv6 options via cmsg on ping6 sockets. - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS. Define dscp_t and stop taking ECN bits into account in fib-rules. - Add support for locked bridge ports (for 802.1X). - tun: support NAPI for packets received from batched XDP buffs, doubling the performance in some scenarios. - IPv6 extension header handling in Open vSwitch. - Support IPv6 control message load balancing in bonding, prevent neighbor solicitation and advertisement from using the wrong port. Support NS/NA monitor selection similar to existing ARP monitor. - SMC - improve performance with TCP_CORK and sendfile() - support auto-corking - support TCP_NODELAY - MCTP (Management Component Transport Protocol) - add user space tag control interface - I2C binding driver (as specified by DMTF DSP0237) - Multi-BSSID beacon handling in AP mode for WiFi. - Bluetooth: - handle MSFT Monitor Device Event - add MGMT Adv Monitor Device Found/Lost events - Multi-Path TCP: - add support for the SO_SNDTIMEO socket option - lots of selftest cleanups and improvements - Increase the max PDU size in CAN ISOTP to 64 kB. Driver API ---------- - Add HW counters for SW netdevs, a mechanism for devices which offload packet forwarding to report packet statistics back to software interfaces such as tunnels. - Select the default NIC queue count as a fraction of number of physical CPU cores, instead of hard-coding to 8. - Expose devlink instance locks to drivers. Allow device layer of drivers to use that lock directly instead of creating their own which always runs into ordering issues in devlink callbacks. - Add header/data split indication to guide user space enabling of TCP zero-copy Rx. - Allow configuring completion queue event size. - Refactor page_pool to enable fragmenting after allocation. - Add allocation and page reuse statistics to page_pool. - Improve Multiple Spanning Trees support in the bridge to allow reuse of topologies across VLANs, saving HW resources in switches. - DSA (Distributed Switch Architecture): - replay and offload of host VLAN entries - offload of static and local FDB entries on LAG interfaces - FDB isolation and unicast filtering New hardware / drivers ---------------------- - Ethernet: - LAN937x T1 PHYs - Davicom DM9051 SPI NIC driver - Realtek RTL8367S, RTL8367RB-VB switch and MDIO - Microchip ksz8563 switches - Netronome NFP3800 SmartNICs - Fungible SmartNICs - MediaTek MT8195 switches - WiFi: - mt76: MediaTek mt7916 - mt76: MediaTek mt7921u USB adapters - brcmfmac: Broadcom BCM43454/6 - Mobile: - iosm: Intel M.2 7360 WWAN card Drivers ------- - Convert many drivers to the new phylink API built for split PCS designs but also simplifying other cases. - Intel Ethernet NICs: - add TTY for GNSS module for E810T device - improve AF_XDP performance - GTP-C and GTP-U filter offload - QinQ VLAN support - Mellanox Ethernet NICs (mlx5): - support xdp->data_meta - multi-buffer XDP - offload tc push_eth and pop_eth actions - Netronome Ethernet NICs (nfp): - flow-independent tc action hardware offload (police / meter) - AF_XDP - Other Ethernet NICs: - at803x: fiber and SFP support - xgmac: mdio: preamble suppression and custom MDC frequencies - r8169: enable ASPM L1.2 if system vendor flags it as safe - macb/gem: ZynqMP SGMII - hns3: add TX push mode - dpaa2-eth: software TSO - lan743x: multi-queue, mdio, SGMII, PTP - axienet: NAPI and GRO support - Mellanox Ethernet switches (mlxsw): - source and dest IP address rewrites - RJ45 ports - Marvell Ethernet switches (prestera): - basic routing offload - multi-chain TC ACL offload - NXP embedded Ethernet switches (ocelot & felix): - PTP over UDP with the ocelot-8021q DSA tagging protocol - basic QoS classification on Felix DSA switch using dcbnl - port mirroring for ocelot switches - Microchip high-speed industrial Ethernet (sparx5): - offloading of bridge port flooding flags - PTP Hardware Clock - Other embedded switches: - lan966x: PTP Hardward Clock - qca8k: mdio read/write operations via crafted Ethernet packets - Qualcomm 802.11ax WiFi (ath11k): - add LDPC FEC type and 802.11ax High Efficiency data in radiotap - enable RX PPDU stats in monitor co-exist mode - Intel WiFi (iwlwifi): - UHB TAS enablement via BIOS - band disablement via BIOS - channel switch offload - 32 Rx AMPDU sessions in newer devices - MediaTek WiFi (mt76): - background radar detection - thermal management improvements on mt7915 - SAR support for more mt76 platforms - MBSSID and 6 GHz band on mt7915 - RealTek WiFi: - rtw89: AP mode - rtw89: 160 MHz channels and 6 GHz band - rtw89: hardware scan - Bluetooth: - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS) - Microchip CAN (mcp251xfd): - multiple RX-FIFOs and runtime configurable RX/TX rings - internal PLL, runtime PM handling simplification - improve chip detection and error handling after wakeup" * tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits) llc: fix netdevice reference leaks in llc_ui_bind() drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool ice: don't allow to run ice_send_event_to_aux() in atomic ctx ice: fix 'scheduling while atomic' on aux critical err interrupt net/sched: fix incorrect vlan_push_eth dest field net: bridge: mst: Restrict info size queries to bridge ports net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init() drivers: net: xgene: Fix regression in CRC stripping net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT net: dsa: fix missing host-filtered multicast addresses net/mlx5e: Fix build warning, detected write beyond size of field iwlwifi: mvm: Don't fail if PPAG isn't supported selftests/bpf: Fix kprobe_multi test. Revert "rethook: x86: Add rethook x86 implementation" Revert "arm64: rethook: Add arm64 rethook implementation" Revert "powerpc: Add rethook support" Revert "ARM: rethook: Add rethook arm implementation" netdevice: add missing dm_private kdoc net: bridge: mst: prevent NULL deref in br_mst_info_size() selftests: forwarding: Use same VRF for port and VLAN upper ...
2022-03-24Merge tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds3-0/+37
Pull VFIO updates from Alex Williamson: - Introduce new device migration uAPI and implement device specific mlx5 vfio-pci variant driver supporting new protocol (Jason Gunthorpe, Yishai Hadas, Leon Romanovsky) - New HiSilicon acc vfio-pci variant driver, also supporting migration interface (Shameer Kolothum, Longfang Liu) - D3hot fixes for vfio-pci-core (Abhishek Sahu) - Document new vfio-pci variant driver acceptance criteria (Alex Williamson) - Fix UML build unresolved ioport_{un}map() functions (Alex Williamson) - Fix MAINTAINERS due to header movement (Lukas Bulwahn) * tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfio: (31 commits) vfio-pci: Provide reviewers and acceptance criteria for variant drivers MAINTAINERS: adjust entry for header movement in hisilicon qm driver hisi_acc_vfio_pci: Use its own PCI reset_done error handler hisi_acc_vfio_pci: Add support for VFIO live migration crypto: hisilicon/qm: Set the VF QM state register hisi_acc_vfio_pci: Add helper to retrieve the struct pci_driver hisi_acc_vfio_pci: Restrict access to VF dev BAR2 migration region hisi_acc_vfio_pci: add new vfio_pci driver for HiSilicon ACC devices hisi_acc_qm: Move VF PCI device IDs to common header crypto: hisilicon/qm: Move few definitions to common header crypto: hisilicon/qm: Move the QM header to include/linux vfio/mlx5: Fix to not use 0 as NULL pointer PCI/IOV: Fix wrong kernel-doc identifier vfio/mlx5: Use its own PCI reset_done error handler vfio/pci: Expose vfio_pci_core_aer_err_detected() vfio/mlx5: Implement vfio_pci driver for mlx5 devices vfio/mlx5: Expose migration commands over mlx5 device vfio: Remove migration protocol v1 documentation vfio: Extend the device migration protocol with RUNNING_P2P vfio: Define device migration protocol v2 ...
2022-03-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds4-76/+273
Pull kvm updates from Paolo Bonzini: "ARM: - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes RISC-V: - Prevent KVM_COMPAT from being selected - Optimize __kvm_riscv_switch_to() implementation - RISC-V SBI v0.3 support s390: - memop selftest - fix SCK locking - adapter interruptions virtualization for secure guests - add Claudio Imbrenda as maintainer - first step to do proper storage key checking x86: - Continue switching kvm_x86_ops to static_call(); introduce static_call_cond() and __static_call_ret0 when applicable. - Cleanup unused arguments in several functions - Synthesize AMD 0x80000021 leaf - Fixes and optimization for Hyper-V sparse-bank hypercalls - Implement Hyper-V's enlightened MSR bitmap for nested SVM - Remove MMU auditing - Eager splitting of page tables (new aka "TDP" MMU only) when dirty page tracking is enabled - Cleanup the implementation of the guest PGD cache - Preparation for the implementation of Intel IPI virtualization - Fix some segment descriptor checks in the emulator - Allow AMD AVIC support on systems with physical APIC ID above 255 - Better API to disable virtualization quirks - Fixes and optimizations for the zapping of page tables: - Zap roots in two passes, avoiding RCU read-side critical sections that last too long for very large guests backed by 4 KiB SPTEs. - Zap invalid and defunct roots asynchronously via concurrency-managed work queue. - Allowing yielding when zapping TDP MMU roots in response to the root's last reference being put. - Batch more TLB flushes with an RCU trick. Whoever frees the paging structure now holds RCU as a proxy for all vCPUs running in the guest, i.e. to prolongs the grace period on their behalf. It then kicks the the vCPUs out of guest mode before doing rcu_read_unlock(). Generic: - Introduce __vcalloc and use it for very large allocations that need memcg accounting" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits) KVM: use kvcalloc for array allocations KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2 kvm: x86: Require const tsc for RT KVM: x86: synthesize CPUID leaf 0x80000021h if useful KVM: x86: add support for CPUID leaf 0x80000021 KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()" kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU KVM: arm64: fix typos in comments KVM: arm64: Generalise VM features into a set of flags KVM: s390: selftests: Add error memop tests KVM: s390: selftests: Add more copy memop tests KVM: s390: selftests: Add named stages for memop test KVM: s390: selftests: Add macro as abstraction for MEM_OP KVM: s390: selftests: Split memop tests KVM: s390x: fix SCK locking RISC-V: KVM: Implement SBI HSM suspend call RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function RISC-V: Add SBI HSM suspend related defines RISC-V: KVM: Implement SBI v0.3 SRST extension ...
2022-03-24panic: move panic_print before kmsg dumpersGuilherme G. Piccoli1-0/+4
The panic_print setting allows users to collect more information in a panic event, like memory stats, tasks, CPUs backtraces, etc. This is an interesting debug mechanism, but currently the print event happens *after* kmsg_dump(), meaning that pstore, for example, cannot collect a dmesg with the panic_print extra information. This patch changes that in 2 steps: (a) The panic_print setting allows to replay the existing kernel log buffer to the console (bit 5), besides the extra information dump. This functionality makes sense only at the end of the panic() function. So, we hereby allow to distinguish the two situations by a new boolean parameter in the function panic_print_sys_info(). (b) With the above change, we can safely call panic_print_sys_info() before kmsg_dump(), allowing to dump the extra information when using pstore or other kmsg dumpers. The additional messages from panic_print could overwrite the oldest messages when the buffer is full. The only reasonable solution is to use a large enough log buffer, hence we added an advice into the kernel parameters documentation about that. Link: https://lkml.kernel.org/r/20220214141308.841525-1-gpiccoli@igalia.com Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Feng Tang <feng.tang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24panic: add option to dump all CPUs backtraces in panic_printGuilherme G. Piccoli2-0/+2
Currently the "panic_print" parameter/sysctl allows some interesting debug information to be printed during a panic event. This is useful for example in cases the user cannot kdump due to resource limits, or if the user collects panic logs in a serial output (or pstore) and prefers a fast reboot instead of a kdump. Happens that currently there's no way to see all CPUs backtraces in a panic using "panic_print" on architectures that support that. We do have "oops_all_cpu_backtrace" sysctl, but although partially overlapping in the functionality, they are orthogonal in nature: "panic_print" is a panic tuning (and we have panics without oopses, like direct calls to panic() or maybe other paths that don't go through oops_enter() function), and the original purpose of "oops_all_cpu_backtrace" is to provide more information on oopses for cases in which the users desire to continue running the kernel even after an oops, i.e., used in non-panic scenarios. So, we hereby introduce an additional bit for "panic_print" to allow dumping the CPUs backtraces during a panic event. Link: https://lkml.kernel.org/r/20211109202848.610874-3-gpiccoli@igalia.com Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Feng Tang <feng.tang@intel.com> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24docs: sysctl/kernel: add missing bit to panic_printGuilherme G. Piccoli1-0/+1
Patch series "Some improvements on panic_print". This is a mix of a documentation fix with some additions to the "panic_print" syscall / parameter. The goal here is being able to collect all CPUs backtraces during a panic event and also to enable "panic_print" in a kdump event - details of the reasoning and design choices in the patches. This patch (of 3): Commit de6da1e8bcf0 ("panic: add an option to replay all the printk message in buffer") added a new bit to the sysctl/kernel parameter "panic_print", but the documentation was added only in kernel-parameters.txt, not in the sysctl guide. Fix it here by adding bit 5 to sysctl admin-guide documentation. [rdunlap@infradead.org: fix table format warning] Link: https://lkml.kernel.org/r/20220109055635.6999-1-rdunlap@infradead.org Link: https://lkml.kernel.org/r/20211109202848.610874-1-gpiccoli@igalia.com Link: https://lkml.kernel.org/r/20211109202848.610874-2-gpiccoli@igalia.com Fixes: de6da1e8bcf0 ("panic: add an option to replay all the printk message in buffer") Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Feng Tang <feng.tang@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Iurii Zaikin <yzaikin@google.com> Cc: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24docs: kdump: add scp example to write out the dump fileTiezhu Yang1-0/+4
Except cp and makedumpfile, add scp example to write out the dump file. Link: https://lkml.kernel.org/r/1644324666-15947-3-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Baoquan He <bhe@redhat.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marco Elver <elver@google.com> Cc: Xuefeng Li <lixuefeng@loongson.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24docs: kdump: update description about sysfs file system supportTiezhu Yang1-3/+3
Patch series "Update doc and fix some issues about kdump", v2. This patch (of 5): After commit 6a108a14fa35 ("kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT"), "Configure standard kernel features (for small systems)" is not exist, we should use "Configure standard kernel features (expert users)" now. Link: https://lkml.kernel.org/r/1644324666-15947-1-git-send-email-yangtiezhu@loongson.cn Link: https://lkml.kernel.org/r/1644324666-15947-2-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Baoquan He <bhe@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marco Elver <elver@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Xuefeng Li <lixuefeng@loongson.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24Documentation/sparse: add hints about __CHECKER__Bjorn Helgaas1-0/+2
Several attributes depend on __CHECKER__, but previously there was no clue in the tree about when __CHECKER__ might be defined. Add hints at the most common places (__kernel, __user, __iomem, __bitwise) and in the sparse documentation. Link: https://lkml.kernel.org/r/20220310220927.245704-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: "Michael S . Tsirkin" <mst@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24Merge tag 'arm-dt-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds37-413/+1183
Pull ARM devicetree updates from Arnd Bergmann: "After a somewhat quiet 5.17 release, the size of the DT changes is a bit larger again. There are nine new SoC that get added, all of them related to existing platforms: - Airoha (formerly Mediatek/EcoNet) EN7523 networking SoC and EVB - Mediatek mt6582 tablet platform with the Prestigio PMT5008 3G tablet - Microchip Lan966 networking SoC and it evaluation board - Qualcomm Snapdragon 625/632 midrange phone SoCs, with the LG Nexus 5X and Fairphone FP3 phones - Renesas RZ/G2LC and RZ/V2L general-purpose embedded SoCs, along with their evaluation boards - Samsung Exynos 850 phone SoC and reference board - Samsung Exynos7885 with the Samsung Galaxy A8 (2018) phone - Tesla FSD (Fully Self-Driving), an automotive SoC loosely derived from the Samsung Exynos family. - TI K3/AM62 SoC and reference board Support for additional functionality in existing dts files is added all over the place: Samsung, Renesas, Mstar, wpcm450, OMAP, AT91, Allwinner, i.MX, Tegra, Aspeed, Oxnas, Qualcomm, Mediatek, and Broadcom. Samsung has a rework for its pinctrl schema that is a bit tricky and requires driver changes to be included here. A few more platforms only have smaller cleanups and DT Schema fixes, this includes SoCFPGA, ux500, ixp4xx, STi, Xilinx Zynq, LG, and Juno. The new machines are really too many to list, but I'll do it anyway: Allwinner: - A20-Marsboard development board Amlogic: - Amediatek X96-AIR (Amlogic S905X3) - CYX A95XF3-AIR (Amlogic S905X3) - Haochuangy H96-Max (Amlogic S905X3) - Amlogic AQ222 (Amlogic S4) - OSMC Vero 4K+ (Amlogic S905D) Arm Juno: - Separate DT depending on SCMI firmware version Aspeed: - Quanta S6Q BMC (AST2600) - ASRock ROMED8HM3 (AST2500) Broadcom: - Raspberry Pi Zero 2 W Marvell MVEBU/Armada: - Ctera C200 V1 NAS (kirkwood) - Ctera C200 V2 NAS (armada-370) Mstar: - DongShanPiOne, a low-end embedded board - Miyoo Mini handheld game console NXP i.MX: - Numerous i.MX8M Mini based boards in even more variations, but none based on other SoCs this time: Protonic PRT8MM, emCON-MX8M Mini, Toradex Verdin, and Gateworks GW7903 Qualcomm: - Google Herobrine R1 Chromebook platform (Snapdragon 7c Gen 3) - SHIFT6mq phone (Snapdragon 845) - Samsung Galaxy Book2 (Snapdragon 850) - Snapdragon 8 Gen 1 Hardware Development Kit TI OMAP: - SanCloud BeagleBone Enhanced WiFi Rockchip: - Pine64 PineNote ereader tablet (rk356x) - Bananapi-R2-Pro (rk356x) STM32: - emtrion emSBS-Argon embedded board (stm32mp157c)" * tag 'arm-dt-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (627 commits) arm64: dts: n5x: drop invalid property and fix edac node name arm64: dts: fsd: Add the MCT support arm64: dts: stingray: Fix spi clock name arm64: dts: ns2: Fix spi clock name ARM: dts: rockchip: Update regulator name for PX3 ARM: dts: rockchip: Add #clock-cells value for rk805 arm64: dts: rockchip: Add #clock-cells value for rk805 arm64: dts: rockchip: Remove vcc13 and vcc14 for rk808 arm64: dts: rockchip: Fix SDIO regulator supply properties on rk3399-firefly ARM: dts: at91: sama7g5: Add NAND support ARM: dts: at91: sama7g5: add eic node ARM: dts: at91: sama7g5: Remove unused properties in i2c nodes ARM: dts: at91: sam9x60ek: modify vdd_1v5 regulator to vdd_1v15 arm64: dts: lg: align pl330 node name with dtschema arm64: dts: lg: add dma-cells to pl330 node arm64: dts: juno: align pl330 node name with dtschema arm64: dts: broadcom: Fix sata nodename arm64: dts: n5x: add sdr edac support arm64: dts: agilex/stratix10: add clock-names to USB DWC2 node dt-bindings: usb: dwc2: add disable-over-current ...
2022-03-24Merge tag 'arm-drivers-5.18' of ↵Linus Torvalds22-335/+1012
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM driver updates from Arnd Bergmann: "There are a few separately maintained driver subsystems that we merge through the SoC tree, notable changes are: - Memory controller updates, mainly for Tegra and Mediatek SoCs, and clarifications for the memory controller DT bindings - SCMI firmware interface updates, in particular a new transport based on OPTEE and support for atomic operations. - Cleanups to the TEE subsystem, refactoring its memory management For SoC specific drivers without a separate subsystem, changes include - Smaller updates and fixes for TI, AT91/SAMA5, Qualcomm and NXP Layerscape SoCs. - Driver support for Microchip SAMA5D29, Tesla FSD, Renesas RZ/G2L, and Qualcomm SM8450. - Better power management on Mediatek MT81xx, NXP i.MX8MQ and older NVIDIA Tegra chips" * tag 'arm-drivers-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (154 commits) ARM: spear: fix typos in comments soc/microchip: fix invalid free in mpfs_sys_controller_delete soc: s4: Add support for power domains controller dt-bindings: power: add Amlogic s4 power domains bindings ARM: at91: add support in soc driver for new SAMA5D29 soc: mediatek: mmsys: add sw0_rst_offset in mmsys driver data dt-bindings: memory: renesas,rpc-if: Document RZ/V2L SoC memory: emif: check the pointer temp in get_device_details() memory: emif: Add check for setup_interrupts dt-bindings: arm: mediatek: mmsys: add support for MT8186 dt-bindings: mediatek: add compatible for MT8186 pwrap soc: mediatek: pwrap: add pwrap driver for MT8186 SoC soc: mediatek: mmsys: add mmsys reset control for MT8186 soc: mediatek: mtk-infracfg: Disable ACP on MT8192 soc: ti: k3-socinfo: Add AM62x JTAG ID soc: mediatek: add MTK mutex support for MT8186 soc: mediatek: mmsys: add mt8186 mmsys routing table soc: mediatek: pm-domains: Add support for mt8186 dt-bindings: power: Add MT8186 power domains soc: mediatek: pm-domains: Add support for mt8195 ...
2022-03-24Merge tag 'arm-soc-5.18' of ↵Linus Torvalds1-10/+10
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC updates from Arnd Bergmann: "SoC specific code is generally used for older platforms that don't (yet) use device tree to do the same things. - Support is added for i.MXRT10xx, a Cortex-M7 based microcontroller from NXP. At the moment this is still incomplete as other portions are merged through different trees. - Long abandoned support for running NOMMU ARMv4 or ARMv5 platforms gets removed, now the Arm NOMMU platforms are limited to the Cortex-M family of microcontrollers - Two old PXA boards get removed, along with corresponding driver bits. - Continued cleanup of the Intel IXP4xx platforms, removing some remnants of the old board files. - Minor Cleanups and fixes for Orion, PXA, MMP, Mstar, Samsung - CPU idle support for AT91 - A system controller driver for Polarfire" * tag 'arm-soc-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (29 commits) ARM: remove support for NOMMU ARMv4/v5 ARM: PXA: fix up decompressor code soc: microchip: make mpfs_sys_controller_put static ARM: pxa: remove Intel Imote2 and Stargate 2 boards ARM: mmp: Fix failure to remove sram device ARM: mstar: Select ARM_ERRATA_814220 soc: add microchip polarfire soc system controller ARM: at91: Kconfig: select PM_OPP ARM: at91: PM: add cpu idle support for sama7g5 ARM: at91: ddr: fix typo to align with datasheet naming ARM: at91: ddr: align macro definitions ARM: at91: ddr: remove CONFIG_SOC_SAMA7 dependency ARM: ixp4xx: Convert to SPARSE_IRQ and P2V ARM: ixp4xx: Drop all common code ARM: ixp4xx: Drop custom DMA coherency and bouncing ARM: ixp4xx: Remove feature bit accessors net: ixp4xx_hss: Check features using syscon net: ixp4xx_eth: Drop platform data support soc: ixp4xx-npe: Access syscon regs using regmap soc: ixp4xx: Add features from regmap helper ...
2022-03-24Merge tag 'asm-generic-5.18' of ↵Linus Torvalds48-217/+0
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks" * tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits) nds32: Remove the architecture uaccess: remove CONFIG_SET_FS ia64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support sparc64: remove CONFIG_SET_FS support lib/test_lockup: fix kernel pointer check for separate address spaces uaccess: generalize access_ok() uaccess: fix type mismatch warnings from access_ok() arm64: simplify access_ok() m68k: fix access_ok for coldfire MIPS: use simpler access_ok() MIPS: Handle address errors for accesses above CPU max virtual user address uaccess: add generic __{get,put}_kernel_nofault nios2: drop access_ok() check from __put_user() x86: use more conventional access_ok() definition x86: remove __range_not_ok() sparc64: add __{get,put}_kernel_nofault() nds32: fix access_ok() checks in get/put_user uaccess: fix nios2 and microblaze get_user_8() sparc64: fix building assembly files ...
2022-03-24Merge tag 'sound-5.18-rc1' of ↵Linus Torvalds45-246/+1040
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "It's been a fairly calm development cycle. There are a few last-minute ALSA core fixes, most notably for covering PCM ioctl races, but the most of rest are device-specific changes. Below are some highlights: ALSA core: - Fixes for PCM ioctl races that may lead to UAF - Fix for oversized allocations in PCM OSS layer ASoC: - Start of moving SoF to support multiple IPC mechanisms - Use of NHLT ACPI table to reduce the amount of quirking required for Intel systems - Preliminary works forthcoming Intel AVS driver for legacy Intel DSP firmwares - Support for AMD PDM, Atmel PDMC, Awinic AW8738, i.MX cards with TLV320AIC31xx, Intel machines with CS35L41 and ESSX8336, Mediatek MT8181 wideband bluetooth, nVidia Tegra234, Qualcomm SC7280, Renesas RZ/V2L, Texas Instruments TAS585M HD-audio: - Driver re-binding fix for HD-audio - Updates for Intel ADL and Tegra234, various platform quirks for Dell, HP, Lenovo, ASUS, Samsung and Clevo machines USB-audio: - Quirk updates for Scarlett2, RODE, Corsair devices" * tag 'sound-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (486 commits) ALSA: hda/realtek: Add alc256-samsung-headphone fixup ALSA: pci: fix reading of swapped values from pcmreg in AC97 codec ALSA: pcm: Add stream lock during PCM reset ioctl operations ALSA: pcm: Fix races among concurrent prealloc proc writes ALSA: pcm: Fix races among concurrent prepare and hw_params/hw_free calls ALSA: pcm: Fix races among concurrent read/write and buffer changes ALSA: pcm: Fix races among concurrent hw_params and hw_free calls ASoC: atmel: mchp-pdmc: print the correct property name MAINTAINERS: Add Shengjiu to maintainer list of sound/soc/fsl ASoC: SOF: Add a new dai_get_clk topology IPC op ASoC: SOF: topology: Add ops for setting up and tearing down pipelines ASoC: SOF: expose sof_route_setup() ASoC: SOF: Add dai_link_fixup PCM op for IPC3 ASoC: SOF: Add trigger PCM op for IPC3 ASoC: SOF: Define hw_params PCM op for IPC3 ASoC: SOF: Introduce IPC3 PCM hw_free op ASoC: SOF: pcm: expose the sof_pcm_setup_connected_widgets() function ASoC: SOF: Introduce IPC-specific PCM ops ASoC: SOF: Add bytes_ext control IPC ops for IPC3 ASoC: SOF: Add bytes_get/put control IPC ops for IPC3 ...
2022-03-24Merge tag 'media/v5.18-1' of ↵Linus Torvalds44-516/+2052
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - a major reorg at platform Kconfig/Makefile files, organizing them per vendor. The other media Kconfig/Makefile files also sorted - New sensor drivers: hi847, isl7998x, ov08d10 - New Amphion vpu decoder stateful driver - New Atmel microchip csi2dc driver - tegra-vde driver promoted from staging - atomisp: some fixes for it to work on BYT - imx7-mipi-csis driver promoted from staging and renamed - camss driver got initial support for VFE hardware version Titan 480 - mtk-vcodec has gained support for MT8192 - lots of driver changes, fixes and improvements * tag 'media/v5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (417 commits) media: nxp: Restrict VIDEO_IMX_MIPI_CSIS to ARCH_MXC or COMPILE_TEST media: amphion: cleanup media device if register it fail media: amphion: fix some issues to improve robust media: amphion: fix some error related with undefined reference to __divdi3 media: amphion: fix an issue that using pm_runtime_get_sync incorrectly media: vidtv: use vfree() for memory allocated with vzalloc() media: m5mols/m5mols.h: document new reset field media: pixfmt-yuv-planar.rst: fix PIX_FMT labels media: platform: Remove unnecessary print function dev_err() media: amphion: Add missing of_node_put() in vpu_core_parse_dt() media: mtk-vcodec: Add missing of_node_put() in mtk_vdec_hw_prob_done() media: platform: amphion: Fix build error without MAILBOX media: spi: Kconfig: Place SPI drivers on a single menu media: i2c: Kconfig: move camera drivers to the top media: atomisp: fix bad usage at error handling logic media: platform: rename mediatek/mtk-jpeg/ to mediatek/jpeg/ media: media/*/Kconfig: sort entries media: Kconfig: cleanup VIDEO_DEV dependencies media: platform/*/Kconfig: make manufacturer menus more uniform media: platform: Create vendor/{Makefile,Kconfig} files ...
2022-03-24Merge tag 'for-5.18/fbdev-1' of ↵Linus Torvalds1-5/+7
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev updates from Helge Deller: "Lots of small fixes and code cleanups across most of the fbdev drivers. This includes conversions to use helper functions, const conversions, spelling fixes, help text updates, adding return value checks, small build fixes, and much more" * tag 'for-5.18/fbdev-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: (59 commits) video: fbdev: kyro: make read-only array ODValues static const video: fbdev: offb: fix warning comparing pointer to 0 video: fbdev: omapfb: Add missing of_node_put() in dvic_probe_of video: fbdev: sm712fb: Fix crash in smtcfb_write() video: fbdev: s3c-fb: fix platform_get_irq.cocci warning video: fbdev: sm712fb: Fix crash in smtcfb_read() video: fbdev: via: check the return value of kstrdup() video: fbdev: au1100fb: Spelling s/palette/palette/ video: fbdev: atari: Atari 2 bpp (STe) palette bugfix video: fbdev: atari: Remove unused atafb_setcolreg() video: fbdev: atari: Convert to standard round_up() helper video: fbdev: atari: Fix TT High video mode video: fbdev: udlfb: replace snprintf in show functions with sysfs_emit video: fbdev: omapfb: panel-tpo-td043mtea1: Use sysfs_emit() instead of snprintf() video: fbdev: omapfb: panel-dsi-cm: Use sysfs_emit() instead of snprintf() video: fbdev: omapfb: Use sysfs_emit() instead of snprintf() video: fbdev: s3c-fb: Use platform_get_irq() to get the interrupt video: fbdev: Fix wrong file path for pvr2fb.c in Kconfig help text video: fbdev: pxa3xx-gcu: Remove unnecessary print function dev_err() video: fbdev: pxa168fb: Remove unnecessary print function dev_err() ...
2022-03-24Merge tag 'mmc-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds7-1/+97
Pull MMC updates from Ulf Hansson: "MMC core: - Convert to sysfs_emit() in favor of sprintf() - Improve fallback to speed modes if eMMC HS200 fails MMC host: - dw_mmc: Allow variants to set minimal supported clock rate - dw-mmc-rockchip: Fix problems with invalid clock rates - litex_mmc: Add new DT based driver for the LiteX's LiteSDCard interface - litex_mmc: Add Gabriel Somlo and Joel Stanley as co-maintainers for LiteX - mtk-sd: Add support for the Mediatek MT8186 variant - renesas_sdhi: Add support for RZ/G2UL variant - renesas_sdhi: Add support for RZ/V2L variant - rtsx_pci: Adjust power-on sequence to conform to the SD spec - sdhci-am654: Add support for TI's AM62 variant - sdhci_am654: Fixup support for TI's AM64 variant - sdhci-esdhc-imx: Add support for the imx93 variant - sdhci-msm: Add support for the msm8953 variant - sdhci-pci-gli: Add support for runtime PM for the GL9763E variant - sdhci-pci-gli: Adjustments of the SSC function for the GL975x variants - sdhci-tegra: Add support for wake on SD card event - sunxi-mmc: Add support for Allwinner's F1c100s variant - sunxi-mmc: Add support for D1 MMC variant" * tag 'mmc-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (37 commits) dt-bindings: mmc: renesas,sdhi: Document RZ/G2UL SoC mmc: tmio: remove outdated members from host struct mmc: mtk-sd: Silence delay phase calculation debug log mmc: davinci_mmc: Handle error for clk_enable mmc: sdhci-pci-gli: Add runtime PM for GL9763E mmc: core: Drop HS400 caps unless 8-bit bus is supported too mmc: host: Return an error when ->enable_sdio_irq() ops is missing mmc: core: Improve fallback to speed modes if eMMC HS200 fails dt-bindings: mmc: sunxi: add Allwinner F1c100s compatible mmc: dw-mmc-rockchip: Fix handling invalid clock rates mmc: dw_mmc: Support setting f_min from host drivers mmc: host: Drop commas after SoC match table sentinels mmc: rtsx: add 74 Clocks in power on flow dt-bindings: mmc: renesas,sdhi: Document RZ/V2L SoC mmc: sh_mmcif: Simplify division/shift logic mmc: sdhci_am654: Add Support for TI's AM62 SoC dt-bindings: mmc: imx-esdhc: Add imx93 compatible string dt-bindings: mmc: sdhci-am654: Add compatible string for AM62 SoC mmc: sdhci_am654: Fix the driver data of AM64 SoC mmc: core: use sysfs_emit() instead of sprintf() ...
2022-03-23Merge tag 'trace-v5.18' of ↵Linus Torvalds3-0/+225
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: - New user_events interface. User space can register an event with the kernel describing the format of the event. Then it will receive a byte in a page mapping that it can check against. A privileged task can then enable that event like any other event, which will change the mapped byte to true, telling the user space application to start writing the event to the tracing buffer. - Add new "ftrace_boot_snapshot" kernel command line parameter. When set, the tracing buffer will be saved in the snapshot buffer at boot up when the kernel hands things over to user space. This will keep the traces that happened at boot up available even if user space boot up has tracing as well. - Have TRACE_EVENT_ENUM() also update trace event field type descriptions. Thus if a static array defines its size with an enum, the user space trace event parsers can still know how to parse that array. - Add new TRACE_CUSTOM_EVENT() macro. This acts the same as the TRACE_EVENT() macro, but will attach to an existing tracepoint. This will make one tracepoint be able to trace different content and not be stuck at only what the original TRACE_EVENT() macro exports. - Fixes to tracing error logging. - Better saving of cmdlines to PIDs when tracing (use the wakeup events for mapping). * tag 'trace-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (30 commits) tracing: Have type enum modifications copy the strings user_events: Add trace event call as root for low permission cases tracing/user_events: Use alloc_pages instead of kzalloc() for register pages tracing: Add snapshot at end of kernel boot up tracing: Have TRACE_DEFINE_ENUM affect trace event types as well tracing: Fix strncpy warning in trace_events_synth.c user_events: Prevent dyn_event delete racing with ioctl add/delete tracing: Add TRACE_CUSTOM_EVENT() macro tracing: Move the defines to create TRACE_EVENTS into their own files tracing: Add sample code for custom trace events tracing: Allow custom events to be added to the tracefs directory tracing: Fix last_cmd_set() string management in histogram code user_events: Fix potential uninitialized pointer while parsing field tracing: Fix allocation of last_cmd in last_cmd_set() user_events: Add documentation file user_events: Add sample code for typical usage user_events: Add self-test for validator boundaries user_events: Add self-test for perf_event integration user_events: Add self-test for dynamic_events integration user_events: Add self-test for ftrace integration ...
2022-03-23Merge tag 'trace-rtla-v5.18' of ↵Linus Torvalds3-0/+41
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull RTLA tracing tool updates from Steven Rostedt: "Real Time Analysis Tool updatesfor 5.18: - Support for adjusting tracing_threashold - Add -a (auto) option to make it easier for users to debug in the field - Add -e option to add more events to the trace - Add --trigger option to add triggers to events - Add --filter option to filter events - Add support to save histograms to the file - Add --dma-latency to set /dev/cpu_dma_latency - Other fixes and cleanups" * tag 'trace-rtla-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: rtla: Tools main loop cleanup rtla/timerlat: Add --dma-latency option rtla/osnoise: Fix osnoise hist stop tracing message rtla: Check for trace off also in the trace instance rtla/trace: Save event histogram output to a file rtla: Add --filter support rtla/trace: Add trace event filter helpers rtla: Add --trigger support rtla/trace: Add trace event trigger helpers rtla: Add -e/--event support rtla/trace: Add trace events helpers rtla/timerlat: Add the automatic trace option rtla/osnoise: Add the automatic trace option rtla/osnoise: Add an option to set the threshold rtla/osnoise: Add support to adjust the tracing_thresh
2022-03-23Merge tag 'printk-for-5.18' of ↵Linus Torvalds1-2/+7
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Make %pK behave the same as %p for kptr_restrict == 0 also with no_hash_pointers parameter - Ignore the default console in the device tree also when console=null or console="" is used on the command line - Document console=null and console="" behavior - Prevent a deadlock and a livelock caused by console_lock in panic() - Make console_lock available for panicking CPU - Fast query for the next to-be-used sequence number - Use the expected return values in printk.devkmsg __setup handler - Use the correct atomic operations in wake_up_klogd() irq_work handler - Avoid possible unaligned access when handling %4cc printing format * tag 'printk-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: fix return value of printk.devkmsg __setup handler vsprintf: Fix %pK with kptr_restrict == 0 printk: make suppress_panic_printk static printk: Set console_set_on_cmdline=1 when __add_preferred_console() is called with user_specified == true Docs: printk: add 'console=null|""' to admin/kernel-parameters printk: use atomic updates for klogd work printk: Drop console_sem during panic printk: Avoid livelock with heavy printk during panic printk: disable optimistic spin during panic printk: Add panic_in_progress helper vsprintf: Move space out of string literals in fourcc_string() vsprintf: Fix potential unaligned access printk: ringbuffer: Improve prb_next_seq() performance
2022-03-23Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds3-47/+48
Pull filesystem folio updates from Matthew Wilcox: "Primarily this series converts some of the address_space operations to take a folio instead of a page. Notably: - a_ops->is_partially_uptodate() takes a folio instead of a page and changes the type of the 'from' and 'count' arguments to make it obvious they're bytes. - a_ops->invalidatepage() becomes ->invalidate_folio() and has a similar type change. - a_ops->launder_page() becomes ->launder_folio() - a_ops->set_page_dirty() becomes ->dirty_folio() and adds the address_space as an argument. There are a couple of other misc changes up front that weren't worth separating into their own pull request" * tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache: (53 commits) fs: Remove aops ->set_page_dirty fb_defio: Use noop_dirty_folio() fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio fs: Convert __set_page_dirty_buffers to block_dirty_folio nilfs: Convert nilfs_set_page_dirty() to nilfs_dirty_folio() mm: Convert swap_set_page_dirty() to swap_dirty_folio() ubifs: Convert ubifs_set_page_dirty to ubifs_dirty_folio f2fs: Convert f2fs_set_node_page_dirty to f2fs_dirty_node_folio f2fs: Convert f2fs_set_data_page_dirty to f2fs_dirty_data_folio f2fs: Convert f2fs_set_meta_page_dirty to f2fs_dirty_meta_folio afs: Convert afs_dir_set_page_dirty() to afs_dir_dirty_folio() btrfs: Convert extent_range_redirty_for_io() to use folios fs: Convert trivial uses of __set_page_dirty_nobuffers to filemap_dirty_folio btrfs: Convert from set_page_dirty to dirty_folio fscache: Convert fscache_set_page_dirty() to fscache_dirty_folio() fs: Add aops->dirty_folio fs: Remove aops->launder_page orangefs: Convert launder_page to launder_folio nfs: Convert from launder_page to launder_folio fuse: Convert from launder_page to launder_folio ...
2022-03-23Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds1-9/+9
Pull folio updates from Matthew Wilcox: - Rewrite how munlock works to massively reduce the contention on i_mmap_rwsem (Hugh Dickins): https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/ - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig): https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/ - Convert GUP to use folios and make pincount available for order-1 pages. (Matthew Wilcox) - Convert a few more truncation functions to use folios (Matthew Wilcox) - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox) - Convert rmap_walk to use folios (Matthew Wilcox) - Convert most of shrink_page_list() to use a folio (Matthew Wilcox) - Add support for creating large folios in readahead (Matthew Wilcox) * tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits) mm/damon: minor cleanup for damon_pa_young selftests/vm/transhuge-stress: Support file-backed PMD folios mm/filemap: Support VM_HUGEPAGE for file mappings mm/readahead: Switch to page_cache_ra_order mm/readahead: Align file mappings for non-DAX mm/readahead: Add large folio readahead mm: Support arbitrary THP sizes mm: Make large folios depend on THP mm: Fix READ_ONLY_THP warning mm/filemap: Allow large folios to be added to the page cache mm: Turn can_split_huge_page() into can_split_folio() mm/vmscan: Convert pageout() to take a folio mm/vmscan: Turn page_check_references() into folio_check_references() mm/vmscan: Account large folios correctly mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios mm/vmscan: Free non-shmem folios without splitting them mm/rmap: Constify the rmap_walk_control argument mm/rmap: Convert rmap_walk() to take a folio mm: Turn page_anon_vma() into folio_anon_vma() mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() ...
2022-03-23Merge branch 'akpm' (patches from Andrew)Linus Torvalds13-61/+752
Merge updates from Andrew Morton: - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs - Most the MM patches which precede the patches in Willy's tree: kasan, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb, userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp, cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap, zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits) mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release() Docs/ABI/testing: add DAMON sysfs interface ABI document Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface selftests/damon: add a test for DAMON sysfs interface mm/damon/sysfs: support DAMOS stats mm/damon/sysfs: support DAMOS watermarks mm/damon/sysfs: support schemes prioritization mm/damon/sysfs: support DAMOS quotas mm/damon/sysfs: support DAMON-based Operation Schemes mm/damon/sysfs: support the physical address space monitoring mm/damon/sysfs: link DAMON for virtual address spaces monitoring mm/damon: implement a minimal stub for sysfs-based DAMON interface mm/damon/core: add number of each enum type values mm/damon/core: allow non-exclusive DAMON start/stop Docs/damon: update outdated term 'regions update interval' Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling Docs/vm/damon: call low level monitoring primitives the operations mm/damon: remove unnecessary CONFIG_DAMON option mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}() mm/damon/dbgfs-test: fix is_target_id() change ...
2022-03-23Docs/ABI/testing: add DAMON sysfs interface ABI documentSeongJae Park1-0/+274
This commit adds DAMON sysfs interface ABI document under Documentation/ABI/testing. Link: https://lkml.kernel.org/r/20220228081314.5770-14-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Xin Hao <xhao@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23Docs/admin-guide/mm/damon/usage: document DAMON sysfs interfaceSeongJae Park1-6/+344
This commit adds detailed usage of DAMON sysfs interface in the admin-guide document for DAMON. Link: https://lkml.kernel.org/r/20220228081314.5770-13-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Xin Hao <xhao@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23Docs/damon: update outdated term 'regions update interval'SeongJae Park2-8/+10
Before DAMON is merged in the mainline, the concept of 'regions update interval' has generalized to be used as the time interval for update of any monitoring operations related data structure, but the document has not updated properly. This commit updates the document for better consistency. Link: https://lkml.kernel.org/r/20220222170100.17068-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handlingSeongJae Park1-3/+4
In DAMON's early development stage before it be merged in the mainline, it was first designed to work exclusively with Idle page tracking to avoid any interference between each other. Later, but still before be merged in the mainline, because Idle page tracking is fully under the control of sysadmins, we made the resolving of conflict as the responsibility of sysadmins. The document is not updated for the change, though. This commit updates the document for that. Link: https://lkml.kernel.org/r/20220222170100.17068-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23Docs/vm/damon: call low level monitoring primitives the operationsSeongJae Park2-13/+13
Patch series "Docs/damon: Update documents for better consistency". Some of DAMON document are not properly updated for latest version. This patchset updates such parts. This patch (of 3): DAMON code calls the low level monitoring primitives implementations the monitoring operations. The documentation would have no problem at still calling those primitives implementation because there is no real difference in the concepts, but making it more consistent with the code would make it better. This commit therefore convert sentences in the doc specifically pointing the implementations of the primitives to call it monitoring operations. Link: https://lkml.kernel.org/r/20220222170100.17068-1-sj@kernel.org Link: https://lkml.kernel.org/r/20220222170100.17068-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23Docs/admin-guide/mm/damon/usage: update for changed initail_regions file inputSeongJae Park1-10/+14
A previous commit made init_regions debugfs file to use target index instead of target id for specifying the target of the init regions. This commit updates the usage document to reflect the change. Link: https://lkml.kernel.org/r/20211230100723.2238-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23kfence: allow use of a deferrable timerMarco Elver1-0/+12
Allow the use of a deferrable timer, which does not force CPU wake-ups when the system is idle. A consequence is that the sample interval becomes very unpredictable, to the point that it is not guaranteed that the KFENCE KUnit test still passes. Nevertheless, on power-constrained systems this may be preferable, so let's give the user the option should they accept the above trade-off. Link: https://lkml.kernel.org/r/20220308141415.3168078-1-elver@google.com Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23mm/zswap.c: allow handling just same-value filled pagesMaciej S. Szmigiero1-3/+19
Zswap has an ability to efficiently store same-value filled pages, which can be turned on and off using the "same_filled_pages_enabled" parameter. However, there is currently no way to enable just this (lightweight) functionality, while not making use of the whole compressed page storage machinery. Add a "non_same_filled_pages_enabled" parameter which allows disabling handling of pages that aren't same-value filled. This way zswap can be run in such lightweight same-value filled pages only mode. Link: https://lkml.kernel.org/r/7dbafa963e8bab43608189abbe2067f4b9287831.1641247624.git.maciej.szmigiero@oracle.com Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Cc: Seth Jennings <sjenning@redhat.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23NUMA balancing: optimize page placement for memory tiering systemHuang Ying1-9/+20
With the advent of various new memory types, some machines will have multiple types of memory, e.g. DRAM and PMEM (persistent memory). The memory subsystem of these machines can be called memory tiering system, because the performance of the different types of memory are usually different. In such system, because of the memory accessing pattern changing etc, some pages in the slow memory may become hot globally. So in this patch, the NUMA balancing mechanism is enhanced to optimize the page placement among the different memory types according to hot/cold dynamically. In a typical memory tiering system, there are CPUs, fast memory and slow memory in each physical NUMA node. The CPUs and the fast memory will be put in one logical node (called fast memory node), while the slow memory will be put in another (faked) logical node (called slow memory node). That is, the fast memory is regarded as local while the slow memory is regarded as remote. So it's possible for the recently accessed pages in the slow memory node to be promoted to the fast memory node via the existing NUMA balancing mechanism. The original NUMA balancing mechanism will stop to migrate pages if the free memory of the target node becomes below the high watermark. This is a reasonable policy if there's only one memory type. But this makes the original NUMA balancing mechanism almost do not work to optimize page placement among different memory types. Details are as follows. It's the common cases that the working-set size of the workload is larger than the size of the fast memory nodes. Otherwise, it's unnecessary to use the slow memory at all. So, there are almost always no enough free pages in the fast memory nodes, so that the globally hot pages in the slow memory node cannot be promoted to the fast memory node. To solve the issue, we have 2 choices as follows, a. Ignore the free pages watermark checking when promoting hot pages from the slow memory node to the fast memory node. This will create some memory pressure in the fast memory node, thus trigger the memory reclaiming. So that, the cold pages in the fast memory node will be demoted to the slow memory node. b. Define a new watermark called wmark_promo which is higher than wmark_high, and have kswapd reclaiming pages until free pages reach such watermark. The scenario is as follows: when we want to promote hot-pages from a slow memory to a fast memory, but fast memory's free pages would go lower than high watermark with such promotion, we wake up kswapd with wmark_promo watermark in order to demote cold pages and free us up some space. So, next time we want to promote hot-pages we might have a chance of doing so. The choice "a" may create high memory pressure in the fast memory node. If the memory pressure of the workload is high, the memory pressure may become so high that the memory allocation latency of the workload is influenced, e.g. the direct reclaiming may be triggered. The choice "b" works much better at this aspect. If the memory pressure of the workload is high, the hot pages promotion will stop earlier because its allocation watermark is higher than that of the normal memory allocation. So in this patch, choice "b" is implemented. A new zone watermark (WMARK_PROMO) is added. Which is larger than the high watermark and can be controlled via watermark_scale_factor. In addition to the original page placement optimization among sockets, the NUMA balancing mechanism is extended to be used to optimize page placement according to hot/cold among different memory types. So the sysctl user space interface (numa_balancing) is extended in a backward compatible way as follow, so that the users can enable/disable these functionality individually. The sysctl is converted from a Boolean value to a bits field. The definition of the flags is, - 0: NUMA_BALANCING_DISABLED - 1: NUMA_BALANCING_NORMAL - 2: NUMA_BALANCING_MEMORY_TIERING We have tested the patch with the pmbench memory accessing benchmark with the 80:20 read/write ratio and the Gauss access address distribution on a 2 socket Intel server with Optane DC Persistent Memory Model. The test results shows that the pmbench score can improve up to 95.9%. Thanks Andrew Morton to help fix the document format error. Link: https://lkml.kernel.org/r/20220221084529.1052339-3-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Rik van Riel <riel@surriel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Wei Xu <weixugc@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: zhongjiang-ali <zhongjiang-ali@linux.alibaba.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Feng Tang <feng.tang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23mm: hugetlb: free the 2nd vmemmap page associated with each HugeTLB pageMuchun Song1-1/+1
Patch series "Free the 2nd vmemmap page associated with each HugeTLB page", v7. This series can minimize the overhead of struct page for 2MB HugeTLB pages significantly. It further reduces the overhead of struct page by 12.5% for a 2MB HugeTLB compared to the previous approach, which means 2GB per 1TB HugeTLB. It is a nice gain. Comments and reviews are welcome. Thanks. The main implementation and details can refer to the commit log of patch 1. In this series, I have changed the following four helpers, the following table shows the impact of the overhead of those helpers. +------------------+-----------------------+ | APIs | head page | tail page | +------------------+-----------+-----------+ | PageHead() | Y | N | +------------------+-----------+-----------+ | PageTail() | Y | N | +------------------+-----------+-----------+ | PageCompound() | N | N | +------------------+-----------+-----------+ | compound_head() | Y | N | +------------------+-----------+-----------+ Y: Overhead is increased. N: Overhead is _NOT_ increased. It shows that the overhead of those helpers on a tail page don't change between "hugetlb_free_vmemmap=on" and "hugetlb_free_vmemmap=off". But the overhead on a head page will be increased when "hugetlb_free_vmemmap=on" (except PageCompound()). So I believe that Matthew Wilcox's folio series will help with this. The users of PageHead() and PageTail() are much less than compound_head() and most users of PageTail() are VM_BUG_ON(), so I have done some tests about the overhead of compound_head() on head pages. I have tested the overhead of calling compound_head() on a head page, which is 2.11ns (Measure the call time of 10 million times compound_head(), and then average). For a head page whose address is not aligned with PAGE_SIZE or a non-compound page, the overhead of compound_head() is 2.54ns which is increased by 20%. For a head page whose address is aligned with PAGE_SIZE, the overhead of compound_head() is 2.97ns which is increased by 40%. Most pages are the former. I do not think the overhead is significant since the overhead of compound_head() itself is low. This patch (of 5): This patch minimizes the overhead of struct page for 2MB HugeTLB pages significantly. It further reduces the overhead of struct page by 12.5% for a 2MB HugeTLB compared to the previous approach, which means 2GB per 1TB HugeTLB (2MB type). After the feature of "Free sonme vmemmap pages of HugeTLB page" is enabled, the mapping of the vmemmap addresses associated with a 2MB HugeTLB page becomes the figure below. HugeTLB struct pages(8 pages) page frame(8 pages) +-----------+ ---virt_to_page---> +-----------+ mapping to +-----------+---> PG_head | | | 0 | -------------> | 0 | | | +-----------+ +-----------+ | | | 1 | -------------> | 1 | | | +-----------+ +-----------+ | | | 2 | ----------------^ ^ ^ ^ ^ ^ | | +-----------+ | | | | | | | | 3 | ------------------+ | | | | | | +-----------+ | | | | | | | 4 | --------------------+ | | | | 2MB | +-----------+ | | | | | | 5 | ----------------------+ | | | | +-----------+ | | | | | 6 | ------------------------+ | | | +-----------+ | | | | 7 | --------------------------+ | | +-----------+ | | | | | | +-----------+ As we can see, the 2nd vmemmap page frame (indexed by 1) is reused and remaped. However, the 2nd vmemmap page frame is also can be freed to the buddy allocator, then we can change the mapping from the figure above to the figure below. HugeTLB struct pages(8 pages) page frame(8 pages) +-----------+ ---virt_to_page---> +-----------+ mapping to +-----------+---> PG_head | | | 0 | -------------> | 0 | | | +-----------+ +-----------+ | | | 1 | ---------------^ ^ ^ ^ ^ ^ ^ | | +-----------+ | | | | | | | | | 2 | -----------------+ | | | | | | | +-----------+ | | | | | | | | 3 | -------------------+ | | | | | | +-----------+ | | | | | | | 4 | ---------------------+ | | | | 2MB | +-----------+ | | | | | | 5 | -----------------------+ | | | | +-----------+ | | | | | 6 | -------------------------+ | | | +-----------+ | | | | 7 | ---------------------------+ | | +-----------+ | | | | | | +-----------+ After we do this, all tail vmemmap pages (1-7) are mapped to the head vmemmap page frame (0). In other words, there are more than one page struct with PG_head associated with each HugeTLB page. We __know__ that there is only one head page struct, the tail page structs with PG_head are fake head page structs. We need an approach to distinguish between those two different types of page structs so that compound_head(), PageHead() and PageTail() can work properly if the parameter is the tail page struct but with PG_head. The following code snippet describes how to distinguish between real and fake head page struct. if (test_bit(PG_head, &page->flags)) { unsigned long head = READ_ONCE(page[1].compound_head); if (head & 1) { if (head == (unsigned long)page + 1) ==> head page struct else ==> tail page struct } else ==> head page struct } We can safely access the field of the @page[1] with PG_head because the @page is a compound page composed with at least two contiguous pages. [songmuchun@bytedance.com: restore lost comment changes] Link: https://lkml.kernel.org/r/20211101031651.75851-1-songmuchun@bytedance.com Link: https://lkml.kernel.org/r/20211101031651.75851-2-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Barry Song <song.bao.hua@hisilicon.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: David Hildenbrand <david@redhat.com> Cc: Chen Huang <chenhuang5@huawei.com> Cc: Bodeddula Balasubramaniam <bodeddub@amazon.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23fs: introduce alloc_inode_sb() to allocate filesystems specific inodeMuchun Song1-0/+6
The allocated inode cache is supposed to be added to its memcg list_lru which should be allocated as well in advance. That can be done by kmem_cache_alloc_lru() which allocates object and list_lru. The file systems is main user of it. So introduce alloc_inode_sb() to allocate file system specific inodes and set up the inode reclaim context properly. The file system is supposed to use alloc_inode_sb() to allocate inodes. In later patches, we will convert all users to the new API. Link: https://lkml.kernel.org/r/20220228122126.37293-4-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23mm/memcg: disable threshold event handlers on PREEMPT_RTSebastian Andrzej Siewior1-0/+2
During the integration of PREEMPT_RT support, the code flow around memcg_check_events() resulted in `twisted code'. Moving the code around and avoiding then would then lead to an additional local-irq-save section within memcg_check_events(). While looking better, it adds a local-irq-save section to code flow which is usually within an local-irq-off block on non-PREEMPT_RT configurations. The threshold event handler is a deprecated memcg v1 feature. Instead of trying to get it to work under PREEMPT_RT just disable it. There should be no users on PREEMPT_RT. From that perspective it makes even less sense to get it to work under PREEMPT_RT while having zero users. Make memory.soft_limit_in_bytes and cgroup.event_control return -EOPNOTSUPP on PREEMPT_RT. Make an empty memcg_check_events() and memcg_write_event_control() which return only -EOPNOTSUPP on PREEMPT_RT. Document that the two knobs are disabled on PREEMPT_RT. Link: https://lkml.kernel.org/r/20220226204144.1008339-3-bigeasy@linutronix.de Suggested-by: Michal Hocko <mhocko@kernel.org> Suggested-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: kernel test robot <oliver.sang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23memcg: add per-memcg total kernel memory statYosry Ahmed1-0/+5
Currently memcg stats show several types of kernel memory: kernel stack, page tables, sock, vmalloc, and slab. However, there are other allocations with __GFP_ACCOUNT (or supersets such as GFP_KERNEL_ACCOUNT) that are not accounted in any of those stats, a few examples are: - various kvm allocations (e.g. allocated pages to create vcpus) - io_uring - tmp_page in pipes during pipe_write() - bpf ringbuffers - unix sockets Keeping track of the total kernel memory is essential for the ease of migration from cgroup v1 to v2 as there are large discrepancies between v1's kmem.usage_in_bytes and the sum of the available kernel memory stats in v2. Adding separate memcg stats for all __GFP_ACCOUNT kernel allocations is an impractical maintenance burden as there a lot of those all over the kernel code, with more use cases likely to show up in the future. Therefore, add a "kernel" memcg stat that is analogous to kmem page counter, with added benefits such as using rstat infrastructure which aggregates stats more efficiently. Additionally, this provides a lighter alternative in case the legacy kmem is deprecated in the future [yosryahmed@google.com: v2] Link: https://lkml.kernel.org/r/20220203193856.972500-1-yosryahmed@google.com Link: https://lkml.kernel.org/r/20220201200823.3283171-1-yosryahmed@google.com Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23mm: document and polish read-ahead codeNeilBrown2-8/+27
Add some "big-picture" documentation for read-ahead and polish the code to make it fit this documentation. The meaning of ->async_size is clarified to match its name. i.e. Any request to ->readahead() has a sync part and an async part. The caller will wait for the sync pages to complete, but will not wait for the async pages. The first async page is still marked PG_readahead Note that the current function names page_cache_sync_ra() and page_cache_async_ra() are misleading. All ra request are partly sync and partly async, so either part can be empty. A page_cache_sync_ra() request will usually set ->async_size non-zero, implying it is not all synchronous. When a non-zero req_count is passed to page_cache_async_ra(), the implication is that some prefix of the request is synchronous, though the calculation made there is incorrect - I haven't tried to fix it. Link: https://lkml.kernel.org/r/164549983734.9187.11586890887006601405.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jeff Layton <jlayton@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Paolo Valente <paolo.valente@linaro.org> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23Merge tag 'sched-core-2022-03-22' of ↵Linus Torvalds3-45/+56
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Cleanups for SCHED_DEADLINE - Tracing updates/fixes - CPU Accounting fixes - First wave of changes to optimize the overhead of the scheduler build, from the fast-headers tree - including placeholder *_api.h headers for later header split-ups. - Preempt-dynamic using static_branch() for ARM64 - Isolation housekeeping mask rework; preperatory for further changes - NUMA-balancing: deal with CPU-less nodes - NUMA-balancing: tune systems that have multiple LLC cache domains per node (eg. AMD) - Updates to RSEQ UAPI in preparation for glibc usage - Lots of RSEQ/selftests, for same - Add Suren as PSI co-maintainer * tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits) sched/headers: ARM needs asm/paravirt_api_clock.h too sched/numa: Fix boot crash on arm64 systems headers/prep: Fix header to build standalone: <linux/psi.h> sched/headers: Only include <linux/entry-common.h> when CONFIG_GENERIC_ENTRY=y cgroup: Fix suspicious rcu_dereference_check() usage warning sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers sched/topology: Remove redundant variable and fix incorrect type in build_sched_domains sched/deadline,rt: Remove unused parameter from pick_next_[rt|dl]_entity() sched/deadline,rt: Remove unused functions for !CONFIG_SMP sched/deadline: Use __node_2_[pdl|dle]() and rb_first_cached() consistently sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy() sched/deadline: Move bandwidth mgmt and reclaim functions into sched class source file sched/deadline: Remove unused def_dl_bandwidth sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE sched/tracing: Don't re-read p->state when emitting sched_switch event sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race sched/cpuacct: Remove redundant RCU read lock sched/cpuacct: Optimize away RCU read lock sched/cpuacct: Fix charge percpu cpuusage sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies ...
2022-03-22ALSA: hda/realtek: Add alc256-samsung-headphone fixupMatt Kramer1-0/+4
This fixes the near-silence of the headphone jack on the ALC256-based Samsung Galaxy Book Flex Alpha (NP730QCJ). The magic verbs were found through trial and error, using known ALC298 hacks as inspiration. The fixup is auto-enabled only when the NP730QCJ is detected. It can be manually enabled using model=alc256-samsung-headphone. Signed-off-by: Matt Kramer <mccleetus@gmail.com> Link: https://lore.kernel.org/r/3168355.aeNJFYEL58@linus Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-03-22Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski4-0/+293
Alexei Starovoitov says: ==================== pull-request: bpf-next 2022-03-21 v2 We've added 137 non-merge commits during the last 17 day(s) which contain a total of 143 files changed, 7123 insertions(+), 1092 deletions(-). The main changes are: 1) Custom SEC() handling in libbpf, from Andrii. 2) subskeleton support, from Delyan. 3) Use btf_tag to recognize __percpu pointers in the verifier, from Hao. 4) Fix net.core.bpf_jit_harden race, from Hou. 5) Fix bpf_sk_lookup remote_port on big-endian, from Jakub. 6) Introduce fprobe (multi kprobe) _without_ arch bits, from Masami. The arch specific bits will come later. 7) Introduce multi_kprobe bpf programs on top of fprobe, from Jiri. 8) Enable non-atomic allocations in local storage, from Joanne. 9) Various var_off ptr_to_btf_id fixed, from Kumar. 10) bpf_ima_file_hash helper, from Roberto. 11) Add "live packet" mode for XDP in BPF_PROG_RUN, from Toke. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (137 commits) selftests/bpf: Fix kprobe_multi test. Revert "rethook: x86: Add rethook x86 implementation" Revert "arm64: rethook: Add arm64 rethook implementation" Revert "powerpc: Add rethook support" Revert "ARM: rethook: Add rethook arm implementation" bpftool: Fix a bug in subskeleton code generation bpf: Fix bpf_prog_pack when PMU_SIZE is not defined bpf: Fix bpf_prog_pack for multi-node setup bpf: Fix warning for cast from restricted gfp_t in verifier bpf, arm: Fix various typos in comments libbpf: Close fd in bpf_object__reuse_map bpftool: Fix print error when show bpf map bpf: Fix kprobe_multi return probe backtrace Revert "bpf: Add support to inline bpf_get_func_ip helper on x86" bpf: Simplify check in btf_parse_hdr() selftests/bpf/test_lirc_mode2.sh: Exit with proper code bpf: Check for NULL return from bpf_get_btf_vmlinux selftests/bpf: Test skipping stacktrace bpf: Adjust BPF stack helper functions to accommodate skip > 0 bpf: Select proper size for bpf_prog_pack ... ==================== Link: https://lore.kernel.org/r/20220322050159.5507-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-22Merge tag 'ext4_for_linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Fix some bugs in converting ext4 to use the new mount API, as well as more bug fixes and clean ups in the ext4 fast_commit feature (most notably, in the tracepoints). In the jbd2 layer, the t_handle_lock spinlock has been removed, with the last place where it was actually needed replaced with an atomic cmpxchg" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (35 commits) ext4: fix kernel doc warnings ext4: fix remaining two trace events to use same printk convention ext4: add commit tid info in ext4_fc_commit_start/stop trace events ext4: add commit_tid info in jbd debug log ext4: add transaction tid info in fc_track events ext4: add new trace event in ext4_fc_cleanup ext4: return early for non-eligible fast_commit track events ext4: do not call FC trace event in ext4_fc_commit() if FS does not support FC ext4: convert ext4_fc_track_dentry type events to use event class ext4: fix ext4_fc_stats trace point ext4: remove unused enum EXT4_FC_COMMIT_FAILED ext4: warn when dirtying page w/o buffers in data=journal mode doc: fixed a typo in ext4 documentation ext4: make mb_optimize_scan performance mount option work with extents ext4: make mb_optimize_scan option work with set/unset mount cmd ext4: don't BUG if someone dirty pages without asking ext4 first ext4: remove redundant assignment to variable split_flag1 ext4: fix underflow in ext4_max_bitmap_size() ext4: fix ext4_mb_clear_bb() kernel-doc comment ext4: fix fs corruption when tring to remove a non-empty directory with IO error ...
2022-03-22Merge tag 'nfsd-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds1-3/+3
Pull nfsd updates from Chuck Lever: "New features: - NFSv3 support in NFSD is now always built - Added NFSD support for the NFSv4 birth-time file attribute - Added support for storing and displaying sockaddrs in trace points - NFSD now recognizes RPC_AUTH_TLS probes Performance improvements: - Optimized the svc transport enqueuing mechanism - Added micro-optimizations for the duplicate reply cache Notable bug fixes: - Allocation of the NFSD file cache hash table is more reliable" * tag 'nfsd-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (30 commits) nfsd: fix using the correct variable for sizeof() nfsd: use correct format characters NFSD: prevent integer overflow on 32 bit systems NFSD: prevent underflow in nfssvc_decode_writeargs() fs/lock: documentation cleanup. Replace inode->i_lock with flc_lock. NFSD: Fix nfsd_breaker_owns_lease() return values NFSD: Clean up _lm_ operation names arch: Remove references to CONFIG_NFSD_V3 in the default configs NFSD: Remove CONFIG_NFSD_V3 nfsd: more robust allocation failure handling in nfsd_file_cache_init SUNRPC: Teach server to recognize RPC_AUTH_TLS NFSD: Move svc_serv_ops::svo_function into struct svc_serv NFSD: Remove svc_serv_ops::svo_module SUNRPC: Remove svc_shutdown_net() SUNRPC: Rename svc_close_xprt() SUNRPC: Rename svc_create_xprt() SUNRPC: Remove svo_shutdown method SUNRPC: Merge svc_do_enqueue_xprt() into svc_enqueue_xprt() SUNRPC: Remove the .svo_enqueue_xprt method SUNRPC: Record endpoint information in trace log ...