summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2021-07-02Merge branch 'for-5.14' of ↵Linus Torvalds6-58/+354
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cgroup.kill is added which implements atomic killing of the whole subtree. Down the line, this should be able to replace the multiple userland implementations of "keep killing till empty". - PSI can now be turned off at boot time to avoid overhead for configurations which don't care about PSI. * 'for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: make per-cgroup pressure stall tracking configurable cgroup: Fix kernel-doc cgroup: inline cgroup_task_freeze() tests/cgroup: test cgroup.kill tests/cgroup: move cg_wait_for(), cg_prepare_for_wait() tests/cgroup: use cgroup.kill in cg_killall() docs/cgroup: add entry for cgroup.kill cgroup: introduce cgroup.kill
2021-07-01Merge tag 'net-next-5.14' of ↵Linus Torvalds192-1669/+8874
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - BPF: - add syscall program type and libbpf support for generating instructions and bindings for in-kernel BPF loaders (BPF loaders for BPF), this is a stepping stone for signed BPF programs - infrastructure to migrate TCP child sockets from one listener to another in the same reuseport group/map to improve flexibility of service hand-off/restart - add broadcast support to XDP redirect - allow bypass of the lockless qdisc to improving performance (for pktgen: +23% with one thread, +44% with 2 threads) - add a simpler version of "DO_ONCE()" which does not require jump labels, intended for slow-path usage - virtio/vsock: introduce SOCK_SEQPACKET support - add getsocketopt to retrieve netns cookie - ip: treat lowest address of a IPv4 subnet as ordinary unicast address allowing reclaiming of precious IPv4 addresses - ipv6: use prandom_u32() for ID generation - ip: add support for more flexible field selection for hashing across multi-path routes (w/ offload to mlxsw) - icmp: add support for extended RFC 8335 PROBE (ping) - seg6: add support for SRv6 End.DT46 behavior - mptcp: - DSS checksum support (RFC 8684) to detect middlebox meddling - support Connection-time 'C' flag - time stamping support - sctp: packetization Layer Path MTU Discovery (RFC 8899) - xfrm: speed up state addition with seq set - WiFi: - hidden AP discovery on 6 GHz and other HE 6 GHz improvements - aggregation handling improvements for some drivers - minstrel improvements for no-ack frames - deferred rate control for TXQs to improve reaction times - switch from round robin to virtual time-based airtime scheduler - add trace points: - tcp checksum errors - openvswitch - action execution, upcalls - socket errors via sk_error_report Device APIs: - devlink: add rate API for hierarchical control of max egress rate of virtual devices (VFs, SFs etc.) - don't require RCU read lock to be held around BPF hooks in NAPI context - page_pool: generic buffer recycling New hardware/drivers: - mobile: - iosm: PCIe Driver for Intel M.2 Modem - support for Qualcomm MSM8998 (ipa) - WiFi: Qualcomm QCN9074 and WCN6855 PCI devices - sparx5: Microchip SparX-5 family of Enterprise Ethernet switches - Mellanox BlueField Gigabit Ethernet (control NIC of the DPU) - NXP SJA1110 Automotive Ethernet 10-port switch - Qualcomm QCA8327 switch support (qca8k) - Mikrotik 10/25G NIC (atl1c) Driver changes: - ACPI support for some MDIO, MAC and PHY devices from Marvell and NXP (our first foray into MAC/PHY description via ACPI) - HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx - Mellanox/Nvidia NIC (mlx5) - NIC VF offload of L2 bridging - support IRQ distribution to Sub-functions - Marvell (prestera): - add flower and match all - devlink trap - link aggregation - Netronome (nfp): connection tracking offload - Intel 1GE (igc): add AF_XDP support - Marvell DPU (octeontx2): ingress ratelimit offload - Google vNIC (gve): new ring/descriptor format support - Qualcomm mobile (rmnet & ipa): inline checksum offload support - MediaTek WiFi (mt76) - mt7915 MSI support - mt7915 Tx status reporting - mt7915 thermal sensors support - mt7921 decapsulation offload - mt7921 enable runtime pm and deep sleep - Realtek WiFi (rtw88) - beacon filter support - Tx antenna path diversity support - firmware crash information via devcoredump - Qualcomm WiFi (wcn36xx) - Wake-on-WLAN support with magic packets and GTK rekeying - Micrel PHY (ksz886x/ksz8081): add cable test support" * tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits) tcp: change ICSK_CA_PRIV_SIZE definition tcp_yeah: check struct yeah size at compile time gve: DQO: Fix off by one in gve_rx_dqo() stmmac: intel: set PCI_D3hot in suspend stmmac: intel: Enable PHY WOL option in EHL net: stmmac: option to enable PHY WOL with PMT enabled net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del} net: use netdev_info in ndo_dflt_fdb_{add,del} ptp: Set lookup cookie when creating a PTP PPS source. net: sock: add trace for socket errors net: sock: introduce sk_error_report net: dsa: replay the local bridge FDB entries pointing to the bridge dev too net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev net: dsa: include fdb entries pointing to bridge in the host fdb list net: dsa: include bridge addresses which are local in the host fdb list net: dsa: sync static FDB entries on foreign interfaces to hardware net: dsa: install the host MDB and FDB entries in the master's RX filter net: dsa: reference count the FDB addresses at the cross-chip notifier level net: dsa: introduce a separate cross-chip notifier type for host FDBs net: dsa: reference count the MDB entries at the cross-chip notifier level ...
2021-06-30Merge tag 'platform-drivers-x86-v5.14-1' of ↵Linus Torvalds4-2/+35
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver updates from Hans de Goede: "Highlights: - New think-lmi driver adding support for changing Lenovo Thinkpad BIOS settings from within Linux using the standard firmware- attributes class sysfs API - MS Surface aggregator-cdev now also supports forwarding events to user-space (for debugging / new driver development purposes only) - New intel_skl_int3472 driver this provides the necessary glue to translate ACPI table information to GPIOs, regulators, etc. for camera sensors on Intel devices with IPU3 attached MIPI cameras - A whole bunch of other fixes + device-specific quirk additions - New devm_work_autocancel() devm-helpers.h function" * tag 'platform-drivers-x86-v5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (83 commits) platform/x86: dell-wmi-sysman: Change user experience when Admin/System Password is modified platform/x86: intel_skl_int3472: Uninitialized variable in skl_int3472_handle_gpio_resources() platform/x86: think-lmi: Move kfree(setting->possible_values) to tlmi_attr_setting_release() platform/x86: think-lmi: Split current_value to reflect only the value platform/x86: think-lmi: Fix issues with duplicate attributes platform/x86: think-lmi: Return EINVAL when kbdlang gets set to a 0 length string platform/x86: intel_cht_int33fe: Move to its own subfolder platform/x86: intel_skl_int3472: Move to intel/ subfolder platform/x86: intel_skl_int3472: Provide skl_int3472_unregister_clock() platform/x86: intel_skl_int3472: Provide skl_int3472_unregister_regulator() platform/x86: intel_skl_int3472: Use ACPI GPIO resource directly platform/x86: intel_skl_int3472: Fix dependencies (drop CLKDEV_LOOKUP) platform/x86: intel_skl_int3472: Free ACPI device resources after use platform/x86: Remove "default n" entries platform/x86: ISST: Use numa node id for cpu pci dev mapping platform/x86: ISST: Optimize CPU to PCI device mapping tools/power/x86/intel-speed-select: v1.10 release tools/power/x86/intel-speed-select: Fix uncore memory frequency display extcon: extcon-max8997: Simplify driver using devm extcon: extcon-max8997: Fix IRQ freeing at error path ...
2021-06-30Merge tag 'fs.openat2.unknown_flags.v5.14' of ↵Linus Torvalds1-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull openat2 fixes from Christian Brauner: - Remove the unused VALID_UPGRADE_FLAGS define we carried from an extension to openat2() that we haven't merged. Aleksa might be getting back to it at some point but just not right now. - openat2() used to accidently ignore unknown flag values in the upper 32 bits. The new openat2() syscall verifies that no unknown O-flag values are set and returns an error to userspace if they are while the older open syscalls like open() and openat() simply ignore unknown flag values: #define O_FLAG_CURRENTLY_INVALID (1 << 31) struct open_how how = { .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID, .resolve = 0, }; /* fails */ fd = openat2(-EBADF, "/dev/null", &how, sizeof(how)); /* succeeds */ fd = openat(-EBADF, "/dev/null", O_RDONLY | O_FLAG_CURRENTLY_INVALID); However, openat2() silently truncates the upper 32 bits meaning: #define O_FLAG_CURRENTLY_INVALID_LOWER32 (1 << 31) #define O_FLAG_CURRENTLY_INVALID_UPPER32 (1 << 40) struct open_how how_lowe32 = { .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_LOWER32, }; struct open_how how_upper32 = { .flags = O_RDONLY | O_FLAG_CURRENTLY_INVALID_UPPER32, }; /* fails */ fd = openat2(-EBADF, "/dev/null", &how_lower32, sizeof(how_lower32)); /* succeeds */ fd = openat2(-EBADF, "/dev/null", &how_upper32, sizeof(how_upper32)); Fix this by preventing the immediate truncation in build_open_flags() and add a compile-time check to catch when we add flags in the upper 32 bit range. * tag 'fs.openat2.unknown_flags.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: test: add openat2() test for invalid upper 32 bit flag value open: don't silently ignore unknown O-flags in openat2() fcntl: remove unused VALID_UPGRADE_FLAGS
2021-06-30Merge tag 'fs.mount_setattr.nosymfollow.v5.14' of ↵Linus Torvalds1-3/+85
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull mount_setattr updates from Christian Brauner: "A few releases ago the old mount API gained support for a mount options which prevents following symlinks on a given mount. This adds support for it in the new mount api through the MOUNT_ATTR_NOSYMFOLLOW flag via mount_setattr() and fsmount(). With mount_setattr() that flag can even be applied recursively. There's an additional ack from Ross Zwisler who originally authored the nosymfollow patch. As I've already had the patches in my for-next I didn't add his ack explicitly" * tag 'fs.mount_setattr.nosymfollow.v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: test MOUNT_ATTR_NOSYMFOLLOW with mount_setattr() mount: Support "nosymfollow" in new mount api
2021-06-30Merge branch 'akpm' (patches from Andrew)Linus Torvalds2-31/+69
Merge misc updates from Andrew Morton: "191 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab, slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap, mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization, pagealloc, and memory-failure)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits) mm,hwpoison: make get_hwpoison_page() call get_any_page() mm,hwpoison: send SIGBUS with error virutal address mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes mm/page_alloc: allow high-order pages to be stored on the per-cpu lists mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA docs: remove description of DISCONTIGMEM arch, mm: remove stale mentions of DISCONIGMEM mm: remove CONFIG_DISCONTIGMEM m68k: remove support for DISCONTIGMEM arc: remove support for DISCONTIGMEM arc: update comment about HIGHMEM implementation alpha: remove DISCONTIGMEM and NUMA mm/page_alloc: move free_the_page mm/page_alloc: fix counting of managed_pages mm/page_alloc: improve memmap_pages dbg msg mm: drop SECTION_SHIFT in code comments mm/page_alloc: introduce vm.percpu_pagelist_high_fraction mm/page_alloc: limit the number of pages on PCP lists when reclaim is active mm/page_alloc: scale the number of pages that are batch freed ...
2021-06-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-1/+9
Trivial conflict in net/netfilter/nf_tables_api.c. Duplicate fix in tools/testing/selftests/net/devlink_port_split.py - take the net-next version. skmsg, and L4 bpf - keep the bpf code but remove the flags and err params. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-06-29Merge tag 'x86-entry-2021-06-29' of ↵Linus Torvalds1-49/+442
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry code related updates from Thomas Gleixner: - Consolidate the macros for .byte ... opcode sequences - Deduplicate register offset defines in include files - Simplify the ia32,x32 compat handling of the related syscall tables to get rid of #ifdeffery. - Clear all EFLAGS which are not required for syscall handling - Consolidate the syscall tables and switch the generation over to the generic shell script and remove the CFLAGS tweaks which are not longer required. - Use 'int' type for system call numbers to match the generic code. - Add more selftests for syscalls * tag 'x86-entry-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/syscalls: Don't adjust CFLAGS for syscall tables x86/syscalls: Remove -Wno-override-init for syscall tables x86/uml/syscalls: Remove array index from syscall initializers x86/syscalls: Clear 'offset' and 'prefix' in case they are set in env x86/entry: Use int everywhere for system call numbers x86/entry: Treat out of range and gap system calls the same x86/entry/64: Sign-extend system calls on entry to int selftests/x86/syscall: Add tests under ptrace to syscall_numbering_64 selftests/x86/syscall: Simplify message reporting in syscall_numbering selftests/x86/syscall: Update and extend syscall_numbering_64 x86/syscalls: Switch to generic syscallhdr.sh x86/syscalls: Use __NR_syscalls instead of __NR_syscall_max x86/unistd: Define X32_NR_syscalls only for 64-bit kernel x86/syscalls: Stop filling syscall arrays with *_sys_ni_syscall x86/syscalls: Switch to generic syscalltbl.sh x86/entry/x32: Rename __x32_compat_sys_* to __x64_compat_sys_*
2021-06-29Merge tag 'x86-irq-2021-06-29' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 interrupt related updates from Thomas Gleixner: - Consolidate the VECTOR defines and the usage sites. - Cleanup GDT/IDT related code and replace open coded ASM with proper native helper functions. * tag 'x86-irq-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kexec: Set_[gi]dt() -> native_[gi]dt_invalidate() in machine_kexec_*.c x86: Add native_[ig]dt_invalidate() x86/idt: Remove address argument from idt_invalidate() x86/irq: Add and use NR_EXTERNAL_VECTORS and NR_SYSTEM_VECTORS x86/irq: Remove unused vectors defines
2021-06-29Merge tag 'printk-for-5.14' of ↵Linus Torvalds3-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Add %pt[RT]s modifier to vsprintf(). It overrides ISO 8601 separator by using ' ' (space). It produces "YYYY-mm-dd HH:MM:SS" instead of "YYYY-mm-ddTHH:MM:SS". - Correctly parse long row of numbers by sscanf() when using the field width. Add extensive sscanf() selftest. - Generalize re-entrant CPU lock that has already been used to serialize dump_stack() output. It is part of the ongoing printk rework. It will allow to remove the obsoleted printk_safe buffers and introduce atomic consoles. - Some code clean up and sparse warning fixes. * tag 'printk-for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: fix cpu lock ordering lib/dump_stack: move cpu lock to printk.c printk: Remove trailing semicolon in macros random32: Fix implicit truncation warning in prandom_seed_state() lib: test_scanf: Remove pointless use of type_min() with unsigned types selftests: lib: Add wrapper script for test_scanf lib: test_scanf: Add tests for sscanf number conversion lib: vsprintf: Fix handling of number field widths in vsscanf lib: vsprintf: scanf: Negative number must have field width > 1 usb: host: xhci-tegra: Switch to use %ptTs nilfs2: Switch to use %ptTs kdb: Switch to use %ptTs lib/vsprintf: Allow to override ISO 8601 date and time separator
2021-06-29mm/gup_benchmark: support threadingPeter Xu1-31/+65
Patch series "mm/gup: Fix pin page write cache bouncing on has_pinned", v2. This series contains 3 patches, the 1st one enables threading for gup_benchmark in the kselftest. The latter two patches are collected from Andrea's local branch which can fix write cache bouncing issue with pinning fast-gup. To be explicit on the latter two patches: - the 2nd patch fixes the perf degrade when introducing has_pinned, then - the last patch tries to remove the has_pinned with a bit in mm->flags For patch 3: originally I think we had a plan to reuse has_pinned into a counter very soon, however that's not happening at least until today, so maybe it proves that we can remove it until we really want such a counter for whatever reason. As the commit message stated, it saves 4 bytes for each mm without observable regressions. Regarding testing: we can reference to the commit message of patch 2 for some detailed testing with will-is-scale. Meanwhile I did patch 1 just because then we can even easily verify the patchset using the existing kselftest facilities or even regress test it in the future with the repo if we want. Below numbers are extra verification tests that I did besides commit message of patch 2 using the new gup_benchmark and 256 cpus. Below test is done on 40 cpus host with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz, and I can get similar result (of course the write cache bouncing get severe with even more cores). After patch 1 applied (only test patch, so using old kernel): $ sudo chrt -f 1 ./gup_test -a -m 512 -j 40 PIN_FAST_BENCHMARK: Time: get:459632 put:5990 us PIN_FAST_BENCHMARK: Time: get:461967 put:5840 us PIN_FAST_BENCHMARK: Time: get:464521 put:6140 us PIN_FAST_BENCHMARK: Time: get:465176 put:7100 us PIN_FAST_BENCHMARK: Time: get:465960 put:6733 us PIN_FAST_BENCHMARK: Time: get:465324 put:6781 us PIN_FAST_BENCHMARK: Time: get:466018 put:7130 us PIN_FAST_BENCHMARK: Time: get:466362 put:7118 us PIN_FAST_BENCHMARK: Time: get:465118 put:6975 us PIN_FAST_BENCHMARK: Time: get:466422 put:6602 us PIN_FAST_BENCHMARK: Time: get:465791 put:6818 us PIN_FAST_BENCHMARK: Time: get:467091 put:6298 us PIN_FAST_BENCHMARK: Time: get:467694 put:5432 us PIN_FAST_BENCHMARK: Time: get:469575 put:5581 us PIN_FAST_BENCHMARK: Time: get:468124 put:6055 us PIN_FAST_BENCHMARK: Time: get:468877 put:6720 us PIN_FAST_BENCHMARK: Time: get:467212 put:4961 us PIN_FAST_BENCHMARK: Time: get:467834 put:6697 us PIN_FAST_BENCHMARK: Time: get:470778 put:6398 us PIN_FAST_BENCHMARK: Time: get:469788 put:6310 us PIN_FAST_BENCHMARK: Time: get:488277 put:7113 us PIN_FAST_BENCHMARK: Time: get:486613 put:7085 us PIN_FAST_BENCHMARK: Time: get:486940 put:7202 us PIN_FAST_BENCHMARK: Time: get:488728 put:7101 us PIN_FAST_BENCHMARK: Time: get:487570 put:7327 us PIN_FAST_BENCHMARK: Time: get:489260 put:7027 us PIN_FAST_BENCHMARK: Time: get:488846 put:6866 us PIN_FAST_BENCHMARK: Time: get:488521 put:6745 us PIN_FAST_BENCHMARK: Time: get:489950 put:6459 us PIN_FAST_BENCHMARK: Time: get:489777 put:6617 us PIN_FAST_BENCHMARK: Time: get:488224 put:6591 us PIN_FAST_BENCHMARK: Time: get:488644 put:6477 us PIN_FAST_BENCHMARK: Time: get:488754 put:6711 us PIN_FAST_BENCHMARK: Time: get:488875 put:6743 us PIN_FAST_BENCHMARK: Time: get:489290 put:6657 us PIN_FAST_BENCHMARK: Time: get:490264 put:6684 us PIN_FAST_BENCHMARK: Time: get:489631 put:6737 us PIN_FAST_BENCHMARK: Time: get:488434 put:6655 us PIN_FAST_BENCHMARK: Time: get:492213 put:6297 us PIN_FAST_BENCHMARK: Time: get:491124 put:6173 us After the whole series applied (new fixed kernel): $ sudo chrt -f 1 ./gup_test -a -m 512 -j 40 PIN_FAST_BENCHMARK: Time: get:82038 put:7041 us PIN_FAST_BENCHMARK: Time: get:82144 put:6817 us PIN_FAST_BENCHMARK: Time: get:83417 put:6674 us PIN_FAST_BENCHMARK: Time: get:82540 put:6594 us PIN_FAST_BENCHMARK: Time: get:83214 put:6681 us PIN_FAST_BENCHMARK: Time: get:83444 put:6889 us PIN_FAST_BENCHMARK: Time: get:83194 put:7499 us PIN_FAST_BENCHMARK: Time: get:84876 put:7369 us PIN_FAST_BENCHMARK: Time: get:86092 put:10289 us PIN_FAST_BENCHMARK: Time: get:86153 put:10415 us PIN_FAST_BENCHMARK: Time: get:85026 put:7751 us PIN_FAST_BENCHMARK: Time: get:85458 put:7944 us PIN_FAST_BENCHMARK: Time: get:85735 put:8154 us PIN_FAST_BENCHMARK: Time: get:85851 put:8299 us PIN_FAST_BENCHMARK: Time: get:86323 put:9617 us PIN_FAST_BENCHMARK: Time: get:86288 put:10496 us PIN_FAST_BENCHMARK: Time: get:87697 put:9346 us PIN_FAST_BENCHMARK: Time: get:87980 put:8382 us PIN_FAST_BENCHMARK: Time: get:88719 put:8400 us PIN_FAST_BENCHMARK: Time: get:87616 put:8588 us PIN_FAST_BENCHMARK: Time: get:86730 put:9563 us PIN_FAST_BENCHMARK: Time: get:88167 put:8673 us PIN_FAST_BENCHMARK: Time: get:86844 put:9777 us PIN_FAST_BENCHMARK: Time: get:88068 put:11774 us PIN_FAST_BENCHMARK: Time: get:86170 put:15676 us PIN_FAST_BENCHMARK: Time: get:87967 put:12827 us PIN_FAST_BENCHMARK: Time: get:95773 put:7652 us PIN_FAST_BENCHMARK: Time: get:87734 put:13650 us PIN_FAST_BENCHMARK: Time: get:89833 put:14237 us PIN_FAST_BENCHMARK: Time: get:96186 put:8029 us PIN_FAST_BENCHMARK: Time: get:95532 put:8886 us PIN_FAST_BENCHMARK: Time: get:95351 put:5826 us PIN_FAST_BENCHMARK: Time: get:96401 put:8407 us PIN_FAST_BENCHMARK: Time: get:96473 put:8287 us PIN_FAST_BENCHMARK: Time: get:97177 put:8430 us PIN_FAST_BENCHMARK: Time: get:98120 put:5263 us PIN_FAST_BENCHMARK: Time: get:96271 put:7757 us PIN_FAST_BENCHMARK: Time: get:99628 put:10467 us PIN_FAST_BENCHMARK: Time: get:99344 put:10045 us PIN_FAST_BENCHMARK: Time: get:94212 put:15485 us Summary: Old kernel: 477729.97 (+-3.79%) New kernel: 89144.65 (+-11.76%) This patch (of 3): Add a new parameter "-j N" to support concurrent gup test. Link: https://lkml.kernel.org/r/20210507150553.208763-1-peterx@redhat.com Link: https://lkml.kernel.org/r/20210507150553.208763-2-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Jan Kara <jack@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Kirill Shutemov <kirill@shutemov.name> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29tools/vm/page_owner_sort.c: check malloc() returnTang Bin1-0/+4
Link: https://lkml.kernel.org/r/20210506131402.10416-1-tangbin@cmss.chinamobile.com Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-06-29Merge branch 'for-linus' of ↵Linus Torvalds5-0/+171
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace rlimit handling update from Eric Biederman: "This is the work mainly by Alexey Gladkov to limit rlimits to the rlimits of the user that created a user namespace, and to allow users to have stricter limits on the resources created within a user namespace." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: cred: add missing return error code when set_cred_ucounts() failed ucounts: Silence warning in dec_rlimit_ucounts ucounts: Set ucount_max to the largest positive value the type can hold kselftests: Add test to check for rlimit changes in different user namespaces Reimplement RLIMIT_MEMLOCK on top of ucounts Reimplement RLIMIT_SIGPENDING on top of ucounts Reimplement RLIMIT_MSGQUEUE on top of ucounts Reimplement RLIMIT_NPROC on top of ucounts Use atomic_t for ucounts reference counting Add a reference to ucounts for each cred Increase size of ucounts to atomic_long_t
2021-06-29Merge tag 'seccomp-v5.14-rc1' of ↵Linus Torvalds2-6/+55
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: - Add "atomic addfd + send reply" mode to SECCOMP_USER_NOTIF to better handle EINTR races visible to seccomp monitors. (Rodrigo Campos, Sargun Dhillon) - Improve seccomp selftests for readability in CI systems. (Kees Cook) * tag 'seccomp-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: Avoid using "sysctl" for report selftests/seccomp: Flush benchmark output selftests/seccomp: More closely track fds being assigned selftests/seccomp: Add test for atomic addfd+send seccomp: Support atomic "addfd + send reply"
2021-06-29Merge tag 'docs-5.14' of git://git.lwn.net/linuxLinus Torvalds1-1/+1
Pull documentation updates from Jonathan Corbet: "This was a reasonably active cycle for documentation; this includes: - Some kernel-doc cleanups. That script is still regex onslaught from hell, but it has gotten a little better. - Improvements to the checkpatch docs, which are also used by the tool itself. - A major update to the pathname lookup documentation. - Elimination of :doc: markup, since our automarkup magic can create references from filenames without all the extra noise. - The flurry of Chinese translation activity continues. Plus, of course, the usual collection of updates, typo fixes, and warning fixes" * tag 'docs-5.14' of git://git.lwn.net/linux: (115 commits) docs: path-lookup: use bare function() rather than literals docs: path-lookup: update symlink description docs: path-lookup: update get_link() ->follow_link description docs: path-lookup: update WALK_GET, WALK_PUT desc docs: path-lookup: no get_link() docs: path-lookup: update i_op->put_link and cookie description docs: path-lookup: i_op->follow_link replaced with i_op->get_link docs: path-lookup: Add macro name to symlink limit description docs: path-lookup: remove filename_mountpoint docs: path-lookup: update do_last() part docs: path-lookup: update path_mountpoint() part docs: path-lookup: update path_to_nameidata() part docs: path-lookup: update follow_managed() part docs: Makefile: Use CONFIG_SHELL not SHELL docs: Take a little noise out of the build process docs: x86: avoid using ReST :doc:`foo` markup docs: virt: kvm: s390-pv-boot.rst: avoid using ReST :doc:`foo` markup docs: userspace-api: landlock.rst: avoid using ReST :doc:`foo` markup docs: trace: ftrace.rst: avoid using ReST :doc:`foo` markup docs: trace: coresight: coresight.rst: avoid using ReST :doc:`foo` markup ...
2021-06-29selftests: net: devlink_port_split: check devlink returned an element before ↵Paolo Pisati1-0/+3
dereferencing it And thus avoid a Python stacktrace: ~/linux/tools/testing/selftests/net$ ./devlink_port_split.py Traceback (most recent call last): File "/home/linux/tools/testing/selftests/net/./devlink_port_split.py", line 277, in <module> main() File "/home/linux/tools/testing/selftests/net/./devlink_port_split.py", line 242, in main dev = list(devs.keys())[0] IndexError: list index out of range Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds49-577/+3163
Pull kvm updates from Paolo Bonzini: "This covers all architectures (except MIPS) so I don't expect any other feature pull requests this merge window. ARM: - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes PPC: - Support for the H_RPT_INVALIDATE hypercall - Conversion of Book3S entry/exit to C - Bug fixes S390: - new HW facilities for guests - make inline assembly more robust with KASAN and co x86: - Allow userspace to handle emulation errors (unknown instructions) - Lazy allocation of the rmap (host physical -> guest physical address) - Support for virtualizing TSC scaling on VMX machines - Optimizations to avoid shattering huge pages at the beginning of live migration - Support for initializing the PDPTRs without loading them from memory - Many TLB flushing cleanups - Refuse to load if two-stage paging is available but NX is not (this has been a requirement in practice for over a year) - A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from CR0/CR4/EFER, using the MMU mode everywhere once it is computed from the CPU registers - Use PM notifier to notify the guest about host suspend or hibernate - Support for passing arguments to Hyper-V hypercalls using XMM registers - Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap on AMD processors - Hide Hyper-V hypercalls that are not included in the guest CPUID - Fixes for live migration of virtual machines that use the Hyper-V "enlightened VMCS" optimization of nested virtualization - Bugfixes (not many) Generic: - Support for retrieving statistics without debugfs - Cleanups for the KVM selftests API" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (314 commits) KVM: x86: rename apic_access_page_done to apic_access_memslot_enabled kvm: x86: disable the narrow guest module parameter on unload selftests: kvm: Allows userspace to handle emulation errors. kvm: x86: Allow userspace to handle emulation errors KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on KVM: x86/mmu: Get CR4.SMEP from MMU, not vCPU, in shadow page fault KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic KVM: x86: Enhance comments for MMU roles and nested transition trickiness KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU KVM: x86/mmu: Use MMU's role to determine PTTYPE KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers KVM: x86/mmu: Add a helper to calculate root from role_regs KVM: x86/mmu: Add helper to update paging metadata KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0 KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper KVM: x86/mmu: Get nested MMU's root level from the MMU's role ...
2021-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller6-255/+74
Daniel Borkmann says: ==================== pull-request: bpf-next 2021-06-28 The following pull-request contains BPF updates for your *net-next* tree. We've added 37 non-merge commits during the last 12 day(s) which contain a total of 56 files changed, 394 insertions(+), 380 deletions(-). The main changes are: 1) XDP driver RCU cleanups, from Toke Høiland-Jørgensen and Paul E. McKenney. 2) Fix bpf_skb_change_proto() IPv4/v6 GSO handling, from Maciej Żenczykowski. 3) Fix false positive kmemleak report for BPF ringbuf alloc, from Rustam Kovhaev. 4) Fix x86 JIT's extable offset calculation for PROBE_LDX NULL, from Ravi Bangoria. 5) Enable libbpf fallback probing with tracing under RHEL7, from Jonathan Edwards. 6) Clean up x86 JIT to remove unused cnt tracking from EMIT macro, from Jiri Olsa. 7) Netlink cleanups for libbpf to please Coverity, from Kumar Kartikeya Dwivedi. 8) Allow to retrieve ancestor cgroup id in tracing programs, from Namhyung Kim. 9) Fix lirc BPF program query to use user-provided prog_cnt, from Sean Young. 10) Add initial libbpf doc including generated kdoc for its API, from Grant Seltzer. 11) Make xdp_rxq_info_unreg_mem_model() more robust, from Jakub Kicinski. 12) Fix up bpfilter startup log-level to info level, from Gary Lin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29Merge tag 'arm64-upstream' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's a reasonable amount here and the juicy details are all below. It's worth noting that the MTE/KASAN changes strayed outside of our usual directories due to core mm changes and some associated changes to some other architectures; Andrew asked for us to carry these [1] rather that take them via the -mm tree. Summary: - Optimise SVE switching for CPUs with 128-bit implementations. - Fix output format from SVE selftest. - Add support for versions v1.2 and 1.3 of the SMC calling convention. - Allow Pointer Authentication to be configured independently for kernel and userspace. - PMU driver cleanups for managing IRQ affinity and exposing event attributes via sysfs. - KASAN optimisations for both hardware tagging (MTE) and out-of-line software tagging implementations. - Relax frame record alignment requirements to facilitate 8-byte alignment with KASAN and Clang. - Cleanup of page-table definitions and removal of unused memory types. - Reduction of ARCH_DMA_MINALIGN back to 64 bytes. - Refactoring of our instruction decoding routines and addition of some missing encodings. - Move entry code moved into C and hardened against harmful compiler instrumentation. - Update booting requirements for the FEAT_HCX feature, added to v8.7 of the architecture. - Fix resume from idle when pNMI is being used. - Additional CPU sanity checks for MTE and preparatory changes for systems where not all of the CPUs support 32-bit EL0. - Update our kernel string routines to the latest Cortex Strings implementation. - Big cleanup of our cache maintenance routines, which were confusingly named and inconsistent in their implementations. - Tweak linker flags so that GDB can understand vmlinux when using RELR relocations. - Boot path cleanups to enable early initialisation of per-cpu operations needed by KCSAN. - Non-critical fixes and miscellaneous cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (150 commits) arm64: tlb: fix the TTL value of tlb_get_level arm64: Restrict undef hook for cpufeature registers arm64/mm: Rename ARM64_SWAPPER_USES_SECTION_MAPS arm64: insn: avoid circular include dependency arm64: smp: Bump debugging information print down to KERN_DEBUG drivers/perf: fix the missed ida_simple_remove() in ddr_perf_probe() perf/arm-cmn: Fix invalid pointer when access dtc object sharing the same IRQ number arm64: suspend: Use cpuidle context helpers in cpu_suspend() PSCI: Use cpuidle context helpers in psci_cpu_suspend_enter() arm64: Convert cpu_do_idle() to using cpuidle context helpers arm64: Add cpuidle context save/restore helpers arm64: head: fix code comments in set_cpu_boot_mode_flag arm64: mm: drop unused __pa(__idmap_text_start) arm64: mm: fix the count comments in compute_indices arm64/mm: Fix ttbr0 values stored in struct thread_info for software-pan arm64: mm: Pass original fault address to handle_mm_fault() arm64/mm: Drop SECTION_[SHIFT|SIZE|MASK] arm64/mm: Use CONT_PMD_SHIFT for ARM64_MEMSTART_SHIFT arm64/mm: Drop SWAPPER_INIT_MAP_SIZE arm64: Conditionally configure PTR_AUTH key of the kernel. ...
2021-06-28Merge tag 'x86-asm-2021-06-28' of ↵Linus Torvalds2-14/+203
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: - Micro-optimize and standardize the do_syscall_64() calling convention - Make syscall entry flags clearing more conservative - Clean up syscall table handling - Clean up & standardize assembly macros, in preparation of FRED - Misc cleanups and fixes * tag 'x86-asm-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Make <asm/asm.h> valid on cross-builds as well x86/regs: Syscall_get_nr() returns -1 for a non-system call x86/entry: Split PUSH_AND_CLEAR_REGS into two submacros x86/syscall: Maximize MSR_SYSCALL_MASK x86/syscall: Unconditionally prototype {ia32,x32}_sys_call_table[] x86/entry: Reverse arguments to do_syscall_64() x86/entry: Unify definitions from <asm/calling.h> and <asm/ptrace-abi.h> x86/asm: Use _ASM_BYTES() in <asm/nops.h> x86/asm: Add _ASM_BYTES() macro for a .byte ... opcode sequence x86/asm: Have the __ASM_FORM macros handle commas in arguments
2021-06-28selftests/seccomp: Avoid using "sysctl" for reportKees Cook1-2/+6
Instead of depending on "sysctl" being installed, just use "grep -H" for sysctl status reporting. Additionally report kernel version for easier comparisons. Signed-off-by: Kees Cook <keescook@chromium.org>
2021-06-28selftests/seccomp: Flush benchmark outputKees Cook1-0/+2
When running the seccomp benchmark under a test runner, it wouldn't provide any feedback on progress. Set stdout unbuffered. Suggested-by: Will Drewry <wad@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2021-06-28selftests/seccomp: More closely track fds being assignedKees Cook1-7/+12
Since the open fds might not always start at "4" (especially when running under kselftest, etc), start counting from the first assigned fd, rather than using the more permissive EXPECT_GE(fd, 0). Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20210527032948.3730953-1-keescook@chromium.org Reviewed-by: Rodrigo Campos <rodrigo@kinvolk.io> Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-06-28selftests/seccomp: Add test for atomic addfd+sendRodrigo Campos1-0/+38
This just adds a test to verify that when using the new introduced flag to ADDFD, a valid fd is added and returned as the syscall result. Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io> Signed-off-by: Sargun Dhillon <sargun@sargun.me> Acked-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210517193908.3113-5-sargun@sargun.me
2021-06-28Merge tag 'sched-core-2021-06-28' of ↵Linus Torvalds5-0/+362
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler udpates from Ingo Molnar: - Changes to core scheduling facilities: - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables coordinated scheduling across SMT siblings. This is a much requested feature for cloud computing platforms, to allow the flexible utilization of SMT siblings, without exposing untrusted domains to information leaks & side channels, plus to ensure more deterministic computing performance on SMT systems used by heterogenous workloads. There are new prctls to set core scheduling groups, which allows more flexible management of workloads that can share siblings. - Fix task->state access anti-patterns that may result in missed wakeups and rename it to ->__state in the process to catch new abuses. - Load-balancing changes: - Tweak newidle_balance for fair-sched, to improve 'memcache'-like workloads. - "Age" (decay) average idle time, to better track & improve workloads such as 'tbench'. - Fix & improve energy-aware (EAS) balancing logic & metrics. - Fix & improve the uclamp metrics. - Fix task migration (taskset) corner case on !CONFIG_CPUSET. - Fix RT and deadline utilization tracking across policy changes - Introduce a "burstable" CFS controller via cgroups, which allows bursty CPU-bound workloads to borrow a bit against their future quota to improve overall latencies & batching. Can be tweaked via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us. - Rework assymetric topology/capacity detection & handling. - Scheduler statistics & tooling: - Disable delayacct by default, but add a sysctl to enable it at runtime if tooling needs it. Use static keys and other optimizations to make it more palatable. - Use sched_clock() in delayacct, instead of ktime_get_ns(). - Misc cleanups and fixes. * tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/doc: Update the CPU capacity asymmetry bits sched/topology: Rework CPU capacity asymmetry detection sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag psi: Fix race between psi_trigger_create/destroy sched/fair: Introduce the burstable CFS controller sched/uclamp: Fix uclamp_tg_restrict() sched/rt: Fix Deadline utilization tracking during policy change sched/rt: Fix RT utilization tracking during policy change sched: Change task_struct::state sched,arch: Remove unused TASK_STATE offsets sched,timer: Use __set_current_state() sched: Add get_current_state() sched,perf,kvm: Fix preemption condition sched: Introduce task_is_running() sched: Unbreak wakeups sched/fair: Age the average idle time sched/cpufreq: Consider reduced CPU capacity in energy calculation sched/fair: Take thermal pressure into account while estimating energy thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure sched/fair: Return early from update_tg_cfs_load() if delta == 0 ...
2021-06-28Merge tag 'locking-core-2021-06-28' of ↵Linus Torvalds6-18/+430
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Core locking & atomics: - Convert all architectures to ARCH_ATOMIC: move every architecture to ARCH_ATOMIC, then get rid of ARCH_ATOMIC and all the transitory facilities and #ifdefs. Much reduction in complexity from that series: 63 files changed, 756 insertions(+), 4094 deletions(-) - Self-test enhancements - Futexes: - Add the new FUTEX_LOCK_PI2 ABI, which is a variant that doesn't set FLAGS_CLOCKRT (.e. uses CLOCK_MONOTONIC). [ The temptation to repurpose FUTEX_LOCK_PI's implicit setting of FLAGS_CLOCKRT & invert the flag's meaning to avoid having to introduce a new variant was resisted successfully. ] - Enhance futex self-tests - Lockdep: - Fix dependency path printouts - Optimize trace saving - Broaden & fix wait-context checks - Misc cleanups and fixes. * tag 'locking-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) locking/lockdep: Correct the description error for check_redundant() futex: Provide FUTEX_LOCK_PI2 to support clock selection futex: Prepare futex_lock_pi() for runtime clock selection lockdep/selftest: Remove wait-type RCU_CALLBACK tests lockdep/selftests: Fix selftests vs PROVE_RAW_LOCK_NESTING lockdep: Fix wait-type for empty stack locking/selftests: Add a selftest for check_irq_usage() lockding/lockdep: Avoid to find wrong lock dep path in check_irq_usage() locking/lockdep: Remove the unnecessary trace saving locking/lockdep: Fix the dep path printing for backwards BFS selftests: futex: Add futex compare requeue test selftests: futex: Add futex wait test seqlock: Remove trailing semicolon in macros locking/lockdep: Reduce LOCKDEP dependency list locking/lockdep,doc: Improve readability of the block matrix locking/atomics: atomic-instrumented: simplify ifdeffery locking/atomic: delete !ARCH_ATOMIC remnants locking/atomic: xtensa: move to ARCH_ATOMIC locking/atomic: sparc: move to ARCH_ATOMIC locking/atomic: sh: move to ARCH_ATOMIC ...
2021-06-28Merge tags 'objtool-urgent-2021-06-28' and 'objtool-core-2021-06-28' of ↵Linus Torvalds8-51/+136
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix and updates from Ingo Molnar: "An ELF format fix for a section flags mismatch bug that breaks kernel tooling such as kpatch-build. The biggest change in this cycle is the new code to handle and rewrite variable sized jump labels - which results in slightly tighter code generation in hot paths, through the use of short(er) NOPs. Also a number of cleanups and fixes, and a change to the generic include/linux/compiler.h to handle a s390 GCC quirk" * tag 'objtool-urgent-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Don't make .altinstructions writable * tag 'objtool-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Improve reloc hash size guestimate instrumentation.h: Avoid using inline asm operand modifiers compiler.h: Avoid using inline asm operand modifiers kbuild: Fix objtool dependency for 'OBJECT_FILES_NON_STANDARD_<obj> := n' objtool: Reflow handle_jump_alt() jump_label/x86: Remove unused JUMP_LABEL_NOP_SIZE jump_label, x86: Allow short NOPs objtool: Provide stats for jump_labels objtool: Rewrite jump_label instructions objtool: Decode jump_entry::key addend jump_label, x86: Emit short JMP jump_label: Free jump_entry::key bit1 for build use jump_label, x86: Add variable length patching support jump_label, x86: Introduce jump_entry_size() jump_label, x86: Improve error when we fail expected text jump_label, x86: Factor out the __jump_table generation jump_label, x86: Strip ASM jump_label support x86, objtool: Dont exclude arch/x86/realmode/ objtool: Rewrite hashtable sizing
2021-06-25Merge tag 'kvmarm-5.14' of ↵Paolo Bonzini76-837/+2262
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for v5.14. - Add MTE support in guests, complete with tag save/restore interface - Reduce the impact of CMOs by moving them in the page-table code - Allow device block mappings at stage-2 - Reduce the footprint of the vmemmap in protected mode - Support the vGIC on dumb systems such as the Apple M1 - Add selftest infrastructure to support multiple configuration and apply that to PMU/non-PMU setups - Add selftests for the debug architecture - The usual crop of PMU fixes
2021-06-25selftests: kvm: Allows userspace to handle emulation errors.Aaron Lewis5-0/+317
This test exercises the feature KVM_CAP_EXIT_ON_EMULATION_FAILURE. When enabled, errors in the in-kernel instruction emulator are forwarded to userspace with the instruction bytes stored in the exit struct for KVM_EXIT_INTERNAL_ERROR. So, when the guest attempts to emulate an 'flds' instruction, which isn't able to be emulated in KVM, instead of failing, KVM sends the instruction to userspace to handle. For this test to work properly the module parameter 'allow_smaller_maxphyaddr' has to be set. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Message-Id: <20210510144834.658457-3-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25KVM: x86/mmu: Rename "nxe" role bit to "efer_nx" for macro shenanigansSean Christopherson1-2/+2
Rename "nxe" to "efer_nx" so that future macro magic can use the pattern <reg>_<bit> for all CR0, CR4, and EFER bits that included in the role. Using "efer_nx" also makes it clear that the role bit reflects EFER.NX, not the NX bit in the corresponding PTE. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-25-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25KVM: selftests: Add selftest for KVM statistics data binary interfaceJing Zhang5-0/+256
Add selftest to check KVM stats descriptors validity. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Tested-by: Fuad Tabba <tabba@google.com> #arm64 Signed-off-by: Jing Zhang <jingzhangos@google.com> Message-Id: <20210618222709.1858088-7-jingzhangos@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24tools/testing: add a selftest for SO_NETNS_COOKIELorenz Bauer4-1/+64
Make sure that SO_NETNS_COOKIE returns a non-zero value, and that sockets from different namespaces have a distinct cookie value. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-24KVM: sefltests: Add x86-64 test to verify MMU reacts to CPUID updatesSean Christopherson4-0/+152
Add an x86-only test to verify that x86's MMU reacts to CPUID updates that impact the MMU. KVM has had multiple bugs where it fails to reconfigure the MMU after the guest's vCPU model changes. Sadly, this test is effectively limited to shadow paging because the hardware page walk handler doesn't support software disabling of GBPAGES support, and KVM doesn't manually walk the GVA->GPA on faults for performance reasons (doing so would large defeat the benefits of TDP). Don't require !TDP for the tests as there is still value in running the tests with TDP, even though the tests will fail (barring KVM hacks). E.g. KVM should not completely explode if MAXPHYADDR results in KVM using 4-level vs. 5-level paging for the guest. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-20-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Add hugepage support for x86-64Sean Christopherson2-25/+68
Add x86-64 hugepage support in the form of a x86-only variant of virt_pg_map() that takes an explicit page size. To keep things simple, follow the existing logic for 4k pages and disallow creating a hugepage if the upper-level entry is present, even if the desired pfn matches. Opportunistically fix a double "beyond beyond" reported by checkpatch. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-19-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Genericize upper level page table entry structSean Christopherson1-65/+26
In preparation for adding hugepage support, replace "pageMapL4Entry", "pageDirectoryPointerEntry", and "pageDirectoryEntry" with a common "pageUpperEntry", and add a helper to create an upper level entry. All upper level entries have the same layout, using unique structs provides minimal value and requires a non-trivial amount of code duplication. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-18-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Add PTE helper for x86-64 in preparation for hugepagesSean Christopherson1-28/+31
Add a helper to retrieve a PTE pointer given a PFN, address, and level in preparation for adding hugepage support. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-17-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Rename x86's page table "address" to "pfn"Sean Christopherson1-25/+22
Rename the "address" field to "pfn" in x86's page table structs to match reality. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-16-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Add wrapper to allocate page table pageSean Christopherson6-39/+23
Add a helper to allocate a page for use in constructing the guest's page tables. All architectures have identical address and memslot requirements (which appear to be arbitrary anyways). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-15-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Unconditionally allocate EPT tables in memslot 0Sean Christopherson4-22/+17
Drop the EPTP memslot param from all EPT helpers and shove the hardcoded '0' down to the vm_phy_page_alloc() calls. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-14-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Unconditionally use memslot '0' for page table allocationsSean Christopherson16-35/+31
Drop the memslot param from virt_pg_map() and virt_map() and shove the hardcoded '0' down to the vm_phy_page_alloc() calls. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-13-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Unconditionally use memslot 0 for vaddr allocationsSean Christopherson7-21/+18
Drop the memslot param(s) from vm_vaddr_alloc() now that all callers directly specific '0' as the memslot. Drop the memslot param from virt_pgd_alloc() as well since vm_vaddr_alloc() is its only user. I.e. shove the hardcoded '0' down to the vm_phy_pages_alloc() calls. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Use "standard" min virtual address for CPUID test allocSean Christopherson1-2/+1
Use KVM_UTIL_MIN_ADDR as the minimum for x86-64's CPUID array. The system page size was likely used as the minimum because _something_ had to be provided. Increasing the min from 0x1000 to 0x2000 should have no meaningful impact on the test, and will allow changing vm_vaddr_alloc() to use KVM_UTIL_MIN_VADDR as the default. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-11-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Use alloc page helper for xAPIC IPI testSean Christopherson1-1/+1
Use the common page allocation helper for the xAPIC IPI test, effectively raising the minimum virtual address from 0x1000 to 0x2000. Presumably the test won't explode if it can't get a page at address 0x1000... Cc: Peter Shier <pshier@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Use alloc_page helper for x86-64's GDT/IDT/TSS allocationsSean Christopherson1-6/+4
Switch to the vm_vaddr_alloc_page() helper for x86-64's "kernel" allocations now that the helper uses the same min virtual address as the open coded versions. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Lower the min virtual address for misc page allocationsSean Christopherson1-1/+1
Reduce the minimum virtual address of page allocations from 0x10000 to KVM_UTIL_MIN_VADDR (0x2000). Both values appear to be completely arbitrary, and reducing the min to KVM_UTIL_MIN_VADDR will allow for additional consolidation of code. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Add helpers to allocate N pages of virtual memorySean Christopherson6-24/+59
Add wrappers to allocate 1 and N pages of memory using de facto standard values as the defaults for minimum virtual address, data memslot, and page table memslot. Convert all compatible users. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Use "standard" min virtual address for Hyper-V pagesSean Christopherson1-1/+1
Use the de facto standard minimum virtual address for Hyper-V's hcall params page. It's the allocator's job to not double-allocate memory, i.e. there's no reason to force different regions for the params vs. hcall page. This will allow adding a page allocation helper with a "standard" minimum address. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Unconditionally use memslot 0 for x86's GDT/TSS setupSean Christopherson1-10/+8
Refactor x86's GDT/TSS allocations to for memslot '0' at its vm_addr_alloc() call sites instead of passing in '0' from on high. This is a step toward using a common helper for allocating pages. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Unconditionally use memslot 0 when loading elf binarySean Christopherson6-10/+7
Use memslot '0' for all vm_vaddr_alloc() calls when loading the test binary. This is the first step toward adding a helper to handle page allocations with a default value for the target memslot. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24KVM: selftests: Zero out the correct page in the Hyper-V features testSean Christopherson1-1/+1
Fix an apparent copy-paste goof in hyperv_features where hcall_page (which is two pages, so technically just the first page) gets zeroed twice, and hcall_params gets zeroed none times. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210622200529.3650424-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>