summaryrefslogtreecommitdiff
path: root/tools/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2024-04-08Merge tag 'fixes-2024-04-08' of ↵Linus Torvalds3-0/+25
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fixes from Mike Rapoport: "Fix build errors in memblock tests: - add stubs to functions that calls to them were recently added to memblock but they were missing in tests - update gfp_types.h to include bits.h so that BIT() definitions won't depend on other includes" * tag 'fixes-2024-04-08' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: fix undefined reference to `BIT' memblock tests: fix undefined reference to `panic' memblock tests: fix undefined reference to `early_pfn_to_nid'
2024-04-04memblock tests: fix undefined reference to `panic'Wei Yang2-0/+20
commit e96c6b8f212a ("memblock: report failures when memblock_can_resize is not set") introduced the usage of panic, which is not defined in memblock test. Let's define it directly in panic.h to fix it. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Song Shuai <songshuaishuai@tinylab.org> CC: Mike Rapoport <rppt@kernel.org> Link: https://lore.kernel.org/r/20240402132701.29744-3-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-04-04memblock tests: fix undefined reference to `early_pfn_to_nid'Wei Yang1-0/+5
commit 6a9531c3a880 ("memblock: fix crash when reserved memory is not added to memory") introduce the usage of early_pfn_to_nid, which is not defined in memblock tests. The original definition of early_pfn_to_nid is defined in mm.h, so let add this in the corresponding mm.h. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Yajun Deng <yajun.deng@linux.dev> CC: Mike Rapoport <rppt@kernel.org> Link: https://lore.kernel.org/r/20240402132701.29744-2-richard.weiyang@gmail.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2024-03-29tools/resolve_btfids: fix build with musl libcNatanael Copa1-0/+2
Include the header that defines u32. This fixes build of 6.6.23 and 6.1.83 kernels for Alpine Linux, which uses musl libc. I assume that GNU libc indirecly pulls in linux/types.h. Fixes: 9707ac4fe2f5 ("tools/resolve_btfids: Refactor set sorting with types from btf_ids.h") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218647 Cc: stable@vger.kernel.org Signed-off-by: Natanael Copa <ncopa@alpinelinux.org> Tested-by: Greg Thelen <gthelen@google.com> Link: https://lore.kernel.org/r/20240328110103.28734-1-ncopa@alpinelinux.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-22Merge tag 'loongarch-6.9' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Add objtool support for LoongArch - Add ORC stack unwinder support for LoongArch - Add kernel livepatching support for LoongArch - Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig - Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig - Some bug fixes and other small changes * tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch/crypto: Clean up useless assignment operations LoongArch: Define the __io_aw() hook as mmiowb() LoongArch: Remove superfluous flush_dcache_page() definition LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.h LoongArch: Change __my_cpu_offset definition to avoid mis-optimization LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig LoongArch: Add kernel livepatching support LoongArch: Add ORC stack unwinder support objtool: Check local label in read_unwind_hints() objtool: Check local label in add_dead_ends() objtool/LoongArch: Enable orc to be built objtool/x86: Separate arch-specific and generic parts objtool/LoongArch: Implement instruction decoder objtool/LoongArch: Enable objtool to be built
2024-03-13Merge tag 'slab-for-6.9' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - Freelist loading optimization (Chengming Zhou) When the per-cpu slab is depleted and a new one loaded from the cpu partial list, optimize the loading to avoid an irq enable/disable cycle. This results in a 3.5% performance improvement on the "perf bench sched messaging" test. - Kernel boot parameters cleanup after SLAB removal (Xiongwei Song) Due to two different main slab implementations we've had boot parameters prefixed either slab_ and slub_ with some later becoming an alias as both implementations gained the same functionality (i.e. slab_nomerge vs slub_nomerge). In order to eventually get rid of the implementation-specific names, the canonical and documented parameters are now all prefixed slab_ and the slub_ variants become deprecated but still working aliases. - SLAB_ kmem_cache creation flags cleanup (Vlastimil Babka) The flags had hardcoded #define values which became tedious and error-prone when adding new ones. Assign the values via an enum that takes care of providing unique bit numbers. Also deprecate SLAB_MEM_SPREAD which was only used by SLAB, so it's a no-op since SLAB removal. Assign it an explicit zero value. The removals of the flag usage are handled independently in the respective subsystems, with a final removal of any leftover usage planned for the next release. - Misc cleanups and fixes (Chengming Zhou, Xiaolei Wang, Zheng Yejian) Includes removal of unused code or function parameters and a fix of a memleak. * tag 'slab-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: slab: remove PARTIAL_NODE slab_state mm, slab: remove memcg_from_slab_obj() mm, slab: remove the corner case of inc_slabs_node() mm/slab: Fix a kmemleak in kmem_cache_destroy() mm, slab, kasan: replace kasan_never_merge() with SLAB_NO_MERGE mm, slab: use an enum to define SLAB_ cache creation flags mm, slab: deprecate SLAB_MEM_SPREAD flag mm, slab: fix the comment of cpu partial list mm, slab: remove unused object_size parameter in kmem_cache_flags() mm/slub: remove parameter 'flags' in create_kmalloc_caches() mm/slub: remove unused parameter in next_freelist_entry() mm/slub: remove full list manipulation for non-debug slab mm/slub: directly load freelist from cpu partial slab in the likely case mm/slub: make the description of slab_min_objects helpful in doc mm/slub: replace slub_$params with slab_$params in slub.rst mm/slub: unify all sl[au]b parameters with "slab_$param" Documentation: kernel-parameters: remove noaliencache
2024-03-11objtool/LoongArch: Implement instruction decoderTiezhu Yang1-0/+11
Only copy the minimal definitions of instruction opcodes and formats in inst.h from arch/loongarch to tools/arch/loongarch, and also copy the definition of sign_extend64() to tools/include/linux/bitops.h to decode the following kinds of instructions: (1) stack pointer related instructions addi.d, ld.d, st.d, ldptr.d and stptr.d (2) branch and jump related instructions beq, bne, blt, bge, bltu, bgeu, beqz, bnez, bceqz, bcnez, b, bl and jirl (3) other instructions break, nop and ertn See more info about instructions in LoongArch Reference Manual: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-03-05slab: remove PARTIAL_NODE slab_stateChengming Zhou1-1/+0
The PARTIAL_NODE slab_state has gone with SLAB removed, so just remove it. Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-03-03Merge tag 'for-netdev' of ↵Jakub Kicinski1-0/+9
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-02-29 We've added 119 non-merge commits during the last 32 day(s) which contain a total of 150 files changed, 3589 insertions(+), 995 deletions(-). The main changes are: 1) Extend the BPF verifier to enable static subprog calls in spin lock critical sections, from Kumar Kartikeya Dwivedi. 2) Fix confusing and incorrect inference of PTR_TO_CTX argument type in BPF global subprogs, from Andrii Nakryiko. 3) Larger batch of riscv BPF JIT improvements and enabling inlining of the bpf_kptr_xchg() for RV64, from Pu Lehui. 4) Allow skeleton users to change the values of the fields in struct_ops maps at runtime, from Kui-Feng Lee. 5) Extend the verifier's capabilities of tracking scalars when they are spilled to stack, especially when the spill or fill is narrowing, from Maxim Mikityanskiy & Eduard Zingerman. 6) Various BPF selftest improvements to fix errors under gcc BPF backend, from Jose E. Marchesi. 7) Avoid module loading failure when the module trying to register a struct_ops has its BTF section stripped, from Geliang Tang. 8) Annotate all kfuncs in .BTF_ids section which eventually allows for automatic kfunc prototype generation from bpftool, from Daniel Xu. 9) Several updates to the instruction-set.rst IETF standardization document, from Dave Thaler. 10) Shrink the size of struct bpf_map resp. bpf_array, from Alexei Starovoitov. 11) Initial small subset of BPF verifier prepwork for sleepable bpf_timer, from Benjamin Tissoires. 12) Fix bpftool to be more portable to musl libc by using POSIX's basename(), from Arnaldo Carvalho de Melo. 13) Add libbpf support to gcc in CORE macro definitions, from Cupertino Miranda. 14) Remove a duplicate type check in perf_event_bpf_event, from Florian Lehner. 15) Fix bpf_spin_{un,}lock BPF helpers to actually annotate them with notrace correctly, from Yonghong Song. 16) Replace the deprecated bpf_lpm_trie_key 0-length array with flexible array to fix build warnings, from Kees Cook. 17) Fix resolve_btfids cross-compilation to non host-native endianness, from Viktor Malik. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits) selftests/bpf: Test if shadow types work correctly. bpftool: Add an example for struct_ops map and shadow type. bpftool: Generated shadow variables for struct_ops maps. libbpf: Convert st_ops->data to shadow type. libbpf: Set btf_value_type_id of struct bpf_map for struct_ops. bpf: Replace bpf_lpm_trie_key 0-length array with flexible array bpf, arm64: use bpf_prog_pack for memory management arm64: patching: implement text_poke API bpf, arm64: support exceptions arm64: stacktrace: Implement arch_bpf_stack_walk() for the BPF JIT bpf: add is_async_callback_calling_insn() helper bpf: introduce in_sleepable() helper bpf: allow more maps in sleepable bpf programs selftests/bpf: Test case for lacking CFI stub functions. bpf: Check cfi_stubs before registering a struct_ops type. bpf: Clarify batch lookup/lookup_and_delete semantics bpf, docs: specify which BPF_ABS and BPF_IND fields were zero bpf, docs: Fix typos in instruction-set.rst selftests/bpf: update tcp_custom_syncookie to use scalar packet offset bpf: Shrink size of struct bpf_map/bpf_array. ... ==================== Link: https://lore.kernel.org/r/20240301001625.8800-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-10work around gcc bugs with 'asm goto' with outputsLinus Torvalds1-2/+2
We've had issues with gcc and 'asm goto' before, and we created a 'asm_volatile_goto()' macro for that in the past: see commits 3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation bug") and a9f180345f53 ("compiler/gcc4: Make quirk for asm_volatile_goto() unconditional"). Then, much later, we ended up removing the workaround in commit 43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR 58670") because we no longer supported building the kernel with the affected gcc versions, but we left the macro uses around. Now, Sean Christopherson reports a new version of a very similar problem, which is fixed by re-applying that ancient workaround. But the problem in question is limited to only the 'asm goto with outputs' cases, so instead of re-introducing the old workaround as-is, let's rename and limit the workaround to just that much less common case. It looks like there are at least two separate issues that all hit in this area: (a) some versions of gcc don't mark the asm goto as 'volatile' when it has outputs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420 which is easy to work around by just adding the 'volatile' by hand. (b) Internal compiler errors: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422 which are worked around by adding the extra empty 'asm' as a barrier, as in the original workaround. but the problem Sean sees may be a third thing since it involves bad code generation (not an ICE) even with the manually added 'volatile'. but the same old workaround works for this case, even if this feels a bit like voodoo programming and may only be hiding the issue. Reported-and-tested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/ Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Jakub Jelinek <jakub@redhat.com> Cc: Andrew Pinski <quic_apinski@quicinc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-02-08tools/resolve_btfids: Refactor set sorting with types from btf_ids.hViktor Malik1-0/+9
Instead of using magic offsets to access BTF ID set data, leverage types from btf_ids.h (btf_id_set and btf_id_set8) which define the actual layout of the data. Thanks to this change, set sorting should also continue working if the layout changes. This requires to sync the definition of 'struct btf_id_set8' from include/linux/btf_ids.h to tools/include/linux/btf_ids.h. We don't sync the rest of the file at the moment, b/c that would require to also sync multiple dependent headers and we don't need any other defs from btf_ids.h. Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/ff7f062ddf6a00815fda3087957c4ce667f50532.1707223196.git.vmalik@redhat.com
2023-12-11maple_tree: update check_forking() and bench_forking()Peng Zhang1-0/+4
Updated check_forking() and bench_forking() to use __mt_dup() to duplicate maple tree. Link: https://lkml.kernel.org/r/20231027033845.90608-9-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-11maple_tree: introduce {mtree,mas}_lock_nested()Peng Zhang1-0/+1
In some cases, nested locks may be needed, so {mtree,mas}_lock_nested is introduced. For example, when duplicating maple tree, we need to hold the locks of two trees, in which case nested locks are needed. At the same time, add the definition of spin_lock_nested() in tools for testing. Link: https://lkml.kernel.org/r/20231027033845.90608-3-zhangpeng.00@bytedance.com Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Mike Christie <michael.christie@oracle.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-11-03Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of ↵Linus Torvalds2-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "As usual, lots of singleton and doubleton patches all over the tree and there's little I can say which isn't in the individual changelogs. The lengthier patch series are - 'kdump: use generic functions to simplify crashkernel reservation in arch', from Baoquan He. This is mainly cleanups and consolidation of the 'crashkernel=' kernel parameter handling - After much discussion, David Laight's 'minmax: Relax type checks in min() and max()' is here. Hopefully reduces some typecasting and the use of min_t() and max_t() - A group of patches from Oleg Nesterov which clean up and slightly fix our handling of reads from /proc/PID/task/... and which remove task_struct.thread_group" * tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits) scripts/gdb/vmalloc: disable on no-MMU scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n .mailmap: add address mapping for Tomeu Vizoso mailmap: update email address for Claudiu Beznea tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions .mailmap: map Benjamin Poirier's address scripts/gdb: add lx_current support for riscv ocfs2: fix a spelling typo in comment proc: test ProtectionKey in proc-empty-vm test proc: fix proc-empty-vm test with vsyscall fs/proc/base.c: remove unneeded semicolon do_io_accounting: use sig->stats_lock do_io_accounting: use __for_each_thread() ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error() ocfs2: fix a typo in a comment scripts/show_delta: add __main__ judgement before main code treewide: mark stuff as __ro_after_init fs: ocfs2: check status values proc: test /proc/${pid}/statm compiler.h: move __is_constexpr() to compiler.h ...
2023-10-19compiler.h: move __is_constexpr() to compiler.hDavid Laight2-8/+8
Prior to f747e6667ebb2 __is_constexpr() was in its only user minmax.h. That commit moved it to const.h - but that file just defines ULL(x) and UL(x) so that constants can be defined for .S and .c files. So apart from the word 'const' it wasn't really a good location. Instead move the definition to compiler.h just before the similar is_signed_type() and is_unsigned_type(). This may not be a good long-term home, but the three definitions belong together. Link: https://lkml.kernel.org/r/2a6680bbe2e84459816a113730426782@AcuMS.aculab.com Signed-off-by: David Laight <david.laight@aculab.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-18maple_tree: add GFP_KERNEL to allocations in mas_expected_entries()Liam R. Howlett1-0/+40
Users complained about OOM errors during fork without triggering compaction. This can be fixed by modifying the flags used in mas_expected_entries() so that the compaction will be triggered in low memory situations. Since mas_expected_entries() is only used during fork, the extra argument does not need to be passed through. Additionally, the two test_maple_tree test cases and one benchmark test were altered to use the correct locking type so that allocations would not trigger sleeping and thus fail. Testing was completed with lockdep atomic sleep detection. The additional locking change requires rwsem support additions to the tools/ directory through the use of pthreads pthread_rwlock_t. With this change test_maple_tree works in userspace, as a module, and in-kernel. Users may notice that the system gave up early on attempting to start new processes instead of attempting to reclaim memory. Link: https://lkml.kernel.org/r/20230915093243epcms1p46fa00bbac1ab7b7dca94acb66c44c456@epcms1p4 Link: https://lkml.kernel.org/r/20231012155233.2272446-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com> Cc: <jason.sim@samsung.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-09-21Merge tag 'net-6.6-rc3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter and bpf. Current release - regressions: - bpf: adjust size_index according to the value of KMALLOC_MIN_SIZE - netfilter: fix entries val in rule reset audit log - eth: stmmac: fix incorrect rxq|txq_stats reference Previous releases - regressions: - ipv4: fix null-deref in ipv4_link_failure - netfilter: - fix several GC related issues - fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP - eth: team: fix null-ptr-deref when team device type is changed - eth: i40e: fix VF VLAN offloading when port VLAN is configured - eth: ionic: fix 16bit math issue when PAGE_SIZE >= 64KB Previous releases - always broken: - core: fix ETH_P_1588 flow dissector - mptcp: fix several connection hang-up conditions - bpf: - avoid deadlock when using queue and stack maps from NMI - add override check to kprobe multi link attach - hsr: properly parse HSRv1 supervisor frames. - eth: igc: fix infinite initialization loop with early XDP redirect - eth: octeon_ep: fix tx dma unmap len values in SG - eth: hns3: fix GRE checksum offload issue" * tag 'net-6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits) sfc: handle error pointers returned by rhashtable_lookup_get_insert_fast() igc: Expose tx-usecs coalesce setting to user octeontx2-pf: Do xdp_do_flush() after redirects. bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPI net: ena: Flush XDP packets on error. net/handshake: Fix memory leak in __sock_create() and sock_alloc_file() net: hinic: Fix warning-hinic_set_vlan_fliter() warn: variable dereferenced before check 'hwdev' netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP netfilter: nf_tables: fix memleak when more than 255 elements expired netfilter: nf_tables: disable toggling dormant table state more than once vxlan: Add missing entries to vxlan_get_size() net: rds: Fix possible NULL-pointer dereference team: fix null-ptr-deref when team device type is changed net: bridge: use DEV_STATS_INC() net: hns3: add 5ms delay before clear firmware reset irq source net: hns3: fix fail to delete tc flower rules during reset issue net: hns3: only enable unicast promisc when mac table full net: hns3: fix GRE checksum offload issue net: hns3: add cmdq check for vf periodic service task net: stmmac: fix incorrect rxq|txq_stats reference ...
2023-09-15bpf: Fix BTF_ID symbol generation collision in tools/Nick Desaulniers1-1/+1
Marcus and Satya reported an issue where BTF_ID macro generates same symbol in separate objects and that breaks final vmlinux link. ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol '__BTF_ID__struct__cgroup__624' is already defined This can be triggered under specific configs when __COUNTER__ happens to be the same for the same symbol in two different translation units, which is already quite unlikely to happen. Add __LINE__ number suffix to make BTF_ID symbol more unique, which is not a complete fix, but it would help for now and meanwhile we can work on better solution as suggested by Andrii. Cc: stable@vger.kernel.org Reported-by: Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com> Reported-by: Marcus Seyfarth <m.seyfarth@gmail.com> Closes: https://github.com/ClangBuiltLinux/linux/issues/1913 Debugged-by: Nathan Chancellor <nathan@kernel.org> Co-developed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/CAEf4Bzb5KQ2_LmhN769ifMeSJaWfebccUasQOfQKaOd0nQ51tw@mail.gmail.com/ Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20230915-bpf_collision-v3-2-263fc519c21f@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-09-14memblock tests: fix warning ‘struct seq_file’ declared inside parameter listMike Rapoport (IBM)1-0/+2
Building memblock tests produces the following warning: cc -I. -I../../include -Wall -O2 -fsanitize=address -fsanitize=undefined -D CONFIG_PHYS_ADDR_T_64BIT -c -o main.o main.c In file included from tests/common.h:9, from tests/basic_api.h:5, from main.c:2: ./linux/memblock.h:601:50: warning: ‘struct seq_file’ declared inside parameter list will not be visible outside of this definition or declaration 601 | static inline void memtest_report_meminfo(struct seq_file *m) { } | ^~~~~~~~ Add declaration of 'struct seq_file' to tools/include/linux/seq_file.h to fix it. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2023-09-14memblock tests: fix warning: "__ALIGN_KERNEL" redefinedMike Rapoport (IBM)1-2/+0
Building memblock tests produces the following warning: cc -I. -I../../include -Wall -O2 -fsanitize=address -fsanitize=undefined -D CONFIG_PHYS_ADDR_T_64BIT -c -o main.o main.c In file included from ../../include/linux/pfn.h:5, from ./linux/memory_hotplug.h:6, from ./linux/init.h:7, from ./linux/memblock.h:11, from tests/common.h:8, from tests/basic_api.h:5, from main.c:2: ../../include/linux/mm.h:14: warning: "__ALIGN_KERNEL" redefined 14 | #define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (typeof(x))(a) - 1) | In file included from ../../include/linux/mm.h:6, from ../../include/linux/pfn.h:5, from ./linux/memory_hotplug.h:6, from ./linux/init.h:7, from ./linux/memblock.h:11, from tests/common.h:8, from tests/basic_api.h:5, from main.c:2: ../../include/uapi/linux/const.h:31: note: this is the location of the previous definition 31 | #define __ALIGN_KERNEL(x, a) __ALIGN_KERNEL_MASK(x, (__typeof__(x))(a) - 1) | Remove definitions of __ALIGN_KERNEL and __ALIGN_KERNEL_MASK from tools/include/linux/mm.h to fix it. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2023-09-14memblock tests: Fix compilation errors.Rong Tao1-1/+1
This patch fix the follow errors. commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") pass nid parameter to reserve_bootmem_region(), $ make -C tools/testing/memblock/ ... memblock.c: In function ‘memmap_init_reserved_pages’: memblock.c:2111:25: error: too many arguments to function ‘reserve_bootmem_region’ 2111 | reserve_bootmem_region(start, end, nid); | ^~~~~~~~~~~~~~~~~~~~~~ ../../include/linux/mm.h:32:6: note: declared here 32 | void reserve_bootmem_region(phys_addr_t start, phys_addr_t end); | ^~~~~~~~~~~~~~~~~~~~~~ memblock.c:2122:17: error: too many arguments to function ‘reserve_bootmem_region’ 2122 | reserve_bootmem_region(start, end, nid); | ^~~~~~~~~~~~~~~~~~~~~~ commit dcdfdd40fa82 ("mm: Add support for unaccepted memory") call accept_memory() in memblock.c $ make -C tools/testing/memblock/ ... cc -fsanitize=address -fsanitize=undefined main.o memblock.o \ lib/slab.o mmzone.o slab.o tests/alloc_nid_api.o \ tests/alloc_helpers_api.o tests/alloc_api.o tests/basic_api.o \ tests/common.o tests/alloc_exact_nid_api.o -o main /usr/bin/ld: memblock.o: in function `memblock_alloc_range_nid': memblock.c:(.text+0x7ae4): undefined reference to `accept_memory' Signed-off-by: Rong Tao <rongtao@cestc.cn> Fixes: dcdfdd40fa82 ("mm: Add support for unaccepted memory") Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") Link: https://lore.kernel.org/r/tencent_6F19BC082167F15DF2A8D8BEFE8EF220F60A@qq.com Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
2023-08-04tools include: Add some common function attributesMark Brown1-0/+12
We don't have definitions of __always_unused or __noreturn in the tools version of compiler.h, add them so we can use them in kselftests. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-3-0c1290db5d46@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04tools compiler.h: Add OPTIMIZER_HIDE_VAR()Mark Brown1-0/+6
Port over the definition of OPTIMIZER_HIDE_VAR() so we can use it in kselftests. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230728-arm64-signal-memcpy-fix-v4-2-0c1290db5d46@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-05-26perf cs-etm: Copy kernel coresight-pmu.h headerJames Clark1-13/+0
Copy the kernel version of the header to fix the header diff build warning. Some new definitions were only added to the tools side header, but these are only used in Perf so move them to a different header. Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: Mike Leach <mike.leach@linaro.org> Link: https://lore.kernel.org/r/20230522102604.1081416-1-james.clark@arm.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: acme@kernel.org Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: coresight@lists.linaro.org Cc: siyanteng@loongson.cn Cc: John Garry <john.g.garry@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-07Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of ↵Linus Torvalds3-17/+40
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "Third version of perf tool updates, with the build problems with with using a 'vmlinux.h' generated from the main build fixed, and the bpf skeleton build disabled by default. Build: - Require libtraceevent to build, one can disable it using NO_LIBTRACEEVENT=1. It is required for tools like 'perf sched', 'perf kvm', 'perf trace', etc. libtraceevent is available in most distros so installing 'libtraceevent-devel' should be a one-time event to continue building perf as usual. Using NO_LIBTRACEEVENT=1 produces tooling that is functional and sufficient for lots of users not interested in those libtraceevent dependent features. - Allow Python support in 'perf script' when libtraceevent isn't linked, as not all features requires it, for instance Intel PT does not use tracepoints. - Error if the python interpreter needed for jevents to work isn't available and NO_JEVENTS=1 isn't set, preventing a build without support for JSON vendor events, which is a rare but possible condition. The two check error messages: $(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.) $(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.) - Make libbpf 1.0 the minimum required when building with out of tree, distro provided libbpf. - Use libsdtc++'s and LLVM's libcxx's __cxa_demangle, a portable C++ demangler, add 'perf test' entry for it. - Make binutils libraries opt in, as distros disable building with it due to licensing, they were used for C++ demangling, for instance. - Switch libpfm4 to opt-out rather than opt-in, if libpfm-devel (or equivalent) isn't installed, we'll just have a build warning: Makefile.config:1144: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev - Add a feature test for scandirat(), that is not implemented so far in musl and uclibc, disabling features that need it, such as scanning for tracepoints in /sys/kernel/tracing/events. perf BPF filters: - New feature where BPF can be used to filter samples, for instance: $ sudo ./perf record -e cycles --filter 'period > 1000' true $ sudo ./perf script perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms]) perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms]) perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms]) true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms]) true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) - In addition to 'period' (PERF_SAMPLE_PERIOD), the other PERF_SAMPLE_ can be used for filtering, and also some other sample accessible values, from tools/perf/Documentation/perf-record.txt: Essentially the BPF filter expression is: <term> <operator> <value> (("," | "||") <term> <operator> <value>)* The <term> can be one of: ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr, code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat, p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock, mem_dtlb, mem_blk, mem_hops The <operator> can be one of: ==, !=, >, >=, <, <=, & The <value> can be one of: <number> (for any term) na, load, store, pfetch, exec (for mem_op) l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl) na, none, hit, miss, hitm, fwd, peer (for mem_snoop) remote (for mem_remote) na, locked (for mem_locked) na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb) na, by_data, by_addr (for mem_blk) hops0, hops1, hops2, hops3 (for mem_hops) perf lock contention: - Show lock type with address. - Track and show mmap_lock, siglock and per-cpu rq_lock with address. This is done for mmap_lock by following the current->mm pointer: $ sudo ./perf lock con -abl -- sleep 10 contended total wait max wait avg wait address symbol ... 16344 312.30 ms 2.22 ms 19.11 us ffff8cc702595640 17686 310.08 ms 1.49 ms 17.53 us ffff8cc7025952c0 3 84.14 ms 45.79 ms 28.05 ms ffff8cc78114c478 mmap_lock 3557 76.80 ms 68.75 us 21.59 us ffff8cc77ca3af58 1 68.27 ms 68.27 ms 68.27 ms ffff8cda745dfd70 9 54.53 ms 7.96 ms 6.06 ms ffff8cc7642a48b8 mmap_lock 14629 44.01 ms 60.00 us 3.01 us ffff8cc7625f9ca0 3481 42.63 ms 140.71 us 12.24 us ffffffff937906ac vmap_area_lock 16194 38.73 ms 42.15 us 2.39 us ffff8cd397cbc560 11 38.44 ms 10.39 ms 3.49 ms ffff8ccd6d12fbb8 mmap_lock 1 5.43 ms 5.43 ms 5.43 ms ffff8cd70018f0d8 1674 5.38 ms 422.93 us 3.21 us ffffffff92e06080 tasklist_lock 581 4.51 ms 130.68 us 7.75 us ffff8cc9b1259058 5 3.52 ms 1.27 ms 703.23 us ffff8cc754510070 112 3.47 ms 56.47 us 31.02 us ffff8ccee38b3120 381 3.31 ms 73.44 us 8.69 us ffffffff93790690 purge_vmap_area_lock 255 3.19 ms 36.35 us 12.49 us ffff8d053ce30c80 - Update default map size to 16384. - Allocate single letter option -M for --map-nr-entries, as it is proving being frequently used. - Fix struct rq lock access for older kernels with BPF's CO-RE (Compile once, run everywhere). - Fix problems found with MSAn. perf report/top: - Add inline information when using --call-graph=fp or lbr, as was already done to the --call-graph=dwarf callchain mode. - Improve the 'srcfile' sort key performance by really using an optimization introduced in 6.2 for the 'srcline' sort key that avoids calling addr2line for comparision with each sample. perf sched: - Make 'perf sched latency/map/replay' to use "sched:sched_waking" instead of "sched:sched_waking", consistent with 'perf record' since d566a9c2d482 ("perf sched: Prefer sched_waking event when it exists"). perf ftrace: - Make system wide the default target for latency subcommand, run the following command then generate some network traffic and press control+C: # perf ftrace latency -T __kfree_skb ^C DURATION | COUNT | GRAPH | 0 - 1 us | 27 | ############# | 1 - 2 us | 22 | ########### | 2 - 4 us | 8 | #### | 4 - 8 us | 5 | ## | 8 - 16 us | 24 | ############ | 16 - 32 us | 2 | # | 32 - 64 us | 1 | | 64 - 128 us | 0 | | 128 - 256 us | 0 | | 256 - 512 us | 0 | | 512 - 1024 us | 0 | | 1 - 2 ms | 0 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 0 | | 256 - 512 ms | 0 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | # perf top: - Add --branch-history (LBR: Last Branch Record) option, just like already available for 'perf record'. - Fix segfault in thread__comm_len() where thread->comm was being used outside thread->comm_lock. perf annotate: - Allow configuring objdump and addr2line in ~/.perfconfig., so that you can use alternative binaries, such as llvm's. perf kvm: - Add TUI mode for 'perf kvm stat report'. Reference counting: - Add reference count checking infrastructure to check for use after free, done to the 'cpumap', 'namespaces', 'maps' and 'map' structs, more to come. To build with it use -DREFCNT_CHECKING=1 in the make command line to build tools/perf. Documented at: https://perf.wiki.kernel.org/index.php/Reference_Count_Checking - The above caught, for instance, fix, present in this series: - Fix maps use after put in 'perf test "Share thread maps"': 'maps' is copied from leader, but the leader is put on line 79 and then 'maps' is used to read the reference count below - so a use after put, with the put of maps happening within thread__put. Fixed by reversing the order of puts so that the leader is put last. - Also several fixes were made to places where reference counts were not being held. - Make this one of the tests in 'make -C tools/perf build-test' to regularly build test it and to make sure no direct access to the reference counted structs are made, doing that via accessors to check the validity of the struct pointer. ARM64: - Fix 'perf report' segfault when filtering coresight traces by sparse lists of CPUs. - Add support for 'simd' as a sort field for 'perf report', to show ARM's NEON SIMD's predicate flags: "partial" and "empty". arm64 vendor events: - Add N1 metrics. Intel vendor events: - Add graniterapids, grandridge and sierraforrest events. - Refresh events for: alderlake, aldernaken, broadwell, broadwellde, broadwellx, cascadelakx, haswell, haswellx, icelake, icelakex, jaketown, meteorlake, knightslanding, sandybridge, sapphirerapids, silvermont, skylake, tigerlake and westmereep-dp - Refresh metrics for alderlake-n, broadwell, broadwellde, broadwellx, haswell, haswellx, icelakex, ivybridge, ivytown and skylakex. perf stat: - Implement --topdown using JSON metrics. - Add TopdownL1 JSON metric as a default if present, but disable it for now for some Intel hybrid architectures, a series of patches addressing this is being reviewed and will be submitted for v6.5. - Use metrics for --smi-cost. - Update topdown documentation. Vendor events (JSON) infrastructure: - Add support for computing and printing metric threshold values. For instance, here is one found in thesapphirerapids json file: { "BriefDescription": "Percentage of cycles spent in System Management Interrupts.", "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)", "MetricGroup": "smi", "MetricName": "smi_cycles", "MetricThreshold": "smi_cycles > 0.1", "ScaleUnit": "100%" }, - Test parsing metric thresholds with the fake PMU in 'perf test pmu-events'. - Support for printing metric thresholds in 'perf list'. - Add --metric-no-threshold option to 'perf stat'. - Add rand (reverse and) and has_pmem (optane memory) support to metrics. - Sort list of input files to avoid depending on the order from readdir() helping in obtaining reproducible builds. S/390: - Add common metrics: - CPI (cycles per instruction), prbstate (ratio of instructions executed in problem state compared to total number of instructions), l1mp (Level one instruction and data cache misses per 100 instructions). - Add cache metrics for z13, z14, z15 and z16. - Add metric for TLB and cache. ARM: - Add raw decoding for SPE (Statistical Profiling Extension) v1.3 MTE (Memory Tagging Extension) and MOPS (Memory Operations) load/store. Intel PT hardware tracing: - Add event type names UINTR (User interrupt delivered) and UIRET (Exiting from user interrupt routine), documented in table 32-50 "CFE Packet Type and Vector Fields Details" in the Intel Processor Trace chapter of The Intel SDM Volume 3 version 078. - Add support for new branch instructions ERETS and ERETU. - Fix CYC timestamps after standalone CBR ARM CoreSight hardware tracing: - Allow user to override timestamp and contextid settings. - Fix segfault in dso lookup. - Fix timeless decode mode detection. - Add separate decode paths for timeless and per-thread modes. auxtrace: - Fix address filter entire kernel size. Miscellaneous: - Fix use-after-free and unaligned bugs in the PLT handling routines. - Use zfree() to reduce chances of use after free. - Add missing 0x prefix for addresses printed in hexadecimal in 'perf probe'. - Suppress massive unsupported target platform errors in the unwind code. - Fix return incorrect build_id size in elf_read_build_id(). - Fix 'perf scripts intel-pt-events.py' IPC output for Python 2 . - Add missing new parameter in kfree_skb tracepoint to the python scripts using it. - Add 'perf bench syscall fork' benchmark. - Add support for printing PERF_MEM_LVLNUM_UNC (Uncached access) in 'perf mem'. - Fix wrong size expectation for perf test 'Setup struct perf_event_attr' caused by the patch adding perf_event_attr::config3. - Fix some spelling mistakes" * tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (365 commits) Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL" Revert "perf build: Warn for BPF skeletons if endian mismatches" perf metrics: Fix SEGV with --for-each-cgroup perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE perf stat: Separate bperf from bpf_profiler perf test record+probe_libc_inet_pton: Fix call chain match on x86_64 perf test record+probe_libc_inet_pton: Fix call chain match on s390 perf tracepoint: Fix memory leak in is_valid_tracepoint() perf cs-etm: Add fix for coresight trace for any range of CPUs perf build: Fix unescaped # in perf build-test perf unwind: Suppress massive unsupported target platform errors perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it perf script: Print raw ip instead of binary offset for callchain perf symbols: Fix return incorrect build_id size in elf_read_build_id() perf list: Modify the warning message about scandirat(3) perf list: Fix memory leaks in print_tracepoint_events() perf lock contention: Rework offset calculation with BPF CO-RE perf lock contention: Fix struct rq lock access perf stat: Disable TopdownL1 on hybrid perf stat: Avoid SEGV on counter->name ...
2023-04-29Merge tag 'objtool-core-2023-04-27' of ↵Linus Torvalds2-200/+57
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Mark arch_cpu_idle_dead() __noreturn, make all architectures & drivers that did this inconsistently follow this new, common convention, and fix all the fallout that objtool can now detect statically - Fix/improve the ORC unwinder becoming unreliable due to UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK and UNWIND_HINT_UNDEFINED to resolve it - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code - Generate ORC data for __pfx code - Add more __noreturn annotations to various kernel startup/shutdown and panic functions - Misc improvements & fixes * tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) x86/hyperv: Mark hv_ghcb_terminate() as noreturn scsi: message: fusion: Mark mpt_halt_firmware() __noreturn x86/cpu: Mark {hlt,resume}_play_dead() __noreturn btrfs: Mark btrfs_assertfail() __noreturn objtool: Include weak functions in global_noreturns check cpu: Mark nmi_panic_self_stop() __noreturn cpu: Mark panic_smp_self_stop() __noreturn arm64/cpu: Mark cpu_park_loop() and friends __noreturn x86/head: Mark *_start_kernel() __noreturn init: Mark start_kernel() __noreturn init: Mark [arch_call_]rest_init() __noreturn objtool: Generate ORC data for __pfx code x86/linkage: Fix padding for typed functions objtool: Separate prefix code from stack validation code objtool: Remove superfluous dead_end_function() check objtool: Add symbol iteration helpers objtool: Add WARN_INSN() scripts/objdump-func: Support multiple functions context_tracking: Fix KCSAN noinstr violation objtool: Add stackleak instrumentation to uaccess safe list ...
2023-04-28Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-0/+5
Pull virtio updates from Michael Tsirkin: "virtio,vhost,vdpa: features, fixes, and cleanups: - reduction in interrupt rate in virtio - perf improvement for VDUSE - scalability for vhost-scsi - non power of 2 ring support for packed rings - better management for mlx5 vdpa - suspend for snet - VIRTIO_F_NOTIFICATION_DATA - shared backend with vdpa-sim-blk - user VA support in vdpa-sim - better struct packing for virtio and fixes, cleanups all over the place" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (52 commits) vhost_vdpa: fix unmap process in no-batch mode MAINTAINERS: make me a reviewer of VIRTIO CORE AND NET DRIVERS tools/virtio: fix build caused by virtio_ring changes virtio_ring: add a struct device forward declaration vdpa_sim_blk: support shared backend vdpa_sim: move buffer allocation in the devices vdpa/snet: use likely/unlikely macros in hot functions vdpa/snet: implement kick_vq_with_data callback virtio-vdpa: add VIRTIO_F_NOTIFICATION_DATA feature support virtio: add VIRTIO_F_NOTIFICATION_DATA feature support vdpa/snet: support the suspend vDPA callback vdpa/snet: support getting and setting VQ state MAINTAINERS: add vringh.h to Virtio Core and Net Drivers vringh: address kdoc warnings vdpa: address kdoc warnings virtio_ring: don't update event idx on get_buf vdpa_sim: add support for user VA vdpa_sim: replace the spinlock with a mutex to protect the state vdpa_sim: use kthread worker vdpa_sim: make devices agnostic for work management ...
2023-04-24Merge tag 'docs-6.4' of git://git.lwn.net/linuxLinus Torvalds1-1/+1
Pull documentation updates from Jonathan Corbet: "Commit volume in documentation is relatively low this time, but there is still a fair amount going on, including: - Reorganize the architecture-specific documentation under Documentation/arch This makes the structure match the source directory and helps to clean up the mess that is the top-level Documentation directory a bit. This work creates the new directory and moves x86 and most of the less-active architectures there. The current plan is to move the rest of the architectures in 6.5, with the patches going through the appropriate subsystem trees. - Some more Spanish translations and maintenance of the Italian translation - A new "Kernel contribution maturity model" document from Ted - A new tutorial on quickly building a trimmed kernel from Thorsten Plus the usual set of updates and fixes" * tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits) media: Adjust column width for pdfdocs media: Fix building pdfdocs docs: clk: add documentation to log which clocks have been disabled docs: trace: Fix typo in ftrace.rst Documentation/process: always CC responsible lists docs: kmemleak: adjust to config renaming ELF: document some de-facto PT_* ABI quirks Documentation: arm: remove stih415/stih416 related entries docs: turn off "smart quotes" in the HTML build Documentation: firmware: Clarify firmware path usage docs/mm: Physical Memory: Fix grammar Documentation: Add document for false sharing dma-api-howto: typo fix docs: move m68k architecture documentation under Documentation/arch/ docs: move parisc documentation under Documentation/arch/ docs: move ia64 architecture docs under Documentation/arch/ docs: Move arc architecture docs under Documentation/arch/ docs: move nios2 documentation under Documentation/arch/ docs: move openrisc documentation under Documentation/arch/ docs: move superh documentation under Documentation/arch/ ...
2023-04-21tools/virtio: fix build caused by virtio_ring changesShunsuke Mie1-0/+5
Fix the build dependency for virtio_test. The virtio_ring that is used from the test requires container_of_const(). Change to use container_of.h kernel header directly and adapt related codes. Signed-off-by: Shunsuke Mie <mie@igel.co.jp> Message-Id: <20230417022037.917668-2-mie@igel.co.jp> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-07tools: Rename __fallthrough to fallthroughLiam Howlett2-6/+4
Rename the fallthrough attribute to better align with the kernel version. Copy the definition from include/linux/compiler_attributes.h including the #else clause. Adding the #else clause allows the tools compiler.h header to drop the check for a definition entirely and keeps both definitions together. Change any __fallthrough statements to fallthrough anywhere it was used within perf. This allows other tools to use the same key word as the kernel. Committer notes: Did some missing conversions to: builtin-list.c Also included gtk.h before the 'fallthrough' definition in: tools/perf/ui/gtk/hists.c tools/perf/ui/gtk/helpline.c tools/perf/ui/gtk/browser.c As it is the arg name for a macro in glib.h: /var/home/acme/git/perf-tools-next/tools/include/linux/compiler-gcc.h:16:55: error: missing binary operator before token "(" 16 | # define fallthrough __attribute__((__fallthrough__)) | ^ /usr/include/glib-2.0/glib/gmacros.h:637:28: note: in expansion of macro ‘fallthrough’ 637 | #if g_macro__has_attribute(fallthrough) Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Tom Rix <trix@redhat.com> Cc: linux-sparse@vger.kernel.org <linux-sparse@vger.kernel.org> Cc: llvm@lists.linux.dev <llvm@lists.linux.dev> Link: https://lore.kernel.org/r/20221125154947.2163498-1-Liam.Howlett@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packetMike Leach1-0/+15
When using dynamically assigned CoreSight trace IDs the drivers can output the ID / CPU association as a PERF_RECORD_AUX_OUTPUT_HW_ID packet. Update cs-etm decoder to handle this packet by setting the CPU/Trace ID mapping. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Update record event to use new Trace ID protocolMike Leach1-12/+17
Trace IDs are now dynamically allocated. Previously used the static association algorithm that is no longer used. The 'cpu * 2 + seed' was outdated and broken for systems with high core counts (>46). as it did not scale and was broken for larger core counts. Trace ID will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. Legacy ID algorithm renamed and retained for limited backward compatibility use. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04perf cs-etm: Move mapping of Trace ID and cpu into helper functionMike Leach1-0/+5
The information to associate Trace ID and CPU will be changing. Drivers will start outputting this as a hardware ID packet in the data file which if present will be used in preference to the AUXINFO values. To prepare for this we provide a helper functions to do the individual ID mapping, and one to extract the IDs from the completed metadata blocks. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Mike Leach <mike.leach@linaro.org> Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Darren Hart <darren@os.amperecomputing.com> Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-30docs: move x86 documentation into Documentation/arch/Jonathan Corbet1-1/+1
Move the x86 documentation under Documentation/arch/ as a way of cleaning up the top-level directory and making the structure of our docs more closely match the structure of the source directories it describes. All in-kernel references to the old paths have been updated. Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20230315211523.108836-1-corbet@lwn.net/ Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-03-24x86,objtool: Split UNWIND_HINT_EMPTY in twoJosh Poimboeuf1-9/+18
Mark reported that the ORC unwinder incorrectly marks an unwind as reliable when the unwind terminates prematurely in the dark corners of return_to_handler() due to lack of information about the next frame. The problem is UNWIND_HINT_EMPTY is used in two different situations: 1) The end of the kernel stack unwind before hitting user entry, boot code, or fork entry 2) A blind spot in ORC coverage where the unwinder has to bail due to lack of information about the next frame The ORC unwinder has no way to tell the difference between the two. When it encounters an undefined stack state with 'end=1', it blindly marks the stack reliable, which can break the livepatch consistency model. Fix it by splitting UNWIND_HINT_EMPTY into UNWIND_HINT_UNDEFINED and UNWIND_HINT_END_OF_STACK. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/fd6212c8b450d3564b855e1cb48404d6277b4d9f.1677683419.git.jpoimboe@kernel.org
2023-03-24x86,objtool: Separate unret validation from unwind hintsJosh Poimboeuf1-3/+2
The ENTRY unwind hint type is serving double duty as both an empty unwind hint and an unret validation annotation. Unret validation is unrelated to unwinding. Separate it out into its own annotation. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/ff7448d492ea21b86d8a90264b105fbd0d751077.1677683419.git.jpoimboe@kernel.org
2023-03-24x86,objtool: Introduce ORC_TYPE_*Josh Poimboeuf1-0/+1
Unwind hints and ORC entry types are two distinct things. Separate them out more explicitly. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/cc879d38fff8a43f8f7beb2fd56e35a5a384d7cd.1677683419.git.jpoimboe@kernel.org
2023-03-24objtool: Add objtool_types.hJosh Poimboeuf2-200/+48
Reduce the amount of header sync churn by splitting the shared objtool.h types into a new file. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/dec622720851210ceafa12d4f4c5f9e73c832152.1677683419.git.jpoimboe@kernel.org
2023-03-10Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Add Adrian Hunter to MAINTAINERS as a perf tools reviewer - Sync various tools/ copies of kernel headers with the kernel sources, this time trying to avoid first merging with upstream to then update but instead copy from upstream so that a merge is avoided and the end result after merging this pull request is the one expected, tools/perf/check-headers.sh (mostly) happy, less warnings while building tools/perf/ - Fix counting when initial delay configured by setting perf_attr.enable_on_exec when starting workloads from the perf command line - Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject --buildid-all' when that record comes with a build-id, otherwise we end up not being able to resolve symbols - Don't use comma as the CSV output separator the "stat+csv_output" test, as comma can appear on some tests as a modifier for an event, use @ instead, ditto for the JSON linter test - The offcpu test was looking for some bits being set on task_struct->prev_state without masking other bits not important for this specific 'perf test', fix it * tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf tools: Add Adrian Hunter to MAINTAINERS as a reviewer tools headers UAPI: Sync linux/perf_event.h with the kernel sources tools headers x86 cpufeatures: Sync with the kernel sources tools include UAPI: Sync linux/vhost.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers kvm: Sync uapi/{asm/linux} kvm.h headers with the kernel sources tools include UAPI: Synchronize linux/fcntl.h with the kernel sources tools headers: Synchronize {linux,vdso}/bits.h with the kernel sources tools headers UAPI: Sync linux/prctl.h with the kernel sources tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench' perf stat: Fix counting when initial delay configured tools headers svm: Sync svm headers with the kernel sources perf test: Avoid counting commas in json linter perf tests stat+csv_output: Switch CSV separator to @ perf inject: Fix --buildid-all not to eat up MMAP2 tools arch x86: Sync the msr-index.h copy with the kernel sources perf test: Fix offcpu test prev_state check
2023-03-04tools headers: Synchronize {linux,vdso}/bits.h with the kernel sourcesArnaldo Carvalho de Melo1-1/+0
To pick up the changes in this cset: cbdb1f163af2bb90 ("vdso/bits.h: Add BIT_ULL() for the sake of consistency") That just causes perf to rebuild, the macro included doesn't clash with anything in tools/{perf,objtool,bpf}. This addresses this perf build warning: Warning: Kernel ABI header at 'tools/include/linux/bits.h' differs from latest version at 'include/linux/bits.h' diff -u tools/include/linux/bits.h include/linux/bits.h Warning: Kernel ABI header at 'tools/include/vdso/bits.h' differs from latest version at 'include/vdso/bits.h' diff -u tools/include/vdso/bits.h include/vdso/bits.h Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-23Merge branch 'linus' into objtool/core, to pick up Xen dependenciesIngo Molnar1-0/+9
Pick up dependencies - freshly merged upstream via xen-next - before applying dependent objtool changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-02-11x86/unwind/orc: Add 'signal' field to ORC metadataJosh Poimboeuf1-4/+7
Add a 'signal' field which allows unwind hints to specify whether the instruction pointer should be taken literally (like for most interrupts and exceptions) rather than decremented (like for call stack return addresses) when used to find the next ORC entry. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/d2c5ec4d83a45b513d8fd72fab59f1a8cfa46871.1676068346.git.jpoimboe@kernel.org
2023-01-18tools headers: Syncronize linux/build_bug.h with the kernel sourcesArnaldo Carvalho de Melo1-0/+9
To pick up the changes in: 07a368b3f55a79d3 ("bug: introduce ASSERT_STRUCT_OFFSET") This cset only introduces a build time assert macro, that may be useful at some point for tooling, for now it silences this perf build warning: Warning: Kernel ABI header at 'tools/include/linux/build_bug.h' differs from latest version at 'include/linux/build_bug.h' diff -u tools/include/linux/build_bug.h include/linux/build_bug.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/Y8f0jqQFYDAOBkHx@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-15Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-34/+176
Pull kvm updates from Paolo Bonzini: "ARM64: - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a97d: "Fix a number of issues with MTE, such as races on the tags being initialised vs the PG_mte_tagged flag as well as the lack of support for VM_SHARED when KVM is involved. Patches from Catalin Marinas and Peter Collingbourne"). - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. s390: - Second batch of the lazy destroy patches - First batch of KVM changes for kernel virtual != physical address support - Removal of a unused function x86: - Allow compiling out SMM support - Cleanup and documentation of SMM state save area format - Preserve interrupt shadow in SMM state save area - Respond to generic signals during slow page faults - Fixes and optimizations for the non-executable huge page errata fix. - Reprogram all performance counters on PMU filter change - Cleanups to Hyper-V emulation and tests - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest running on top of a L1 Hyper-V hypervisor) - Advertise several new Intel features - x86 Xen-for-KVM: - Allow the Xen runstate information to cross a page boundary - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured - Add support for 32-bit guests in SCHEDOP_poll - Notable x86 fixes and cleanups: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. - Advertise (on AMD) that the SMM_CTL MSR is not supported - Remove unnecessary exports Generic: - Support for responding to signals during page faults; introduces new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks Selftests: - Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. - Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. - Introduce actual atomics for clear/set_bit() in selftests - Add support for pinning vCPUs in dirty_log_perf_test. - Rename the so called "perf_util" framework to "memstress". - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the memstress tests. - Add a common ucall implementation; code dedup and pre-work for running SEV (and beyond) guests in selftests. - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). - A bunch of added/enabled/fixed selftests for ARM64, covering memslots, breakpoints, stage-2 faults and access tracking. - x86-specific selftest changes: - Clean up x86's page table management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. - Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). Documentation: - Remove deleted ioctls from documentation - Clean up the docs for the x86 MSR filter. - Various fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits) KVM: x86: Add proper ReST tables for userspace MSR exits/flags KVM: selftests: Allocate ucall pool from MEM_REGION_DATA KVM: arm64: selftests: Align VA space allocator with TTBR0 KVM: arm64: Fix benign bug with incorrect use of VA_BITS KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow KVM: x86: Advertise that the SMM_CTL MSR is not supported KVM: x86: remove unnecessary exports KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic" tools: KVM: selftests: Convert clear/set_bit() to actual atomics tools: Drop "atomic_" prefix from atomic test_and_set_bit() tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests perf tools: Use dedicated non-atomic clear/set bit helpers tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers KVM: arm64: selftests: Enable single-step without a "full" ucall() KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself KVM: Remove stale comment about KVM_REQ_UNHALT KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR KVM: Reference to kvm_userspace_memory_region in doc and comments KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl ...
2022-12-12Merge remote-tracking branch 'kvm/queue' into HEADPaolo Bonzini1-34/+0
x86 Xen-for-KVM: * Allow the Xen runstate information to cross a page boundary * Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured * add support for 32-bit guests in SCHEDOP_poll x86 fixes: * One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). * Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. * Clean up the MSR filter docs. * Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. * Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. * Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. * Advertise (on AMD) that the SMM_CTL MSR is not supported * Remove unnecessary exports Selftests: * Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. * Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). * Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. * Introduce actual atomics for clear/set_bit() in selftests Documentation: * Remove deleted ioctls from documentation * Various fixes
2022-12-02tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpersSean Christopherson1-34/+0
Drop tools' non-atomic test_and_set_bit() and test_and_clear_bit() helpers now that all users are gone. The names will be claimed in the future for atomic versions. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221119013450.2643007-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-10tools: Copy bitfield.h from the kernel sourcesRicardo Koller1-0/+176
Copy bitfield.h from include/linux/bitfield.h. A subsequent change will make use of some FIELD_{GET,PREP} macros defined in this header. The header was copied as-is, no changes needed. Cc: Jakub Kicinski <kuba@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221017195834.2295901-6-ricarkol@google.com
2022-10-17objtool: Fix find_{symbol,func}_containing()Peter Zijlstra1-0/+187
The current find_{symbol,func}_containing() functions are broken in the face of overlapping symbols, exactly the case that is needed for a new ibt/endbr supression. Import interval_tree_generic.h into the tools tree and convert the symbol tree to an interval tree to support proper range stabs. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220915111146.330203761@infradead.org
2022-10-11Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-10-10Merge tag 'bitmap-6.1-rc1' of https://github.com/norov/linuxLinus Torvalds1-49/+12
Pull bitmap updates from Yury Norov: - Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES (Phil Auld) - cleanup nr_cpu_ids vs nr_cpumask_bits mess (me) This series cleans that mess and adds new config FORCE_NR_CPUS that allows to optimize cpumask subsystem if the number of CPUs is known at compile-time. - optimize find_bit() functions (me) Reworks find_bit() functions based on new FIND_{FIRST,NEXT}_BIT() macros. - add find_nth_bit() (me) Adds find_nth_bit(), which is ~70 times faster than bitcounting with for_each() loop: for_each_set_bit(bit, mask, size) if (n-- == 0) return bit; Also adds bitmap_weight_and() to let people replace this pattern: tmp = bitmap_alloc(nbits); bitmap_and(tmp, map1, map2, nbits); weight = bitmap_weight(tmp, nbits); bitmap_free(tmp); with a single bitmap_weight_and() call. - repair cpumask_check() (me) After switching cpumask to use nr_cpu_ids, cpumask_check() started generating many false-positive warnings. This series fixes it. - Add for_each_cpu_andnot() and for_each_cpu_andnot() (Valentin Schneider) Extends the API with one more function and applies it in sched/core. * tag 'bitmap-6.1-rc1' of https://github.com/norov/linux: (28 commits) sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot() lib/test_cpumask: Add for_each_cpu_and(not) tests cpumask: Introduce for_each_cpu_andnot() lib/find_bit: Introduce find_next_andnot_bit() cpumask: fix checking valid cpu range lib/bitmap: add tests for for_each() loops lib/find: optimize for_each() macros lib/bitmap: introduce for_each_set_bit_wrap() macro lib/find_bit: add find_next{,_and}_bit_wrap cpumask: switch for_each_cpu{,_not} to use for_each_bit() net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and} cpumask: add cpumask_nth_{,and,andnot} lib/bitmap: remove bitmap_ord_to_pos lib/bitmap: add tests for find_nth_bit() lib: add find_nth{,_and,_andnot}_bit() lib/bitmap: add bitmap_weight_and() lib/bitmap: don't call __bitmap_weight() in kernel code tools: sync find_bit() implementation lib/find_bit: optimize find_next_bit() functions lib/find_bit: create find_first_zero_bit_le() ...