summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2023-02-03mm: discard __GFP_ATOMICNeilBrown1-4/+4
__GFP_ATOMIC serves little purpose. Its main effect is to set ALLOC_HARDER which adds a few little boosts to increase the chance of an allocation succeeding, one of which is to lower the water-mark at which it will succeed. It is *always* paired with __GFP_HIGH which sets ALLOC_HIGH which also adjusts this watermark. It is probable that other users of __GFP_HIGH should benefit from the other little bonuses that __GFP_ATOMIC gets. __GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM. There is little point to this. We already get a might_sleep() warning if __GFP_DIRECT_RECLAIM is set. __GFP_ATOMIC allows the "watermark_boost" to be side-stepped. It is probable that testing ALLOC_HARDER is a better fit here. __GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might sleep. This should test __GFP_DIRECT_RECLAIM instead. This patch: - removes __GFP_ATOMIC - allows __GFP_HIGH allocations to ignore watermark boosting as well as GFP_ATOMIC requests. - makes other adjustments as suggested by the above. The net result is not change to GFP_ATOMIC allocations. Other allocations that use __GFP_HIGH will benefit from a few different extra privileges. This affects: xen, dm, md, ntfs3 the vermillion frame buffer hibernation ksm swap all of which likely produce more benefit than cost if these selected allocation are more likely to succeed quickly. [mgorman: Minor adjustments to rework on top of a series] Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-03maple_tree: fix comment of mte_destroy_walkVernon Yang1-2/+2
The parameter name of maple tree is mt, make the comment be mt instead of mn, and the separator between the parameter name and the description to be : instead of -. Link: https://lkml.kernel.org/r/20230111135348.803181-1-vernon2gm@gmail.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-03maple_tree: remove the parameter entry of mas_preallocateVernon Yang1-2/+1
The parameter entry of mas_preallocate is not used, so drop it. Link: https://lkml.kernel.org/r/20230110154211.1758562-1-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-01Sync mm-stable with mm-hotfixes-stable to pick up dependent patchesAndrew Morton3-12/+102
Merge branch 'mm-hotfixes-stable' into mm-stable
2023-02-01Kconfig.debug: fix the help description in SCHED_DEBUGye xingchen1-1/+1
The correct file path for SCHED_DEBUG is /sys/kernel/debug/sched. Link: https://lkml.kernel.org/r/202301291013573466558@zte.com.cn Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-01mm: use stack_depot_early_init for kmemleakZhaoyang Huang1-0/+1
Mirsad report the below error which is caused by stack_depot_init() failure in kvcalloc. Solve this by having stackdepot use stack_depot_early_init(). On 1/4/23 17:08, Mirsad Goran Todorovac wrote: I hate to bring bad news again, but there seems to be a problem with the output of /sys/kernel/debug/kmemleak: [root@pc-mtodorov ~]# cat /sys/kernel/debug/kmemleak unreferenced object 0xffff951c118568b0 (size 16): comm "kworker/u12:2", pid 56, jiffies 4294893952 (age 4356.548s) hex dump (first 16 bytes): 6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0....... backtrace: [root@pc-mtodorov ~]# Apparently, backtrace of called functions on the stack is no longer printed with the list of memory leaks. This appeared on Lenovo desktop 10TX000VCR, with AlmaLinux 8.7 and BIOS version M22KT49A (11/10/2022) and 6.2-rc1 and 6.2-rc2 builds. This worked on 6.1 with the same CONFIG_KMEMLEAK=y and MGLRU enabled on a vanilla mainstream kernel from Mr. Torvalds' tree. I don't know if this is deliberate feature for some reason or a bug. Please find attached the config, lshw and kmemleak output. [vbabka@suse.cz: remove stack_depot_init() call] Link: https://lore.kernel.org/all/5272a819-ef74-65ff-be61-4d2d567337de@alu.unizg.hr/ Link: https://lkml.kernel.org/r/1674091345-14799-2-git-send-email-zhaoyang.huang@unisoc.com Fixes: 56a61617dd22 ("mm: use stack_depot for recording kmemleak's backtrace") Reported-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: ke.wang <ke.wang@unisoc.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-01maple_tree: should get pivots boundary by typeWei Yang1-2/+3
We should get pivots boundary by type. Fixes a potential overindexing of mt_pivots[]. Link: https://lkml.kernel.org/r/20221112234308.23823-1-richard.weiyang@gmail.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-01maple_tree: fix mas_empty_area_rev() lower bound validationLiam Howlett2-9/+97
mas_empty_area_rev() was not correctly validating the start of a gap against the lower limit. This could lead to the range starting lower than the requested minimum. Fix the issue by better validating a gap once one is found. This commit also adds tests to the maple tree test suite for this issue and tests the mas_empty_area() function for similar bound checking. Link: https://lkml.kernel.org/r/20230111200136.1851322-1-Liam.Howlett@oracle.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216911 Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: <amanieu@gmail.com> Link: https://lore.kernel.org/linux-mm/0b9f5425-08d4-8013-aa4c-e620c3b10bb2@leemhuis.info/ Tested-by: Holger Hoffsttte <holger@applied-asynchrony.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()Liam Howlett1-37/+43
Preallocations are common in the VMA code to avoid allocating under certain locking conditions. The preallocations must also cover the worst-case scenario. Removing the GFP_ZERO flag from the kmem_cache_alloc() (and bulk variant) calls will reduce the amount of time spent zeroing memory that may not be used. Only zero out the necessary area to keep track of the allocations in the maple state. Zero the entire node prior to using it in the tree. This required internal changes to node counting on allocation, so the test code is also updated. This restores some micro-benchmark performance: up to +9% in mmtests mmap1 by my testing +10% to +20% in mmap, mmapaddr, mmapmany tests reported by Red Hat Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636 Link: https://lkml.kernel.org/r/20230105160427.2988454-1-Liam.Howlett@oracle.com Signed-off-by: Liam Howlett <Liam.Howlett@oracle.com> Reported-by: Jirka Hladky <jhladky@redhat.com> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19maple_tree: refine mab_calc_split functionVernon Yang1-3/+2
Invert the conditional judgment of the mid_split, to focus the return statement in the last statement, which is easier to understand and for better readability. Link: https://lkml.kernel.org/r/20221221060058.609003-8-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19maple_tree: refine ma_state init from mas_start()Vernon Yang1-3/+3
If mas->node is an MAS_START, there are three cases, and they all assign different values to mas->node and mas->offset. So there is no need to set them to a default value before updating. Update them directly to make them easier to understand and for better readability. Link: https://lkml.kernel.org/r/20221221060058.609003-7-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19maple_tree: use macro MA_ROOT_PARENT instead of numberVernon Yang1-2/+1
When you need to compare whether node->parent is parent of the root node, using macro MA_ROOT_PARENT is easier to understand and for better readability. Link: https://lkml.kernel.org/r/20221221060058.609003-5-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19maple_tree: use mt_node_max() instead of direct operations mt_max[]Vernon Yang1-2/+2
Use mt_node_max() to get the maximum number of slots for a node, rather than direct operations mt_max[], makes it better portability. Link: https://lkml.kernel.org/r/20221221060058.609003-4-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19maple_tree: remove extra return statementVernon Yang1-3/+0
For functions with a return type of void, it is unnecessary to add a reurn statement at the end of the function, so drop it. Link: https://lkml.kernel.org/r/20221221060058.609003-3-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19maple_tree: remove extra space and blank lineVernon Yang1-10/+4
Patch series "Clean up and refinement for maple tree", v2. This patchset cleans up and refines some maple tree code. A few small changes make the code easier to understand and for better readability. This patch (of 7): These extra space and blank lines are unnecessary, so drop them. Link: https://lkml.kernel.org/r/20221221060058.609003-1-vernon2gm@gmail.com Link: https://lkml.kernel.org/r/20221221060058.609003-2-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19lib/test_vmalloc.c: add parameter use_huge for fix_size_alloc_testQinglin Pan1-1/+7
Add a parameter `use_huge' for fix_size_alloc_test(), which can be used to test allocation vie vmalloc_huge for both functionality and performance. Link: https://lkml.kernel.org/r/20221212055657.698420-1-panqinglin2020@iscas.ac.cn Signed-off-by: Qinglin Pan <panqinglin2020@iscas.ac.cn> Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-19Sync with v6.2-rc4Andrew Morton1-1/+0
Merge branch 'master' into mm-hotfixes-stable
2023-01-13lockref: stop doing cpu_relax in the cmpxchg loopMateusz Guzik1-1/+0
On the x86-64 architecture even a failing cmpxchg grants exclusive access to the cacheline, making it preferable to retry the failed op immediately instead of stalling with the pause instruction. To illustrate the impact, below are benchmark results obtained by running various will-it-scale tests on top of the 6.2-rc3 kernel and Cascade Lake (2 sockets * 24 cores * 2 threads) CPU. All results in ops/s. Note there is some variance in re-runs, but the code is consistently faster when contention is present. open3 ("Same file open/close"): proc stock no-pause 1 805603 814942 (+%1) 2 1054980 1054781 (-0%) 8 1544802 1822858 (+18%) 24 1191064 2199665 (+84%) 48 851582 1469860 (+72%) 96 609481 1427170 (+134%) fstat2 ("Same file fstat"): proc stock no-pause 1 3013872 3047636 (+1%) 2 4284687 4400421 (+2%) 8 3257721 5530156 (+69%) 24 2239819 5466127 (+144%) 48 1701072 5256609 (+209%) 96 1269157 6649326 (+423%) Additionally, a kernel with a private patch to help access() scalability: access2 ("Same file access"): proc stock patched patched +nopause 24 2378041 2005501 5370335 (-15% / +125%) That is, fixing the problems in access itself *reduces* scalability after the cacheline ping-pong only happens in lockref with the pause instruction. Note that fstat and access benchmarks are not currently integrated into will-it-scale, but interested parties can find them in pull requests to said project. Code at hand has a rather tortured history. First modification showed up in commit d472d9d98b46 ("lockref: Relax in cmpxchg loop"), written with Itanium in mind. Later it got patched up to use an arch-dependent macro to stop doing it on s390 where it caused a significant regression. Said macro had undergone revisions and was ultimately eliminated later, going back to cpu_relax. While I intended to only remove cpu_relax for x86-64, I got the following comment from Linus: I would actually prefer just removing it entirely and see if somebody else hollers. You have the numbers to prove it hurts on real hardware, and I don't think we have any numbers to the contrary. So I think it's better to trust the numbers and remove it as a failure, than say "let's just remove it on x86-64 and leave everybody else with the potentially broken code" Additionally, Will Deacon (maintainer of the arm64 port, one of the architectures previously benchmarked): So, from the arm64 side of the fence, I'm perfectly happy just removing the cpu_relax() calls from lockref. As such, come back full circle in history and whack it altogether. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/all/CAGudoHHx0Nqg6DE70zAVA75eV-HXfWyhVMWZ-aSeOofkA_=WdA@mail.gmail.com/ Acked-by: Tony Luck <tony.luck@intel.com> # ia64 Acked-by: Nicholas Piggin <npiggin@gmail.com> # powerpc Acked-by: Will Deacon <will@kernel.org> # arm64 Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-01-12lib/win_minmax: use /* notation for regular commentsRandy Dunlap1-1/+1
Don't use kernel-doc "/**" notation for non-kernel-doc comments. Prevents a kernel-doc warning: lib/win_minmax.c:31: warning: expecting prototype for lib/minmax.c(). Prototype was for minmax_subwin_update() instead Link: https://lkml.kernel.org/r/20230102211614.26343-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Neal Cardwell <ncardwell@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-01-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-1/+1
Pull rdma fixes from Jason Gunthorpe: "Most noticeable is that Yishai found a big data corruption regression due to a change in the scatterlist: - Do not wrongly combine non-contiguous pages in scatterlist - Fix compilation warnings on gcc 13 - Oops when using some mlx5 stats - Bad enforcement of atomic responder resources in mlx5" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: lib/scatterlist: Fix to merge contiguous pages into the last SG properly RDMA/mlx5: Fix validation of max_rd_atomic caps for DC RDMA/mlx5: Fix mlx5_ib_get_hw_stats when used for device RDMA/srp: Move large values to a new enum for gcc13
2023-01-05lib/scatterlist: Fix to merge contiguous pages into the last SG properlyYishai Hadas1-1/+1
When sg_alloc_append_table_from_pages() calls to pages_are_mergeable() in its 'sgt_append->prv' flow to check whether it can merge contiguous pages into the last SG, it passes the page arguments in the wrong order. The first parameter should be the next candidate page to be merged to the last page and not the opposite. The current code leads to a corrupted SG which resulted in OOPs and unexpected errors when non-contiguous pages are merged wrongly. Fix to pass the page parameters in the right order. Fixes: 1567b49d1a40 ("lib/scatterlist: add check when merging zone device pages") Link: https://lore.kernel.org/r/20230105112339.107969-1-yishaih@nvidia.com Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-27kunit: alloc_string_stream_fragment error handling bug fixYoungJun.park1-1/+3
When it fails to allocate fragment, it does not free and return error. And check the pointer inappropriately. Fixed merge conflicts with commit 618887768bb7 ("kunit: update NULL vs IS_ERR() tests") Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: YoungJun.park <her0gyugyu@gmail.com> Reviewed-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-12-22test_maple_tree: add test for mas_spanning_rebalance() on insufficient dataLiam Howlett1-0/+23
Add a test to the maple tree test suite for the spanning rebalance insufficient node issue does not go undetected again. Link: https://lkml.kernel.org/r/20221219161922.2708732-3-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-22maple_tree: fix mas_spanning_rebalance() on insufficient dataLiam Howlett1-1/+3
Mike Rapoport contacted me off-list with a regression in running criu. Periodic tests fail with an RCU stall during execution. Although rare, it is possible to hit this with other uses so this patch should be backported to fix the regression. This patchset adds the fix and a test case to the maple tree test suite. This patch (of 2): An insufficient node was causing an out-of-bounds access on the node in mas_leaf_max_gap(). The cause was the faulty detection of the new node being a root node when overwriting many entries at the end of the tree. Fix the detection of a new root and ensure there is sufficient data prior to entering the spanning rebalance loop. Link: https://lkml.kernel.org/r/20221219161922.2708732-1-Liam.Howlett@oracle.com Link: https://lkml.kernel.org/r/20221219161922.2708732-2-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Mike Rapoport <rppt@kernel.org> Tested-by: Mike Rapoport <rppt@kernel.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-20Merge tag 'spdx-6.2-rc1' of ↵Linus Torvalds2-22/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX/License additions from Greg KH: "Here are two small updates for LICENSES and some kernel files that add the Copyleft-next license and use it in a SPDX tag as a dual-license for some kernel files. These have been discussed thoroughly in public on the linux-spdx mailing list, and have the needed acks on them, as well as having been in linux-next with no reported issues for quite some time" * tag 'spdx-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: testing: use the copyleft-next-0.3.1 SPDX tag LICENSES: Add the copyleft-next-0.3.1 license
2022-12-20Merge tag 'asm-generic-6.2-1' of ↵Linus Torvalds1-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are only three fairly simple patches. The #include change to linux/swab.h addresses a userspace build issue, and the change to the mmio tracing logic helps provide more useful traces" * tag 'asm-generic-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: uapi: Add missing _UAPI prefix to <asm-generic/types.h> include guard asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info include/uapi/linux/swab: Fix potentially missing __always_inline
2022-12-19Merge tag 'kbuild-v6.2' of ↵Linus Torvalds1-2/+27
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Support zstd-compressed debug info - Allow W=1 builds to detect objects shared among multiple modules - Add srcrpm-pkg target to generate a source RPM package - Make the -s option detection work for future GNU Make versions - Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y - Allow W=1 builds to detect -Wundef warnings in any preprocessed files - Raise the minimum supported version of binutils to 2.25 - Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used - Use $(file ...) to read a file if GNU Make >= 4.2 is used - Print error if GNU Make older than 3.82 is used - Allow modpost to detect section mismatches with Clang LTO - Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y * tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (29 commits) buildtar: fix tarballs with EFI_ZBOOT enabled modpost: Include '.text.*' in TEXT_SECTIONS padata: Mark padata_work_init() as __ref kbuild: ensure Make >= 3.82 is used kbuild: refactor the prerequisites of the modpost rule kbuild: change module.order to list *.o instead of *.ko kbuild: use .NOTINTERMEDIATE for future GNU Make versions kconfig: refactor Makefile to reduce process forks kbuild: add read-file macro kbuild: do not sort after reading modules.order kbuild: add test-{ge,gt,le,lt} macros Documentation: raise minimum supported version of binutils to 2.25 kbuild: add -Wundef to KBUILD_CPPFLAGS for W=1 builds kbuild: move -Werror from KBUILD_CFLAGS to KBUILD_CPPFLAGS kbuild: Port silent mode detection to future gnu make. init/version.c: remove #include <generated/utsrelease.h> firmware_loader: remove #include <generated/utsrelease.h> modpost: Mark uuid_le type to be suitable only for MEI kbuild: add ability to make source rpm buildable using koji kbuild: warn objects shared among multiple modules ...
2022-12-19Merge tag 'zstd-linus-v6.2' of https://github.com/terrelln/linuxLinus Torvalds38-2443/+6689
Pull zstd updates from Nick Terrell: "Update the kernel to upstream zstd v1.5.2 [0]. Specifically to the tag v1.5.2-kernel [1] which includes several cherrypicked fixes for the kernel on top of v1.5.2. Excepting the MAINTAINERS change, all the changes in this can be generated by: git clone https://github.com/facebook/zstd cd zstd/contrib/linux-kernel git checkout v1.5.2-kernel LINUX=/path/to/linux/repo make import Additionally, this includes several minor typo fixes, which have all been fixed upstream so they are maintained on the next import" Link: https://github.com/facebook/zstd/releases/tag/v1.5.2 [0] Link: https://github.com/facebook/zstd/tree/v1.5.2-kernel [1] Link: https://lore.kernel.org/lkml/20221024202606.404049-1-nickrterrell@gmail.com/ Link: https://github.com/torvalds/linux/commit/637a642f5ca5e850186bb64ac75ebb0f124b458d * tag 'zstd-linus-v6.2' of https://github.com/terrelln/linux: zstd: import usptream v1.5.2 zstd: Move zstd-common module exports to zstd_common_module.c lib: zstd: Fix comment typo lib: zstd: fix repeated words in comments MAINTAINERS: git://github -> https://github.com for terrelln lib: zstd: clean up double word in comment.
2022-12-19Merge tag 'mm-nonmm-stable-2022-12-17-20-32' of ↵Linus Torvalds2-9/+14
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull fault-injection updates from Andrew Morton: "Some fault-injection improvements from Wei Yongjun which enable stacktrace filtering on x86_64" * tag 'mm-nonmm-stable-2022-12-17-20-32' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: fault-injection: make stacktrace filter works as expected fault-injection: make some stack filter attrs more readable fault-injection: skip stacktrace filtering by default fault-injection: allow stacktrace filter for x86-64
2022-12-19Merge tag 'mm-stable-2022-12-17-2' of ↵Linus Torvalds2-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more mm updates from Andrew Morton: - A few late-breaking minor fixups - Two minor feature patches which were awkwardly dependent on mm-nonmm. I need to set up a new branch to handle such things. * tag 'mm-stable-2022-12-17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: zram: zsmalloc: Add an additional co-maintainer mm/kmemleak: use %pK to display kernel pointers in backtrace mm: use stack_depot for recording kmemleak's backtrace maple_tree: update copyright dates for test code maple_tree: fix mas_find_rev() comment mm/gup_test: free memory allocated via kvcalloc() using kvfree()
2022-12-16Merge tag 'driver-core-6.2-rc1' of ↵Linus Torvalds1-11/+18
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the set of driver core and kernfs changes for 6.2-rc1. The "big" change in here is the addition of a new macro, container_of_const() that will preserve the "const-ness" of a pointer passed into it. The "problem" of the current container_of() macro is that if you pass in a "const *", out of it can comes a non-const pointer unless you specifically ask for it. For many usages, we want to preserve the "const" attribute by using the same call. For a specific example, this series changes the kobj_to_dev() macro to use it, allowing it to be used no matter what the const value is. This prevents every subsystem from having to declare 2 different individual macros (i.e. kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce the const value at build time, which having 2 macros would not do either. The driver for all of this have been discussions with the Rust kernel developers as to how to properly mark driver core, and kobject, objects as being "non-mutable". The changes to the kobject and driver core in this pull request are the result of that, as there are lots of paths where kobjects and device pointers are not modified at all, so marking them as "const" allows the compiler to enforce this. So, a nice side affect of the Rust development effort has been already to clean up the driver core code to be more obvious about object rules. All of this has been bike-shedded in quite a lot of detail on lkml with different names and implementations resulting in the tiny version we have in here, much better than my original proposal. Lots of subsystem maintainers have acked the changes as well. Other than this change, included in here are smaller stuff like: - kernfs fixes and updates to handle lock contention better - vmlinux.lds.h fixes and updates - sysfs and debugfs documentation updates - device property updates All of these have been in the linux-next tree for quite a while with no problems" * tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits) device property: Fix documentation for fwnode_get_next_parent() firmware_loader: fix up to_fw_sysfs() to preserve const usb.h: take advantage of container_of_const() device.h: move kobj_to_dev() to use container_of_const() container_of: add container_of_const() that preserves const-ness of the pointer driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion. driver core: fix up missed scsi/cxlflash class.devnode() conversion. driver core: fix up some missing class.devnode() conversions. driver core: make struct class.devnode() take a const * driver core: make struct class.dev_uevent() take a const * cacheinfo: Remove of_node_put() for fw_token device property: Add a blank line in Kconfig of tests device property: Rename goto label to be more precise device property: Move PROPERTY_ENTRY_BOOL() a bit down device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*() kernfs: fix all kernel-doc warnings and multiple typos driver core: pass a const * into of_device_uevent() kobject: kset_uevent_ops: make name() callback take a const * kobject: kset_uevent_ops: make filter() callback take a const * kobject: make kobject_namespace take a const * ...
2022-12-16Merge tag 'char-misc-6.2-rc1' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the large set of char/misc and other driver subsystem changes for 6.2-rc1. Nothing earth-shattering in here at all, just a lot of new driver development and minor fixes. Highlights include: - fastrpc driver updates - iio new drivers and updates - habanalabs driver updates for new hardware and features - slimbus driver updates - speakup module parameters added to aid in boot time configuration - i2c probe_new conversions for lots of different drivers - other small driver fixes and additions One semi-interesting change in here is the increase of the number of misc dynamic minors available to 1048448 to handle new huge-cpu systems. All of these have been in linux-next for a while with no reported problems" * tag 'char-misc-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (521 commits) extcon: usbc-tusb320: Convert to i2c's .probe_new() extcon: rt8973: Convert to i2c's .probe_new() extcon: fsa9480: Convert to i2c's .probe_new() extcon: max77843: Replace irqchip mask_invert with unmask_base chardev: fix error handling in cdev_device_add() mcb: mcb-parse: fix error handing in chameleon_parse_gdd() drivers: mcb: fix resource leak in mcb_probe() coresight: etm4x: fix repeated words in comments coresight: cti: Fix null pointer error on CTI init before ETM coresight: trbe: remove cpuhp instance node before remove cpuhp state counter: stm32-lptimer-cnt: fix the check on arr and cmp registers update misc: fastrpc: Add dma_mask to fastrpc_channel_ctx misc: fastrpc: Add mmap request assigning for static PD pool misc: fastrpc: Safekeep mmaps on interrupted invoke misc: fastrpc: Add support for audiopd misc: fastrpc: Rework fastrpc_req_munmap misc: fastrpc: Use fastrpc_map_put in fastrpc_map_create on fail misc: fastrpc: Add fastrpc_remote_heap_alloc misc: fastrpc: Add reserved mem support misc: fastrpc: Rename audio protection domain to root ...
2022-12-16fault-injection: make stacktrace filter works as expectedWei Yongjun1-3/+9
stacktrace filter is checked after others, such as fail-nth, interval and probability. This make it doesn't work well as expected. Fix to running stacktrace filter before other filters. It will speed up fault inject testing for driver modules. Link: https://lkml.kernel.org/r/20220817080332.1052710-5-weiyongjun1@huawei.com Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Isabella Basso <isabbasso@riseup.net> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-16fault-injection: make some stack filter attrs more readableWei Yongjun1-4/+4
Attributes of stack filter are show as unsigned decimal, such as 'require-start', 'require-end'. This patch change to show them as unsigned hexadecimal for more readable. Before: $ echo 0xffffffffc0257000 > /sys/kernel/debug/failslab/require-start $ cat /sys/kernel/debug/failslab/require-start 18446744072638263296 After: $ echo 0xffffffffc0257000 > /sys/kernel/debug/failslab/require-start $ cat /sys/kernel/debug/failslab/require-start 0xffffffffc0257000 [wangyufen@huawei.com: use debugfs_create_xul() instead of debugfs_create_xl()] Link: https://lkml.kernel.org/r/1664331299-4976-1-git-send-email-wangyufen@huawei.com Link: https://lkml.kernel.org/r/20220817080332.1052710-4-weiyongjun1@huawei.com Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Wang Yufen <wangyufen@huawei.com> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Isabella Basso <isabbasso@riseup.net> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-16fault-injection: skip stacktrace filtering by defaultWei Yongjun1-1/+1
If FAULT_INJECTION_STACKTRACE_FILTER is enabled, the depth is default to 32. This means fail_stacktrace() will iter each entry's stacktrace, even if filter is not configured. This patch changes to quick return from fail_stacktrace() if stacktrace filter is not set. Link: https://lkml.kernel.org/r/20220817080332.1052710-3-weiyongjun1@huawei.com Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Isabella Basso <isabbasso@riseup.net> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-16fault-injection: allow stacktrace filter for x86-64Wei Yongjun1-1/+0
This patchset allow fault injection to run on x86_64 and makes stacktrace filter work as expected. With this, we can test a device driver module with fault injection more easily. This patch (of 4): FAULT_INJECTION_STACKTRACE_FILTER option was apparently disallowed on x86_64 because of problems with the stack unwinder: commit 6d690dcac92a84f98fd774862628ff871b713660 Author: Akinobu Mita <akinobu.mita@gmail.com> Date: Sat May 12 10:36:53 2007 -0700 fault injection: disable stacktrace filter for x86-64 However, there is no problems whatsoever with this today. Let's allow it again. Link: https://lkml.kernel.org/r/20220817080332.1052710-1-weiyongjun1@huawei.com Link: https://lkml.kernel.org/r/20220817080332.1052710-2-weiyongjun1@huawei.com Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Isabella Basso <isabbasso@riseup.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-16mm: use stack_depot for recording kmemleak's backtraceZhaoyang Huang1-0/+1
Using stack_depot to record kmemleak's backtrace which has been implemented on slub for reducing redundant information. [akpm@linux-foundation.org: fix build - remove now-unused __save_stack_trace()] [zhaoyang.huang@unisoc.com: v3] Link: https://lkml.kernel.org/r/1667101354-4669-1-git-send-email-zhaoyang.huang@unisoc.com [akpm@linux-foundation.org: fix v3 layout oddities] [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/1666864224-27541-1-git-send-email-zhaoyang.huang@unisoc.com Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: ke.wang <ke.wang@unisoc.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhaoyang Huang <huangzhaoyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-16maple_tree: fix mas_find_rev() commentLiam Howlett1-1/+1
mas_find_rev() uses mas_prev_entry(), not mas_next_entry(), correct comment. Link: https://lkml.kernel.org/r/20221025173756.2719616-1-Liam.Howlett@oracle.com Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-15Merge tag 'x86_core_for_v6.2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Borislav Petkov: - Add the call depth tracking mitigation for Retbleed which has been long in the making. It is a lighterweight software-only fix for Skylake-based cores where enabling IBRS is a big hammer and causes a significant performance impact. What it basically does is, it aligns all kernel functions to 16 bytes boundary and adds a 16-byte padding before the function, objtool collects all functions' locations and when the mitigation gets applied, it patches a call accounting thunk which is used to track the call depth of the stack at any time. When that call depth reaches a magical, microarchitecture-specific value for the Return Stack Buffer, the code stuffs that RSB and avoids its underflow which could otherwise lead to the Intel variant of Retbleed. This software-only solution brings a lot of the lost performance back, as benchmarks suggest: https://lore.kernel.org/all/20220915111039.092790446@infradead.org/ That page above also contains a lot more detailed explanation of the whole mechanism - Implement a new control flow integrity scheme called FineIBT which is based on the software kCFI implementation and uses hardware IBT support where present to annotate and track indirect branches using a hash to validate them - Other misc fixes and cleanups * tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits) x86/paravirt: Use common macro for creating simple asm paravirt functions x86/paravirt: Remove clobber bitmask from .parainstructions x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit x86/Kconfig: Enable kernel IBT by default x86,pm: Force out-of-line memcpy() objtool: Fix weak hole vs prefix symbol objtool: Optimize elf_dirty_reloc_sym() x86/cfi: Add boot time hash randomization x86/cfi: Boot time selection of CFI scheme x86/ibt: Implement FineIBT objtool: Add --cfi to generate the .cfi_sites section x86: Add prefix symbols for function padding objtool: Add option to generate prefix symbols objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf objtool: Slice up elf_create_section_symbol() kallsyms: Revert "Take callthunks into account" x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces x86/retpoline: Fix crash printing warning x86/paravirt: Fix a !PARAVIRT build warning ...
2022-12-14Merge tag 'v6.2-p1' of ↵Linus Torvalds4-0/+1177
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Optimise away self-test overhead when they are disabled - Support symmetric encryption via keyring keys in af_alg - Flip hwrng default_quality, the default is now maximum entropy Algorithms: - Add library version of aesgcm - CFI fixes for assembly code - Add arm/arm64 accelerated versions of sm3/sm4 Drivers: - Remove assumption on arm64 that kmalloc is DMA-aligned - Fix selftest failures in rockchip - Add support for RK3328/RK3399 in rockchip - Add deflate support in qat - Merge ux500 into stm32 - Add support for TEE for PCI ID 0x14CA in ccp - Add mt7986 support in mtk - Add MaxLinear platform support in inside-secure - Add NPCM8XX support in npcm" * tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits) crypto: ux500/cryp - delete driver crypto: stm32/cryp - enable for use with Ux500 crypto: stm32 - enable drivers to be used on Ux500 dt-bindings: crypto: Let STM32 define Ux500 CRYP hwrng: geode - Fix PCI device refcount leak hwrng: amd - Fix PCI device refcount leak crypto: qce - Set DMA alignment explicitly crypto: octeontx2 - Set DMA alignment explicitly crypto: octeontx - Set DMA alignment explicitly crypto: keembay - Set DMA alignment explicitly crypto: safexcel - Set DMA alignment explicitly crypto: hisilicon/hpre - Set DMA alignment explicitly crypto: chelsio - Set DMA alignment explicitly crypto: ccree - Set DMA alignment explicitly crypto: ccp - Set DMA alignment explicitly crypto: cavium - Set DMA alignment explicitly crypto: img-hash - Fix variable dereferenced before check 'hdev->req' crypto: arm64/ghash-ce - use frame_push/pop macros consistently crypto: arm64/crct10dif - use frame_push/pop macros consistently crypto: arm64/aes-modes - use frame_push/pop macros consistently ...
2022-12-14Merge tag 'hardening-v6.2-rc1' of ↵Linus Torvalds10-343/+1075
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kernel hardening updates from Kees Cook: - Convert flexible array members, fix -Wstringop-overflow warnings, and fix KCFI function type mismatches that went ignored by maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook) - Remove the remaining side-effect users of ksize() by converting dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add more __alloc_size attributes, and introduce full testing of all allocator functions. Finally remove the ksize() side-effect so that each allocation-aware checker can finally behave without exceptions - Introduce oops_limit (default 10,000) and warn_limit (default off) to provide greater granularity of control for panic_on_oops and panic_on_warn (Jann Horn, Kees Cook) - Introduce overflows_type() and castable_to_type() helpers for cleaner overflow checking - Improve code generation for strscpy() and update str*() kern-doc - Convert strscpy and sigphash tests to KUnit, and expand memcpy tests - Always use a non-NULL argument for prepare_kernel_cred() - Disable structleak plugin in FORTIFY KUnit test (Anders Roxell) - Adjust orphan linker section checking to respect CONFIG_WERROR (Xin Li) - Make sure siginfo is cleared for forced SIGKILL (haifeng.xu) - Fix um vs FORTIFY warnings for always-NULL arguments * tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits) ksmbd: replace one-element arrays with flexible-array members hpet: Replace one-element array with flexible-array member um: virt-pci: Avoid GCC non-NULL warning signal: Initialize the info in ksignal lib: fortify_kunit: build without structleak plugin panic: Expose "warn_count" to sysfs panic: Introduce warn_limit panic: Consolidate open-coded panic_on_warn checks exit: Allow oops_limit to be disabled exit: Expose "oops_count" to sysfs exit: Put an upper limit on how often we can oops panic: Separate sysctl logic from CONFIG_SMP mm/pgtable: Fix multiple -Wstringop-overflow warnings mm: Make ksize() a reporting-only function kunit/fortify: Validate __alloc_size attribute results drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid() drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid() driver core: Add __alloc_size hint to devm allocators overflow: Introduce overflows_type() and castable_to_type() coredump: Proactively round up to kmalloc bucket size ...
2022-12-14Merge tag 'for-linus-iommufd' of ↵Linus Torvalds2-0/+136
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd implementation from Jason Gunthorpe: "iommufd is the user API to control the IOMMU subsystem as it relates to managing IO page tables that point at user space memory. It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO container) which is the VFIO specific interface for a similar idea. We see a broad need for extended features, some being highly IOMMU device specific: - Binding iommu_domain's to PASID/SSID - Userspace IO page tables, for ARM, x86 and S390 - Kernel bypassed invalidation of user page tables - Re-use of the KVM page table in the IOMMU - Dirty page tracking in the IOMMU - Runtime Increase/Decrease of IOPTE size - PRI support with faults resolved in userspace Many of these HW features exist to support VM use cases - for instance the combination of PASID, PRI and Userspace IO Page Tables allows an implementation of DMA Shared Virtual Addressing (vSVA) within a guest. Dirty tracking enables VM live migration with SRIOV devices and PASID support allow creating "scalable IOV" devices, among other things. As these features are fundamental to a VM platform they need to be uniformly exposed to all the driver families that do DMA into VMs, which is currently VFIO and VDPA" For more background, see the extended explanations in Jason's pull request: https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/ * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits) iommufd: Change the order of MSI setup iommufd: Improve a few unclear bits of code iommufd: Fix comment typos vfio: Move vfio group specific code into group.c vfio: Refactor dma APIs for emulated devices vfio: Wrap vfio group module init/clean code into helpers vfio: Refactor vfio_device open and close vfio: Make vfio_device_open() truly device specific vfio: Swap order of vfio_device_container_register() and open_device() vfio: Set device->group in helper function vfio: Create wrappers for group register/unregister vfio: Move the sanity check of the group to vfio_create_group() vfio: Simplify vfio_create_group() iommufd: Allow iommufd to supply /dev/vfio/vfio vfio: Make vfio_container optionally compiled vfio: Move container related MODULE_ALIAS statements into container.c vfio-iommufd: Support iommufd for emulated VFIO devices vfio-iommufd: Support iommufd for physical VFIO devices vfio-iommufd: Allow iommufd to be used in place of a container fd vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() ...
2022-12-14Merge tag 'mm-stable-2022-12-13' of ↵Linus Torvalds4-11/+34
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - More userfaultfs work from Peter Xu - Several convert-to-folios series from Sidhartha Kumar and Huang Ying - Some filemap cleanups from Vishal Moola - David Hildenbrand added the ability to selftest anon memory COW handling - Some cpuset simplifications from Liu Shixin - Addition of vmalloc tracing support by Uladzislau Rezki - Some pagecache folioifications and simplifications from Matthew Wilcox - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it - Miguel Ojeda contributed some cleanups for our use of the __no_sanitize_thread__ gcc keyword. This series should have been in the non-MM tree, my bad - Naoya Horiguchi improved the interaction between memory poisoning and memory section removal for huge pages - DAMON cleanups and tuneups from SeongJae Park - Tony Luck fixed the handling of COW faults against poisoned pages - Peter Xu utilized the PTE marker code for handling swapin errors - Hugh Dickins reworked compound page mapcount handling, simplifying it and making it more efficient - Removal of the autonuma savedwrite infrastructure from Nadav Amit and David Hildenbrand - zram support for multiple compression streams from Sergey Senozhatsky - David Hildenbrand reworked the GUP code's R/O long-term pinning so that drivers no longer need to use the FOLL_FORCE workaround which didn't work very well anyway - Mel Gorman altered the page allocator so that local IRQs can remnain enabled during per-cpu page allocations - Vishal Moola removed the try_to_release_page() wrapper - Stefan Roesch added some per-BDI sysfs tunables which are used to prevent network block devices from dirtying excessive amounts of pagecache - David Hildenbrand did some cleanup and repair work on KSM COW breaking - Nhat Pham and Johannes Weiner have implemented writeback in zswap's zsmalloc backend - Brian Foster has fixed a longstanding corner-case oddity in file[map]_write_and_wait_range() - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang Chen - Shiyang Ruan has done some work on fsdax, to make its reflink mode work better under xfstests. Better, but still not perfect - Christoph Hellwig has removed the .writepage() method from several filesystems. They only need .writepages() - Yosry Ahmed wrote a series which fixes the memcg reclaim target beancounting - David Hildenbrand has fixed some of our MM selftests for 32-bit machines - Many singleton patches, as usual * tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits) mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio mm: mmu_gather: allow more than one batch of delayed rmaps mm: fix typo in struct pglist_data code comment kmsan: fix memcpy tests mm: add cond_resched() in swapin_walk_pmd_entry() mm: do not show fs mm pc for VM_LOCKONFAULT pages selftests/vm: ksm_functional_tests: fixes for 32bit selftests/vm: cow: fix compile warning on 32bit selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem mm,thp,rmap: fix races between updates of subpages_mapcount mm: memcg: fix swapcached stat accounting mm: add nodes= arg to memory.reclaim mm: disable top-tier fallback to reclaim on proactive reclaim selftests: cgroup: make sure reclaim target memcg is unprotected selftests: cgroup: refactor proactive reclaim code to reclaim_until() mm: memcg: fix stale protection of reclaim target memcg mm/mmap: properly unaccount memory on mas_preallocate() failure omfs: remove ->writepage jfs: remove ->writepage ...
2022-12-14Merge branch 'zstd-next' into zstd-linusNick Terrell38-2443/+6689
2022-12-14Merge tag 'net-next-6.2' of ↵Linus Torvalds7-27/+18
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core: - Allow live renaming when an interface is up - Add retpoline wrappers for tc, improving considerably the performances of complex queue discipline configurations - Add inet drop monitor support - A few GRO performance improvements - Add infrastructure for atomic dev stats, addressing long standing data races - De-duplicate common code between OVS and conntrack offloading infrastructure - A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements - Netfilter: introduce packet parser for tunneled packets - Replace IPVS timer-based estimators with kthreads to scale up the workload with the number of available CPUs - Add the helper support for connection-tracking OVS offload BPF: - Support for user defined BPF objects: the use case is to allocate own objects, build own object hierarchies and use the building blocks to build own data structures flexibly, for example, linked lists in BPF - Make cgroup local storage available to non-cgroup attached BPF programs - Avoid unnecessary deadlock detection and failures wrt BPF task storage helpers - A relevant bunch of BPF verifier fixes and improvements - Veristat tool improvements to support custom filtering, sorting, and replay of results - Add LLVM disassembler as default library for dumping JITed code - Lots of new BPF documentation for various BPF maps - Add bpf_rcu_read_{,un}lock() support for sleepable programs - Add RCU grace period chaining to BPF to wait for the completion of access from both sleepable and non-sleepable BPF programs - Add support storing struct task_struct objects as kptrs in maps - Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer values - Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions Protocols: - TCP: implement Protective Load Balancing across switch links - TCP: allow dynamically disabling TCP-MD5 static key, reverting back to fast[er]-path - UDP: Introduce optional per-netns hash lookup table - IPv6: simplify and cleanup sockets disposal - Netlink: support different type policies for each generic netlink operation - MPTCP: add MSG_FASTOPEN and FastOpen listener side support - MPTCP: add netlink notification support for listener sockets events - SCTP: add VRF support, allowing sctp sockets binding to VRF devices - Add bridging MAC Authentication Bypass (MAB) support - Extensions for Ethernet VPN bridging implementation to better support multicast scenarios - More work for Wi-Fi 7 support, comprising conversion of all the existing drivers to internal TX queue usage - IPSec: introduce a new offload type (packet offload) allowing complete header processing and crypto offloading - IPSec: extended ack support for more descriptive XFRM error reporting - RXRPC: increase SACK table size and move processing into a per-local endpoint kernel thread, reducing considerably the required locking - IEEE 802154: synchronous send frame and extended filtering support, initial support for scanning available 15.4 networks - Tun: bump the link speed from 10Mbps to 10Gbps - Tun/VirtioNet: implement UDP segmentation offload support Driver API: - PHY/SFP: improve power level switching between standard level 1 and the higher power levels - New API for netdev <-> devlink_port linkage - PTP: convert existing drivers to new frequency adjustment implementation - DSA: add support for rx offloading - Autoload DSA tagging driver when dynamically changing protocol - Add new PCP and APPTRUST attributes to Data Center Bridging - Add configuration support for 800Gbps link speed - Add devlink port function attribute to enable/disable RoCE and migratable - Extend devlink-rate to support strict prioriry and weighted fair queuing - Add devlink support to directly reading from region memory - New device tree helper to fetch MAC address from nvmem - New big TCP helper to simplify temporary header stripping New hardware / drivers: - Ethernet: - Marvel Octeon CNF95N and CN10KB Ethernet Switches - Marvel Prestera AC5X Ethernet Switch - WangXun 10 Gigabit NIC - Motorcomm yt8521 Gigabit Ethernet - Microchip ksz9563 Gigabit Ethernet Switch - Microsoft Azure Network Adapter - Linux Automation 10Base-T1L adapter - PHY: - Aquantia AQR112 and AQR412 - Motorcomm YT8531S - PTP: - Orolia ART-CARD - WiFi: - MediaTek Wi-Fi 7 (802.11be) devices - RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB devices - Bluetooth: - Broadcom BCM4377/4378/4387 Bluetooth chipsets - Realtek RTL8852BE and RTL8723DS - Cypress.CYW4373A0 WiFi + Bluetooth combo device Drivers: - CAN: - gs_usb: bus error reporting support - kvaser_usb: listen only and bus error reporting support - Ethernet NICs: - Intel (100G): - extend action skbedit to RX queue mapping - implement devlink-rate support - support direct read from memory - nVidia/Mellanox (mlx5): - SW steering improvements, increasing rules update rate - Support for enhanced events compression - extend H/W offload packet manipulation capabilities - implement IPSec packet offload mode - nVidia/Mellanox (mlx4): - better big TCP support - Netronome Ethernet NICs (nfp): - IPsec offload support - add support for multicast filter - Broadcom: - RSS and PTP support improvements - AMD/SolarFlare: - netlink extened ack improvements - add basic flower matches to offload, and related stats - Virtual NICs: - ibmvnic: introduce affinity hint support - small / embedded: - FreeScale fec: add initial XDP support - Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood - TI am65-cpsw: add suspend/resume support - Mediatek MT7986: add RX wireless wthernet dispatch support - Realtek 8169: enable GRO software interrupt coalescing per default - Ethernet high-speed switches: - Microchip (sparx5): - add support for Sparx5 TC/flower H/W offload via VCAP - Mellanox mlxsw: - add 802.1X and MAC Authentication Bypass offload support - add ip6gre support - Embedded Ethernet switches: - Mediatek (mtk_eth_soc): - improve PCS implementation, add DSA untag support - enable flow offload support - Renesas: - add rswitch R-Car Gen4 gPTP support - Microchip (lan966x): - add full XDP support - add TC H/W offload via VCAP - enable PTP on bridge interfaces - Microchip (ksz8): - add MTU support for KSZ8 series - Qualcomm 802.11ax WiFi (ath11k): - support configuring channel dwell time during scan - MediaTek WiFi (mt76): - enable Wireless Ethernet Dispatch (WED) offload support - add ack signal support - enable coredump support - remain_on_channel support - Intel WiFi (iwlwifi): - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities - 320 MHz channels support - RealTek WiFi (rtw89): - new dynamic header firmware format support - wake-over-WLAN support" * tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits) ipvs: fix type warning in do_div() on 32 bit net: lan966x: Remove a useless test in lan966x_ptp_add_trap() net: ipa: add IPA v4.7 support dt-bindings: net: qcom,ipa: Add SM6350 compatible bnxt: Use generic HBH removal helper in tx path IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver selftests: forwarding: Add bridge MDB test selftests: forwarding: Rename bridge_mdb test bridge: mcast: Support replacement of MDB port group entries bridge: mcast: Allow user space to specify MDB entry routing protocol bridge: mcast: Allow user space to add (*, G) with a source list and filter mode bridge: mcast: Add support for (*, G) with a source list and filter mode bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source bridge: mcast: Add a flag for user installed source entries bridge: mcast: Expose __br_multicast_del_group_src() bridge: mcast: Expose br_multicast_new_group_src() bridge: mcast: Add a centralized error path bridge: mcast: Place netlink policy before validation functions bridge: mcast: Split (*, G) and (S, G) addition into different functions bridge: mcast: Do not derive entry type from its filter mode ...
2022-12-14Merge tag 'x86_asm_for_v6.2' of ↵Linus Torvalds1-0/+22
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Borislav Petkov: - Move the 32-bit memmove() asm implementation out-of-line in order to fix a 32-bit full LTO build failure with clang where it would fail at register allocation. Move it to an asm file and clean it up while at it, similar to what has been already done on 64-bit * tag 'x86_asm_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mem: Move memmove to out of line assembler
2022-12-13Merge tag 'regulator-v6.2' of ↵Linus Torvalds1-11/+2
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator updates from Mark Brown: "Quite a quiet release for regulator, the diffstat is dominated by the I2C migration to probe_new() and the newly added MT6357 driver. We've just one framework addition and the rest is all new device support, fixes and cleanups. The framework addition is an API for requesting all regulators defined in DT, this isn't great practice but has reasonable applications when there is generic code handling devices on buses where the bus specification doesn't include power. The immediate application is MDIO but I believe there's others, it's another API that'll need an eye keeping on it for undesirable usage. Summary: - An API for requesting all regulators defined in DT - Conversion of lots of drivers to the I2C probe_new() API - Support for Mediatek MT6357, Qualcomm PM8550, PMR735a and Richtek RT6190" * tag 'regulator-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (56 commits) regulator: core: Use different devices for resource allocation and DT lookup dt-bindings: Add missing 'unevaluatedProperties' to regulator nodes regulator: qcom-labibb: Fix missing of_node_put() in qcom_labibb_regulator_probe() regulator: add mt6357 regulator regulator: dt-bindings: Add binding schema for mt6357 regulators regulator: core: fix resource leak in regulator_register() regulator: core: fix module refcount leak in set_supply() regulator: core: fix use_count leakage when handling boot-on regulator: rk808: Use dev_err_probe regulator: rk808: reduce 'struct rk808' usage regulator: Drop obsolete dependencies on COMPILE_TEST regulator: pv88080-regulator: Convert to i2c's .probe_new() regulator: pfuze100-regulator: Convert to i2c's .probe_new() regulator: isl6271a-regulator: Convert to i2c's .probe_new() regulator: fan53555: Convert to i2c's .probe_new() regulator: act8865-regulator: Convert to i2c's .probe_new() regulator: qcom-rpmh: Add support for PM8550 regulators regulator: dt-bindings: qcom,rpmh: Add compatible for PM8550 regulator: tps65023-regulator: Convert to i2c's .probe_new() regulator: tps62360-regulator: Convert to i2c's .probe_new() ...
2022-12-13Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linuxLinus Torvalds5-183/+79
Pull block updates from Jens Axboe: - NVMe pull requests via Christoph: - Support some passthrough commands without CAP_SYS_ADMIN (Kanchan Joshi) - Refactor PCIe probing and reset (Christoph Hellwig) - Various fabrics authentication fixes and improvements (Sagi Grimberg) - Avoid fallback to sequential scan due to transient issues (Uday Shankar) - Implement support for the DEAC bit in Write Zeroes (Christoph Hellwig) - Allow overriding the IEEE OUI and firmware revision in configfs for nvmet (Aleksandr Miloserdov) - Force reconnect when number of queue changes in nvmet (Daniel Wagner) - Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi Grimberg, Christoph Hellwig, Christophe JAILLET) - Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni) - Use the common tagset helpers in nvme-pci driver (Christoph Hellwig) - Cleanup the nvme-pci removal path (Christoph Hellwig) - Use kstrtobool() instead of strtobool (Christophe JAILLET) - Allow unprivileged passthrough of Identify Controller (Joel Granados) - Support io stats on the mpath device (Sagi Grimberg) - Minor nvmet cleanup (Sagi Grimberg) - MD pull requests via Song: - Code cleanups (Christoph) - Various fixes - Floppy pull request from Denis: - Fix a memory leak in the init error path (Yuan) - Series fixing some batch wakeup issues with sbitmap (Gabriel) - Removal of the pktcdvd driver that was deprecated more than 5 years ago, and subsequent removal of the devnode callback in struct block_device_operations as no users are now left (Greg) - Fix for partition read on an exclusively opened bdev (Jan) - Series of elevator API cleanups (Jinlong, Christoph) - Series of fixes and cleanups for blk-iocost (Kemeng) - Series of fixes and cleanups for blk-throttle (Kemeng) - Series adding concurrent support for sync queues in BFQ (Yu) - Series bringing drbd a bit closer to the out-of-tree maintained version (Christian, Joel, Lars, Philipp) - Misc drbd fixes (Wang) - blk-wbt fixes and tweaks for enable/disable (Yu) - Fixes for mq-deadline for zoned devices (Damien) - Add support for read-only and offline zones for null_blk (Shin'ichiro) - Series fixing the delayed holder tracking, as used by DM (Yu, Christoph) - Series enabling bio alloc caching for IRQ based IO (Pavel) - Series enabling userspace peer-to-peer DMA (Logan) - BFQ waker fixes (Khazhismel) - Series fixing elevator refcount issues (Christoph, Jinlong) - Series cleaning up references around queue destruction (Christoph) - Series doing quiesce by tagset, enabling cleanups in drivers (Christoph, Chao) - Series untangling the queue kobject and queue references (Christoph) - Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye, Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph) * tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits) blktrace: Fix output non-blktrace event when blk_classic option enabled block: sed-opal: Don't include <linux/kernel.h> sed-opal: allow using IOC_OPAL_SAVE for locking too blk-cgroup: Fix typo in comment block: remove bio_set_op_attrs nvmet: don't open-code NVME_NS_ATTR_RO enumeration nvme-pci: use the tagset alloc/free helpers nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers nvme: consolidate setting the tagset flags nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set block: bio_copy_data_iter nvme-pci: split out a nvme_pci_ctrl_is_dead helper nvme-pci: return early on ctrl state mismatch in nvme_reset_work nvme-pci: rename nvme_disable_io_queues nvme-pci: cleanup nvme_suspend_queue nvme-pci: remove nvme_pci_disable nvme-pci: remove nvme_disable_admin_queue nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl nvme: use nvme_wait_ready in nvme_shutdown_ctrl ...
2022-12-13Merge tag 'pull-iov_iter' of ↵Linus Torvalds1-22/+28
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull iov_iter updates from Al Viro: "iov_iter work; most of that is about getting rid of direction misannotations and (hopefully) preventing more of the same for the future" * tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: use less confusing names for iov_iter direction initializers iov_iter: saner checks for attempt to copy to/from iterator [xen] fix "direction" argument of iov_iter_kvec() [vhost] fix 'direction' argument of iov_iter_{init,bvec}() [target] fix iov_iter_bvec() "direction" argument [s390] memcpy_real(): WRITE is "data source", not destination... [s390] zcore: WRITE is "data source", not destination... [infiniband] READ is "data destination", not source... [fsi] WRITE is "data source", not destination... [s390] copy_oldmem_kernel() - WRITE is "data source", not destination csum_and_copy_to_iter(): handle ITER_DISCARD get rid of unlikely() on page_copy_sane() calls
2022-12-13Merge tag 'mm-nonmm-stable-2022-12-12' of ↵Linus Torvalds7-9/+18
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - A ptrace API cleanup series from Sergey Shtylyov - Fixes and cleanups for kexec from ye xingchen - nilfs2 updates from Ryusuke Konishi - squashfs feature work from Xiaoming Ni: permit configuration of the filesystem's compression concurrency from the mount command line - A series from Akinobu Mita which addresses bound checking errors when writing to debugfs files - A series from Yang Yingliang to address rapidio memory leaks - A series from Zheng Yejian to address possible overflow errors in encode_comp_t() - And a whole shower of singleton patches all over the place * tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits) ipc: fix memory leak in init_mqueue_fs() hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount rapidio: devices: fix missing put_device in mport_cdev_open kcov: fix spelling typos in comments hfs: Fix OOB Write in hfs_asc2mac hfs: fix OOB Read in __hfs_brec_find relay: fix type mismatch when allocating memory in relay_create_buf() ocfs2: always read both high and low parts of dinode link count io-mapping: move some code within the include guarded section kernel: kcsan: kcsan_test: build without structleak plugin mailmap: update email for Iskren Chernev eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD rapidio: fix possible UAF when kfifo_alloc() fails relay: use strscpy() is more robust and safer cpumask: limit visibility of FORCE_NR_CPUS acct: fix potential integer overflow in encode_comp_t() acct: fix accuracy loss for input value of encode_comp_t() linux/init.h: include <linux/build_bug.h> and <linux/stringify.h> rapidio: rio: fix possible name leak in rio_register_mport() rapidio: fix possible name leaks when rio_add_device() fails ...