summaryrefslogtreecommitdiff
path: root/mm
AgeCommit message (Collapse)AuthorFilesLines
2022-07-04mm: split huge PUD on wp_huge_pud fallbackGowans, James1-13/+14
Currently the implementation will split the PUD when a fallback is taken inside the create_huge_pud function. This isn't where it should be done: the splitting should be done in wp_huge_pud, just like it's done for PMDs. Reason being that if a callback is taken during create, there is no PUD yet so nothing to split, whereas if a fallback is taken when encountering a write protection fault there is something to split. It looks like this was the original intention with the commit where the splitting was introduced, but somehow it got moved to the wrong place between v1 and v2 of the patch series. Rebase mistake perhaps. Link: https://lkml.kernel.org/r/6f48d622eb8bce1ae5dd75327b0b73894a2ec407.camel@amazon.com Fixes: 327e9fd48972 ("mm: Split huge pages on write-notify or COW") Signed-off-by: James Gowans <jgowans@amazon.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-04mm/rmap: fix dereferencing invalid subpage pointer in try_to_migrate_one()David Hildenbrand1-10/+17
The subpage we calculate is an invalid pointer for device private pages, because device private pages are mapped via non-present device private entries, not ordinary present PTEs. Let's just not compute broken pointers and fixup later. Move the proper assignment of the correct subpage to the beginning of the function and assert that we really only have a single page in our folio. This currently results in a BUG when tying to compute anon_exclusive, because: [ 528.727237] BUG: unable to handle page fault for address: ffffea1fffffffc0 [ 528.739585] #PF: supervisor read access in kernel mode [ 528.745324] #PF: error_code(0x0000) - not-present page [ 528.751062] PGD 44eaf2067 P4D 44eaf2067 PUD 0 [ 528.756026] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 528.760890] CPU: 120 PID: 18275 Comm: hmm-tests Not tainted 5.19.0-rc3-kfd-alex #257 [ 528.769542] Hardware name: AMD Corporation BardPeak/BardPeak, BIOS RTY1002BDS 09/17/2021 [ 528.778579] RIP: 0010:try_to_migrate_one+0x21a/0x1000 [ 528.784225] Code: f6 48 89 c8 48 2b 05 45 d1 6a 01 48 c1 f8 06 48 29 c3 48 8b 45 a8 48 c1 e3 06 48 01 cb f6 41 18 01 48 89 85 50 ff ff ff 74 0b <4c> 8b 33 49 c1 ee 11 41 83 e6 01 48 8b bd 48 ff ff ff e8 3f 99 02 [ 528.805194] RSP: 0000:ffffc90003cdfaa0 EFLAGS: 00010202 [ 528.811027] RAX: 00007ffff7ff4000 RBX: ffffea1fffffffc0 RCX: ffffeaffffffffc0 [ 528.818995] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc90003cdfaf8 [ 528.826962] RBP: ffffc90003cdfb70 R08: 0000000000000000 R09: 0000000000000000 [ 528.834930] R10: ffffc90003cdf910 R11: 0000000000000002 R12: ffff888194450540 [ 528.842899] R13: ffff888160d057c0 R14: 0000000000000000 R15: 03ffffffffffffff [ 528.850865] FS: 00007ffff7fdb740(0000) GS:ffff8883b0600000(0000) knlGS:0000000000000000 [ 528.859891] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 528.866308] CR2: ffffea1fffffffc0 CR3: 00000001562b4003 CR4: 0000000000770ee0 [ 528.874275] PKRU: 55555554 [ 528.877286] Call Trace: [ 528.880016] <TASK> [ 528.882356] ? lock_is_held_type+0xdf/0x130 [ 528.887033] rmap_walk_anon+0x167/0x410 [ 528.891316] try_to_migrate+0x90/0xd0 [ 528.895405] ? try_to_unmap_one+0xe10/0xe10 [ 528.900074] ? anon_vma_ctor+0x50/0x50 [ 528.904260] ? put_anon_vma+0x10/0x10 [ 528.908347] ? invalid_mkclean_vma+0x20/0x20 [ 528.913114] migrate_vma_setup+0x5f4/0x750 [ 528.917691] dmirror_devmem_fault+0x8c/0x250 [test_hmm] [ 528.923532] do_swap_page+0xac0/0xe50 [ 528.927623] ? __lock_acquire+0x4b2/0x1ac0 [ 528.932199] __handle_mm_fault+0x949/0x1440 [ 528.936876] handle_mm_fault+0x13f/0x3e0 [ 528.941256] do_user_addr_fault+0x215/0x740 [ 528.945928] exc_page_fault+0x75/0x280 [ 528.950115] asm_exc_page_fault+0x27/0x30 [ 528.954593] RIP: 0033:0x40366b ... Link: https://lkml.kernel.org/r/20220623205332.319257-1-david@redhat.com Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Tested-by: Alistair Popple <apopple@nvidia.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-04mm: sparsemem: fix missing higher order allocation splittingMuchun Song1-0/+8
Higher order allocations for vmemmap pages from buddy allocator must be able to be treated as indepdenent small pages as they can be freed individually by the caller. There is no problem for higher order vmemmap pages allocated at boot time since each individual small page will be initialized at boot time. However, it will be an issue for memory hotplug case since those higher order vmemmap pages are allocated from buddy allocator without initializing each individual small page's refcount. The system will panic in put_page_testzero() when CONFIG_DEBUG_VM is enabled if the vmemmap page is freed. Link: https://lkml.kernel.org/r/20220620023019.94257-1-songmuchun@bytedance.com Fixes: d8d55f5616cf ("mm: sparsemem: use page table lock to protect kernel pmd operations") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-04mm/damon: use set_huge_pte_at() to make huge pte oldBaolin Wang1-2/+1
The huge_ptep_set_access_flags() can not make the huge pte old according to the discussion [1], that means we will always mornitor the young state of the hugetlb though we stopped accessing the hugetlb, as a result DAMON will get inaccurate accessing statistics. So changing to use set_huge_pte_at() to make the huge pte old to fix this issue. [1] https://lore.kernel.org/all/Yqy97gXI4Nqb7dYo@arm.com/ Link: https://lkml.kernel.org/r/1655692482-28797-1-git-send-email-baolin.wang@linux.alibaba.com Fixes: 49f4203aae06 ("mm/damon: add access checking for hugetlb pages") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: SeongJae Park <sj@kernel.org> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-04mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pagesAxel Rasmussen1-1/+4
When fallocate() is used on a shmem file, the pages we allocate can end up with !PageUptodate. Since UFFDIO_CONTINUE tries to find the existing page the user wants to map with SGP_READ, we would fail to find such a page, since shmem_getpage_gfp returns with a "NULL" pagep for SGP_READ if it discovers !PageUptodate. As a result, UFFDIO_CONTINUE returns -EFAULT, as it would do if the page wasn't found in the page cache at all. This isn't the intended behavior. UFFDIO_CONTINUE is just trying to find if a page exists, and doesn't care whether it still needs to be cleared or not. So, instead of SGP_READ, pass in SGP_NOALLOC. This is the same, except for one critical difference: in the !PageUptodate case, SGP_NOALLOC will clear the page and then return it. With this change, UFFDIO_CONTINUE works properly (succeeds) on a shmem file which has been fallocated, but otherwise not modified. Link: https://lkml.kernel.org/r/20220610173812.1768919-1-axelrasmussen@google.com Fixes: 153132571f02 ("userfaultfd/shmem: support UFFDIO_CONTINUE for shmem") Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Acked-by: Peter Xu <peterx@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-02usercopy: use unsigned long instead of uintptr_tJason A. Donenfeld1-1/+1
A recent commit factored out a series of annoying (unsigned long) casts into a single variable declaration, but made the pointer type a `uintptr_t` rather than the usual `unsigned long`. This patch changes it to be the integer type more typically used by the kernel to represent addresses. Fixes: 35fb9ae4aa2e ("usercopy: Cast pointer to an integer once") Cc: Matthew Wilcox <willy@infradead.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220616143617.449094-1-Jason@zx2c4.com
2022-06-29filemap: Use filemap_read_folio() in do_read_cache_folio()Matthew Wilcox (Oracle)1-20/+9
By passing ->read_folio to filemap_read_folio(), we can use filemap_read_folio() in do_read_cache_folio(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-29filemap: Handle AOP_TRUNCATED_PAGE in do_read_cache_folio()Matthew Wilcox (Oracle)1-1/+3
If the call to filler() returns AOP_TRUNCATED_PAGE, we need to retry the page cache lookup. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-29filemap: Move 'filler' case to the end of do_read_cache_folio()Matthew Wilcox (Oracle)1-15/+13
No functionality change intended; this simply moves code around to disentangle the function a little. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-29filemap: Remove find_get_pages_range() and associated functionsMatthew Wilcox (Oracle)2-96/+0
All callers of find_get_pages_range(), pagevec_lookup_range() and pagevec_lookup() have now been removed. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-06-29shmem: Convert shmem_unlock_mapping() to use filemap_get_folios()Matthew Wilcox (Oracle)1-7/+6
This is a straightforward conversion. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-06-29vmscan: Add check_move_unevictable_folios()Matthew Wilcox (Oracle)1-22/+34
Change the guts of check_move_unevictable_pages() over to use folios and add check_move_unevictable_pages() as a wrapper. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-06-29filemap: Add filemap_get_folios()Matthew Wilcox (Oracle)1-0/+59
This is the equivalent of find_get_pages() but fills a folio_batch instead of an array of pages. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-06-29filemap: Remove add_to_page_cache() and add_to_page_cache_locked()Matthew Wilcox (Oracle)3-22/+2
These functions have no more users, so delete them. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
2022-06-29hugetlb: Convert huge_add_to_page_cache() to use a folioMatthew Wilcox (Oracle)1-4/+10
Remove the last caller of add_to_page_cache() Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
2022-06-29mm: Remove __delete_from_page_cache()Matthew Wilcox (Oracle)3-3/+3
This wrapper is no longer used. Remove it and all references to it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-29mm: Account dirty folios properly during splitsMatthew Wilcox (Oracle)1-3/+8
If the last folio in a file is split as a result of truncation, we simply clear the dirty bits for the pages we're discarding. That causes NR_FILE_DIRTY (among other counters) to be thrown off and eventually Linux will hang in balance_dirty_pages_ratelimited() Reported-by: Dave Chinner <dchinner@redhat.com> Tested-by: Dave Chinner <dchinner@redhat.com> Tested-by: Darrick J. Wong <djwong@kernel.org> Fixes: d68eccad3706 ("mm/filemap: Allow large folios to be added to the page cache") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-27mm: ioremap: Add ioremap/iounmap_allowed()Kefeng Wang1-1/+10
Add special hook for architecture to verify addr, size or prot when ioremap() or iounmap(), which will make the generic ioremap more useful. ioremap_allowed() return a bool, - true means continue to remap - false means skip remap and return directly iounmap_allowed() return a bool, - true means continue to vunmap - false code means skip vunmap and return directly Meanwhile, only vunmap the address when it is in vmalloc area as the generic ioremap only returns vmalloc addresses. Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Baoquan He <bhe@redhat.com> Link: https://lore.kernel.org/r/20220607125027.44946-5-wangkefeng.wang@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27mm: ioremap: Setup phys_addr of struct vm_structKefeng Wang1-0/+1
Show physical address of each ioremap in /proc/vmallocinfo. Acked-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20220607125027.44946-4-wangkefeng.wang@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27mm: ioremap: Use more sensible name in ioremap_prot()Kefeng Wang1-6/+8
Use more meaningful and sensible naming phys_addr instead addr in ioremap_prot(). Suggested-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220610092255.32445-1-wangkefeng.wang@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-06-27Merge tag 'mm-hotfixes-stable-2022-06-26' of ↵Linus Torvalds8-6/+31
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "Minor things, mainly - mailmap updates, MAINTAINERS updates, etc. Fixes for this merge window: - fix for a damon boot hang, from SeongJae - fix for a kfence warning splat, from Jason Donenfeld - fix for zero-pfn pinning, from Alex Williamson - fix for fallocate hole punch clearing, from Mike Kravetz Fixes for previous releases: - fix for a performance regression, from Marcelo - fix for a hwpoisining BUG from zhenwei pi" * tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mailmap: add entry for Christian Marangi mm/memory-failure: disable unpoison once hw error happens hugetlbfs: zero partial pages during fallocate hole punch mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py mm: re-allow pinning of zero pfns mm/kfence: select random number before taking raw lock MAINTAINERS: add maillist information for LoongArch MAINTAINERS: update MM tree references MAINTAINERS: update Abel Vesa's email MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer MAINTAINERS: add Miaohe Lin as a memory-failure reviewer mailmap: add alias for jarkko@profian.com mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal mm: lru_cache_disable: use synchronize_rcu_expedited mm/page_isolation.c: fix one kernel-doc comment
2022-06-23filemap: Fix serialization adding transparent huge pages to page cacheAlistair Popple1-0/+2
Commit 793917d997df ("mm/readahead: Add large folio readahead") introduced support for using large folios for filebacked pages if the filesystem supports it. page_cache_ra_order() was introduced to allocate and add these large folios to the page cache. However adding pages to the page cache should be serialized against truncation and hole punching by taking invalidate_lock. Not doing so can lead to data races resulting in stale data getting added to the page cache and marked up-to-date. See commit 730633f0b7f9 ("mm: Protect operations adding pages to page cache with invalidate_lock") for more details. This issue was found by inspection but a testcase revealed it was possible to observe in practice on XFS. Fix this by taking invalidate_lock in page_cache_ra_order(), to mirror what is done for the non-thp case in page_cache_ra_unbounded(). Signed-off-by: Alistair Popple <apopple@nvidia.com> Fixes: 793917d997df ("mm/readahead: Add large folio readahead") Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-23mm: Clear page->private when splitting or migrating a pageMatthew Wilcox (Oracle)2-0/+2
In our efforts to remove uses of PG_private, we have found folios with the private flag clear and folio->private not-NULL. That is the root cause behind 642d51fb0775 ("ceph: check folio PG_private bit instead of folio->private"). It can also affect a few other filesystems that haven't yet reported a problem. compaction_alloc() can return a page with uninitialised page->private, and rather than checking all the callers of migrate_pages(), just zero page->private after calling get_new_page(). Similarly, the tail pages from split_huge_page() may also have an uninitialised page->private. Reported-by: Xiubo Li <xiubli@redhat.com> Tested-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20filemap: Handle sibling entries in filemap_get_read_batch()Matthew Wilcox (Oracle)1-0/+2
If a read races with an invalidation followed by another read, it is possible for a folio to be replaced with a higher-order folio. If that happens, we'll see a sibling entry for the new folio in the next iteration of the loop. This manifests as a NULL pointer dereference while holding the RCU read lock. Handle this by simply returning. The next call will find the new folio and handle it correctly. The other ways of handling this rare race are more complex and it's just not worth it. Reported-by: Dave Chinner <david@fromorbit.com> Reported-by: Brian Foster <bfoster@redhat.com> Debugged-by: Brian Foster <bfoster@redhat.com> Tested-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20filemap: Correct the conditions for marking a folio as accessedMatthew Wilcox (Oracle)1-3/+10
We had an off-by-one error which meant that we never marked the first page in a read as accessed. This was visible as a slowdown when re-reading a file as pages were being evicted from cache too soon. In reviewing this code, we noticed a second bug where a multi-page folio would be marked as accessed multiple times when doing reads that were less than the size of the folio. Abstract the comparison of whether two file positions are in the same folio into a new function, fixing both of these bugs. Reported-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-20Merge tag 'slab-for-5.19-fixup' of ↵Linus Torvalds1-7/+36
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: - A slub fix for PREEMPT_RT locking semantics from Sebastian. - A slub fix for state corruption due to a possible race scenario from Jann. * tag 'slab-for-5.19-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slub: add missing TID updates on slab deactivation mm/slub: Move the stackdepot related allocation out of IRQ-off section.
2022-06-17Merge tag 'fs_for_v5.19-rc3' of ↵Linus Torvalds1-9/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull writeback and ext2 fixes from Jan Kara: "A fix for writeback bug which prevented machines with kdevtmpfs from booting and also one small ext2 bugfix in IO error handling" * tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: init: Initialize noop_backing_dev_info early ext2: fix fs corruption when trying to remove a non-empty directory with IO error
2022-06-17mm/memory-failure: disable unpoison once hw error happenszhenwei pi3-2/+14
Currently unpoison_memory(unsigned long pfn) is designed for soft poison(hwpoison-inject) only. Since 17fae1294ad9d, the KPTE gets cleared on a x86 platform once hardware memory corrupts. Unpoisoning a hardware corrupted page puts page back buddy only, the kernel has a chance to access the page with *NOT PRESENT* KPTE. This leads BUG during accessing on the corrupted KPTE. Suggested by David&Naoya, disable unpoison mechanism when a real HW error happens to avoid BUG like this: Unpoison: Software-unpoisoned page 0x61234 BUG: unable to handle page fault for address: ffff888061234000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 2c01067 P4D 2c01067 PUD 107267063 PMD 10382b063 PTE 800fffff9edcb062 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 26551 Comm: stress Kdump: loaded Tainted: G M OE 5.18.0.bm.1-amd64 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ... RIP: 0010:clear_page_erms+0x7/0x10 Code: ... RSP: 0000:ffffc90001107bc8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000901 RCX: 0000000000001000 RDX: ffffea0001848d00 RSI: ffffea0001848d40 RDI: ffff888061234000 RBP: ffffea0001848d00 R08: 0000000000000901 R09: 0000000000001276 R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000140dca R15: 0000000000000001 FS: 00007fd8b2333740(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff888061234000 CR3: 00000001023d2005 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> prep_new_page+0x151/0x170 get_page_from_freelist+0xca0/0xe20 ? sysvec_apic_timer_interrupt+0xab/0xc0 ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 __alloc_pages+0x17e/0x340 __folio_alloc+0x17/0x40 vma_alloc_folio+0x84/0x280 __handle_mm_fault+0x8d4/0xeb0 handle_mm_fault+0xd5/0x2a0 do_user_addr_fault+0x1d0/0x680 ? kvm_read_and_reset_apf_flags+0x3b/0x50 exc_page_fault+0x78/0x170 asm_exc_page_fault+0x27/0x30 Link: https://lkml.kernel.org/r/20220615093209.259374-2-pizhenwei@bytedance.com Fixes: 847ce401df392 ("HWPOISON: Add unpoisoning support") Fixes: 17fae1294ad9d ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> [5.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-17mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.pyYang Yang1-1/+1
There is no slabinfo.py in tools/cgroup, but has memcg_slabinfo.py instead. Link: https://lkml.kernel.org/r/20220610024451.744135-1-yang.yang29@zte.com.cn Signed-off-by: Yang Yang <yang.yang29@zte.com.cn> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-17mm/kfence: select random number before taking raw lockJason A. Donenfeld1-2/+5
The RNG uses vanilla spinlocks, not raw spinlocks, so kfence should pick its random numbers before taking its raw spinlocks. This also has the nice effect of doing less work inside the lock. It should fix a splat that Geert saw with CONFIG_PROVE_RAW_LOCK_NESTING: dump_backtrace.part.0+0x98/0xc0 show_stack+0x14/0x28 dump_stack_lvl+0xac/0xec dump_stack+0x14/0x2c __lock_acquire+0x388/0x10a0 lock_acquire+0x190/0x2c0 _raw_spin_lock_irqsave+0x6c/0x94 crng_make_state+0x148/0x1e4 _get_random_bytes.part.0+0x4c/0xe8 get_random_u32+0x4c/0x140 __kfence_alloc+0x460/0x5c4 kmem_cache_alloc_trace+0x194/0x1dc __kthread_create_on_node+0x5c/0x1a8 kthread_create_on_node+0x58/0x7c printk_start_kthread.part.0+0x34/0xa8 printk_activate_kthreads+0x4c/0x54 do_one_initcall+0xec/0x278 kernel_init_freeable+0x11c/0x214 kernel_init+0x24/0x124 ret_from_fork+0x10/0x20 Link: https://lkml.kernel.org/r/20220609123319.17576-1-Jason@zx2c4.com Fixes: d4150779e60f ("random32: use real rng for non-deterministic randomness") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-17mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is ↵SeongJae Park1-0/+8
initialized Commit 059342d1dd4e ("mm/damon/reclaim: fix the timer always stays active") made DAMON_RECLAIM's 'enabled' parameter store callback, 'enabled_store()', to schedule 'damon_reclaim_timer'. The scheduling uses 'system_wq', which is initialized in 'workqueue_init_early()'. As kernel parameters parsing function ('parse_args()') is called before 'workqueue_init_early()', 'enabled_store()' can be executed before 'workqueue_init_early()' and end up accessing the uninitialized 'system_wq'. As a result, the booting hang[1]. This commit fixes the issue by checking if the initialization is done before scheduling the timer. [1] https://lkml.kernel.org/20220604192222.1488-1-sj@kernel.org/ Link: https://lkml.kernel.org/r/20220604195051.1589-1-sj@kernel.org Fixes: 059342d1dd4e ("mm/damon/reclaim: fix the timer always stays active") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Greg White <gwhite@kupulau.com> Cc: Hailong Tu <tuhailong@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-17mm: lru_cache_disable: use synchronize_rcu_expeditedMarcelo Tosatti1-1/+1
commit ff042f4a9b050 ("mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu") replaced lru_cache_disable's usage of work queues with synchronize_rcu. Some users reported large performance regressions due to this commit, for example: https://lore.kernel.org/all/20220521234616.GO1790663@paulmck-ThinkPad-P17-Gen-1/T/ Switching to synchronize_rcu_expedited fixes the problem. Link: https://lkml.kernel.org/r/YpToHCmnx/HEcVyR@fuller.cnet Fixes: ff042f4a9b050 ("mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu") Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Tested-by: Michael Larabel <Michael@MichaelLarabel.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Nicolas Saenz Julienne <nsaenzju@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Minchan Kim <minchan@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Phil Elwell <phil@raspberrypi.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-17mm/page_isolation.c: fix one kernel-doc commentYang Li1-0/+2
Remove one warning found by running scripts/kernel-doc, which is caused by using 'make W=1': mm/page_isolation.c:304: warning: Function parameter or member 'skip_isolation' not described in 'isolate_single_pageblock' Link: https://lkml.kernel.org/r/20220602062116.61199-1-yang.lee@linux.alibaba.com Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16init: Initialize noop_backing_dev_info earlyJan Kara1-9/+2
noop_backing_dev_info is used by superblocks of various pseudofilesystems such as kdevtmpfs. After commit 10e14073107d ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error") this broke because __mark_inode_dirty() started to access more fields from noop_backing_dev_info and this led to crashes inside locked_inode_to_wb_and_lock_list() called from __mark_inode_dirty(). Fix the problem by initializing noop_backing_dev_info before the filesystems get mounted. Fixes: 10e14073107d ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error") Reported-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reported-and-tested-by: Alexandru Elisei <alexandru.elisei@arm.com> Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
2022-06-15memblock: Disable mirror feature if kernelcore is not specifiedMa Wupeng3-1/+6
If system have some mirrored memory and mirrored feature is not specified in boot parameter, the basic mirrored feature will be enabled and this will lead to the following situations: - memblock memory allocation prefers mirrored region. This may have some unexpected influence on numa affinity. - contiguous memory will be split into several parts if parts of them is mirrored memory via memblock_mark_mirror(). To fix this, variable mirrored_kernelcore will be checked in memblock_mark_mirror(). Mark mirrored memory with flag MEMBLOCK_MIRROR iff kernelcore=mirror is added in the kernel parameters. Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220614092156.1972846-6-mawupeng1@huawei.com Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-15mm: Limit warning message in vmemmap_verify() to onceMa Wupeng1-1/+1
For a system only have limited mirrored memory or some numa node without mirrored memory, the per node vmemmap page_structs prefer to allocate memory from mirrored region, which will lead to vmemmap_verify() in vmemmap_populate_basepages() report lots of warning message. This patch change the frequency of "potential offnode page_structs" warning messages to only once to avoid a very long print during bootup. Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20220614092156.1972846-4-mawupeng1@huawei.com Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-15mm: Ratelimited mirrored memory related warning messagesMa Wupeng1-2/+2
If system has mirrored memory, memblock will try to allocate mirrored memory firstly and fallback to non-mirrored memory when fails, but if with limited mirrored memory or some numa node without mirrored memory, lots of warning message about memblock allocation will occur. This patch ratelimit the warning message to avoid a very long print during bootup. Signed-off-by: Ma Wupeng <mawupeng1@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20220614092156.1972846-3-mawupeng1@huawei.com Acked-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-06-13mm: create security context for memfd_secret inodesChristian Göttsche1-0/+9
Create a security context for the inodes created by memfd_secret(2) via the LSM hook inode_init_security_anon to allow a fine grained control. As secret memory areas can affect hibernation and have a global shared limit access control might be desirable. Signed-off-by: Christian Göttsche <cgzones@googlemail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-06-13usercopy: Make usercopy resilient against ridiculously large copiesMatthew Wilcox (Oracle)1-10/+9
If 'n' is so large that it's negative, we might wrap around and mistakenly think that the copy is OK when it's not. Such a copy would probably crash, but just doing the arithmetic in a more simple way lets us detect and refuse this case. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220612213227.3881769-4-willy@infradead.org
2022-06-13usercopy: Cast pointer to an integer onceMatthew Wilcox (Oracle)1-5/+6
Get rid of a lot of annoying casts by setting 'addr' once at the top of the function. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220612213227.3881769-3-willy@infradead.org
2022-06-13usercopy: Handle vm_map_ram() areasMatthew Wilcox (Oracle)2-7/+5
vmalloc does not allocate a vm_struct for vm_map_ram() areas. That causes us to deny usercopies from those areas. This affects XFS which uses vm_map_ram() for its directories. Fix this by calling find_vmap_area() instead of find_vm_area(). Fixes: 0aef499f3172 ("mm/usercopy: Detect vmalloc overruns") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220612213227.3881769-2-willy@infradead.org
2022-06-13mm/slub: add missing TID updates on slab deactivationJann Horn1-0/+2
The fastpath in slab_alloc_node() assumes that c->slab is stable as long as the TID stays the same. However, two places in __slab_alloc() currently don't update the TID when deactivating the CPU slab. If multiple operations race the right way, this could lead to an object getting lost; or, in an even more unlikely situation, it could even lead to an object being freed onto the wrong slab's freelist, messing up the `inuse` counter and eventually causing a page to be freed to the page allocator while it still contains slab objects. (I haven't actually tested these cases though, this is just based on looking at the code. Writing testcases for this stuff seems like it'd be a pain...) The race leading to state inconsistency is (all operations on the same CPU and kmem_cache): - task A: begin do_slab_free(): - read TID - read pcpu freelist (==NULL) - check `slab == c->slab` (true) - [PREEMPT A->B] - task B: begin slab_alloc_node(): - fastpath fails (`c->freelist` is NULL) - enter __slab_alloc() - slub_get_cpu_ptr() (disables preemption) - enter ___slab_alloc() - take local_lock_irqsave() - read c->freelist as NULL - get_freelist() returns NULL - write `c->slab = NULL` - drop local_unlock_irqrestore() - goto new_slab - slub_percpu_partial() is NULL - get_partial() returns NULL - slub_put_cpu_ptr() (enables preemption) - [PREEMPT B->A] - task A: finish do_slab_free(): - this_cpu_cmpxchg_double() succeeds() - [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL] From there, the object on c->freelist will get lost if task B is allowed to continue from here: It will proceed to the retry_load_slab label, set c->slab, then jump to load_freelist, which clobbers c->freelist. But if we instead continue as follows, we get worse corruption: - task A: run __slab_free() on object from other struct slab: - CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial) - task A: run slab_alloc_node() with NUMA node constraint: - fastpath fails (c->slab is NULL) - call __slab_alloc() - slub_get_cpu_ptr() (disables preemption) - enter ___slab_alloc() - c->slab is NULL: goto new_slab - slub_percpu_partial() is non-NULL - set c->slab to slub_percpu_partial(c) - [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects from slab-2] - goto redo - node_match() fails - goto deactivate_slab - existing c->freelist is passed into deactivate_slab() - inuse count of slab-1 is decremented to account for object from slab-2 At this point, the inuse count of slab-1 is 1 lower than it should be. This means that if we free all allocated objects in slab-1 except for one, SLUB will think that slab-1 is completely unused, and may free its page, leading to use-after-free. Fixes: c17dda40a6a4e ("slub: Separate out kmem_cache_cpu processing from deactivate_slab") Fixes: 03e404af26dc2 ("slub: fast release on full slab") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20220608182205.2945720-1-jannh@google.com
2022-06-13mm/slub: Move the stackdepot related allocation out of IRQ-off section.Sebastian Andrzej Siewior1-7/+34
The set_track() invocation in free_debug_processing() is invoked with acquired slab_lock(). The lock disables interrupts on PREEMPT_RT and this forbids to allocate memory which is done in stack_depot_save(). Split set_track() into two parts: set_track_prepare() which allocate memory and set_track_update() which only performs the assignment of the trace data structure. Use set_track_prepare() before disabling interrupts. [ vbabka@suse.cz: make set_track() call set_track_update() instead of open-coded assignments ] Fixes: 5cf909c553e9e ("mm/slub: use stackdepot to save stack trace in objects") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/Yp9sqoUi4fVa5ExF@linutronix.de
2022-06-09mm/huge_memory: Fix xarray node memory leakMatthew Wilcox (Oracle)1-2/+1
If xas_split_alloc() fails to allocate the necessary nodes to complete the xarray entry split, it sets the xa_state to -ENOMEM, which xas_nomem() then interprets as "Please allocate more memory", not as "Please free any unnecessary memory" (which was the intended outcome). It's confusing to use xas_nomem() to free memory in this context, so call xas_destroy() instead. Reported-by: syzbot+9e27a75a8c24f3fe75c1@syzkaller.appspotmail.com Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-09filemap: Cache the value of vm_flagsMatthew Wilcox (Oracle)1-4/+5
After we have unlocked the mmap_lock for I/O, the file is pinned, but the VMA is not. Checking this flag after that can be a use-after-free. It's not a terribly interesting use-after-free as it can only read one bit, and it's used to decide whether to read 2MB or 4MB. But it upsets the automated tools and it's generally bad practice anyway, so let's fix it. Reported-by: syzbot+5b96d55e5b54924c77ad@syzkaller.appspotmail.com Fixes: 4687fdbb805a ("mm/filemap: Support VM_HUGEPAGE for file mappings") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-09filemap: Don't release a locked folioMatthew Wilcox (Oracle)1-0/+2
We must hold a reference over the call to filemap_release_folio(), otherwise the page cache will put the last reference to the folio before we unlock it, leading to splats like this: BUG: Bad page state in process u8:5 pfn:1ab1f4 page:ffffea0006ac7d00 refcount:0 mapcount:0 mapping:0000000000000000 index:0x28b1de pfn:0x1ab1f4 flags: 0x17ff80000040001(locked|reclaim|node=0|zone=2|lastcpupid=0xfff) raw: 017ff80000040001 dead000000000100 dead000000000122 0000000000000000 raw: 000000000028b1de 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set It's an error path, so it doesn't see much testing. Reported-by: Darrick J. Wong <djwong@kernel.org> Fixes: a42634a6c07d ("readahead: Use a folio in read_pages()") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-06Merge tag 'mm-hotfixes-stable-2022-06-05' of ↵Linus Torvalds4-32/+32
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull mm hotfixes from Andrew Morton: "Fixups for various recently-added and longer-term issues and a few minor tweaks: - fixes for material merged during this merge window - cc:stable fixes for more longstanding issues - minor mailmap and MAINTAINERS updates" * tag 'mm-hotfixes-stable-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/oom_kill.c: fix vm_oom_kill_table[] ifdeffery x86/kexec: fix memory leak of elf header buffer mm/memremap: fix missing call to untrack_pfn() in pagemap_range() mm: page_isolation: use compound_nr() correctly in isolate_single_pageblock() mm: hugetlb_vmemmap: fix CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON MAINTAINERS: add maintainer information for z3fold mailmap: update Josh Poimboeuf's email
2022-06-06Merge tag 'mm-nonmm-stable-2022-06-05' of ↵Linus Torvalds2-0/+16
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull delay-accounting update from Andrew Morton: "A single featurette for delay accounting. Delayed a bit because, unusually, it had dependencies on both the mm-stable and mm-nonmm-stable queues" * tag 'mm-nonmm-stable-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: delayacct: track delays from write-protect copy
2022-06-05Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linuxLinus Torvalds1-2/+2
Pull bitmap updates from Yury Norov: - bitmap: optimize bitmap_weight() usage, from me - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro Carvalho Chehab - include/linux/find: Fix documentation, from Anna-Maria Behnsen - bitmap: fix conversion from/to fix-sized arrays, from me - bitmap: Fix return values to be unsigned, from Kees Cook It has been in linux-next for at least a week with no problems. * tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits) nodemask: Fix return values to be unsigned bitmap: Fix return values to be unsigned KVM: x86: hyper-v: replace bitmap_weight() with hweight64() KVM: x86: hyper-v: fix type of valid_bank_mask ia64: cleanup remove_siblinginfo() drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate lib/bitmap: add test for bitmap_{from,to}_arr64 lib: add bitmap_{from,to}_arr64 lib/bitmap: extend comment for bitmap_(from,to)_arr32() include/linux/find: Fix documentation lib/bitmap.c make bitmap_print_bitmask_to_buf parseable MAINTAINERS: add cpumask and nodemask files to BITMAP_API arch/x86: replace nodes_weight with nodes_empty where appropriate mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate clocksource: replace cpumask_weight with cpumask_empty in clocksource.c genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate irq: mips: replace cpumask_weight with cpumask_empty where appropriate drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate arch/x86: replace cpumask_weight with cpumask_empty where appropriate ...
2022-06-03mm/vmstat: replace cpumask_weight with cpumask_empty where appropriateYury Norov1-2/+2
mm/vmstat.c code calls cpumask_weight() to check if any bit of a given cpumask is set. We can do it more efficiently with cpumask_empty() because cpumask_empty() stops traversing the cpumask as soon as it finds first set bit, while cpumask_weight() counts all bits unconditionally. Signed-off-by: Yury Norov <yury.norov@gmail.com> Acked-by: Mike Rapoport <rppt@linux.ibm.com>