summaryrefslogtreecommitdiff
path: root/mm
AgeCommit message (Collapse)AuthorFilesLines
2022-10-04hugetlbfs: revert use i_mmap_rwsem for more pmd sharing synchronizationMike Kravetz3-81/+15
Commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") added code to take i_mmap_rwsem in read mode for the duration of fault processing. However, this has been shown to cause performance/scaling issues. Revert the code and go back to only taking the semaphore in huge_pmd_share during the fault path. Keep the code that takes i_mmap_rwsem in write mode before calling try_to_unmap as this is required if huge_pmd_unshare is called. NOTE: Reverting this code does expose the following race condition. Faulting thread Unsharing thread ... ... ptep = huge_pte_offset() or ptep = huge_pte_alloc() ... i_mmap_lock_write lock page table ptep invalid <------------------------ huge_pmd_unshare() Could be in a previously unlock_page_table sharing process or worse i_mmap_unlock_write ... ptl = huge_pte_lock(ptep) get/update pte set_pte_at(pte, ptep) It is unknown if the above race was ever experienced by a user. It was discovered via code inspection when initially addressed. In subsequent patches, a new synchronization mechanism will be added to coordinate pmd sharing and eliminate this race. Link: https://lkml.kernel.org/r/20220914221810.95771-3-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04hugetlbfs: revert use i_mmap_rwsem to address page fault/truncate raceMike Kravetz1-11/+11
Patch series "hugetlb: Use new vma lock for huge pmd sharing synchronization", v2. hugetlb fault scalability regressions have recently been reported [1]. This is not the first such report, as regressions were also noted when commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") was added [2] in v5.7. At that time, a proposal to address the regression was suggested [3] but went nowhere. The regression and benefit of this patch series is not evident when using the vm_scalability benchmark reported in [2] on a recent kernel. Results from running, "./usemem -n 48 --prealloc --prefault -O -U 3448054972" 48 sample Avg next-20220913 next-20220913 next-20220913 unmodified revert i_mmap_sema locking vma sema locking, this series ----------------------------------------------------------------------------- 498150 KB/s 501934 KB/s 504793 KB/s The recent regression report [1] notes page fault and fork latency of shared hugetlb mappings. To measure this, I created two simple programs: 1) map a shared hugetlb area, write fault all pages, unmap area Do this in a continuous loop to measure faults per second 2) map a shared hugetlb area, write fault a few pages, fork and exit Do this in a continuous loop to measure forks per second These programs were run on a 48 CPU VM with 320GB memory. The shared mapping size was 250GB. For comparison, a single instance of the program was run. Then, multiple instances were run in parallel to introduce lock contention. Changing the locking scheme results in a significant performance benefit. test instances unmodified revert vma -------------------------------------------------------------------------- faults per sec 1 393043 395680 389932 faults per sec 24 71405 81191 79048 forks per sec 1 2802 2747 2725 forks per sec 24 439 536 500 Combined faults 24 1621 68070 53662 Combined forks 24 358 67 142 Combined test is when running both faulting program and forking program simultaneously. Patches 1 and 2 of this series revert c0d0381ade79 and 87bf91d39bb5 which depends on c0d0381ade79. Acquisition of i_mmap_rwsem is still required in the fault path to establish pmd sharing, so this is moved back to huge_pmd_share. With c0d0381ade79 reverted, this race is exposed: Faulting thread Unsharing thread ... ... ptep = huge_pte_offset() or ptep = huge_pte_alloc() ... i_mmap_lock_write lock page table ptep invalid <------------------------ huge_pmd_unshare() Could be in a previously unlock_page_table sharing process or worse i_mmap_unlock_write ... ptl = huge_pte_lock(ptep) get/update pte set_pte_at(pte, ptep) Reverting 87bf91d39bb5 exposes races in page fault/file truncation. When the new vma lock is put to use in patch 8, this will handle the fault/file truncation races. This is explained in patch 9 where code associated with these races is cleaned up. Patches 3 - 5 restructure existing code in preparation for using the new vma lock (rw semaphore) for pmd sharing synchronization. The idea is that this semaphore will be held in read mode for the duration of fault processing, and held in write mode for unmap operations which may call huge_pmd_unshare. Acquiring i_mmap_rwsem is also still required to synchronize huge pmd sharing. However it is only required in the fault path when setting up sharing, and will be acquired in huge_pmd_share(). Patch 6 adds the new vma lock and all supporting routines, but does not actually change code to use the new lock. Patch 7 refactors code in preparation for using the new lock. And, patch 8 finally adds code to make use of this new vma lock. Unfortunately, the fault code and truncate/hole punch code would naturally take locks in the opposite order which could lead to deadlock. Since the performance of page faults is more important, the truncation/hole punch code is modified to back out and take locks in the correct order if necessary. [1] https://lore.kernel.org/linux-mm/43faf292-245b-5db5-cce9-369d8fb6bd21@infradead.org/ [2] https://lore.kernel.org/lkml/20200622005551.GK5535@shao2-debian/ [3] https://lore.kernel.org/linux-mm/20200706202615.32111-1-mike.kravetz@oracle.com/ This patch (of 9): Commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") added code to take i_mmap_rwsem in read mode for the duration of fault processing. The use of i_mmap_rwsem to prevent fault/truncate races depends on this. However, this has been shown to cause performance/scaling issues. As a result, that code will be reverted. Since the use i_mmap_rwsem to address page fault/truncate races depends on this, it must also be reverted. In a subsequent patch, code will be added to detect the fault/truncate race and back out operations as required. Link: https://lkml.kernel.org/r/20220914221810.95771-1-mike.kravetz@oracle.com Link: https://lkml.kernel.org/r/20220914221810.95771-2-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: James Houghton <jthoughton@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Xu <peterx@redhat.com> Cc: Prakash Sangappa <prakash.sangappa@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/hugetlb: remove unnecessary 'NULL' values from pointerXU pengfei1-2/+2
Pointer variables allocate memory first, and then judge. There is no need to initialize the assignment. Link: https://lkml.kernel.org/r/20220914012113.6271-1-xupengfei@nfschina.com Signed-off-by: XU pengfei <xupengfei@nfschina.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/filemap: make folio_put_wait_locked staticKe Sun1-1/+1
It's only used in mm/filemap.c, since commit <ffa65753c431> ("mm/migrate.c: rework migration_entry_wait() to not take a pageref"). Make it static. Link: https://lkml.kernel.org/r/20220914021738.3228011-1-sunke@kylinos.cn Signed-off-by: Ke Sun <sunke@kylinos.cn> Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm: hugetlb: eliminate memory-less nodes handlingMuchun Song1-41/+29
The memory-notify-based approach aims to handle meory-less nodes, however, it just adds the complexity of code as pointed by David in thread [1]. The handling of memory-less nodes is introduced by commit 4faf8d950ec4 ("hugetlb: handle memory hot-plug events"). >From its commit message, we cannot find any necessity of handling this case. So, we can simply register/unregister sysfs entries in register_node/unregister_node to simlify the code. BTW, hotplug callback added because in hugetlb_register_all_nodes() we register sysfs nodes only for N_MEMORY nodes, seeing commit 9b5e5d0fdc91, which said it was a preparation for handling memory-less nodes via memory hotplug. Since we want to remove memory hotplug, so make sure we only register per-node sysfs for online (N_ONLINE) nodes in hugetlb_register_all_nodes(). https://lore.kernel.org/linux-mm/60933ffc-b850-976c-78a0-0ee6e0ea9ef0@redhat.com/ [1] Link: https://lkml.kernel.org/r/20220914072603.60293-3-songmuchun@bytedance.com Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm: hugetlb: simplify per-node sysfs creation and removalMuchun Song1-12/+23
Patch series "simplify handling of per-node sysfs creation and removal", v4. This patch (of 2): The following commit offload per-node sysfs creation and removal to a kworker and did not say why it is needed. And it also said "I don't know that this is absolutely required". It seems like the author was not sure as well. Since it only complicates the code, this patch will revert the changes to simplify the code. 39da08cb074c ("hugetlb: offload per node attribute registrations") We could use memory hotplug notifier to do per-node sysfs creation and removal instead of inserting those operations to node registration and unregistration. Then, it can reduce the code coupling between node.c and hugetlb.c. Also, it can simplify the code. Link: https://lkml.kernel.org/r/20220914072603.60293-1-songmuchun@bytedance.com Link: https://lkml.kernel.org/r/20220914072603.60293-2-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/mempolicy: use PAGE_ALIGN instead of open-coding itze zuo1-2/+2
Replace the simple calculation with PAGE_ALIGN. Link: https://lkml.kernel.org/r/20220913015505.1998958-1-zuoze1@huawei.com Signed-off-by: ze zuo <zuoze1@huawei.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/page_alloc.c: document bulkfree_pcp_prepare() return valueAndrew Morton1-0/+1
Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: ke.wang <ke.wang@unisoc.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Zhaoyang Huang <huangzhaoyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/page_alloc.c: rename check_free_page() to free_page_is_bad()Andrew Morton1-10/+10
The name "check_free_page()" provides no information regarding its return value when the page is indeed found to be bad. Renaming it to "free_page_is_bad()" makes it clear that a `true' return value means the page was bad. And make it return a bool, not an int. [akpm@linux-foundation.org: don't use bool as int] Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: ke.wang <ke.wang@unisoc.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Zhaoyang Huang <huangzhaoyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/memcontrol: use kstrtobool for swapaccount param parsingLiu Shixin1-4/+4
Use kstrtobool which is more powerful to handle all kinds of parameters like 'Yy1Nn0' or [oO][NnFf] for "on" and "off". Link: https://lkml.kernel.org/r/20220913071358.1812206-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/core: simplify the kdamond stop mechanism by removing 'done'Kaixu Xia1-15/+9
When the 'kdamond_wait_activation()' function or 'after_sampling()' or 'after_aggregation()' DAMON callbacks return an error, it is unnecessary to use bool 'done' to check if kdamond should be finished. This commit simplifies the kdamond stop mechanism by removing 'done' and break the while loop directly in the cases. Link: https://lkml.kernel.org/r/1663060287-30201-4-git-send-email-kaixuxia@tencent.com Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/sysfs: simplify the variable 'pid' assignment operationKaixu Xia1-7/+4
We can initialize the variable 'pid' with '-1' in pid_show() to simplify the variable assignment operation and make the code more readable. Link: https://lkml.kernel.org/r/1663060287-30201-3-git-send-email-kaixuxia@tencent.com Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon: simplify the parameter passing for 'prepare_access_checks'Kaixu Xia2-6/+5
Patch series "mm/damon: code simplifications and cleanups". This patchset contains some code simplifications and cleanups for DAMON. This patch (of 4): The parameter 'struct damon_ctx *ctx' isn't used in the functions __damon_{p,v}a_prepare_access_check(), so we can remove it and simplify the parameter passing. Link: https://lkml.kernel.org/r/1663060287-30201-1-git-send-email-kaixuxia@tencent.com Link: https://lkml.kernel.org/r/1663060287-30201-2-git-send-email-kaixuxia@tencent.com Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/lru_sort: deduplicate hot/cold schemes generatorsSeongJae Park1-24/+21
damon_lru_sort_new_{hot,cold}_scheme() have quite a lot of duplicates. This commit factors out the duplicate to a separate function and use it for reducing the duplicate. Link: https://lkml.kernel.org/r/20220913174449.50645-23-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/lru_sort: use quotas param generatorSeongJae Park1-51/+19
This commit makes DAMON_LRU_SORT to generate the module parameters for DAMOS watermarks using the generator macro to simplify the code and reduce duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-22-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/reclaim: use the quota params generator macroSeongJae Park1-52/+12
This commit makes DAMON_RECLAIM to generate the module parameters for DAMOS quotas using the generator macro to simplify the code and reduce duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-21-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/modules-common: implement damos time quota params generatorSeongJae Park1-2/+5
DAMON_LRU_SORT have module parameters for DAMOS time quota only but size quota. This commit implements a macro for generating the module parameters so that we can reuse later. Link: https://lkml.kernel.org/r/20220913174449.50645-20-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/modules-common: implement a damos quota params generatorSeongJae Park1-1/+7
DAMON_RECLAIM and DAMON_LRU_SORT have module parameters for DAMOS quotas that having same names. This commit implements a macro for generating such module parameters so that we can reuse later. Link: https://lkml.kernel.org/r/20220913174449.50645-19-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/lru_sort: use stat generatorSeongJae Park1-71/+12
This commit makes DAMON_LRU_SORT to generate the module parameters for DAMOS statistics using the generator macro to simplify the code and reduce duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-18-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/reclaim: use stat parameters generatorSeongJae Park1-36/+5
This commit makes DAMON_RECLAIM to generate the module parameters for DAMOS statistics using the generator macro to simplify the code and reduce duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-17-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/modules-common: implement a stats parameters generator macroSeongJae Park1-0/+12
DAMON_RECLAIM and DAMON_LRU_SORT have module parameters for DAMOS statistics that having same names. This commit implements a macro for generating such module parameters so that we can reuse later. Link: https://lkml.kernel.org/r/20220913174449.50645-16-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/reclaim: use watermarks parameters generator macroSeongJae Park1-47/+9
This commit makes DAMON_RECLAIM to generate the module parameters for DAMOS watermarks using the generator macro to simplify the code and reduce duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-15-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/lru_sort: use watermarks parameters generator macroSeongJae Park2-56/+12
This commit makes DAMON_LRU_SORT to generate the module parameters for DAMOS watermarks using the generator macro to simplify the code and reduce duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-14-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/modules-common: implement a watermarks module parameters generator ↵SeongJae Park1-0/+7
macro DAMON_RECLAIM and DAMON_LRU_SORT have module parameters for watermarks that having same names. This commit implements a macro for generating such module parameters so that we can reuse later. Link: https://lkml.kernel.org/r/20220913174449.50645-13-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/reclaim: use monitoring attributes parameters generator macroSeongJae Park1-42/+5
This commit makes DAMON_RECLAIM to generate the module parameters for DAMON monitoring attributes using the generator macro to simplify the code and reduce duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-12-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/lru_sort: use monitoring attributes parameters generaotr macroSeongJae Park1-42/+5
This commit makes DAMON_LRU_SORT to generate the module parameters for DAMON monitoring attributes using the generator macro to simplify the code and reduce duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-11-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon: implement a monitoring attributes module parameters generator macroSeongJae Park1-0/+18
DAMON_RECLAIM and DAMON_LRU_SORT have module parameters for monitoring attributes that having same names. This commot implements a macro for generating such module parameters so that we can reuse later. Link: https://lkml.kernel.org/r/20220913174449.50645-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/lru_sort: use 'struct damon_attrs' for storing parameters for itSeongJae Park1-19/+21
DAMON_LRU_SORT receives monitoring attributes by parameters one by one to separate variables, and then combines those into 'struct damon_attrs'. This commit makes the module directly stores the parameter values to a static 'struct damon_attrs' variable and use it to simplify the code. Link: https://lkml.kernel.org/r/20220913174449.50645-9-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/reclaim: use 'struct damon_attrs' for storing parameters for itSeongJae Park1-17/+19
DAMON_RECLAIM receives monitoring attributes by parameters one by one to separate variables, and then combine those into 'struct damon_attrs'. This commit makes the module directly stores the parameter values to a static 'struct damon_attrs' variable and use it to simplify the code. Link: https://lkml.kernel.org/r/20220913174449.50645-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/core: reduce parameters for damon_set_attrs()SeongJae Park5-27/+35
Number of parameters for 'damon_set_attrs()' is six. As it could be confusing and verbose, this commit reduces the number by receiving single pointer to a 'struct damon_attrs'. Link: https://lkml.kernel.org/r/20220913174449.50645-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/core: use a dedicated struct for monitoring attributesSeongJae Park4-24/+24
DAMON monitoring attributes are directly defined as fields of 'struct damon_ctx'. This makes 'struct damon_ctx' a little long and complicated. This commit defines and uses a struct, 'struct damon_attrs', which is dedicated for only the monitoring attributes to make the purpose of the five values clearer and simplify 'struct damon_ctx'. Link: https://lkml.kernel.org/r/20220913174449.50645-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/core: factor out 'damos_quota' private fileds initializationSeongJae Park1-9/+14
The 'struct damos' creation function, 'damon_new_scheme()', does initialization of private fileds of 'struct damos_quota' in it. As its verbose and makes the function unnecessarily long, this commit factors it out to separate function. Link: https://lkml.kernel.org/r/20220913174449.50645-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/core: copy struct-to-struct instead of field-to-field in ↵SeongJae Park1-17/+4
damon_new_scheme() The function for new 'struct damos' creation, 'damon_new_scheme()', copies each field of the struct one by one, though it could simply copied via struct to struct. This commit replaces the unnecessarily verbose field-to-field copies with struct-to-struct copies to make code simple and short. Link: https://lkml.kernel.org/r/20220913174449.50645-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/paddr: deduplicate damon_pa_{mark_accessed,deactivate_pages}()SeongJae Park1-14/+12
The bodies of damon_pa_{mark_accessed,deactivate_pages}() contains duplicates. This commit factors out the common part to a separate function and removes the duplicates. Link: https://lkml.kernel.org/r/20220913174449.50645-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/paddr: make supported DAMOS actions of paddr clearSeongJae Park1-0/+3
Patch series "mm/damon: cleanup code". DAMON code was not so clean from the beginning, but it has been too much nowadays, especially due to the duplicates in DAMON_RECLAIM and DAMON_LRU_SORT. This patchset cleans some of the mess. This patch (of 22): The 'switch-case' statement in 'damon_va_apply_scheme()' function provides a 'case' for every supported DAMOS action while all not-yet-supported DAMOS actions fall through the 'default' case, and comment it so that people can easily know which actions are supported. Its counterpart in 'paddr', 'damon_pa_apply_scheme()', however, doesn't. This commit makes the 'paddr' side function follows the pattern of 'vaddr' for better readability and consistency. Link: https://lkml.kernel.org/r/20220913174449.50645-1-sj@kernel.org Link: https://lkml.kernel.org/r/20220913174449.50645-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon: simplify scheme create in damon_lru_sort_apply_parametersXin Hao1-6/+4
In damon_lru_sort_apply_parameters(), we can use damon_set_schemes() to replace the way of creating the first 'scheme' in original code, this makes the code look cleaner. Link: https://lkml.kernel.org/r/20220911005917.835-1-xhao@linux.alibaba.com Signed-off-by: Xin Hao <xhao@linux.alibaba.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon: improve damon_new_region strategyDawei Li1-2/+17
Kdamond is implemented as a periodical split-merge pattern, which will create and destroy regions possibly at high frequency (hundreds or even thousands of per sec), depending on the number of regions and aggregation period. In that case, kmalloc and kfree could bring speed and space overheads, which can be improved by using a private kmem cache. [set_pte_at@outlook.com: creating kmem cache for damon regions by KMEM_CACHE()] Link: https://lkml.kernel.org/r/Message-ID: Link: https://lkml.kernel.org/r/TYCP286MB2323DA1894FA55BB9CF90978CA449@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Dawei Li <set_pte_at@outlook.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/sysfs: use the wrapper directly to check if the kdamond is runningKaixu Xia1-2/+1
We can use the 'damon_sysfs_kdamond_running()' wrapper directly to check if the kdamond is running in 'damon_sysfs_turn_damon_on()'. Link: https://lkml.kernel.org/r/1662995513-24489-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: SeongJae Park <sj@kernel.org> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/sysfs: change few functions execute orderXin Hao1-10/+14
There's no need to run container_of() as early as we do. The compiler figures this out, but the resulting code is more readable. Link: https://lkml.kernel.org/r/20220908081932.77370-1-xhao@linux.alibaba.com Signed-off-by: Xin Hao <xhao@linux.alibaba.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/huge_memory: prevent THP_ZERO_PAGE_ALLOC increased twiceLiu Shixin1-1/+1
A user who reads THP_ZERO_PAGE_ALLOC may be more concerned about the huge zero pages that are really allocated for thp. It is misleading to increase THP_ZERO_PAGE_ALLOC twice if two threads call get_huge_zero_page concurrently. Don't increase the value if the huge page is not really used. Update Documentation/admin-guide/mm/transhuge.rst to suit. Link: https://lkml.kernel.org/r/20220909021653.3371879-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Cc: Alexander Potapenko <glider@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm: use nth_page instead of mem_map_offset mem_map_nextCheng Li3-51/+27
To handle the discontiguous case, mem_map_next() has a parameter named `offset`. As a function caller, one would be confused why "get next entry" needs a parameter named "offset". The other drawback of mem_map_next() is that the callers must take care of the map between parameter "iter" and "offset", otherwise we may get an hole or duplication during iteration. So we use nth_page instead of mem_map_next. And replace mem_map_offset with nth_page() per Matthew's comments. Link: https://lkml.kernel.org/r/1662708669-9395-1-git-send-email-lic121@chinatelecom.cn Signed-off-by: Cheng Li <lic121@chinatelecom.cn> Fixes: 69d177c2fc70 ("hugetlbfs: handle pages higher order than MAX_ORDER") Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon: remove duplicate get_monitoring_region() definitionsXin Hao3-70/+44
In lru_sort.c and reclaim.c, they are all defining get_monitoring_region() function, there is no need to define it separately. As 'get_monitoring_region()' is not a 'static' function anymore, we try to use a prefix to distinguish with other functions, so there rename it to 'damon_find_biggest_system_ram'. Link: https://lkml.kernel.org/r/20220909213606.136221-1-sj@kernel.org Signed-off-by: Xin Hao <xhao@linux.alibaba.com> Signed-off-by: SeongJae Park <sj@kernel.org> Suggested-by: SeongJae Park <sj@kernel.org> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm: kfence: convert to DEFINE_SEQ_ATTRIBUTELiu Shixin1-13/+2
Use DEFINE_SEQ_ATTRIBUTE helper macro to simplify the code. Link: https://lkml.kernel.org/r/20220909083140.3592919-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: Marco Elver <elver@google.com> Tested-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04zsmalloc: use correct types in _first_obj_offset functionsAlexey Romanov1-4/+4
Since commit ffedd09fa9b0 ("zsmalloc: Stop using slab fields in struct page") we are using page->page_type (unsigned int) field instead of page->units (int) as first object offset in a subpage of zspage. So get_first_obj_offset() and set_first_obj_offset() functions should work with unsigned int type. Link: https://lkml.kernel.org/r/20220909083722.85024-1-avromanov@sberdevices.ru Fixes: ffedd09fa9b0 ("zsmalloc: Stop using slab fields in struct page") Signed-off-by: Alexey Romanov <avromanov@sberdevices.ru> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Alexey Romanov <avromanov@sberdevices.ru> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/shuffle: convert module_param_call to module_param_cbLiu Shixin1-11/+10
module_param_call is now completely consistent with module_param_cb, so there is no need to keep two macros. Convert module_param_call to module_param_cb since former is obsolete and latter is more kernel-ish. Link: https://lkml.kernel.org/r/20220909083947.3595610-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Liu Shixin <liushixin2@huawei.com> Cc: Paul Russel <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/Kconfig: notify debugfs deprecation planSeongJae Park1-0/+3
Commit b18402726bd1 ("Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface") announced the DAMON debugfs interface deprecation plan, but it is not so aggressively announced. As the deprecation time is coming, this commit makes the announce more easy to be found by adding the note to the config menu of DAMON debugfs interface. Link: https://lkml.kernel.org/r/20220909202901.57977-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Yun Levi <ppbuk5246@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/core-test: test damon_set_regionsSeongJae Park1-0/+23
Preceding commit fixes a bug in 'damon_set_regions()', which allows holes in the new monitoring target ranges. This commit adds a kunit test case for the problem to avoid any regression. Link: https://lkml.kernel.org/r/20220909202901.57977-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Cc: Yun Levi <ppbuk5246@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/core: avoid holes in newly set monitoring target rangesSeongJae Park1-0/+30
When there are two or more non-contiguous regions intersecting with given new ranges, 'damon_set_regions()' does not fill the holes. This commit makes the function to fill the holes with newly created regions. [sj@kernel.org: handle error from 'damon_fill_regions_holes()'] Link: https://lkml.kernel.org/r/20220913215420.57761-1-sj@kernel.org Link: https://lkml.kernel.org/r/20220909202901.57977-3-sj@kernel.org Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Yun Levi <ppbuk5246@gmail.com> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04tmpfs: add support for an i_version counterJeff Layton1-3/+28
NFSv4 mandates a change attribute to avoid problems with timestamp granularity, which Linux implements using the i_version counter. This is particularly important when the underlying filesystem is fast. Give tmpfs an i_version counter. Since it doesn't have to be persistent, we can just turn on SB_I_VERSION and sprinkle some inode_inc_iversion calls in the right places. Also, while there is no formal spec for xattrs, most implementations update the ctime on setxattr. Fix shmem_xattr_handler_set to update the ctime and bump the i_version appropriately. Link: https://lkml.kernel.org/r/20220909130031.15477-1-jlayton@kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-04mm/damon/vaddr: add a comment for 'default' case in damon_va_apply_scheme()Kaixu Xia1-0/+3
The switch case 'DAMOS_STAT' and switch case 'default' have same return value in damon_va_apply_scheme(), and the 'default' case is for DAMOS actions that not supported by 'vaddr'. It might make sense to add a comment here. [akpm@linux-foundation.org: fx comment grammar] Link: https://lkml.kernel.org/r/1662606797-23534-1-git-send-email-kaixuxia@tencent.com Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>