summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMichal Hocko <mhocko@suse.com>2018-12-28 11:38:29 +0300
committerLinus Torvalds <torvalds@linux-foundation.org>2018-12-28 23:11:50 +0300
commita85009c377929d119fad5a75a16c4b7198946603 (patch)
tree4c014c365e643ed47de58fb3473afcbbf979cdd5
parenta1400af755631f5267f7cc3d0fda5ba72f58d7d3 (diff)
downloadlinux-a85009c377929d119fad5a75a16c4b7198946603.tar.xz
mm, memory_hotplug: try to migrate full pfn range
Patch series "few memory offlining enhancements". I have been chasing memory offlining not making progress recently. On the way I have noticed few weird decisions in the code. The migration itself is restricted without a reasonable justification and the retry loop around the migration is quite messy. This is addressed by patch 1 and patch 2. Patch 3 is targeting on the faultaround code which has been a hot candidate for the initial issue reported upstream [2] and that I am debugging internally. It turned out to be not the main contributor in the end but I believe we should address it regardless. See the patch description for more details. [1] http://lkml.kernel.org/r/20181120134323.13007-1-mhocko@kernel.org [2] http://lkml.kernel.org/r/20181114070909.GB2653@MiWiFi-R3L-srv This patch (of 3): do_migrate_range has been limiting the number of pages to migrate to 256 for some reason which is not documented. Even if the limit made some sense back then when it was introduced it doesn't really serve a good purpose these days. If the range contains huge pages then we break out of the loop too early and go through LRU and pcp caches draining and scan_movable_pages is quite suboptimal. The only reason to limit the number of pages I can think of is to reduce the potential time to react on the fatal signal. But even then the number of pages is a questionable metric because even a single page migration might block in a non-killable state (e.g. __unmap_and_move). Remove the limit and offline the full requested range (this is one memblock worth of pages with the current code). Should we ever get a report that offlining takes too long to react on fatal signal then we should rather fix the core migration to use killable waits and bailout on a signal. Link: http://lkml.kernel.org/r/20181211142741.2607-1-mhocko@kernel.org Link: http://lkml.kernel.org/r/20181211142741.2607-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-rw-r--r--mm/memory_hotplug.c8
1 files changed, 2 insertions, 6 deletions
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 37923f81bfeb..2d589d9f9f9e 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1339,18 +1339,16 @@ static struct page *new_node_page(struct page *page, unsigned long private)
return new_page_nodemask(page, nid, &nmask);
}
-#define NR_OFFLINE_AT_ONCE_PAGES (256)
static int
do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
{
unsigned long pfn;
struct page *page;
- int move_pages = NR_OFFLINE_AT_ONCE_PAGES;
int not_managed = 0;
int ret = 0;
LIST_HEAD(source);
- for (pfn = start_pfn; pfn < end_pfn && move_pages > 0; pfn++) {
+ for (pfn = start_pfn; pfn < end_pfn; pfn++) {
if (!pfn_valid(pfn))
continue;
page = pfn_to_page(pfn);
@@ -1362,8 +1360,7 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
ret = -EBUSY;
break;
}
- if (isolate_huge_page(page, &source))
- move_pages -= 1 << compound_order(head);
+ isolate_huge_page(page, &source);
continue;
} else if (PageTransHuge(page))
pfn = page_to_pfn(compound_head(page))
@@ -1397,7 +1394,6 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
if (!ret) { /* Success */
put_page(page);
list_add_tail(&page->lru, &source);
- move_pages--;
if (!__PageMovable(page))
inc_node_page_state(page, NR_ISOLATED_ANON +
page_is_file_cache(page));