Skip to content

Commit b30c592

Browse files
davidhildenbrandtorvalds
authored andcommitted
mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory
Currently, it can happen that pages are allocated (and freed) via the buddy before we finished basic memory onlining. For example, pages are exposed to the buddy and can be allocated before we actually mark the sections online. Allocated pages could suddenly fail pfn_to_online_page() checks. We had similar issues with pcp handling, when pages are allocated+freed before we reach zone_pcp_update() in online_pages() [1]. Instead, mark all pageblocks MIGRATE_ISOLATE, such that allocations are impossible. Once done with the heavy lifting, use undo_isolate_page_range() to move the pages to the MIGRATE_MOVABLE freelist, marking them ready for allocation. Similar to offline_pages(), we have to manually adjust zone->nr_isolate_pageblock. [1] https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Wei Yang <[email protected]> Cc: Baoquan He <[email protected]> Cc: Pankaj Gupta <[email protected]> Cc: Charan Teja Reddy <[email protected]> Cc: Dan Williams <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Logan Gunthorpe <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michel Lespinasse <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Tony Luck <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
1 parent d882c00 commit b30c592

File tree

2 files changed

+23
-11
lines changed

2 files changed

+23
-11
lines changed

mm/Kconfig

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,7 @@ config HAVE_BOOTMEM_INFO_NODE
152152
# eventually, we can have this option just 'select SPARSEMEM'
153153
config MEMORY_HOTPLUG
154154
bool "Allow for memory hot-add"
155+
select MEMORY_ISOLATION
155156
depends on SPARSEMEM || X86_64_ACPI_NUMA
156157
depends on ARCH_ENABLE_MEMORY_HOTPLUG
157158
depends on 64BIT || BROKEN
@@ -178,7 +179,6 @@ config MEMORY_HOTPLUG_DEFAULT_ONLINE
178179

179180
config MEMORY_HOTREMOVE
180181
bool "Allow for memory hot remove"
181-
select MEMORY_ISOLATION
182182
select HAVE_BOOTMEM_INFO_NODE if (X86_64 || PPC64)
183183
depends on MEMORY_HOTPLUG && ARCH_ENABLE_MEMORY_HOTREMOVE
184184
depends on MIGRATION

mm/memory_hotplug.c

Lines changed: 22 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -813,7 +813,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
813813

814814
/* associate pfn range with the zone */
815815
zone = zone_for_pfn_range(online_type, nid, pfn, nr_pages);
816-
move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
816+
move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
817817

818818
arg.start_pfn = pfn;
819819
arg.nr_pages = nr_pages;
@@ -824,6 +824,14 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
824824
if (ret)
825825
goto failed_addition;
826826

827+
/*
828+
* Fixup the number of isolated pageblocks before marking the sections
829+
* onlining, such that undo_isolate_page_range() works correctly.
830+
*/
831+
spin_lock_irqsave(&zone->lock, flags);
832+
zone->nr_isolate_pageblock += nr_pages / pageblock_nr_pages;
833+
spin_unlock_irqrestore(&zone->lock, flags);
834+
827835
/*
828836
* If this zone is not populated, then it is not in zonelist.
829837
* This means the page allocator ignores this zone.
@@ -841,21 +849,25 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages,
841849
zone->zone_pgdat->node_present_pages += nr_pages;
842850
pgdat_resize_unlock(zone->zone_pgdat, &flags);
843851

852+
node_states_set_node(nid, &arg);
853+
if (need_zonelists_rebuild)
854+
build_all_zonelists(NULL);
855+
zone_pcp_update(zone);
856+
857+
/* Basic onlining is complete, allow allocation of onlined pages. */
858+
undo_isolate_page_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE);
859+
844860
/*
845861
* When exposing larger, physically contiguous memory areas to the
846862
* buddy, shuffling in the buddy (when freeing onlined pages, putting
847863
* them either to the head or the tail of the freelist) is only helpful
848864
* for maintaining the shuffle, but not for creating the initial
849865
* shuffle. Shuffle the whole zone to make sure the just onlined pages
850-
* are properly distributed across the whole freelist.
866+
* are properly distributed across the whole freelist. Make sure to
867+
* shuffle once pageblocks are no longer isolated.
851868
*/
852869
shuffle_zone(zone);
853870

854-
node_states_set_node(nid, &arg);
855-
if (need_zonelists_rebuild)
856-
build_all_zonelists(NULL);
857-
zone_pcp_update(zone);
858-
859871
init_per_zone_wmark_min();
860872

861873
kswapd_run(nid);
@@ -1577,9 +1589,9 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages)
15771589
pr_info("Offlined Pages %ld\n", nr_pages);
15781590

15791591
/*
1580-
* Onlining will reset pagetype flags and makes migrate type
1581-
* MOVABLE, so just need to decrease the number of isolated
1582-
* pageblocks zone counter here.
1592+
* The memory sections are marked offline, and the pageblock flags
1593+
* effectively stale; nobody should be touching them. Fixup the number
1594+
* of isolated pageblocks, memory onlining will properly revert this.
15831595
*/
15841596
spin_lock_irqsave(&zone->lock, flags);
15851597
zone->nr_isolate_pageblock -= nr_pages / pageblock_nr_pages;

0 commit comments

Comments
 (0)