Skip to content

Commit 04f13d2

Browse files
yanggeakpm00
yangge
authored andcommitted
mm: replace free hugepage folios after migration
My machine has 4 NUMA nodes, each equipped with 32GB of memory. I have configured each NUMA node with 16GB of CMA and 16GB of in-use hugetlb pages. The allocation of contiguous memory via cma_alloc() can fail probabilistically. When there are free hugetlb folios in the hugetlb pool, during the migration of in-use hugetlb folios, new folios are allocated from the free hugetlb pool. After the migration is completed, the old folios are released back to the free hugetlb pool instead of being returned to the buddy system. This can cause test_pages_isolated() check to fail, ultimately leading to the failure of cma_alloc(). Call trace: cma_alloc() __alloc_contig_migrate_range() // migrate in-use hugepage test_pages_isolated() __test_page_isolated_in_pageblock() PageBuddy(page) // check if the page is in buddy To address this issue, we introduce a function named replace_free_hugepage_folios(). This function will replace the hugepage in the free hugepage pool with a new one and release the old one to the buddy system. After the migration of in-use hugetlb pages is completed, we will invoke replace_free_hugepage_folios() to ensure that these hugepages are properly released to the buddy system. Following this step, when test_pages_isolated() is executed for inspection, it will successfully pass. Additionally, when alloc_contig_range() is used to migrate multiple in-use hugetlb pages, it can result in some in-use hugetlb pages being released back to the free hugetlb pool and subsequently being reallocated and used again. For example: [huge 0] [huge 1] To migrate huge 0, we obtain huge x from the pool. After the migration is completed, we return the now-freed huge 0 back to the pool. When it's time to migrate huge 1, we can simply reuse the now-freed huge 0 from the pool. As a result, when replace_free_hugepage_folios() is executed, it cannot release huge 0 back to the buddy system. To address this issue, we should prevent the reuse of isolated free hugepages during the migration process. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: yangge <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Barry Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: SeongJae Park <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent 424d0e5 commit 04f13d2

File tree

3 files changed

+60
-1
lines changed

3 files changed

+60
-1
lines changed

include/linux/hugetlb.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -681,6 +681,7 @@ struct huge_bootmem_page {
681681
};
682682

683683
int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list);
684+
int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn);
684685
struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
685686
unsigned long addr, int avoid_reserve);
686687
struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid,
@@ -1059,6 +1060,12 @@ static inline int isolate_or_dissolve_huge_page(struct page *page,
10591060
return -ENOMEM;
10601061
}
10611062

1063+
static inline int replace_free_hugepage_folios(unsigned long start_pfn,
1064+
unsigned long end_pfn)
1065+
{
1066+
return 0;
1067+
}
1068+
10621069
static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
10631070
unsigned long addr,
10641071
int avoid_reserve)

mm/hugetlb.c

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@
4848
#include <linux/page_owner.h>
4949
#include "internal.h"
5050
#include "hugetlb_vmemmap.h"
51+
#include <linux/page-isolation.h>
5152

5253
int hugetlb_max_hstate __read_mostly;
5354
unsigned int default_hstate_idx;
@@ -1336,6 +1337,9 @@ static struct folio *dequeue_hugetlb_folio_node_exact(struct hstate *h,
13361337
if (folio_test_hwpoison(folio))
13371338
continue;
13381339

1340+
if (is_migrate_isolate_page(&folio->page))
1341+
continue;
1342+
13391343
list_move(&folio->lru, &h->hugepage_activelist);
13401344
folio_ref_unfreeze(folio, 1);
13411345
folio_clear_hugetlb_freed(folio);
@@ -2975,6 +2979,44 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list)
29752979
return ret;
29762980
}
29772981

2982+
/*
2983+
* replace_free_hugepage_folios - Replace free hugepage folios in a given pfn
2984+
* range with new folios.
2985+
* @start_pfn: start pfn of the given pfn range
2986+
* @end_pfn: end pfn of the given pfn range
2987+
* Returns 0 on success, otherwise negated error.
2988+
*/
2989+
int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn)
2990+
{
2991+
struct hstate *h;
2992+
struct folio *folio;
2993+
int ret = 0;
2994+
2995+
LIST_HEAD(isolate_list);
2996+
2997+
while (start_pfn < end_pfn) {
2998+
folio = pfn_folio(start_pfn);
2999+
if (folio_test_hugetlb(folio)) {
3000+
h = folio_hstate(folio);
3001+
} else {
3002+
start_pfn++;
3003+
continue;
3004+
}
3005+
3006+
if (!folio_ref_count(folio)) {
3007+
ret = alloc_and_dissolve_hugetlb_folio(h, folio,
3008+
&isolate_list);
3009+
if (ret)
3010+
break;
3011+
3012+
putback_movable_pages(&isolate_list);
3013+
}
3014+
start_pfn++;
3015+
}
3016+
3017+
return ret;
3018+
}
3019+
29783020
struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
29793021
unsigned long addr, int avoid_reserve)
29803022
{

mm/page_alloc.c

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6504,7 +6504,17 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
65046504
ret = __alloc_contig_migrate_range(&cc, start, end, migratetype);
65056505
if (ret && ret != -EBUSY)
65066506
goto done;
6507-
ret = 0;
6507+
6508+
/*
6509+
* When in-use hugetlb pages are migrated, they may simply be released
6510+
* back into the free hugepage pool instead of being returned to the
6511+
* buddy system. After the migration of in-use huge pages is completed,
6512+
* we will invoke replace_free_hugepage_folios() to ensure that these
6513+
* hugepages are properly released to the buddy system.
6514+
*/
6515+
ret = replace_free_hugepage_folios(start, end);
6516+
if (ret)
6517+
goto done;
65086518

65096519
/*
65106520
* Pages from [start, end) are within a pageblock_nr_pages

0 commit comments

Comments
 (0)