From 6cbdd9726fb50d749b06ab45a8ef81dff02e69b8 Mon Sep 17 00:00:00 2001 From: "Barry Song (Xiaomi)" Date: Tue, 26 May 2026 21:09:38 +0800 Subject: mm/mglru: use folio_mark_accessed to replace folio_set_active MGLRU gives high priority to folios mapped in page tables. As a result, folio_set_active() is invoked for all folios read during page faults. In practice, however, readahead can bring in many folios that are never accessed via page tables. A previous attempt by Lei Liu proposed introducing a separate LRU for readahead[1] to make readahead pages easier to reclaim, but that approach is likely over-engineered. Before commit 4d5d14a01e2c ("mm/mglru: rework workingset protection"), folios with PG_active were always placed in the youngest generation, leading to over-protection and increased refaults. After that commit, PG_active folios are placed in the second youngest generation, which is still too optimistic given the presence of readahead. In contrast, the classic active/inactive scheme is more conservative. This patch switches to using folio_mark_accessed() and begins prefaulted file folios from the second oldest generation instead of active generations. We should also adjust the following accordingly: - WORKINGSET_ACTIVATE: aligned with setting active for refaulted workingset folios; - lru_gen_folio_seq(): place (pre)faulted file folios into the second oldest generation; - promote second-scanned folios to workingset in folio_check_references(): we now have to depend on folio_lru_refs() > 1, since we previously relied on PG_referenced being set during the first scan, but PG_referenced is now set earlier. On x86, running a kernel build inside a memcg with a 1GB memory limit using 20 threads. w/o patch: real 1m50.764s user 25m32.305s sys 4m0.012s pswpin: 1333245 pswpout: 4366443 pgpgin: 6962592 pgpgout: 17780712 swpout_zero: 1019603 swpin_zero: 14764 refault_file: 287794 refault_anon: 1347963 w/ patch: real 1m48.879s user 25m29.224s sys 3m37.421s pswpin: 568480 pswpout: 2322657 pgpgin: 4073416 pgpgout: 9613408 swpout_zero: 593275 swpin_zero: 9118 refault_file: 262505 refault_anon: 577550 active/inactive LRU: real 1m49.928s user 25m28.196s sys 3m40.740s pswpin: 463452 pswpout: 2309119 pgpgin: 4438856 pgpgout: 9568628 swpout_zero: 743704 swpin_zero: 7244 refault_file: 562555 refault_anon: 470694 Lance and Xueyuan made a huge contribution to this patch through testing. Link: https://lore.kernel.org/20260526130938.66253-1-baohua@kernel.org Link: https://lore.kernel.org/linux-mm/20250916072226.220426-1-liulei.rjpt@vivo.com/ [1] Signed-off-by: Barry Song (Xiaomi) Tested-by: Lance Yang Tested-by: Xueyuan Chen Cc: Pedro Falcato Cc: Kairui Song Cc: Qi Zheng Cc: Shakeel Butt Cc: wangzicheng Cc: Suren Baghdasaryan Cc: Lei Liu Cc: Matthew Wilcox (Oracle) Cc: Axel Rasmussen Cc: Yuanchu Xie Cc: Wei Xu Cc: Will Deacon Cc: Kalesh Singh Signed-off-by: Andrew Morton --- include/linux/mm_inline.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'include/linux/mm_inline.h') diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index a171070e15f0..a8430a7ae054 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -247,7 +247,7 @@ static inline unsigned long lru_gen_folio_seq(const struct lruvec *lruvec, (folio_test_dirty(folio) || folio_test_writeback(folio)))) gen = MIN_NR_GENS; else - gen = MAX_NR_GENS - folio_test_workingset(folio); + gen = MAX_NR_GENS - (folio_test_workingset(folio) || folio_test_referenced(folio)); return max(READ_ONCE(lrugen->max_seq) - gen + 1, READ_ONCE(lrugen->min_seq[type])); } -- cgit v1.2.3