Skip to content

Commit 6c07643

Browse files
committed
mm/memcg: revert ("mm/memcg: optimize user context object stock access")
jira KERNEL-325 cve CVE-2023-53401 Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10 commit-author Michal Hocko <mhocko@suse.com> commit fead2b8 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-4.18.0-553.89.1.el8_10/fead2b86.failed Patch series "mm/memcg: Address PREEMPT_RT problems instead of disabling it", v5. This series aims to address the memcg related problem on PREEMPT_RT. I tested them on CONFIG_PREEMPT and CONFIG_PREEMPT_RT with the tools/testing/selftests/cgroup/* tests and I haven't observed any regressions (other than the lockdep report that is already there). This patch (of 6): The optimisation is based on a micro benchmark where local_irq_save() is more expensive than a preempt_disable(). There is no evidence that it is visible in a real-world workload and there are CPUs where the opposite is true (local_irq_save() is cheaper than preempt_disable()). Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE where preempt_disable() is optimized away. There is no improvement with PREEMPT_DYNAMIC since the preemption counter is always available. The optimization makes also the PREEMPT_RT integration more complicated since most of the assumption are not true on PREEMPT_RT. Revert the optimisation since it complicates the PREEMPT_RT integration and the improvement is hardly visible. [bigeasy@linutronix.de: patch body around Michal's diff] Link: https://lkml.kernel.org/r/20220226204144.1008339-1-bigeasy@linutronix.de Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4@dhcp22.suse.cz Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S@linutronix.de Link: https://lkml.kernel.org/r/20220226204144.1008339-2-bigeasy@linutronix.de Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Waiman Long <longman@redhat.com> Cc: kernel test robot <oliver.sang@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit fead2b8) Signed-off-by: Jonathan Maple <jmaple@ciq.com> # Conflicts: # mm/memcontrol.c
1 parent 95c46c3 commit 6c07643

File tree

1 file changed

+167
-0
lines changed

1 file changed

+167
-0
lines changed
Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
mm/memcg: revert ("mm/memcg: optimize user context object stock access")
2+
3+
jira KERNEL-325
4+
cve CVE-2023-53401
5+
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
6+
commit-author Michal Hocko <mhocko@suse.com>
7+
commit fead2b869764f89d524b79dc8862e61d5191be55
8+
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
9+
Will be included in final tarball splat. Ref for failed cherry-pick at:
10+
ciq/ciq_backports/kernel-4.18.0-553.89.1.el8_10/fead2b86.failed
11+
12+
Patch series "mm/memcg: Address PREEMPT_RT problems instead of disabling it", v5.
13+
14+
This series aims to address the memcg related problem on PREEMPT_RT.
15+
16+
I tested them on CONFIG_PREEMPT and CONFIG_PREEMPT_RT with the
17+
tools/testing/selftests/cgroup/* tests and I haven't observed any
18+
regressions (other than the lockdep report that is already there).
19+
20+
This patch (of 6):
21+
22+
The optimisation is based on a micro benchmark where local_irq_save() is
23+
more expensive than a preempt_disable(). There is no evidence that it
24+
is visible in a real-world workload and there are CPUs where the
25+
opposite is true (local_irq_save() is cheaper than preempt_disable()).
26+
27+
Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE
28+
where preempt_disable() is optimized away. There is no improvement with
29+
PREEMPT_DYNAMIC since the preemption counter is always available.
30+
31+
The optimization makes also the PREEMPT_RT integration more complicated
32+
since most of the assumption are not true on PREEMPT_RT.
33+
34+
Revert the optimisation since it complicates the PREEMPT_RT integration
35+
and the improvement is hardly visible.
36+
37+
[bigeasy@linutronix.de: patch body around Michal's diff]
38+
39+
Link: https://lkml.kernel.org/r/20220226204144.1008339-1-bigeasy@linutronix.de
40+
Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4@dhcp22.suse.cz
41+
Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S@linutronix.de
42+
Link: https://lkml.kernel.org/r/20220226204144.1008339-2-bigeasy@linutronix.de
43+
Signed-off-by: Michal Hocko <mhocko@suse.com>
44+
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
45+
Acked-by: Roman Gushchin <guro@fb.com>
46+
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
47+
Reviewed-by: Shakeel Butt <shakeelb@google.com>
48+
Acked-by: Michal Hocko <mhocko@suse.com>
49+
Cc: Johannes Weiner <hannes@cmpxchg.org>
50+
Cc: Peter Zijlstra <peterz@infradead.org>
51+
Cc: Thomas Gleixner <tglx@linutronix.de>
52+
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
53+
Cc: Waiman Long <longman@redhat.com>
54+
Cc: kernel test robot <oliver.sang@intel.com>
55+
Cc: Michal Hocko <mhocko@kernel.org>
56+
Cc: Michal Koutný <mkoutny@suse.com>
57+
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
58+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
59+
(cherry picked from commit fead2b869764f89d524b79dc8862e61d5191be55)
60+
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
61+
62+
# Conflicts:
63+
# mm/memcontrol.c
64+
diff --cc mm/memcontrol.c
65+
index 6e2a077af4c1,7bf204b2b053..000000000000
66+
--- a/mm/memcontrol.c
67+
+++ b/mm/memcontrol.c
68+
@@@ -2232,18 -2061,27 +2232,21 @@@ static void __unlock_page_memcg(struct
69+
}
70+
71+
/**
72+
- * folio_memcg_unlock - Release the binding between a folio and its memcg.
73+
- * @folio: The folio.
74+
- *
75+
- * This releases the binding created by folio_memcg_lock(). This does
76+
- * not change the accounting of this folio to its memcg, but it does
77+
- * permit others to change it.
78+
+ * unlock_page_memcg - unlock a page and memcg binding
79+
+ * @page: the page
80+
*/
81+
-void folio_memcg_unlock(struct folio *folio)
82+
-{
83+
- __folio_memcg_unlock(folio_memcg(folio));
84+
-}
85+
-
86+
void unlock_page_memcg(struct page *page)
87+
{
88+
- folio_memcg_unlock(page_folio(page));
89+
+ struct page *head = compound_head(page);
90+
+
91+
+ __unlock_page_memcg(page_memcg(head));
92+
}
93+
+EXPORT_SYMBOL(unlock_page_memcg);
94+
95+
- struct obj_stock {
96+
+ struct memcg_stock_pcp {
97+
+ struct mem_cgroup *cached; /* this never be root cgroup */
98+
+ unsigned int nr_pages;
99+
+
100+
#ifdef CONFIG_MEMCG_KMEM
101+
struct obj_cgroup *cached_objcg;
102+
struct pglist_data *cached_pgdat;
103+
@@@ -2269,12 -2098,13 +2263,12 @@@ static DEFINE_PER_CPU(struct memcg_stoc
104+
static DEFINE_MUTEX(percpu_charge_mutex);
105+
106+
#ifdef CONFIG_MEMCG_KMEM
107+
- static void drain_obj_stock(struct obj_stock *stock);
108+
+ static void drain_obj_stock(struct memcg_stock_pcp *stock);
109+
static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
110+
struct mem_cgroup *root_memcg);
111+
-static void memcg_account_kmem(struct mem_cgroup *memcg, int nr_pages);
112+
113+
#else
114+
- static inline void drain_obj_stock(struct obj_stock *stock)
115+
+ static inline void drain_obj_stock(struct memcg_stock_pcp *stock)
116+
{
117+
}
118+
static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
119+
@@@ -7219,22 -6782,21 +7180,30 @@@ static void uncharge_batch(const struc
120+
css_put(&ug->memcg->css);
121+
}
122+
123+
-static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug)
124+
+static void uncharge_page(struct page *page, struct uncharge_gather *ug)
125+
{
126+
- long nr_pages;
127+
+ unsigned long nr_pages;
128+
struct mem_cgroup *memcg;
129+
struct obj_cgroup *objcg;
130+
++<<<<<<< HEAD
131+
+ bool use_objcg = PageMemcgKmem(page);
132+
++=======
133+
++>>>>>>> fead2b869764 (mm/memcg: revert ("mm/memcg: optimize user context object stock access"))
134+
135+
- VM_BUG_ON_FOLIO(folio_test_lru(folio), folio);
136+
+ VM_BUG_ON_PAGE(PageLRU(page), page);
137+
138+
/*
139+
* Nobody should be changing or seriously looking at
140+
- * folio memcg or objcg at this point, we have fully
141+
- * exclusive access to the folio.
142+
+ * page memcg or objcg at this point, we have fully
143+
+ * exclusive access to the page.
144+
*/
145+
++<<<<<<< HEAD
146+
+ if (use_objcg) {
147+
+ objcg = __page_objcg(page);
148+
++=======
149+
+ if (folio_memcg_kmem(folio)) {
150+
+ objcg = __folio_objcg(folio);
151+
++>>>>>>> fead2b869764 (mm/memcg: revert ("mm/memcg: optimize user context object stock access"))
152+
/*
153+
* This get matches the put at the end of the function and
154+
* kmem pages do not hold memcg references anymore.
155+
@@@ -7259,9 -6821,9 +7228,9 @@@
156+
css_get(&memcg->css);
157+
}
158+
159+
- nr_pages = folio_nr_pages(folio);
160+
+ nr_pages = compound_nr(page);
161+
162+
- if (use_objcg) {
163+
+ if (folio_memcg_kmem(folio)) {
164+
ug->nr_memory += nr_pages;
165+
ug->nr_kmem += nr_pages;
166+
167+
* Unmerged path mm/memcontrol.c

0 commit comments

Comments
 (0)