Skip to content

Conversation

@PlaidCat
Copy link
Collaborator

General Process:

Checking Rebuild Commits for Potentially missing commits:

kernel-4.18.0-553.89.1.el8_10

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-4.18.0-553.89.1.el8_10/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 581414
Number of commits in rpm: 14
Number of commits matched with upstream: 8 (57.14%)
Number of commits in upstream but not in rpm: 581406
Number of commits NOT found in upstream: 6 (42.86%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.89.1.el8_10 for kernel-4.18.0-553.89.1.el8_10
Clean Cherry Picks: 4 (50.00%)
Empty Cherry Picks: 4 (50.00%)
_______________________________

__EMPTY COMMITS__________________________
5d122db2ff80cd2aed4dcd630befb56b51ddf947 RDMA/rxe: Fix incomplete state save in rxe_requester
6ab26555c9ffef96c56ca16356e55ac5ab61ec93 gfs2: Add proper lockspace locking
fead2b869764f89d524b79dc8862e61d5191be55 mm/memcg: revert ("mm/memcg: optimize user context object stock access")
3b8abb3239530c423c0b97e42af7f7e856e1ee96 mm: kmem: fix a NULL pointer dereference in obj_stock_flush_required()

__CHANGES NOT IN UPSTREAM________________
Adding prod certs and changed cert date to 20210620
Adding Rocky secure boot certs
Fixing vmlinuz removal
Fixing UEFI CA path
Porting to 8.10, debranding and Rocky branding
Fixing pesign_key_name values

Build

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree-build
Running make mrproper...
  CLEAN   scripts/basic
  CLEAN   scripts/kconfig
[TIMER]{MRPROPER}: 5s
x86_64 architecture detected, copying config
'configs/kernel-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky8_10_rebuild-a8a2edf9cea8"
Making olddefconfig
--
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf  --olddefconfig Kconfig
#
# configuration written to .config
#
Starting Build
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
--
  LD [M]  sound/usb/usx2y/snd-usb-usx2y.ko
  LD [M]  sound/virtio/virtio_snd.ko
  LD [M]  sound/x86/snd-hdmi-lpe-audio.ko
  LD [M]  sound/xen/snd_xen_front.ko
  LD [M]  virt/lib/irqbypass.ko
[TIMER]{BUILD}: 1449s
Making Modules
  INSTALL arch/x86/crypto/blowfish-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx2.ko
  INSTALL arch/x86/crypto/camellia-x86_64.ko
--
  INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-rocky8_10_rebuild-a8a2edf9cea8+
[TIMER]{MODULES}: 13s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-rocky8_10_rebuild-a8a2edf9cea8+ arch/x86/boot/bzImage \
        System.map "/boot"
[TIMER]{INSTALL}: 20s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-rocky8_10_rebuild-a8a2edf9cea8+ and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 5s
[TIMER]{BUILD}: 1449s
[TIMER]{MODULES}: 13s
[TIMER]{INSTALL}: 20s
[TIMER]{TOTAL} 1494s
Rebooting in 10 seconds

KSelfTests

[jmaple@devbox code]$ ~/workspace/auto_kernel_history_rebuild/Rocky10/rocky10/code/get_kselftest_diff.sh
kselftest.4.18.0-rocky8_10_rebuild-d6bebad94beb+.log
207
kselftest.4.18.0-rocky8_10_rebuild-f01f784daddc+.log
207
kselftest.4.18.0-rocky8_10_rebuild-d79f08d232d6+.log
207
kselftest.4.18.0-rocky8_10_rebuild-a8a2edf9cea8+.log
207
Before: kselftest.4.18.0-rocky8_10_rebuild-d79f08d232d6+.log
After: kselftest.4.18.0-rocky8_10_rebuild-a8a2edf9cea8+.log
Diff:
No differences found.

jira KERNEL-325
cve CVE-2022-50543
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
commit-author Li Zhijian <lizhijian@fujitsu.com>
commit 7d984da

rxe_mr_cleanup() which tries to free mr->map again will be called when
rxe_mr_init_user() fails:

   CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
   Call Trace:
    <TASK>
    dump_stack_lvl+0x45/0x5d
    panic+0x19e/0x349
    end_report.part.0+0x54/0x7c
    kasan_report.cold+0xa/0xf
    rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
    __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
    rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
    ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
    ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
    ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]

This issue was firstly exposed since commit b18c7da ("RDMA/rxe: Fix
memory leak in error path code") and then we fixed it in commit
8ff5f5d ("RDMA/rxe: Prevent double freeing rxe_map_set()") but this
fix was reverted together at last by commit 1e75550 (Revert
"RDMA/rxe: Create duplicate mapping tables for FMRs")

Simply let rxe_mr_cleanup() always handle freeing the mr->map once it is
successfully allocated.

Fixes: 1e75550 ("Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"")
Link: https://lore.kernel.org/r/1667099073-2-1-git-send-email-lizhijian@fujitsu.com
	Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
	Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
(cherry picked from commit 7d984da)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-325
cve CVE-2023-53539
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
commit-author Bob Pearson <rpearsonhpe@gmail.com>
commit 5d122db
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.89.1.el8_10/5d122db2.failed

If a send packet is dropped by the IP layer in rxe_requester()
the call to rxe_xmit_packet() can fail with err == -EAGAIN.
To recover, the state of the wqe is restored to the state before
the packet was sent so it can be resent. However, the routines
that save and restore the state miss a significnt part of the
variable state in the wqe, the dma struct which is used to process
through the sge table. And, the state is not saved before the packet
is built which modifies the dma struct.

Under heavy stress testing with many QPs on a fast node sending
large messages to a slow node dropped packets are observed and
the resent packets are corrupted because the dma struct was not
restored. This patch fixes this behavior and allows the test cases
to succeed.

Fixes: 3050b99 ("IB/rxe: Fix race condition between requester and completer")
Link: https://lore.kernel.org/r/20230721200748.4604-1-rpearsonhpe@gmail.com
	Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
	Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
(cherry picked from commit 5d122db)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	drivers/infiniband/sw/rxe/rxe_req.c
jira KERNEL-325
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
commit-author Peter Oberparleiter <oberpar@linux.ibm.com>
commit 9697ca0

Improve the usability of the unit_add sysfs attribute by ensuring that
the associated FCP LUN scan processing is completed synchronously.  This
enables configuration tooling to consistently determine the end of the
scan process to allow for serialization of follow-on actions.

While the scan process associated with unit_add typically completes
synchronously, it is deferred to an asynchronous background process if
unit_add is used before initial remote port scanning has completed.  This
occurs when unit_add is used immediately after setting the associated FCP
device online.

To ensure synchronous unit_add processing, wait for remote port scanning
to complete before initiating the FCP LUN scan.

	Cc: stable@vger.kernel.org
	Reviewed-by: M Nikhil <nikh1092@linux.ibm.com>
	Reviewed-by: Nihar Panda <niharp@linux.ibm.com>
	Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
	Signed-off-by: Nihar Panda <niharp@linux.ibm.com>
Link: https://lore.kernel.org/r/20250603182252.2287285-2-niharp@linux.ibm.com
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9697ca0)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-325
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 2309a01

Check for asynchronous completion and clear the GLF_PENDING_REPLY flag
earlier in do_xmote().  This will make future changes more readable.

	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
	Reviewed-by: Andrew Price <anprice@redhat.com>
(cherry picked from commit 2309a01)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-325
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 6ab2655
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.89.1.el8_10/6ab26555.failed

GFS2 has been calling functions like dlm_lock() even after the lockspace
that these functions operate on has been released with
dlm_release_lockspace().  It has always assumed that those functions
would return -EINVAL in that case, but that was never guaranteed, and it
certainly is no longer the case since commit 4db41bf ("dlm: remove
ls_local_handle from struct dlm_ls").

To fix that, add proper lockspace locking.

Fixes: 3e11e53 ("GFS2: ignore unlock failures after withdraw")
	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
	Reviewed-by: Andrew Price <anprice@redhat.com>
(cherry picked from commit 6ab2655)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	fs/gfs2/file.c
#	fs/gfs2/lock_dlm.c
jira KERNEL-325
cve CVE-2023-53401
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
commit-author Michal Hocko <mhocko@suse.com>
commit fead2b8
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.89.1.el8_10/fead2b86.failed

Patch series "mm/memcg: Address PREEMPT_RT problems instead of disabling it", v5.

This series aims to address the memcg related problem on PREEMPT_RT.

I tested them on CONFIG_PREEMPT and CONFIG_PREEMPT_RT with the
tools/testing/selftests/cgroup/* tests and I haven't observed any
regressions (other than the lockdep report that is already there).

This patch (of 6):

The optimisation is based on a micro benchmark where local_irq_save() is
more expensive than a preempt_disable().  There is no evidence that it
is visible in a real-world workload and there are CPUs where the
opposite is true (local_irq_save() is cheaper than preempt_disable()).

Based on micro benchmarks, the optimisation makes sense on PREEMPT_NONE
where preempt_disable() is optimized away.  There is no improvement with
PREEMPT_DYNAMIC since the preemption counter is always available.

The optimization makes also the PREEMPT_RT integration more complicated
since most of the assumption are not true on PREEMPT_RT.

Revert the optimisation since it complicates the PREEMPT_RT integration
and the improvement is hardly visible.

[bigeasy@linutronix.de: patch body around Michal's diff]

Link: https://lkml.kernel.org/r/20220226204144.1008339-1-bigeasy@linutronix.de
Link: https://lore.kernel.org/all/YgOGkXXCrD%2F1k+p4@dhcp22.suse.cz
Link: https://lkml.kernel.org/r/YdX+INO9gQje6d0S@linutronix.de
Link: https://lkml.kernel.org/r/20220226204144.1008339-2-bigeasy@linutronix.de
	Signed-off-by: Michal Hocko <mhocko@suse.com>
	Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
	Acked-by: Roman Gushchin <guro@fb.com>
	Acked-by: Johannes Weiner <hannes@cmpxchg.org>
	Reviewed-by: Shakeel Butt <shakeelb@google.com>
	Acked-by: Michal Hocko <mhocko@suse.com>
	Cc: Johannes Weiner <hannes@cmpxchg.org>
	Cc: Peter Zijlstra <peterz@infradead.org>
	Cc: Thomas Gleixner <tglx@linutronix.de>
	Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
	Cc: Waiman Long <longman@redhat.com>
	Cc: kernel test robot <oliver.sang@intel.com>
	Cc: Michal Hocko <mhocko@kernel.org>
	Cc: Michal Koutný <mkoutny@suse.com>
	Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
	Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit fead2b8)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	mm/memcontrol.c
jira KERNEL-325
cve CVE-2023-53401
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
commit-author Roman Gushchin <roman.gushchin@linux.dev>
commit 3b8abb3
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.89.1.el8_10/3b8abb32.failed

KCSAN found an issue in obj_stock_flush_required():
stock->cached_objcg can be reset between the check and dereference:

==================================================================
BUG: KCSAN: data-race in drain_all_stock / drain_obj_stock

write to 0xffff888237c2a2f8 of 8 bytes by task 19625 on cpu 0:
 drain_obj_stock+0x408/0x4e0 mm/memcontrol.c:3306
 refill_obj_stock+0x9c/0x1e0 mm/memcontrol.c:3340
 obj_cgroup_uncharge+0xe/0x10 mm/memcontrol.c:3408
 memcg_slab_free_hook mm/slab.h:587 [inline]
 __cache_free mm/slab.c:3373 [inline]
 __do_kmem_cache_free mm/slab.c:3577 [inline]
 kmem_cache_free+0x105/0x280 mm/slab.c:3602
 __d_free fs/dcache.c:298 [inline]
 dentry_free fs/dcache.c:375 [inline]
 __dentry_kill+0x422/0x4a0 fs/dcache.c:621
 dentry_kill+0x8d/0x1e0
 dput+0x118/0x1f0 fs/dcache.c:913
 __fput+0x3bf/0x570 fs/file_table.c:329
 ____fput+0x15/0x20 fs/file_table.c:349
 task_work_run+0x123/0x160 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop+0xcf/0xe0 kernel/entry/common.c:171
 exit_to_user_mode_prepare+0x6a/0xa0 kernel/entry/common.c:203
 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
 syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:296
 do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff888237c2a2f8 of 8 bytes by task 19632 on cpu 1:
 obj_stock_flush_required mm/memcontrol.c:3319 [inline]
 drain_all_stock+0x174/0x2a0 mm/memcontrol.c:2361
 try_charge_memcg+0x6d0/0xd10 mm/memcontrol.c:2703
 try_charge mm/memcontrol.c:2837 [inline]
 mem_cgroup_charge_skmem+0x51/0x140 mm/memcontrol.c:7290
 sock_reserve_memory+0xb1/0x390 net/core/sock.c:1025
 sk_setsockopt+0x800/0x1e70 net/core/sock.c:1525
 udp_lib_setsockopt+0x99/0x6c0 net/ipv4/udp.c:2692
 udp_setsockopt+0x73/0xa0 net/ipv4/udp.c:2817
 sock_common_setsockopt+0x61/0x70 net/core/sock.c:3668
 __sys_setsockopt+0x1c3/0x230 net/socket.c:2271
 __do_sys_setsockopt net/socket.c:2282 [inline]
 __se_sys_setsockopt net/socket.c:2279 [inline]
 __x64_sys_setsockopt+0x66/0x80 net/socket.c:2279
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0xffff8881382d52c0 -> 0xffff888138893740

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 19632 Comm: syz-executor.0 Not tainted 6.3.0-rc2-syzkaller-00387-g534293368afa #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023

Fix it by using READ_ONCE()/WRITE_ONCE() for all accesses to
stock->cached_objcg.

Link: https://lkml.kernel.org/r/20230502160839.361544-1-roman.gushchin@linux.dev
Fixes: bf4f059 ("mm: memcg/slab: obj_cgroup API")
	Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
	Reported-by: syzbot+774c29891415ab0fd29d@syzkaller.appspotmail.com
	Reported-by: Dmitry Vyukov <dvyukov@google.com>
  Link: https://lore.kernel.org/linux-mm/CACT4Y+ZfucZhM60YPphWiCLJr6+SGFhT+jjm8k1P-a_8Kkxsjg@mail.gmail.com/T/#t
	Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
	Acked-by: Shakeel Butt <shakeelb@google.com>
	Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
	Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 3b8abb3)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	mm/memcontrol.c
jira KERNEL-325
Rebuild_History Non-Buildable kernel-4.18.0-553.89.1.el8_10
commit-author Roman Gushchin <roman.gushchin@linux.dev>
commit f785a8f

A memcg pointer in the percpu stock can be accessed by drain_all_stock()
from another cpu in a lockless way.  In theory it might lead to an issue,
similar to the one which has been discovered with stock->cached_objcg,
where the pointer was zeroed between the check for being NULL and
dereferencing.  In this case the issue is unlikely a real problem, but to
make it bulletproof and similar to stock->cached_objcg, let's annotate all
accesses to stock->cached with READ_ONCE()/WTRITE_ONCE().

Link: https://lkml.kernel.org/r/20230502160839.361544-2-roman.gushchin@linux.dev
	Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
	Acked-by: Shakeel Butt <shakeelb@google.com>
	Cc: Dmitry Vyukov <dvyukov@google.com>
	Cc: Yosry Ahmed <yosryahmed@google.com>
	Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit f785a8f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 581414
Number of commits in rpm: 14
Number of commits matched with upstream: 8 (57.14%)
Number of commits in upstream but not in rpm: 581406
Number of commits NOT found in upstream: 6 (42.86%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.89.1.el8_10 for kernel-4.18.0-553.89.1.el8_10
Clean Cherry Picks: 4 (50.00%)
Empty Cherry Picks: 4 (50.00%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-4.18.0-553.89.1.el8_10/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
@PlaidCat PlaidCat requested a review from a team December 11, 2025 18:46
@PlaidCat PlaidCat self-assigned this Dec 11, 2025
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@bmastbergen bmastbergen requested a review from a team December 11, 2025 19:09
Copy link

@jdieter jdieter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@PlaidCat PlaidCat merged commit a8a2edf into rocky8_10 Dec 12, 2025
2 checks passed
@PlaidCat PlaidCat deleted the rocky8_10_rebuild branch December 12, 2025 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants