Skip to content

Conversation

@PlaidCat
Copy link
Collaborator

@PlaidCat PlaidCat commented Dec 4, 2025

General Process:

Checking Rebuild Commits for Potentially missing commits:

kernel-4.18.0-553.87.1.el8_10

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-4.18.0-553.87.1.el8_10/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 581414
Number of commits in rpm: 21
Number of commits matched with upstream: 14 (66.67%)
Number of commits in upstream but not in rpm: 581400
Number of commits NOT found in upstream: 7 (33.33%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.87.1.el8_10 for kernel-4.18.0-553.87.1.el8_10
Clean Cherry Picks: 10 (71.43%)
Empty Cherry Picks: 4 (28.57%)
_______________________________

__EMPTY COMMITS__________________________
c2ed087ed35ca569d8179924ba560be248c758e5 powerpc: Add Power11 architected and raw mode
0af1561b2d60bab2a2b00720a5c7b292ecc549ec smb: client: fix race with concurrent opens in unlink(2)
d84291fc7453df7881a970716f8256273aca5747 smb: client: fix race with concurrent opens in rename(2)
d613f53c83ec47089c4e25859d5e8e0359f6f8da mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memory

__CHANGES NOT IN UPSTREAM________________
Adding prod certs and changed cert date to 20210620
Adding Rocky secure boot certs
Fixing vmlinuz removal
Fixing UEFI CA path
Porting to 8.10, debranding and Rocky branding
Fixing pesign_key_name values
arch/powerpc: commandline option to enable P11 support

Build

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree-build
Running make mrproper...
  CLEAN   scripts/basic
  CLEAN   scripts/kconfig
[TIMER]{MRPROPER}: 5s
x86_64 architecture detected, copying config
'configs/kernel-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky8_10_rebuild-d79f08d232d6"
Making olddefconfig
--
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf  --olddefconfig Kconfig
#
# configuration written to .config
#
Starting Build
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
--
  LD [M]  sound/usb/usx2y/snd-usb-usx2y.ko
  LD [M]  sound/virtio/virtio_snd.ko
  LD [M]  sound/x86/snd-hdmi-lpe-audio.ko
  LD [M]  sound/xen/snd_xen_front.ko
  LD [M]  virt/lib/irqbypass.ko
[TIMER]{BUILD}: 1463s
Making Modules
  INSTALL arch/x86/crypto/blowfish-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx2.ko
  INSTALL arch/x86/crypto/camellia-x86_64.ko
--
  INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-rocky8_10_rebuild-d79f08d232d6+
[TIMER]{MODULES}: 12s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-rocky8_10_rebuild-d79f08d232d6+ arch/x86/boot/bzImage \
        System.map "/boot"
[TIMER]{INSTALL}: 21s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-rocky8_10_rebuild-d79f08d232d6+ and Index to 0
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 5s
[TIMER]{BUILD}: 1463s
[TIMER]{MODULES}: 12s
[TIMER]{INSTALL}: 21s
[TIMER]{TOTAL} 1506s
Rebooting in 10 seconds

KSelfTests

[jmaple@devbox code]$ ~/workspace/auto_kernel_history_rebuild/Rocky10/rocky10/code/get_kselftest_diff.sh
kselftest.4.18.0-rocky8_10_rebuild-48e11f31ca38+.log
207
kselftest.4.18.0-rocky8_10_rebuild-d6bebad94beb+.log
207
kselftest.4.18.0-rocky8_10_rebuild-f01f784daddc+.log
207
kselftest.4.18.0-rocky8_10_rebuild-d79f08d232d6+.log
207
Before: kselftest.4.18.0-rocky8_10_rebuild-f01f784daddc+.log
After: kselftest.4.18.0-rocky8_10_rebuild-d79f08d232d6+.log
Diff:
No differences found.

jira KERNEL-249
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Madhavan Srinivasan <maddy@linux.ibm.com>
commit c2ed087
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.87.1.el8_10/c2ed087e.failed

Add CPU table entries for raw and architected mode. Most fields are
copied from the Power10 table entries.

CPU, MMU and user (ELF_HWCAP) features are unchanged vs P10. However
userspace can detect P11 because the AT_PLATFORM value changes to
"power11".

The logical PVR value of 0x0F000007, passed to firmware via the
ibm_arch_vec, indicates the kernel can support a P11 compatible CPU,
which means at least ISA v3.1 compliant.

	Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240221044623.1598642-1-mpe@ellerman.id.au

(cherry picked from commit c2ed087)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/powerpc/include/asm/cputable.h
#	arch/powerpc/kernel/cpu_specs_book3s_64.h
#	arch/powerpc/kernel/prom_init.c
jira KERNEL-249
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Nick Child <nick.child@ibm.com>
commit c49f5d8

Some functions defined in 'arch/powerpc/perf' are deserving of an
`__init` macro attribute. These functions are only called by other
initialization functions and therefore should inherit the attribute.
Also, change function declarations in header files to include `__init`.

	Signed-off-by: Nick Child <nick.child@ibm.com>
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211216220035.605465-5-nick.child@ibm.com
(cherry picked from commit c49f5d8)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-249
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Madhavan Srinivasan <maddy@linux.ibm.com>
commit b22ea62

Base enablement patch to register performance monitoring
hardware support for Power11. Most of fields are copied
from power10_pmu struct for power11_pmu struct.

	Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
	Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240221044623.1598642-2-mpe@ellerman.id.au

(cherry picked from commit b22ea62)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…x prefix

jira KERNEL-249
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Athira Rajeev <atrajeev@linux.vnet.ibm.com>
commit f6a66ff

Simple expression parser test fails in powerpc as below:

    4: Simple expression parser
    test child forked, pid 170385
    Using CPUID 004e2102
    division by zero
    syntax error
    syntax error
    FAILED tests/expr.c:65 parse test failed
    test child finished with -1
    Simple expression parser: FAILED!

This is observed after commit:
'commit 9d5da30 ("perf jevents: Add a new expression builtin strcmp_cpuid_str()")'

With this commit, a new expression builtin strcmp_cpuid_str
got added. This function takes an 'ID' type value, which is
a string. So expression parse for strcmp_cpuid_str expects
const char * as cpuid value type. In case of powerpc, CPU IDs
are numbers. Hence it doesn't get interpreted correctly by
bison parser. Example in case of power9, cpuid string returns
as: 004e2102

cpuid of string type is expected in two cases:
1. char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused);

   Testcase "tests/expr.c" uses "perf_pmu__getcpuid" which calls
   get_cpuid_str to get the cpuid string.

2. cpuid field in  :struct pmu_events_map

   struct pmu_events_map {
           const char *arch;
	   const char *cpuid;

   Here cpuid field is used in "perf_pmu__find_events_table"
   function as "strcmp_cpuid_str(map->cpuid, cpuid)". The
   value for cpuid field is picked from mapfile.csv.

Fix the mapfile.csv and get_cpuid_str function to prefix
cpuid with 0x so that it gets correctly interpreted by
the bison parser

	Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
	Tested-by: Disha Goel<disgoel@linux.ibm.com>
	Cc: kjain@linux.ibm.com
	Cc: maddy@linux.ibm.com
	Cc: disgoel@linux.vnet.ibm.com
	Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20231009050052.64935-1-atrajeev@linux.vnet.ibm.com
	Signed-off-by: Namhyung Kim <namhyung@kernel.org>
(cherry picked from commit f6a66ff)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…itecture

jira KERNEL-249
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author JiaLong.Yang <jialong.yang@shingroup.cn>
commit ac254df

HX-C2000 is a new CPU made by HEXIN Technologies Co., Ltd. And a new PVN
0x0066 has been applied from the OpenPower Community for this CPU.

Here is a patch to make perf tool run in the CPU.

	Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com>
	Signed-off-by: JiaLong.Yang <jialong.yang@shingroup.cn>
	Cc: Adrian Hunter <adrian.hunter@intel.com>
	Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
	Cc: Ian Rogers <irogers@google.com>
	Cc: Ingo Molnar <mingo@redhat.com>
	Cc: Jiri Olsa <jolsa@kernel.org>
	Cc: Mark Rutland <mark.rutland@arm.com>
	Cc: Namhyung Kim <namhyung@kernel.org>
	Cc: Peter Zijlstra <peterz@infradead.org>
	Cc: shenghui.qu@shingroup.cn
	Cc: Zhao Ke <ke.zhao@shingroup.cn>
	Cc: zhijie.ren@shingroup.cn
Link: https://lore.kernel.org/r/20231221060242.4532-1-jialong.yang@shingroup.cn
	Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
(cherry picked from commit ac254df)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-249
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Madhavan Srinivasan <maddy@linux.ibm.com>
commit e024fa6

Update the Power11 PVR to json mapfile to enable
json events. Power11 is PowerISA v3.1 compliant
and support Power10 events.

	Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
	Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
	Cc: atrajeev@linux.vnet.ibm.com
	Cc: disgoel@linux.vnet.ibm.com
	Cc: linuxppc-dev@lists.ozlabs.org
	Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240129120855.551529-1-maddy@linux.ibm.com
(cherry picked from commit e024fa6)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-249
cve CVE-2023-53513
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Zhong Jinghua <zhongjinghua@huawei.com>
commit 55793ea

We tested and found an alarm caused by nbd_ioctl arg without verification.
The UBSAN warning calltrace like below:

UBSAN: Undefined behaviour in fs/buffer.c:1709:35
signed integer overflow:
-9223372036854775808 - 1 cannot be represented in type 'long long int'
CPU: 3 PID: 2523 Comm: syz-executor.0 Not tainted 4.19.90 #1
Hardware name: linux,dummy-virt (DT)
Call trace:
 dump_backtrace+0x0/0x3f0 arch/arm64/kernel/time.c:78
 show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x170/0x1dc lib/dump_stack.c:118
 ubsan_epilogue+0x18/0xb4 lib/ubsan.c:161
 handle_overflow+0x188/0x1dc lib/ubsan.c:192
 __ubsan_handle_sub_overflow+0x34/0x44 lib/ubsan.c:206
 __block_write_full_page+0x94c/0xa20 fs/buffer.c:1709
 block_write_full_page+0x1f0/0x280 fs/buffer.c:2934
 blkdev_writepage+0x34/0x40 fs/block_dev.c:607
 __writepage+0x68/0xe8 mm/page-writeback.c:2305
 write_cache_pages+0x44c/0xc70 mm/page-writeback.c:2240
 generic_writepages+0xdc/0x148 mm/page-writeback.c:2329
 blkdev_writepages+0x2c/0x38 fs/block_dev.c:2114
 do_writepages+0xd4/0x250 mm/page-writeback.c:2344

The reason for triggering this warning is __block_write_full_page()
-> i_size_read(inode) - 1 overflow.
inode->i_size is assigned in __nbd_ioctl() -> nbd_set_size() -> bytesize.
We think it is necessary to limit the size of arg to prevent errors.

Moreover, __nbd_ioctl() -> nbd_add_socket(), arg will be cast to int.
Assuming the value of arg is 0x80000000000000001) (on a 64-bit machine),
it will become 1 after the coercion, which will return unexpected results.

Fix it by adding checks to prevent passing in too large numbers.

	Signed-off-by: Zhong Jinghua <zhongjinghua@huawei.com>
	Reviewed-by: Yu Kuai <yukuai3@huawei.com>
	Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20230206145805.2645671-1-zhongjinghua@huawei.com
	Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 55793ea)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-249
cve CVE-2025-38724
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Jeff Layton <jlayton@kernel.org>
commit 908e4ea

Lei Lu recently reported that nfsd4_setclientid_confirm() did not check
the return value from get_client_locked(). a SETCLIENTID_CONFIRM could
race with a confirmed client expiring and fail to get a reference. That
could later lead to a UAF.

Fix this by getting a reference early in the case where there is an
extant confirmed client. If that fails then treat it as if there were no
confirmed client found at all.

In the case where the unconfirmed client is expiring, just fail and
return the result from get_client_locked().

	Reported-by: lei lu <llfamsec@gmail.com>
Closes: https://lore.kernel.org/linux-nfs/CAEBF3_b=UvqzNKdnfD_52L05Mqrqui9vZ2eFamgAbV0WG+FNWQ@mail.gmail.com/
Fixes: d20c11d ("nfsd: Protect session creation and client confirm using client_lock")
	Cc: stable@vger.kernel.org
	Signed-off-by: Jeff Layton <jlayton@kernel.org>
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
(cherry picked from commit 908e4ea)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-249
cve CVE-2025-39898
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Vitaly Lifshits <vitaly.lifshits@intel.com>
commit 90fb7db

Fix a possible heap overflow in e1000_set_eeprom function by adding
input validation for the requested length of the change in the EEPROM.
In addition, change the variable type from int to size_t for better
code practices and rearrange declarations to RCT.

	Cc: stable@vger.kernel.org
Fixes: bc7f75f ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
Co-developed-by: Mikael Wessel <post@mikaelkw.online>
	Signed-off-by: Mikael Wessel <post@mikaelkw.online>
	Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
	Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit 90fb7db)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-249
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Paulo Alcantara <pc@manguebit.org>
commit 0af1561
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.87.1.el8_10/0af1561b.failed

According to some logs reported by customers, CIFS client might end up
reporting unlinked files as existing in stat(2) due to concurrent
opens racing with unlink(2).

Besides sending the removal request to the server, the unlink process
could involve closing any deferred close as well as marking all
existing open handles as deleted to prevent them from deferring
closes, which increases the race window for potential concurrent
opens.

Fix this by unhashing the dentry in cifs_unlink() to prevent any
subsequent opens.  Any open attempts, while we're still unlinking,
will block on parent's i_rwsem.

	Reported-by: Jay Shin <jaeshin@redhat.com>
	Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
	Reviewed-by: David Howells <dhowells@redhat.com>
	Cc: Al Viro <viro@zeniv.linux.org.uk>
	Cc: linux-cifs@vger.kernel.org
	Signed-off-by: Steve French <stfrench@microsoft.com>
(cherry picked from commit 0af1561)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	fs/cifs/inode.c
jira KERNEL-249
cve CVE-2025-39825
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Paulo Alcantara <pc@manguebit.org>
commit d84291f
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.87.1.el8_10/d84291fc.failed

Besides sending the rename request to the server, the rename process
also involves closing any deferred close, waiting for outstanding I/O
to complete as well as marking all existing open handles as deleted to
prevent them from deferring closes, which increases the race window
for potential concurrent opens on the target file.

Fix this by unhashing the dentry in advance to prevent any concurrent
opens on the target.

	Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
	Reviewed-by: David Howells <dhowells@redhat.com>
	Cc: Al Viro <viro@zeniv.linux.org.uk>
	Cc: linux-cifs@vger.kernel.org
	Signed-off-by: Steve French <stfrench@microsoft.com>
(cherry picked from commit d84291f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	fs/cifs/inode.c
…on memory

jira KERNEL-249
cve CVE-2025-39883
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Miaohe Lin <linmiaohe@huawei.com>
commit d613f53
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.87.1.el8_10/d613f53c.failed

When I did memory failure tests, below panic occurs:

page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page))
kernel BUG at include/linux/page-flags.h:616!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 unpoison_memory+0x2f3/0x590
 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
 debugfs_attr_write+0x42/0x60
 full_proxy_write+0x5b/0x80
 vfs_write+0xd5/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f08f0314887
RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887
RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001
RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00
 </TASK>
Modules linked in: hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:unpoison_memory+0x2f3/0x590
RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246
RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0
RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb
R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000
R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe
FS:  00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---

The root cause is that unpoison_memory() tries to check the PG_HWPoison
flags of an uninitialized page.  So VM_BUG_ON_PAGE(PagePoisoned(page)) is
triggered.  This can be reproduced by below steps:

1.Offline memory block:

 echo offline > /sys/devices/system/memory/memory12/state

2.Get offlined memory pfn:

 page-types -b n -rlN

3.Write pfn to unpoison-pfn

 echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn

This scenario can be identified by pfn_to_online_page() returning NULL.
And ZONE_DEVICE pages are never expected, so we can simply fail if
pfn_to_online_page() == NULL to fix the bug.

Link: https://lkml.kernel.org/r/20250828024618.1744895-1-linmiaohe@huawei.com
Fixes: f1dd2cd ("mm, memory_hotplug: do not associate hotadded memory to zones until online")
	Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
	Suggested-by: David Hildenbrand <david@redhat.com>
	Acked-by: David Hildenbrand <david@redhat.com>
	Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
	Cc: <stable@vger.kernel.org>
	Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit d613f53)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	mm/memory-failure.c
jira KERNEL-249
cve CVE-2025-39955
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Kuniyuki Iwashima <kuniyu@google.com>
commit 45c8a6c

syzbot reported the splat below where a socket had tcp_sk(sk)->fastopen_rsk
in the TCP_ESTABLISHED state. [0]

syzbot reused the server-side TCP Fast Open socket as a new client before
the TFO socket completes 3WHS:

  1. accept()
  2. connect(AF_UNSPEC)
  3. connect() to another destination

As of accept(), sk->sk_state is TCP_SYN_RECV, and tcp_disconnect() changes
it to TCP_CLOSE and makes connect() possible, which restarts timers.

Since tcp_disconnect() forgot to clear tcp_sk(sk)->fastopen_rsk, the
retransmit timer triggered the warning and the intended packet was not
retransmitted.

Let's call reqsk_fastopen_remove() in tcp_disconnect().

[0]:
WARNING: CPU: 2 PID: 0 at net/ipv4/tcp_timer.c:542 tcp_retransmit_timer (net/ipv4/tcp_timer.c:542 (discriminator 7))
Modules linked in:
CPU: 2 UID: 0 PID: 0 Comm: swapper/2 Not tainted 6.17.0-rc5-g201825fb4278 #62 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:tcp_retransmit_timer (net/ipv4/tcp_timer.c:542 (discriminator 7))
Code: 41 55 41 54 55 53 48 8b af b8 08 00 00 48 89 fb 48 85 ed 0f 84 55 01 00 00 0f b6 47 12 3c 03 74 0c 0f b6 47 12 3c 04 74 04 90 <0f> 0b 90 48 8b 85 c0 00 00 00 48 89 ef 48 8b 40 30 e8 6a 4f 06 3e
RSP: 0018:ffffc900002f8d40 EFLAGS: 00010293
RAX: 0000000000000002 RBX: ffff888106911400 RCX: 0000000000000017
RDX: 0000000002517619 RSI: ffffffff83764080 RDI: ffff888106911400
RBP: ffff888106d5c000 R08: 0000000000000001 R09: ffffc900002f8de8
R10: 00000000000000c2 R11: ffffc900002f8ff8 R12: ffff888106911540
R13: ffff888106911480 R14: ffff888106911840 R15: ffffc900002f8de0
FS:  0000000000000000(0000) GS:ffff88907b768000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8044d69d90 CR3: 0000000002c30003 CR4: 0000000000370ef0
Call Trace:
 <IRQ>
 tcp_write_timer (net/ipv4/tcp_timer.c:738)
 call_timer_fn (kernel/time/timer.c:1747)
 __run_timers (kernel/time/timer.c:1799 kernel/time/timer.c:2372)
 timer_expire_remote (kernel/time/timer.c:2385 kernel/time/timer.c:2376 kernel/time/timer.c:2135)
 tmigr_handle_remote_up (kernel/time/timer_migration.c:944 kernel/time/timer_migration.c:1035)
 __walk_groups.isra.0 (kernel/time/timer_migration.c:533 (discriminator 1))
 tmigr_handle_remote (kernel/time/timer_migration.c:1096)
 handle_softirqs (./arch/x86/include/asm/jump_label.h:36 ./include/trace/events/irq.h:142 kernel/softirq.c:580)
 irq_exit_rcu (kernel/softirq.c:614 kernel/softirq.c:453 kernel/softirq.c:680 kernel/softirq.c:696)
 sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1050 (discriminator 35) arch/x86/kernel/apic/apic.c:1050 (discriminator 35))
 </IRQ>

Fixes: 8336886 ("tcp: TCP Fast Open Server - support TFO listeners")
	Reported-by: syzkaller <syzkaller@googlegroups.com>
	Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250915175800.118793-2-kuniyu@google.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 45c8a6c)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira KERNEL-249
Rebuild_History Non-Buildable kernel-4.18.0-553.87.1.el8_10
commit-author Kuniyuki Iwashima <kuniyu@google.com>
commit 2e7cbbb

syzbot reported the splat below in tcp_conn_request(). [0]

If a listener is close()d while a TFO socket is being processed in
tcp_conn_request(), inet_csk_reqsk_queue_add() does not set reqsk->sk
and calls inet_child_forget(), which calls tcp_disconnect() for the
TFO socket.

After the cited commit, tcp_disconnect() calls reqsk_fastopen_remove(),
where reqsk_put() is called due to !reqsk->sk.

Then, reqsk_fastopen_remove() in tcp_conn_request() decrements the
last req->rsk_refcnt and frees reqsk, and __reqsk_free() at the
drop_and_free label causes the refcount underflow for the listener
and double-free of the reqsk.

Let's remove reqsk_fastopen_remove() in tcp_conn_request().

Note that other callers make sure tp->fastopen_rsk is not NULL.

[0]:
refcount_t: underflow; use-after-free.
WARNING: CPU: 12 PID: 5563 at lib/refcount.c:28 refcount_warn_saturate (lib/refcount.c:28)
Modules linked in:
CPU: 12 UID: 0 PID: 5563 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
RIP: 0010:refcount_warn_saturate (lib/refcount.c:28)
Code: ab e8 8e b4 98 ff 0f 0b c3 cc cc cc cc cc 80 3d a4 e4 d6 01 00 75 9c c6 05 9b e4 d6 01 01 48 c7 c7 e8 df fb ab e8 6a b4 98 ff <0f> 0b e9 03 5b 76 00 cc 80 3d 7d e4 d6 01 00 0f 85 74 ff ff ff c6
RSP: 0018:ffffa79fc0304a98 EFLAGS: 00010246
RAX: d83af4db1c6b3900 RBX: ffff9f65c7a69020 RCX: d83af4db1c6b3900
RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: ffffffffac78a280
RBP: 000000009d781b60 R08: 0000000000007fff R09: ffffffffac6ca280
R10: 0000000000017ffd R11: 0000000000000004 R12: ffff9f65c7b4f100
R13: ffff9f65c7d23c00 R14: ffff9f65c7d26000 R15: ffff9f65c7a64ef8
FS:  00007f9f962176c0(0000) GS:ffff9f65fcf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000000180 CR3: 000000000dbbe006 CR4: 0000000000372ef0
Call Trace:
 <IRQ>
 tcp_conn_request (./include/linux/refcount.h:400 ./include/linux/refcount.h:432 ./include/linux/refcount.h:450 ./include/net/sock.h:1965 ./include/net/request_sock.h:131 net/ipv4/tcp_input.c:7301)
 tcp_rcv_state_process (net/ipv4/tcp_input.c:6708)
 tcp_v6_do_rcv (net/ipv6/tcp_ipv6.c:1670)
 tcp_v6_rcv (net/ipv6/tcp_ipv6.c:1906)
 ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:438)
 ip6_input (net/ipv6/ip6_input.c:500)
 ipv6_rcv (net/ipv6/ip6_input.c:311)
 __netif_receive_skb (net/core/dev.c:6104)
 process_backlog (net/core/dev.c:6456)
 __napi_poll (net/core/dev.c:7506)
 net_rx_action (net/core/dev.c:7569 net/core/dev.c:7696)
 handle_softirqs (kernel/softirq.c:579)
 do_softirq (kernel/softirq.c:480)
 </IRQ>

Fixes: 45c8a6c ("tcp: Clear tcp_sk(sk)->fastopen_rsk in tcp_disconnect().")
	Reported-by: syzkaller <syzkaller@googlegroups.com>
	Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251001233755.1340927-1-kuniyu@google.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 2e7cbbb)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 581414
Number of commits in rpm: 21
Number of commits matched with upstream: 14 (66.67%)
Number of commits in upstream but not in rpm: 581400
Number of commits NOT found in upstream: 7 (33.33%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.87.1.el8_10 for kernel-4.18.0-553.87.1.el8_10
Clean Cherry Picks: 10 (71.43%)
Empty Cherry Picks: 4 (28.57%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-4.18.0-553.87.1.el8_10/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
@PlaidCat PlaidCat requested a review from a team December 4, 2025 19:09
@PlaidCat PlaidCat self-assigned this Dec 4, 2025
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link

@jdieter jdieter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PlaidCat PlaidCat merged commit d79f08d into rocky8_10 Dec 5, 2025
2 checks passed
@PlaidCat PlaidCat deleted the rocky8_10_rebuild branch December 5, 2025 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants