Rpmsg: rpmsg services should be properly destroyed after calling rpmsg_device_destroy() #17761

CV-Bowen · 2026-01-04T07:19:10Z

Summary

This PR series addresses critical issues in the rpmsg/openamp subsystem related to mutex handling and service lifecycle management.

Changes

1 openamp/libmetal: Change mutex to recursive mutex to prevent deadlock

Changed libmetal mutex implementation from regular mutex to recursive mutex (rmutex) to prevent crashes when locks are acquired twice in nested scenarios.

Problem: During the destroy process, we hold rdev->lock while iterating rdev->endpoints to destroy all endpoints. However, rpmsg_destroy_ept() may send a name service message, which attempts to acquire rdev->lock again, leading to a deadlock and crash.

Solution: Use recursive mutex in libmetal to allow the same thread to acquire the lock multiple times safely.

This change already pushed to the OpenAMP/Libmetal community: OpenAMP/libmetal#352

2 drivers/rpmsg: Remove held status check after rmutex change

Since metal_mutex has been changed to rmutex_t, removed the redundant held status checks in rpmsg drivers (rpmsg_port.c, rpmsg_virtio_lite.c, rptun.c). The recursive mutex naturally handles re-acquisition scenarios.

3. rpmsg: Call unbound_cb for server/client services during destruction

Problem: Before this patch, rpmsg services were not properly destroyed after calling rpmsg_device_destroy(). This caused errors when calling rpmsg_device_created() during a second connection attempt with the peer.

Solution: Properly invoke unbound_cb callback for services used as server/client during the destruction process to ensure complete cleanup.

Impact

Rpmsg and Rptun

Testing

cmake -B cmake_out/v8a_server -DBOARD_CONFIG=qemu-armv8a:rpserver_ivshmem -GNinja
cmake --build cmake_out/v8a_server

cmake -B cmake_out/v8a_proxy -DBOARD_CONFIG=qemu-armv8a:rpproxy_ivshmem -GNinja
cmake --build cmake_out/v8a_proxy

qemu-system-aarch64 -cpu cortex-a53 -nographic
-machine virt,virtualization=on,gic-version=3
-chardev stdio,id=con,mux=on -serial chardev:con
-object memory-backend-file,discard-data=on,id=shmmem-shmem0,mem-path=/dev/shm/my_shmem0,size=4194304,share=yes
-device ivshmem-plain,id=shmem0,memdev=shmmem-shmem0,addr=0xb
-mon chardev=con,mode=readline -kernel ./nuttx/cmake_out/v8a_server/nuttx
-gdb tcp::7775

qemu-system-aarch64 -cpu cortex-a53 -nographic
-machine virt,virtualization=on,gic-version=3
-chardev stdio,id=con,mux=on -serial chardev:con
-object memory-backend-file,discard-data=on,id=shmmem-shmem0,mem-path=/dev/shm/my_shmem0,size=4194304,share=yes
-device ivshmem-plain,id=shmem0,memdev=shmmem-shmem0,addr=0xb
-mon chardev=con,mode=readline -kernel ./nuttx/cmake_out/v8a_proxy/nuttx
-gdb tcp::7776

server log:

NuttShell (NSH) NuttX-12.10.0
server> 
server> 
server> rptun ping all 1 1 1 1
[    0.000000] [ 5] [ EMERG] [server] ping times: 1
[    0.000000] [ 5] [ EMERG] [server] buffer_len: 2032, send_len: 17
[    0.000000] [ 5] [ EMERG] [server] avg: 0 s, 18603232 ns
[    0.000000] [ 5] [ EMERG] [server] min: 0 s, 18603232 ns
[    0.000000] [ 5] [ EMERG] [server] max: 0 s, 18603232 ns
[    0.000000] [ 5] [ EMERG] [server] rate: 0.007310 Mbits/sec
server> 
server> 
server> ps
  PID GROUP PRI POLICY   TYPE    NPX STATE    EVENT     SIGMASK            STACK    USED FILLED COMMAND
    0     0   0 FIFO     Kthread   - Ready              0000000000000000 0008160 0001792  21.9%  Idle_Task
    1     0 192 FIFO     Kthread   - Waiting  Semaphore 0000000000000000 0008096 0001344  16.6%  hpwork 0x4046dc08 0x4046dc88
    2     0 100 FIFO     Kthread   - Waiting  Semaphore 0000000000000000 0008096 0001344  16.6%  lpwork 0x4046dcb8 0x4046dd38
    4     0 224 FIFO     Kthread   - Waiting  Semaphore 0000000000000000 0008096 0001776  21.9%  rptun proxy 0x40488550
    5     5 100 FIFO     Task      - Running            0000000000000000 0008128 0004240  52.1%  nsh_main
server> 
server> rptun dump all
[    0.000000] [ 5] [ EMERG] [server] Remote: proxy headrx 8
[    0.000000] [ 5] [ EMERG] [server] Dump rpmsg info between cpu (master: yes)server <==> proxy:
[    0.000000] [ 5] [ EMERG] [server] rpmsg vq RX:
[    0.000000] [ 5] [ EMERG] [server] rpmsg vq TX:
[    0.000000] [ 5] [ EMERG] [server]   rpmsg ept list:
[    0.000000] [ 5] [ EMERG] [server]     ept NS
[    0.000000] [ 5] [ EMERG] [server]     ept rpmsg-sensor
[    0.000000] [ 5] [ EMERG] [server]     ept rpmsg-ping
[    0.000000] [ 5] [ EMERG] [server]     ept rpmsg-syslog
[    0.000000] [ 5] [ EMERG] [server]   rpmsg buffer list:
[    0.000000] [ 5] [ EMERG] [server]     RX buffer, total 8, pending 0
[    0.000000] [ 5] [ EMERG] [server]     TX buffer, total 8, pending 0
server> uname -a
NuttX server 12.10.0 0cb3aa6c84 Jan  4 2026 15:10:56 arm64 qemu-armv8a
server>

To avoid the crash when lock are acquired twice, one case is in the destory process: we hold the rpdev->lock to iterate the rpdev->endpoints to destory all the endpoints, but rpmsg_destroy_ept() may send the name service message, and need acquire the rpdev->lock again to lead crash. Signed-off-by: Bowen Wang <wangbowen6@xiaomi.com>

metal_mutex has been changed to rmutex_t, so do not need check the held status again. Signed-off-by: Bowen Wang <wangbowen6@xiaomi.com>

…stroyed Before this patch, the rpmsg services will not be destroied after call the rpmsg_device_destory(). And will cause error after call rpmsg_device_created() at the secound connecting witt peer. Signed-off-by: liaoao <liaoao@xiaomi.com> Signed-off-by: Bowen Wang <wangbowen6@xiaomi.com>

CV-Bowen requested review from GUIDINGLI, anchao, jerpelea, xiaoxiang781216 and yf13 as code owners January 4, 2026 07:19

github-actions bot added Area: Drivers Drivers issues Size: M The size of the change in this PR is medium labels Jan 4, 2026

CV-Bowen force-pushed the libmetal-mutex branch from 0cb3aa6 to 78d92c4 Compare January 4, 2026 07:25

CV-Bowen and others added 3 commits January 4, 2026 15:28

drivers/rpmsg: no need check held status after change to rmutex

adda070

metal_mutex has been changed to rmutex_t, so do not need check the held status again. Signed-off-by: Bowen Wang <wangbowen6@xiaomi.com>

CV-Bowen force-pushed the libmetal-mutex branch from 78d92c4 to f5739e1 Compare January 4, 2026 07:29

CV-Bowen mentioned this pull request Jan 4, 2026

mutex: change the libmetal nuttx mutex to recursive mutex OpenAMP/libmetal#352

Open

CV-Bowen changed the title ~~Should hold lock when access the rdev->endpoints~~ Rpmsg: rpmsg services should be properly destroyed after calling rpmsg_device_destroy() Jan 4, 2026

acassis approved these changes Jan 4, 2026

View reviewed changes

xiaoxiang781216 approved these changes Jan 4, 2026

View reviewed changes

xiaoxiang781216 merged commit d019a63 into apache:master Jan 4, 2026
40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rpmsg: rpmsg services should be properly destroyed after calling rpmsg_device_destroy() #17761

Rpmsg: rpmsg services should be properly destroyed after calling rpmsg_device_destroy() #17761

Uh oh!

CV-Bowen commented Jan 4, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Rpmsg: rpmsg services should be properly destroyed after calling rpmsg_device_destroy() #17761

Rpmsg: rpmsg services should be properly destroyed after calling rpmsg_device_destroy() #17761

Uh oh!

Conversation

CV-Bowen commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

1 openamp/libmetal: Change mutex to recursive mutex to prevent deadlock

2 drivers/rpmsg: Remove held status check after rmutex change

3. rpmsg: Call unbound_cb for server/client services during destruction

Impact

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CV-Bowen commented Jan 4, 2026 •

edited

Loading