dmaengine: adi-dma: fix use-after-free in cyclic transfers by pamolloy · Pull Request #3112 · analogdevicesinc/linux

pamolloy · 2026-02-06T08:46:21Z

A customer reported a bug and proposed a fix for a kernel crash. The following stack trace is generated by simulating that crash.

I generated the patch in this PR based on the above. It will require more thorough review and testing. The comments are also fairly verbose and should likely get dropped before merging.

[  151.146581] Unable to handle kernel paging request at virtual address dead000000000108
[  151.154506] Mem abort info:
[  151.157169]   ESR = 0x96000044
[  151.160190]   EC = 0x25: DABT (current EL), IL = 32 bits
[  151.165497]   SET = 0, FnV = 0
[  151.168509]   EA = 0, S1PTW = 0
[  151.171639]   FSC = 0x04: level 0 translation fault
[  151.176498] Data abort info:
[  151.179357]   ISV = 0, ISS = 0x00000044
[  151.183185]   CM = 0, WnR = 1
[  151.186119] [dead000000000108] address between user and kernel address ranges
[  151.193281] Internal error: Oops: 96000044 [#1] PREEMPT SMP
[  151.198789] Modules linked in: mod(OE) crct10dif_ce(E) spi_cadence_quadspi(E) fuse(E) xl4ptpextts(OE) overlay(E) adi_rpmsg(E) alap_ipc(OE)
[  151.211206] CPU: 0 PID: 2059 Comm: sim_handler Tainted: G           OE     5.15.78-yocto-standard #1
[  151.220315] Hardware name: MARS L1 US (DT)
[  151.224392] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  151.231337] pc : handler_fn+0x9c/0xcc [mod]
[  151.235501] Physical pc : 0x000000007aa4116c
[  151.239755] lr : handler_fn+0x98/0xcc [mod]
[  151.243921] sp : ffff80000d39be20
[  151.247219] Physical sp : 0x000000008759be20
[  151.251481] pmr_save: 000000e0
[  151.254511] x29: ffff80000d39be20 x28: 0000000000000000 x27: 0000000000000000
[  151.261631] x26: ffff000017f4fec0 x25: ffff800008cd17d8 x24: dead000000000122
[  151.268748] x23: dead000000000100 x22: ffff800000842078 x21: ffff00001b522e00
[  151.275866] x20: ffff800000843000 x19: ffff8000008433c0 x18: fffffffffffed110
[  151.282984] x17: ffff800015544000 x16: ffff800008004000 x15: 0000000000000030
[  151.290101] x14: 00000000000002cd x13: ffff80000d39bae0 x12: ffff800008c10b88
[  151.297219] x11: fffffffffffed110 x10: ffff800008c10b88 x9 : ffff800008bb8b88
[  151.304337] x8 : 00000000ffffefff x7 : ffff800008c10b88 x6 : 0000000000000000
[  151.311454] x5 : ffff00001dfa1710 x4 : 0000000000000000 x3 : 0000000000000000
[  151.318571] x2 : 0000000000000000 x1 : dead000000000100 x0 : dead000000000122
[  151.325690] Call trace:
[  151.328118]  handler_fn+0x9c/0xcc [mod]
[  151.331937]  kthread+0x144/0x150
[  151.335148]  ret_from_fork+0x10/0x20
[  151.338713] Code: b0000000 9103c000 95fdd7b7 a94002a1 (f9000420) 
[  151.344790] ---[ end trace d1f7840f61b087da ]---
[  151.361967] Kernel panic - not syncing: Oops: Fatal exception
[  151.367575] Kernel Offset: disabled
[  151.371013] CPU features: 0xc0000101,23380e02
[  151.375353] Memory Limit: none

A more trivial fix would be as follows and more clearly identifies the problem:

  diff --git a/drivers/dma/adi-dma.c b/drivers/dma/adi-dma.c                                             
  index b631b5e0aaa2b..xxxxxxxxxxxxx 100644              
  --- a/drivers/dma/adi-dma.c                                                                            
  +++ b/drivers/dma/adi-dma.c                            
  @@ -896,10 +896,11 @@ static irqreturn_t adi_dma_thread_handler(int irq, void *id)

        spin_lock_irqsave(&channel->lock, flags);

        if (channel->current_desc && channel->current_desc->cyclic) {
  +             struct dmaengine_result result = channel->current_desc->result;
                dmaengine_desc_get_callback(&channel->current_desc->tx, &cb);

                spin_unlock_irqrestore(&channel->lock, flags);
  -             dmaengine_desc_callback_invoke(&cb, &channel->current_desc->result);
  +             dmaengine_desc_callback_invoke(&cb, &result);
                return IRQ_HANDLED;
        }

nunojsa · 2026-02-06T09:51:09Z

A more trivial fix would be as follows and more clearly identifies the problem:

Agreed! Claude overdid it IMO. I would go with your proposed fix as something temporary... Looking at the driver it seems like a big refactor is needed anyways.

pamolloy · 2026-02-06T10:21:13Z

A more trivial fix would be as follows and more clearly identifies the problem:

Agreed! Claude overdid it IMO. I would go with your proposed fix as something temporary... Looking at the driver it seems like a big refactor is needed anyways.

Yah, I dug a little deeper, especially since it removed the following comment:

	// Cyclic interrupts do not use the pending list to avoid complications
	// during dma termination

And after some prompting it concluded:

  1. Cyclic descriptors are added to cb_pending AND remain in current_desc                               
  2. Thread handler moves them to a local stack list and unlocks
  3. While processing the local list without locks, terminate_all() can free the descriptor via          
  current_desc                                           
  4. Both pointers reference the same memory - freeing one invalidates the other

  This is why the original comment warned "Cyclic interrupts do not use the pending list to avoid
  complications during dma termination."

  The minimal fix (copying result before unlocking) avoids this entirely. Alternatively, we need proper
  synchronization in adi_dma_synchronize() to wait for thread handlers before freeing.

Before I saw that comment I was confused why the cyclic and non-cyclic approaches were different.

nunojsa · 2026-02-06T10:36:23Z

Alternatively, we need proper synchronization in adi_dma_synchronize() to wait for thread handlers before freeing.

Probably over complicating again 😄 . That driver looks like a typical out of tree driver so we'll have to refactor it a bit. So, IMO, I would go with your fix. Yeah, we have a copy in there but it's a small struct and I would not expect overhead at all

A race condition exists in the threaded interrupt handler when processing cyclic DMA descriptors. The handler accesses channel->current_desc->result after releasing the channel lock, creating a window where another CPU executing adi_dma_terminate_all() can free the descriptor. Race sequence: CPU 0 (thread handler) CPU 1 (terminate_all) ---------------------- --------------------- Lock channel->lock Check current_desc->cyclic Get callback pointer Unlock channel->lock Lock channel->lock Free current_desc Unlock channel->lock Access current_desc->result <- Use-after-free This results in accessing freed memory containing list poison values (0xdead000000000100), leading to kernel crashes. Fix this by copying the result structure to a local variable while still holding the lock, then using the copy after unlocking. This follows the same safe pattern used in the non-cyclic path, which already uses a local descriptor pointer. The fix is minimal and avoids the complications warned about in the original code comment regarding cyclic descriptors and the pending list during termination. Signed-off-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Philip Molloy <philip.molloy@analog.com>

pamolloy · 2026-02-06T10:41:36Z

Before I force push, here is the more complicated patch for future reference:

commit 4bd572081a1a1ffa8267b4c91608bfe318f8c698
Author: Philip Molloy <philip@philipmolloy.com>
Date:   Fri Feb 6 09:38:30 2026 +0100

    dmaengine: adi-dma: fix use-after-free in cyclic transfers
    
    A race condition exists in the threaded interrupt handler when processing
    cyclic DMA descriptors. The handler accesses channel->current_desc after
    releasing the channel lock, creating a window where another CPU executing
    adi_dma_terminate_all() can free the descriptor.
    
    Race sequence:
      CPU 0 (thread handler)          CPU 1 (terminate)
      ----------------------          -----------------
      Lock channel->lock
      Check current_desc->cyclic
      Get callback pointer
      Unlock channel->lock
                                      Lock channel->lock
                                      Free current_desc
                                      Set current_desc = NULL
                                      Unlock channel->lock
      Access current_desc->result  <- Use-after-free
    
    This results in accessing freed memory containing list poison values
    (0xdead000000000100), leading to kernel crashes.
    
    The original code had two separate paths: cyclic descriptors were
    handled via current_desc with special-case code, while non-cyclic
    descriptors used the cb_pending list. This created the vulnerability
    and added unnecessary complexity.
    
    Fix this by unifying both paths to use the cb_pending list:
    
    1. Modify __adi_dma_handler() to add both cyclic and non-cyclic
       descriptors to cb_pending. Cyclic descriptors remain in current_desc
       to continue running, but their callbacks are now queued safely.
    
    2. Rewrite adi_dma_thread_handler() to use a local list:
       - Lock once and move all cb_pending entries to a local list
       - Unlock immediately
       - Process all callbacks without holding locks
       - Handle cyclic vs non-cyclic differences (cookie completion
         and descriptor freeing) uniformly
    
    This approach eliminates the use-after-free by ensuring descriptor
    access happens through locally-held pointers rather than dereferencing
    through shared channel state after unlocking. It also improves
    performance by minimizing lock hold time and eliminating lock cycling.
    
    Signed-off-by: Claude Opus 4.6 <noreply@anthropic.com>
    Signed-off-by: Philip Molloy <philip@philipmolloy.com>

diff --git a/drivers/dma/adi-dma.c b/drivers/dma/adi-dma.c
index b631b5e0aaa2b..a3a3e9e0a27a2 100644
--- a/drivers/dma/adi-dma.c
+++ b/drivers/dma/adi-dma.c
@@ -835,13 +835,15 @@ static irqreturn_t __adi_dma_handler(struct adi_dma_channel *channel,
 	desc->result.result = result;
 	desc->result.residue = 0;
 
-	// Cyclic interrupts do not use the pending list to avoid complications
-	// during dma termination
+	// Add all descriptors to cb_pending for unified processing in thread handler
+	list_add_tail(&desc->cb_node, &channel->cb_pending);
+
+	// Non-cyclic descriptors are removed from current_desc after completion
 	if (!desc->cyclic) {
-		list_add_tail(&desc->cb_node, &channel->cb_pending);
 		channel->current_desc = NULL;
 		__issue_pending(channel);
 	}
+	// Cyclic descriptors remain in current_desc and continue running
 
 done:
 	spin_unlock(&channel->lock);
@@ -892,33 +894,36 @@ static irqreturn_t adi_dma_thread_handler(int irq, void *id)
 	struct adi_dma_descriptor *desc;
 	struct dmaengine_desc_callback cb;
 	unsigned long flags;
+	LIST_HEAD(local_list);
 
 	spin_lock_irqsave(&channel->lock, flags);
 
-	if (channel->current_desc && channel->current_desc->cyclic) {
-		dmaengine_desc_get_callback(&channel->current_desc->tx, &cb);
+	// Move all pending callbacks to local list for processing without lock
+	if (!list_empty(&channel->cb_pending))
+		list_splice_tail_init(&channel->cb_pending, &local_list);
 
-		spin_unlock_irqrestore(&channel->lock, flags);
-		dmaengine_desc_callback_invoke(&cb, &channel->current_desc->result);
-		return IRQ_HANDLED;
-	}
+	spin_unlock_irqrestore(&channel->lock, flags);
 
-	while (!list_empty(&channel->cb_pending)) {
-		desc = list_first_entry(&channel->cb_pending, struct adi_dma_descriptor,
+	// Process all callbacks without holding the channel lock
+	while (!list_empty(&local_list)) {
+		desc = list_first_entry(&local_list, struct adi_dma_descriptor,
 			cb_node);
 		list_del(&desc->cb_node);
 
-		dma_cookie_complete(&desc->tx);
-		dmaengine_desc_get_callback(&desc->tx, &cb);
+		// Only complete cookie for non-cyclic descriptors
+		// Cyclic descriptors keep their cookie for status queries
+		if (!desc->cyclic)
+			dma_cookie_complete(&desc->tx);
 
-		spin_unlock_irqrestore(&channel->lock, flags);
+		dmaengine_desc_get_callback(&desc->tx, &cb);
 		dmaengine_desc_callback_invoke(&cb, &desc->result);
 
-		desc->tx.desc_free(&desc->tx);
-		spin_lock_irqsave(&channel->lock, flags);
+		// Only free non-cyclic descriptors
+		// Cyclic descriptors remain allocated and in current_desc
+		if (!desc->cyclic)
+			desc->tx.desc_free(&desc->tx);
 	}
 
-	spin_unlock_irqrestore(&channel->lock, flags);
 	return IRQ_HANDLED;
 }

pamolloy · 2026-02-12T11:01:53Z

@vasbimpikasadi can someone on your team test this? Ideally also with the example code the customer provided to reproduce the problem 🙏

vasbimpikasadi · 2026-02-18T14:34:05Z

@vasbimpikasadi can someone on your team test this? Ideally also with the example code the customer provided to reproduce the problem 🙏

apologies for taking a while. This will be now tested and we'll try to reproduce

qasim-ijaz

@vasbimpikasadi can someone on your team test this? Ideally also with the example code the customer provided to reproduce the problem 🙏

Hi @pamolloy I attempted to run the reproducer (no fix applied) and got the following splat:

root@adsp-sc598-som-ezkit:~# insmod /dma_race_sim.ko                                                                      
[   55.314105] dma_race_sim: loading out-of-tree module taints kernel.                                                    
root@adsp-sc598-som-ezkit:~# [   55.423785] SIM: Terminator done.                                                          
[   55.523793] Unable to handle kernel paging request at virtual address dead000000000108                                  
[   55.531570] Mem abort info:                                                                                            
[   55.534356]   ESR = 0x0000000096000044                                                                                  
[   55.538074]   EC = 0x25: DABT (current EL), IL = 32 bits                                                                
[   55.543369]   SET = 0, FnV = 0                                                                                          
[   55.546405]   EA = 0, S1PTW = 0                                                                                        
[   55.549544]   FSC = 0x04: level 0 translation fault                                                                    
[   55.554392] Data abort info:                                                                                            
[   55.557256]   ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000                                                              
[   55.562723]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0                                                                  
[   55.567763]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0                                                                
[   55.573052] [dead000000000108] address between user and kernel address ranges                                          
[   55.580179] Internal error: Oops: 0000000096000044 1 PREEMPT SMP                                                    
[   55.586414] Modules linked in: dma_race_sim(O) crct10dif_ce                                                            
[   55.591971] CPU: 0 UID: 0 PID: 271 Comm: sim_handler Tainted: G           O       6.12.0-yocto-standard #1              
[   55.601603] Tainted: [O]=OOT_MODULE                                                                                    
[   55.605073] Hardware name: ADI 64-bit SC598 SOM EZ Kit (DT)                                                            
[   55.610629] pstate: 60000009 (nZCv daif PAN -UAO -TCO -DIT -SSBS BTYPE=-)                                            
[   55.617572] pc : handler_fn+0x84/0x94 [dma_race_sim]                                                                    
[   55.622520] lr : handler_fn+0x80/0x94 [dma_race_sim]                                                                    
[   55.627467] sp : ffff8000819f3e30                                                                                      
[   55.630765] x29: ffff8000819f3e30 x28: 0000000000000000 x27: 0000000000000000                                          
[   55.637883] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000                                          
[   55.645001] x23: dead000000000122 x22: dead000000000100 x21: ffff000090f33040                                          
[   55.652118] x20: ffff800079238000 x19: ffff8000792384c0 x18: 0000000000000000                                          
[   55.659236] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000                                          
[   55.666353] x14: 0000000000000000 x13: 0000000000000008 x12: 0000000c936545c0                                          
[   55.673471] x11: 0000000000000001 x10: 0000000000000a60 x9 : ffff8000819f3cf0                                          
[   55.680589] x8 : ffff000090f55e80 x7 : ffff00009df43700 x6 : 0000000000000000                                          
[   55.687706] x5 : ffff8000792384c0 x4 : ffff00009df340c0 x3 : ffff8000792384c0                                          
[   55.694824] x2 : 0000000000000000 x1 : dead000000000100 x0 : dead000000000122                                          
[   55.701942] Call trace:                                                                                                
[   55.704372]  handler_fn+0x84/0x94 [dma_race_sim]                                                                        
[   55.708972]  kthread+0xd8/0xe8                                                                                          
[   55.712010]  ret_from_fork+0x10/0x20                                                                                    
[   55.715575] Code: 95ba41a9 aa1303e0 95dbb1b5 a94002a1 (f9000420)                                                        
[   55.721646] {}{}[ end trace 0000000000000000 ]{}{}                                                                        
[   55.726282] note: sim_handler[271] exited with preempt_count 1

From the splat, the fault appears to occur in the reproducer module itself rather than directly in the adi-dma driver path.
The faulting PC is reported as handler_fn+0x84/0x94 [dma_race_sim]. I also reran it with CONFIG_DEBUG_INFO enabled to get debug symbols, but the result was the same. After reviewing the reproducer source, it appears to create a shared linked list and start two kernel threads on it. The handler thread saves a pointer to a list node, drops the lock, sleeps, and later tries to delete that same node. In parallel, the terminator thread removes and frees all nodes from the list. That makes the reproducer itself capable of triggering a stale pointer/UAF crash independent of the adi-dma driver. So at least from this run, it does not appear to reproduce the failure through the actual adi-dma driver path shown in the customer’s original stack trace. It should probably also be noted this was tested on kernel version 6.12 (yocto 5.0.1) while the customer seems to encounter the issue on kernel version 5.15.78 which would be yocto 3.1.0 released in 2023 iirc.

pamolloy requested review from a team, jiez, mhennerich, nunojsa and ozan956 February 6, 2026 08:46

pamolloy added this to ADSP Feb 6, 2026

pamolloy added the bug label Feb 6, 2026

pamolloy force-pushed the adsp-6.12.0/philip/adi-dma-locking-bug branch from 4bd5720 to fd506fb Compare February 6, 2026 10:42

pamolloy marked this pull request as ready for review February 6, 2026 10:42

ozan956 approved these changes Feb 6, 2026

View reviewed changes

pamolloy self-assigned this Feb 12, 2026

qasim-ijaz reviewed Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dmaengine: adi-dma: fix use-after-free in cyclic transfers#3112

dmaengine: adi-dma: fix use-after-free in cyclic transfers#3112
pamolloy wants to merge 1 commit intoadsp-6.12.0-yfrom
adsp-6.12.0/philip/adi-dma-locking-bug

pamolloy commented Feb 6, 2026 •

edited

Loading

Uh oh!

nunojsa commented Feb 6, 2026

Uh oh!

pamolloy commented Feb 6, 2026 •

edited

Loading

Uh oh!

nunojsa commented Feb 6, 2026

Uh oh!

pamolloy commented Feb 6, 2026

Uh oh!

pamolloy commented Feb 12, 2026

Uh oh!

vasbimpikasadi commented Feb 18, 2026

Uh oh!

qasim-ijaz left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

pamolloy commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nunojsa commented Feb 6, 2026

Uh oh!

pamolloy commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nunojsa commented Feb 6, 2026

Uh oh!

pamolloy commented Feb 6, 2026

Uh oh!

pamolloy commented Feb 12, 2026

Uh oh!

vasbimpikasadi commented Feb 18, 2026

Uh oh!

qasim-ijaz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pamolloy commented Feb 6, 2026 •

edited

Loading

pamolloy commented Feb 6, 2026 •

edited

Loading

qasim-ijaz left a comment •

edited

Loading