Wordy

Table of Contents

Bored in class? Try this cryptic Wordle twist and crack the next word!

nc wordy.ctf.pascalctf.it 5005

The challenge implements a Wordle-style game where secret words are generated by an MT19937 PRNG. To obtain the flag, we must correctly predict 5 consecutive future secrets. The twist: we only observe 20 bits of each 32-bit RNG output.

The attack at a high level
#

The core idea is simple: MT19937 is deterministic. If we can figure out its internal state, we can predict all future outputs.

The problem is we don’t see the full outputs. Each secret word encodes only 20 of the 32 bits the RNG produces:

def new_secret():
    out = rng.next_u32()           # 32-bit output
    idx = out & ((1 << 20) - 1)    # keep only lower 20 bits
    current_secret = index_to_word(idx)

So we’re missing 12 bits per output. Can we still recover the state?

Yes, because MT19937’s state is heavily constrained by its structure. Even partial observations, if we collect enough of them, uniquely determine the state. We use Z3 (an SMT solver) to find a state consistent with all our observations.

How MT19937 works
#

MT19937 maintains 624 32-bit words as internal state, plus an index tracking the current position.

Generating output
#

When you call next_u32(), it:

Reads state[index]
Applies a “tempering” transformation (reversible bit scrambling)
Increments the index

def next_u32(self):
    if self.index >= 624:
        self.twist()  # regenerate state when exhausted

    y = self.mt[self.index]
    self.index += 1

    # Tempering: reversible bit scrambling
    y ^= (y >> 11)
    y ^= ((y << 7) & 0x9D2C5680)
    y ^= ((y << 15) & 0xEFC60000)
    y ^= (y >> 18)

    return y

The twist operation
#

After 624 outputs, the state is exhausted and gets regenerated via “twist”:

def twist(self):
    for i in range(624):
        # Combine upper bit of state[i] with lower 31 bits of state[i+1]
        y = (self.mt[i] & 0x80000000) | (self.mt[(i + 1) % 624] & 0x7FFFFFFF)

        # XOR with state[i+397] and conditional term
        self.mt[i] = self.mt[(i + 397) % 624] ^ (y >> 1)
        if y & 1:
            self.mt[i] ^= 0x9908B0DF

    self.index = 0

The key observation: every operation here is XOR, AND with constants, or bit shifts. No multiplication, no addition that carries between bits. This makes the entire system linear over individual bits.

Why partial observations are enough
#

Here’s the crucial insight that makes this attack work.

Tracing bits through tempering
#

Let’s trace exactly what happens to the bits. Say state word mt[0] has bits we’ll call b31, b30, ..., b0. The first tempering step is:

y = mt[0]
y ^= (y >> 11)

What does y ^= (y >> 11) do to each bit position?

Before:  b31 b30 b29 ... b21 b20 b19 b18 ... b11 b10 ... b0
Shifted:   0   0   0 ...   0 b31 b30 b29 ... b22 b21 ... b11
XOR:     b31 b30 b29 ... b21  X   X   X  ...  X   X  ...  X

where X means "XOR of two bits"

So after this step:

Bit 20 of the output = b20 XOR b31
Bit 19 of the output = b19 XOR b30
Bit 10 of the output = b10 XOR b21
Bits 21-31 are unchanged (nothing shifts into them)

Each subsequent tempering step (y ^= ((y << 7) & MASK), etc.) does more mixing, but it’s always XORing bits together. After all tempering steps, each output bit is the XOR of some specific subset of the original state bits.

From bits to equations
#

Here’s where it becomes useful. Suppose after all tempering, output bit 5 equals b17 XOR b8 XOR b3 (I’m making up the exact combination, the real one depends on all the tempering steps).

If we observe that output bit 5 = 1, we now know:

b17 XOR b8 XOR b3 = 1

That’s an equation! It tells us these three state bits can’t all be the same, either one is 1 and two are 0 or all three are 1.

Each observed bit gives us one such equation. If we observe 20 bits of output, we get 20 equations constraining the state bits.

Why XOR makes this “linear”
#

XOR has a nice property: it behaves like addition where 1+1=0 (arithmetic modulo 2). This means our equations are linear, no bit is multiplied by another bit, just added (XORed) together.

Linear systems can be solved efficiently. Given enough equations, you can solve for all unknowns using Gaussian elimination (or in our case, Z3 does something equivalent).

If tempering used AND or OR in ways that weren’t masked to constants, we’d get nonlinear equations like b5 AND b3 = 1, which are much harder to solve in bulk.

Counting constraints vs unknowns
#

The full state is 624 × 32 = 19968 bits. That’s what we need to determine.

Each observation gives us 20 bits of output. Each bit is one linear equation constraining the state. So:

624 observations → 624 × 20 = 12480 constraints (not enough)
1000 observations → 1000 × 20 = 20000 constraints (just enough)

But raw constraint count isn’t the whole story. We also need the constraints to be independent, meaning they need to tell us different things about the state.

Why the twist helps
#

Before the twist, output \(i\) comes directly from state[i] (through tempering). Each observation only constrains bits of a single state word. If we only collected 624 observations from epoch 0, each state word would have 20 known bits and 12 unknown bits, with no way to pin down those 12 bits.

After the twist, everything changes. Look at how twisted state is computed:

twisted[i] = state[(i + 397) % 624] ^ (y >> 1) ^ ...
# where y combines state[i] and state[i+1]

So twisted[i] depends on bits from state[i], state[i+1], AND state[(i+397) % 624]. Three different original state words get mixed together.

When we observe output 624 (first output after twist), we’re constraining a linear combination of bits from multiple original state words. These cross-word constraints tie the whole state together into a connected system.

With observations spanning two twist boundaries (outputs 0-623, 624-1247, 1248+), we get a dense web of constraints where each state word is linked to many others. This makes the system fully determined despite only seeing 20 bits per output.

Why Z3 works here
#

Z3’s bitvector solver is essentially doing Gaussian elimination on a system of linear equations over bits. This isn’t brute force, it’s \(O(n^3)\) linear algebra, which is fast even for 20000 equations.

If MT19937 used multiplication or other nonlinear operations, Z3 would have to fall back to SAT solving, which could take exponential time. The linearity is what makes this tractable.

Solving Wordle efficiently
#

Before recovering the RNG state, we need to extract the secrets. Brute-forcing all \(16^5\) possibilities per round is infeasible, but the Wordle feedback mechanism enables rapid narrowing.

Letter enumeration strategy
#

We probe each letter by guessing 5 copies. For a guess like aaaaa against secret abcda:

Feedback: G___G
Every a in the secret shows as G (green) at its exact position

This means we don’t actually need Yellow feedback at all. Greens tell us both which letters appear and where they are. After probing all 16 letters, we know every position for each letter. The remaining unknowns are just permutations of letters that could fit multiple positions.

Batch sending
#

A key optimization: send all guesses at once instead of waiting for each response. With 1260 rounds of Wordle over a network connection, round-trip latency adds up fast.

# Send all 16 letter probes in one batch
guesses = [f"GUESS {c * 5}" for c in ALPHABET]
io.sendline("\n".join(guesses).encode())

# Then read all 16 responses
for c in ALPHABET:
    fb = io.recvline().decode().split()[1]
    # process feedback...

Same idea for candidate permutations: generate them all, send in one batch and read of G for each position.

Z3-based state recovery
#

What is Z3?
#

Z3 is an SMT (Satisfiability Modulo Theories) solver, a tool that finds values satisfying a set of constraints, or proves no solution exists.

The key idea is symbolic computation: instead of calculating with concrete numbers, we describe operations on unknown variables and let Z3 figure out what those variables must be. This lets us “run MT19937 backwards”: we know the outputs, and Z3 finds the state that produced them.

What “symbolic” actually means
#

When we write normal Python:

x = 5
y = x ^ 3  # y is now 6

Python computes 5 ^ 3 = 6 immediately. The variable y holds the concrete value 6.

Z3 works differently. When we create a Z3 variable:

x = BitVec('x', 32)  # "x is some unknown 32-bit value"

This x isn’t a number. It’s a symbol representing “some 32-bit value we don’t know yet.” Now when we write:

y = x ^ 3

Python doesn’t (can’t!) compute this, because x has no concrete value. Instead, Z3 overloads the ^ operator so that it builds an expression tree rather than computing a result. This is why the code looks like normal math, but Z3 has hijacked every operator to record what’s happening instead of actually doing it:

    XOR
   /   \
  x     3

The variable y now holds this tree structure, not a number. If we keep going:

z = y ^ 7

We get a bigger tree:

      XOR
     /   \
   XOR    7
  /   \
 x     3

This is what “symbolic computation” means: instead of calculating results, we’re building up descriptions of how to calculate results, in terms of unknown variables.

How constraints work
#

Now comes the key part. When we write:

s = Solver()
s.add(z == 10)

We’re telling Z3: “the expression (x ^ 3) ^ 7 must equal 10.” Z3 records this constraint. Later, when we call s.check(), Z3 works backwards: “what value of x makes this true?”

For this simple example: (x ^ 3) ^ 7 = 10 means x ^ 3 = 10 ^ 7 = 13, so x = 13 ^ 3 = 14.

Scaling up to MT19937
#

The same principle applies to our attack. We create 624 symbolic variables:

state = [BitVec(f'mt_{i}', 32) for i in range(N)]

Each state[i] represents “the unknown \(i\)-th word of the MT19937 state.”

Then we symbolically compute tempering:

tempered = state[i]
tempered = tempered ^ LShR(tempered, 11)
tempered = tempered ^ ((tempered << 7) & 0x9D2C5680)
tempered = tempered ^ ((tempered << 15) & 0xEFC60000)
tempered = tempered ^ LShR(tempered, 18)

After this, tempered isn’t a number. It’s an enormous expression tree with state[i] at the leaves, describing exactly how tempering transforms that state word. (Note: LShR is Z3’s logical right shift, which fills with zeros like C does for unsigned integers. Z3’s >> operator always does arithmetic shift (copies the top bit), so 0x80000000 >> 1 would give 0xC0000000 instead of 0x40000000.)

Now we add a constraint:

s.add(Extract(19, 0, tempered) == observations[i])

Extract(19, 0, tempered) builds another expression: “bits 0-19 of the tempered result.” Setting it equal to our observation tells Z3: “whatever state[i] is, when you temper it and take the low 20 bits, you must get this specific value.”

Each observation adds 20 such bit-level constraints. After 1000+ observations, Z3 has enough constraints to uniquely determine all 624 state words.

Handling the twist
#

For observations 624+, the output comes from twisted state. We compute this symbolically too:

MATRIX_A = BitVecVal(0x9908B0DF, 32)  # Concrete constant, not symbolic
twisted = []
for i in range(N):
    y = (state[i] & 0x80000000) | (state[(i + 1) % N] & 0x7FFFFFFF)
    twisted.append(state[(i + 397) % N] ^ LShR(y, 1) ^
                  If((y & 1) == 1, MATRIX_A, BitVecVal(0, 32)))

Now twisted[i] is an expression tree involving multiple original state words: state[i], state[i+1], and state[(i+397) % 624]. The If(cond, then, else) creates a conditional expression that Z3 will resolve once it knows the bit values.

Constraints on twisted outputs therefore link multiple state words together, creating the cross-word dependencies that make the system fully determined.

Solving
#

Finally:

s.check()  # Find values satisfying all constraints
return [s.model()[state[i]].as_long() for i in range(N)]

Z3 internally converts all our expression trees into a system of boolean equations (one per bit) and solves via something like Gaussian elimination. Because XOR is linear, this is efficient, typically under a minute for our ~25000 equations.

s.check()  # Returns sat, unsat, or unknown
return [s.model()[state[i]].as_long() for i in range(N)]

s.check() runs the solver. If satisfiable, s.model() returns a mapping from symbolic variables to concrete values. We extract each state word with .as_long().

The solver typically finishes in under a minute because the constraint system is linear over bits (XOR-based). Z3 essentially performs Gaussian elimination on ~25000 boolean equations.

Putting it together
#

Collection phase
#

We play 1260 rounds of Wordle, recovering each secret and extracting its 20-bit index:

observations = []
for i in range(1260):
    io.sendline(b'NEW')
    io.recvuntil(b'ROUND STARTED')
    secret = solve_wordle(io)
    observations.append(word_to_index(secret))

The choice of 1260 ensures we span two twist boundaries (twists occur at outputs 624 and 1248), maximizing constraint diversity.

Recovery and prediction phase
#

With all observations collected, we recover the state and create a concrete RNG instance:

state = recover_state(observations)
rng = MT19937(state, 0)

We fast-forward past all the outputs we’ve already seen:

for _ in range(len(observations)):
    rng.next_u32()

Now the RNG is positioned exactly where the server’s RNG is. The next 5 outputs will match:

for _ in range(5):
    pred = rng.next_u32() & 0xFFFFF
    io.sendline(f'FINAL {index_to_word(pred)}'.encode())

Final solve script
#

from pwn import *
from z3 import *
import itertools

ALPHABET = "abcdefghijklmnop"
N = 624

def index_to_word(idx):
    word = ""
    for _ in range(5):
        word = ALPHABET[idx % 16] + word
        idx //= 16
    return word

def word_to_index(word):
    x = 0
    for c in word:
        x = x * 16 + ALPHABET.index(c)
    return x

def solve_wordle(io):
    known_pos = [None] * 5
    counts = {}

    for c in ALPHABET:
        io.sendline(f'GUESS {c * 5}'.encode())
        io.recvuntil(b'FEEDBACK ')
        fb = io.recvline().decode().strip()

        if fb == '_____':
            continue

        g_count = fb.count('G')
        y_count = fb.count('Y')
        total = g_count + y_count

        if total > 0:
            counts[c] = total
            for i, f in enumerate(fb):
                if f == 'G':
                    known_pos[i] = c

    if all(p is not None for p in known_pos):
        return ''.join(known_pos)

    unknown = [i for i in range(5) if known_pos[i] is None]
    remaining = dict(counts)
    for p in known_pos:
        if p is not None and p in remaining:
            remaining[p] -= 1
            if remaining[p] == 0:
                del remaining[p]

    letters = []
    for c, n in remaining.items():
        letters += [c] * n

    for perm in itertools.permutations(letters, len(unknown)):
        candidate = list(known_pos)
        for i, pos in enumerate(unknown):
            candidate[pos] = perm[i]
        word = ''.join(candidate)

        io.sendline(f'GUESS {word}'.encode())
        io.recvuntil(b'FEEDBACK ')
        if io.recvline().decode().strip() == 'GGGGG':
            return word

class MT19937:
    def __init__(self, state, index=0):
        self.mt = state[:]
        self.index = index

    def twist(self):
        old = self.mt[:]
        for i in range(N):
            y = (old[i] & 0x80000000) | (old[(i + 1) % N] & 0x7FFFFFFF)
            self.mt[i] = (old[(i + 397) % N] ^ (y >> 1) ^ (0x9908B0DF if y & 1 else 0)) & 0xFFFFFFFF
        self.index = 0

    def next_u32(self):
        if self.index >= N:
            self.twist()
        y = self.mt[self.index]
        self.index += 1
        y ^= (y >> 11)
        y ^= ((y << 7) & 0x9D2C5680)
        y ^= ((y << 15) & 0xEFC60000)
        y ^= (y >> 18)
        return y & 0xFFFFFFFF

def recover_state(observations):
    s = Solver()
    state = [BitVec(f'mt_{i}', 32) for i in range(N)]

    # Epoch 0: outputs 0-623
    for i in range(min(N, len(observations))):
        tempered = state[i]
        tempered = tempered ^ LShR(tempered, 11)
        tempered = tempered ^ ((tempered << 7) & 0x9D2C5680)
        tempered = tempered ^ ((tempered << 15) & 0xEFC60000)
        tempered = tempered ^ LShR(tempered, 18)
        s.add(Extract(19, 0, tempered) == observations[i])

    if len(observations) > N:
        # Compute twisted state symbolically
        MATRIX_A = BitVecVal(0x9908B0DF, 32)
        twisted = []
        for i in range(N):
            y = (state[i] & 0x80000000) | (state[(i + 1) % N] & 0x7FFFFFFF)
            twisted.append(state[(i + 397) % N] ^ LShR(y, 1) ^
                          If((y & 1) == 1, MATRIX_A, BitVecVal(0, 32)))

        # Epoch 1: outputs 624-1247
        for i in range(N, min(2 * N, len(observations))):
            idx = i - N
            tempered = twisted[idx]
            tempered = tempered ^ LShR(tempered, 11)
            tempered = tempered ^ ((tempered << 7) & 0x9D2C5680)
            tempered = tempered ^ ((tempered << 15) & 0xEFC60000)
            tempered = tempered ^ LShR(tempered, 18)
            s.add(Extract(19, 0, tempered) == observations[i])

    if len(observations) > 2 * N:
        # Second twist
        twisted2 = []
        for i in range(N):
            y = (twisted[i] & 0x80000000) | (twisted[(i + 1) % N] & 0x7FFFFFFF)
            twisted2.append(twisted[(i + 397) % N] ^ LShR(y, 1) ^
                           If((y & 1) == 1, MATRIX_A, BitVecVal(0, 32)))

        # Epoch 2: outputs 1248+
        for i in range(2 * N, len(observations)):
            idx = i - 2 * N
            tempered = twisted2[idx]
            tempered = tempered ^ LShR(tempered, 11)
            tempered = tempered ^ ((tempered << 7) & 0x9D2C5680)
            tempered = tempered ^ ((tempered << 15) & 0xEFC60000)
            tempered = tempered ^ LShR(tempered, 18)
            s.add(Extract(19, 0, tempered) == observations[i])

    s.check()
    return [s.model()[state[i]].as_long() for i in range(N)]

io = remote("wordy.ctf.pascalctf.it", 5005)
# io = process(['python3', 'service.py'], env={"FLAG": "test{flag}"})
io.recvuntil(b"READY")

observations = []
for i in range(1260):
    io.sendline(b"NEW")
    io.recvuntil(b"ROUND STARTED")
    secret = solve_wordle(io)
    observations.append(word_to_index(secret))

state = recover_state(observations)
rng = MT19937(state, 0)

for _ in range(len(observations)):
    rng.next_u32()

for _ in range(5):
    pred = rng.next_u32() & 0xFFFFF
    io.sendline(f"FINAL {index_to_word(pred)}".encode())
    print(io.recvline().decode())

io.interactive()

Flag
#

pascalCTF{Y0ur_l1k3_a_3ncycl0p3d14_0f_r4nd0m_w0rds!}

The attack at a high level #

How MT19937 works #

Generating output #

The twist operation #

Why partial observations are enough #

Tracing bits through tempering #

From bits to equations #

Why XOR makes this “linear” #

Counting constraints vs unknowns #

Why the twist helps #

Why Z3 works here #

Solving Wordle efficiently #

Letter enumeration strategy #

Batch sending #

Z3-based state recovery #

What is Z3? #

What “symbolic” actually means #

How constraints work #

Scaling up to MT19937 #

Handling the twist #

Solving #

Putting it together #

Collection phase #

Recovery and prediction phase #

Final solve script #

Flag #