A crab stole my json schema…
The challenge is a Rust binary that reads JSON from stdin and outputs either a crab emoji (success) or a sad face (failure):
$ echo '{"test": 1}' | ./curly-crab
Give me a JSONy flag!
π
$ echo '???' | ./curly-crab
Give me a JSONy flag!
π¦We need to figure out what JSON structure makes the crab happy.
Why Rust reversing is painful #
Coming from C reversing, Rust binaries have some extra headaches:
-
Monomorphization: Generic functions get duplicated for each concrete type. A simple
Vec<T>becomes separate code forVec<i32>,Vec<String>, etc. The binary bloats with near-identical functions. -
Aggressive inlining: Small functions get inlined everywhere. What would be a clean
callinstruction in C becomes a wall of duplicated code. -
Standard library bloat: Even simple operations pull in tons of library code for error handling,
Resultunwrapping, iterator machinery, etc. A “hello world” in Rust is 300KB+. -
Name mangling on steroids: Function names become monstrosities like
_ZN4core3ptr85drop_in_place$LT$alloc..vec..Vec$LT$u8$GT$$GT$17h3b2c... -
Ownership/borrowing artifacts: The decompiled code is littered with
drop_in_placecalls, reference counting, and move semantics that obscure the actual logic.
The saving grace here: Rust’s serde library generates predictable patterns for JSON deserialization.
The reality: what you’re actually looking at #
Before showing the cleaned-up version, here’s what Rust binaries actually look like in a decompiler. This is the real main function:
int64_t curly_crab::main::h71b58f7aacf87a44()
{
std::io::stdio::_print::h526c462071e58c18(&data_7a8b[0x91], 0x2d);
std::io::stdio::stdin::h11deceff11981680();
void* var_30 = &std::io::stdio::stdin::INSTANCE::h067a27bca4e07de8;
int32_t* rax;
rax = std::io::stdio::Stdin::lock::h1079d43173269675(&var_30);
// ... 50 more lines of Result unwrapping and panic handling ...
serde_json::de::from_trait::h1f3bcad3bd3177ac(&var_80, &var_d8);
if (var_80 != -0x8000000000000000) {
std::io::stdio::_print::h526c462071e58c18(&data_7b32, 0xb); // π¦
} else {
std::io::stdio::_eprint::hbab4723ed852db00(&data_7b37, 0xb); // π
}
// ... 30 more lines of cleanup ...
}The useful bits are buried in noise. Here’s how to navigate it.
Practical tips for reversing Rust/serde #
1. Use function names as landmarks #
Even mangled, the names tell you what’s happening:
serde_json::de::from_traitβ JSON parsing entry pointdeserialize_structβ struct field parsingdeserialize_bool,deserialize_stringβ primitive typesdrop_in_place,__rust_deallocβ cleanup (ignore these)
2. Field matching follows a pattern #
Serde checks field length first, then compares bytes as integers:
if (rax_3 == 6) // length == 6
{
if (!((*(r15_1 + 4) ^ 0x7962) | (*r15_1 ^ 0x62617263)))
// matched!
}The XOR-and-OR pattern (a ^ expected1) | (b ^ expected2) equals zero only if both match.
3. Search for concatenated field names #
Serde embeds field names in error messages. Search for strings like:
"I_crabbycr4bsstruct Crab with 3 elements"This tells you the struct has fields I_, crabby, cr4bs and is called Crab.
4. Ignore the noise #
Most of the code is:
Result/Optionchecking (-0x8000000000000000is theErr/Nonediscriminant)- Memory cleanup (
__rust_dealloc,drop_in_place) - Whitespace skipping (the
TEST_BITQ(0x100002600, ...)pattern) - Panic handling
Focus on the actual comparisons and function calls.
Identifying serde in the binary #
Signs to look for:
-
String references: Search for
"expected struct","missing field","invalid type". -
Function name fragments: Look for
serde,deserialize,Visitor,SeqAccess. -
Concatenated field names: Serde error messages contain field lists like
"I_crabbycr4bs".
Searching strings for “struct” reveals the struct names:
"expected struct TopLevel"
"expected struct Crab"
"expected struct Crabby"This tells us the hierarchy. Now we need the field names.
How serde deserialization works #
Serde is Rust’s serialization framework. When you write:
#[derive(Deserialize)]
struct Crab {
I_: bool,
crabby: Crabby,
cr4bs: i32,
}The #[derive(Deserialize)] macro generates a deserialize function that:
- Expects either
{(object) or[(tuple/array format) - Reads field names as strings
- Matches them against expected field names
- Recursively deserializes nested types
- Returns an error if anything doesn’t match
The key insight: field name matching uses integer comparisons on the raw bytes. Instead of string comparison, serde compares chunks of the field name as integers for speed.
Finding the entry point #
Starting from main, trace the calls:
curly_crab::main::h71b58f7aacf87a44
β
βββ serde_json::de::from_trait::h1f3bcad3bd3177ac
β
βββ deserialize_struct::he3c85fe01abee1f1 β top-level structThe from_trait function is just a wrapper. The real work happens in deserialize_struct.
What you’re actually looking for #
Here’s a snippet from the real deserialize_struct for the top-level struct. I’ve annotated the important parts:
// Inside deserialize_struct - the actual decompiled mess
// ... skip past the '{' check and 100 lines of setup ...
// THIS IS THE GOLD - field length switch
if (var_148 == 3) // β field length check
{
// Compare bytes: (byte[2] ^ 'F') | (bytes[0:2] ^ 0x5443)
if ((*(r13_1 + 2) ^ 0x46) | (*r13_1 ^ 0x5443))
goto label_2c1ab; // unknown field
// Matched "CTF"! Now deserialize the value...
}
else if (var_148 == 4)
{
if (*r13_1 != 0x62617263) // β compare as 4-byte int
goto label_2c1ab;
// Matched "crab"! Call nested struct deserializer
_$LT$RF$mut$u20$serde_j...deserialize_struct::hb5c049ded4c5ad6a(&var_158, r15_1);
}
else if (var_148 == 6)
{
// Two comparisons: 4 bytes + 2 bytes
if ((*(r13_1 + 4) ^ 0x6c61) | (*r13_1 ^ 0x63736170))
goto label_2c1ab;
// Matched "pascal"! Deserialize string
_$LT$RF$mut$u20$serde_j...deserialize_string::h4c289388cf84ac5d(&var_d8, r15_1);
}The pattern to look for:
- Length check in an if/switch
- Byte comparisons using XOR:
(a ^ expected) | (b ^ expected)(equals 0 if match) - Call to another
deserialize_*function for the value
Mapping the struct hierarchy #
Follow the deserialize_struct calls to find nested structs. Each one has the same pattern of length checks and hex comparisons.
Top-level struct #
From the code above:
CTF(len=3,0x5443+0x46) β integercrab(len=4,0x62617263) β nested struct viadeserialize_struct::hb5c049ded4c5ad6apascal(len=6,0x63736170+0x6c61) β string
Nested “crab” struct #
Following deserialize_struct::hb5c049ded4c5ad6a, same pattern:
if (rax_3 == 2) // "I_"
if (*r15_1 != 0x5f49) goto unknown;
// deserialize_bool
else if (rax_3 == 5) // "cr4bs"
if ((*(r15_1 + 4) ^ 0x73) | (*r15_1 ^ 0x62347263)) goto unknown;
// deserialize integer
else if (rax_3 == 6) // "crabby"
if ((*(r15_1 + 4) ^ 0x7962) | (*r15_1 ^ 0x62617263)) goto unknown;
// deserialize_struct::h39718c3ed97ba090 (another nested struct!)
Fields: I_ (bool), cr4bs (int), crabby (struct)
Inner “crabby” struct #
Following the next deserialize_struct:
l0v3_(len=5,0x7633306c+0x5f) β arrayr3vv1ng_(len=8,0x5f676e3176763372) β integer
Visual hierarchy #
Top-level
βββ CTF: integer
βββ crab: struct
β βββ I_: boolean
β βββ crabby: struct
β β βββ r3vv1ng_: integer
β β βββ l0v3_: array
β βββ cr4bs: integer
βββ pascal: stringConstructing valid JSON #
Based on the schema:
{
"CTF": 1,
"crab": {
"I_": true,
"crabby": {
"r3vv1ng_": 1,
"l0v3_": []
},
"cr4bs": 1
},
"pascal": "test"
}Testing:
$ echo '{"CTF":1,"crab":{"I_":true,"crabby":{"r3vv1ng_":1,"l0v3_":[]},"cr4bs":1},"pascal":"x"}' | ./curly-crab
Give me a JSONy flag!
π¦Extracting the flag #
Some people submitted massive JSON documents that happened to work because they included the required fields somewhere. The key is understanding what’s actually being validated: just the field names and types, nothing more.
The field names spell out the flag: I_, l0v3_, r3vv1ng_, cr4bs.
Flag #
pascalCTF{I_l0v3_r3vv1ng_cr4bs}