A 3 KB mystery binary, a timer built to waste a trillion CPU cycles, and a PRNG cycle 34 steps long that collapses everything down to one modular lookup. This is how you read a flag off a 1970s virtual computer without waiting until the heat death of the universe.
You get a file called main.bin โ 3,282 bytes, no extension, no readme, just a blob. The flavour text says: "I found this old cassette tape in my drawer XD". That sentence is doing a lot of work.
The binary is a CHIP-8 ROM โ a program for an interpreted virtual machine from the 1970s, historically loaded onto hobbyist computers from cassette tapes. CHIP-8 has a tiny instruction set (~35 opcodes), a 64ร32-pixel display, 16 eight-bit registers (V0โVF, with VF reserved as the carry/borrow flag register), and programs that load at memory address 0x200. It's practically a toy. This one hides a flag inside it.
The ROM draws 32 characters to screen, one at a time. Each character's sprite data is XOR-encrypted using two PRNG registers. Between each draw, the PRNG advances by an exponentially increasing number of steps โ eventually reaching ~1.1 trillion steps per character. The challenge title is also the hint: this is about time, specifically about not wasting all of it.
The PRNG state is two 8-bit registers (VA and VB). That's 256 ร 256 = 65,536 possible states. By the pigeonhole principle, it must cycle within that many steps. Find the cycle length, and a trillion-step timer becomes a single modular division. No simulation needed.
file(1) (the Unix file-type command) is useless here. Read the bytes yourself.The first instinct is to run file (the Unix file-type command). It says data. Not helpful. Run xxd instead and look at the first two bytes:
$ xxd main.bin | head -1
00000000: 00e0 1280 a6a6 a6a6 a6a6 a6a6 a6a6 a6a6 ................
00 E0 is the CHIP-8 opcode CLS โ clear the screen. The bytes 12 80 immediately after it form a jump instruction to 0x280. The flood of 0xA6 bytes starting at offset 4 is sprite data in memory, never executed as code.
CHIP-8 is an interpreted language designed for 1970s hobbyist machines. Think of it as the simplest possible virtual machine: one 64ร32 pixel screen, 16 eight-bit registers (V0โVF, with VF reserved as the flag register), a small stack, and about 35 opcodes. Programs fit in a few kilobytes and were distributed on โ you guessed it โ cassette tape.
Write a minimal Python CHIP-8 emulator โ just enough to decode opcodes and log draw calls. Set the delay timer to always return zero so you don't wait forever. Running it confirms the ROM draws characters:
Cycle 163 โ 'T'
Cycle 517 โ 'TH'
Cycle 1770 โ 'THE'
Cycle 6675 โ 'THEM'
Cycle 26092 โ 'THEM?'
...stalls
Good โ the ROM definitely draws the flag. Bad โ zeroing the timer breaks the decryption. The PRNG advances during the delay loop; each delay iteration triggers one PRNG step. Skip the delay, and the PRNG state diverges from what it would be during actual execution, causing garbage output from character 6 onwards.
Load the binary into a CHIP-8 disassembler (or write one โ the instruction set is small enough that an afternoon covers it). The structure falls out quickly:
The sprite data is split across 8 subtable banks in memory:
subtable_bases = [
0x400, 0x460, 0x4C0, 0x520,
0x600, 0x660, 0x6C0, 0x720
]
Each of the 32 character draws follows the same pattern. The subroutine at 0x2C0 maintains two 8-bit registers, VA and VB, which together form the PRNG state. Before each draw, the PRNG has been stepped some number of times. Then:
plaintext_byte = memory[subtable_bases[VA & 7] + char_offset] ^ VA ^ VB
VA & 7 picks which of the 8 subtables to use (bottom 3 bits of VA). The encrypted sprite byte is XOR'd with both VA and VB to recover the plaintext. XOR is its own inverse โ same key decrypts as encrypts โ so if you know VA and VB at the moment of each draw, you can recover every character.
XOR is a bitwise operation where the same key both encrypts and decrypts. If ciphertext = plaintext XOR key, then plaintext = ciphertext XOR key. The "key" here is the pair (VA, VB). Predict those two bytes at each draw, and the entire cipher falls apart immediately.
The PRNG state is two 8-bit values. State space: 256 ร 256 = 65,536 possible pairs. The PRNG update function at 0x2C0 is deterministic. Therefore, after at most 65,536 steps, it must revisit a state it's already seen โ and from that point it loops forever. This is not an exploit; it's a mathematical certainty (the pigeonhole principle).
The delay between each character draw is controlled by a timer. The timer values scale exponentially as the ROM progresses:
| Character | PRNG steps between draws | If you simulate it at 1B iters/sec |
|---|---|---|
| 1 | 4 | instant |
| 2 | 16 | instant |
| 3 | 64 | instant |
| 10 | 1,048,576 | ~1ms |
| 15 | 1,073,741,824 | ~1 second |
| 16โ32 | ~1,100,000,000,000 | ~18 minutes each |
Characters 16 through 32 each need roughly 1.1 trillion PRNG steps. Even in optimized C, that's 18 minutes per character. 17 late characters ร 18 minutes = over 5 hours of runtime, assuming zero bugs and no off-by-one errors anywhere in your timing logic. This path leads nowhere useful.
Walk the PRNG forward from its initial state, recording every (VA, VB) pair you've seen. The moment you hit a repeat, you've found the cycle. With only 65,536 possible states, this completes in under a millisecond:
def step_prng(VA, VB):
new_VA = (VA + VB) & 0xFF
new_VB = (VA ^ new_VA) & 0xFF
return new_VA, new_VB
VA, VB = 0, 0 # initial state from ROM
precomputed = {} # maps step โ (VA, VB) for steps 0..361
seen = {}
step = 0
while (VA, VB) not in seen:
seen[(VA, VB)] = step
precomputed[step] = (VA, VB) # VA, VB are current state; step counts forward
VA, VB = step_prng(VA, VB)
step += 1
cycle_start = seen[(VA, VB)] # 328
cycle_length = step - cycle_start # 34
Cycle length: 34. Starts at step 328. Once you're past step 328, the PRNG repeats every 34 steps, forever. This means the state at any trillion-step position can be computed from the ROM's initial PRNG state:
def state_at(n):
if n < 328:
return precomputed[n]
idx = 328 + (n - 328) % 34
return precomputed[idx]
Now applying this to the late characters:
timer_per_late_char = 255 * 4_294_967_295 # = 1,095,216,660,225
advance = 1_095_216_660_225 % 34 # = 17
Every late character advances the PRNG exactly 17 positions in the cycle. One division. The entire trillion-step delay reduced to an integer lookup.
flag = []
cumulative = 0
for i in range(32):
VA, VB = state_at(cumulative)
subtable = subtable_bases[VA & 7]
plaintext = memory[subtable + i] ^ VA ^ VB
flag.append(chr(plaintext))
cumulative += timer_steps_for_char(i)
print("".join(flag))
The ROM draws across four rows of the 64ร32 display:
Row 1: T H E M ? ! C T F {
Row 2: 0 L D _ T 4 P 3 _ N
Row 3: 3 V 3 R _ D 1 E 5 K
Row 4: 7 }
The flavour text, the title, the K7 suffix โ all of it was pointing at CHIP-8 and cassette culture the entire time. Once you know K7 is French for cassette, the challenge design clicks: old tape, CHIP-8, loaded from tape. Even the flag text itself is the joke.
file(1) doesn't know CHIP-8. The first two bytes told the whole story. Always xxd before anything else.
16-bit state โ at most 65,536 steps to find the cycle. Check for cycles before simulating anything in the billions.
When delays go 4, 16, 64, 256... the timer is the puzzle. The crypto lives in the delay, not around it.
Cassette โ CHIP-8 โ K7. CTF flavour text is never decorative. It's compressed hints for people who know where to look.
That single line of math replaced 18 minutes of runtime per character. Math beats hardware every time.
XOR encrypts and decrypts with the same key. PRNG state = key. Once you can predict the state, the cipher is transparent.
A trillion-step timer defeated by a 34-step cycle and one % 34. The old tape never died โ it just needed someone to do the modular arithmetic. THEM?!CTF{0LD_T4P3_N3V3R_D1E5K7}