Reverse Engineering Process & Findings

How we extracted game mechanics, data structures, and algorithms from a 1995 DOS binary.

1. The Target

PropertyValue
GameScorched Earth — "The Mother of All Games"
Version1.50 (June 4, 1995)
AuthorWendell Hicken, Copyright 1991–1995
CompilerBorland C++ 1993 (Large memory model)
Graphics LibraryFastgraph V4.02
PlatformMS-DOS, 16-bit real mode
EXE Size415,456 bytes (MZ format)
Header27,136 bytes (6,136 relocations)
Code+Data388,320 bytes
Data Segment0x4F38 (file base 0x055D80)

Borland's Large memory model means each .cpp file gets its own code segment. This turned out to be a blessing — we could identify source files and map the binary structure by finding debug assertion strings.

2. The Toolchain

radare2 (r2)

Primary disassembler. Loaded with r2 -a x86 -b 16 -s 0x6a00 earth/SCORCH.EXE to skip the MZ header. Good for string cross-references and data exploration, but had a critical limitation with FPU instructions.

The FPU Problem

Borland C++ 1993 does not emit native x87 FPU opcodes. Instead, every single floating-point operation is encoded as an INT 34h–3Dh software interrupt. Radare2 (and IDA, Ghidra, etc.) see these as interrupt calls, not math operations. Any region with physics or trigonometry is completely unreadable.

The encoding scheme:

InterruptDecodes ToPurpose
INT 34hDC xxfsub/fcomp qword
INT 35hD8 xxfadd/fmul dword
INT 36hDA xxfiadd/fimul dword int
INT 37hDE xxfiadd/fimul word int
INT 38hDD xxfld/fst qword
INT 39hD9 xxfld/fst dword
INT 3AhDB xxfild dword
INT 3BhDF xxfild/fistp word
INT 3ChES: + DD/DCPlayer sub-struct FPU ops
INT 3Dh9Bfwait

Custom FPU Decoder

We wrote fpu_decode.py to transform the raw binary into readable FPU mnemonics with constant annotations. This was the key breakthrough that unlocked the physics, damage, and AI systems:

python3 disasm/fpu_decode.py earth/SCORCH.EXE 0x24F01 0x2610F -c -f

# Output (example from AI solver):
# 0x24FCF: fld   qword [DS:0x31F4]    ; scaled_gravity
# 0x24FD5: fmul  qword [DS:0x31FC]    ; gravity * multiplier
# 0x24FDB: fstp  qword [DS:0x31F4]    ; store result

The INT 3Ch Bug

INT 3Ch encodes an ES: segment override prefix followed by a DD or DC opcode. Our decoder initially mapped it to D8, producing wrong mnemonics for any FPU operations on the player sub-struct (accessed via ES:BX). This made the Mag Deflector deflection code at 0x21A80 particularly difficult to trace — the exact field accesses on the 202-byte tank struct remained ambiguous until we cross-referenced with runtime behavior.

3. Finding Source Files

Borland C++ embeds assertion strings containing the source filename. By searching for .cpp in the binary data segment, we mapped 11 source files to their code segments:

Source FileCode SegmentFile BasePurpose
comments.cpp0x117B0x17F50Tank talking / speech bubbles
equip.cpp0x16BC0x1D560Equipment / weapon shop
extras.cpp0x18950x20EA0Explosions / damage / projectiles
icons.cpp0x1F7F+0x263F0Tank and icon rendering
play.cpp0x28B90x2F830Main game loop / state machine
player.cpp0x2B3B+0x31FB0Player / tank management
ranges.cpp0x2CBF0x33690Terrain / mountain generation
score.cpp0x30B20x37520Scoring system
shark.cpp0x31670x38070AI trajectory solver
shields.cpp0x31D80x38780Shield system
team.cpp0x3A56+0x40F60Team management

This gave us a roadmap of the entire binary. Instead of navigating 388KB of raw assembly, we could target specific files for specific mechanics.

4. The Weapon Table

Structure

57 weapons stored as a struct array at file offset 0x056F76, each 52 bytes:

Offset  Size  Field
+00     4     Name pointer (far ptr to name string)
+04     2     Price (uint16)
+06     2     Bundle quantity (uint16)
+08     2     Arms level required (uint16, 0-4)
+0A     2     Behavior type code (uint16)
+0C     2     Behavior handler segment (uint16)
+0E     2     Blast radius / param (int16, signed)
+10-33  36    Runtime fields (zeroed)
Total:  52 bytes

The Linker Bug

Items 50–56 (Force Shield, Heavy Shield, Super Mag, Patriot Missiles, Auto Defense, Fuel Tank, Contact Trigger) have corrupted data. The Borland linker placed the equip.cpp debug assertion string at file offset 0x05793A — exactly where item 50's struct begins (0x056F76 + 48×52 = 0x05793A). The game reads prices from struct fields with no fallback, so these items have garbage prices at runtime. This is a confirmed linker-era bug in v1.50. We recovered intended prices from the official printed manual.

Weapon Dispatch

Weapons fire through an indirect far call: lcall [weapon_idx * 52 + DS:0x1200]. The struct's +0A field is the code offset within the handler segment, and +0C is the segment paragraph. Together they form a far function pointer dispatched at file 0x1C6C8.

Funky Bomb has BhvType=0x0000 (same as accessories) but a non-zero handler segment 0x1DCE — the entry point IS offset 0, the beginning of the segment. This caused initial confusion since NULL usually means "no handler."

5. The AI Solver (shark.cpp)

The AI does NOT use a closed-form ballistic equation. It uses pixel-level ray marching — stepping one pixel at a time along a normalized direction vector, reading screen pixels via fg_getpixel to detect terrain and players. This is why the AI is called "shark" — it swims through the screen pixel by pixel.

Algorithm

  1. Target selection — Find closest alive enemy (stride 0xCA struct iteration)
  2. Direction — Chebyshev-normalize vector from self to target
  3. Ray march — Step one pixel at a time, reading fg_getpixel at each position:
    • Pixel ≥ 105 → terrain (try to navigate around)
    • Pixel 1–79 → player hit (floor(pixel/8) = player index). If it's the target, success!
    • Otherwise → sky, keep stepping
  4. Gravity sweep — Decrement scaled gravity by 1.0 each outer pass, scanning progressively flatter arcs until one reaches the target
  5. Power calculation — Standard ballistic formula: power² = ref_dist × dx² / (cos² × sin_component)
  6. Wind correction — Proportional to gravity / dist^(1/4)
  7. Noise injection — 2–5 sinusoidal harmonics with rejection-sampled frequencies, summed per-column for smooth aim wobble

The Corrupted Sentient

The AI type table at DS:0x02E2 has 10 entries. Types 7 (Cyborg) and 8 (Unknown) are randomized to types 0–5 at runtime. But type 9 ("Sentient") — which appears in the name string table at 0x058480 — is never randomized. Its vtable pointer reads from the ASCII string "Some dumb tank: %s\n" at DS:0x0300, interpreting characters as far pointers. Selecting Sentient AI in the original game would crash DOS with a wild jump to address 0x2062:0x6D75.

6. Physics Discoveries

Multiplicative Viscosity

Air viscosity is NOT a drag force subtracted from velocity. It's a multiplicative damping factor applied every step: velocity *= (1.0 - config/10000). At max setting (20), each step retains 99.8% of velocity.

2D Velocity Rotation Damage

The damage system does not use simple distance falloff. It performs a 2D rotation of the projectile velocity vector. The angle between projectile direction and explosion-to-player direction is doubled, and the velocity is rotated by this amount. Damage = rotated speed magnitude / 100. This creates directional damage weighting — a shot that passes close to a tank deals more damage than one stopped dead next to it.

Adaptive Timestep

The physics dt is NOT a fixed constant. At startup, a MIPS benchmark (get_mips_count() at file 0x20F63) calibrates the timestep to the CPU speed: dt = 1 / (50 × FIRE_DELAY × mips / (projectiles × 100)). The fallback value of 0.02 was used in our reimplementation.

PI/180 Constant

A 0.0174532930 value at DS:0x1D08 confirmed degree-to-radian conversion for trigonometry — hidden behind the INT 34h–3Dh emulation layer. This was found by scanning the data segment for known float patterns.

Wind Generation

Wind uses a center-biased distribution with cascading random doublings:

wind = random(max/2) - max/4     // centered
if (random(100) < 20) wind *= 2  // 20% chance double
if (random(100) < 40) wind *= 2  // 40% chance double (independent)
// Distribution: 48% small, 12% moderate, 32% strong, 8% extreme

7. Data Extraction Highlights

War Quotes (with preserved typos)

15 war quotes displayed between rounds, extracted from 0x05B580–0x05BC5E. Notable preservation of original typos:

Cheat Codes

ActivationCodeEffect
SET ASGARD=frondheimfrondheimMonochrome debug overlay — renders to MDA/Hercules VRAM at B000:0000. Wendell Hicken's second-monitor debug console.
SET ASGARD=ragnarokragnarokDebug log to file — opens scorch.dbg for text output
SCORCH.EXE mayhemmayhemAll weapons, 99 of each
SCORCH.EXE nofloatnofloatDisable FPU physics — for machines without 8087 coprocessor

ASGARD/frondheim/ragnarok are debug modes, not gameplay cheats. Only mayhem affects gameplay. The handler at file 0x02A42D uses getenv() and stricmp() for matching.

Talk Files

54 attack phrases (TALK1.CFG) and 61 death phrases (TALK2.CFG), loaded from paths in ATTACK_COMMENTS and DIE_COMMENTS config variables. Displayed as speech bubbles above tanks via the comments.cpp module.

Shield Configuration Table

TypeEnergyRadiusColorBehavior
Shield55 HP16 pxYellowBasic absorption
Warp Shield100 HP15 pxWhiteRandom teleport on hit
Teleport Shield100 HP15 pxPurpleTeleport when triggered
Force Shield150 HP16 pxWhiteAbsorption + deflection
Heavy Shield200 HP16 pxOrangeMax absorption + deflection

Flicker Shield has no config table entry — uses probabilistic on/off cycling. Damage absorption is flat 1:1 HP. Shield color fades proportionally: color = energy × configColor / maxEnergy.

8. Challenges & Lessons

The FPU Encoding Problem

Every standard disassembler fails on this binary. Without the custom FPU decoder, the physics engine, damage system, AI solver, and terrain generator are walls of INT 34h / INT 38h calls. The decoder transformed ~2,000 encoded instructions across the analysis.

Pixel Collision = No Collision Mesh

The game has no abstract collision geometry. A crater is just erased pixels. A tank is just colored pixels. Collision detection reads getpixel() and checks color ranges. This means the framebuffer is the physics world — a design that's elegant for a 320×200 game but extremely unusual by modern standards.

Two-Level Player Records

Players use a two-struct architecture: a compact 108-byte record (stride 0x6C) for hot-loop iteration, and a full 202-byte sub-struct (stride 0xCA) for complete tank state. The sub-struct contains turret angle, shield energy, AI targeting data, and linked-list nodes. Cross-referencing between these two levels through far pointer indirection was a persistent challenge.

The Mag Deflector Erroneous ×30

The in-flight Mag Deflector deflection scales as (direction / normDist) × dt with no additional multiplier. Our initial web implementation had an erroneous ×30 factor that caused projectiles to reverse direction when multiple players had Mag Deflectors (e.g., after the MAYHEM cheat). This was only discovered through playtesting and traced back to a misread of the normalized distance calculation at file 0x21B68.