Skip to content

Commit 8c7312d

Browse files
skeptomaiclaude
andcommitted
fix: Restore Z-Machine opcode validation for Zork I compatibility
Fixes critical regression where interpreter incorrectly rejected opcode 0x15 (sub instruction), causing Zork I and other commercial games to fail with "Invalid Long form opcode 0x15" errors. Changes: - Updated src/instruction.rs validation from 0x01-0x14 to 0x01-0x1C - Added comprehensive opcode documentation (0x14=add, 0x15=sub, etc.) - Updated unit tests to reflect correct validation ranges - Documented opcode validation architecture in ARCHITECTURE.md - Confirmed fix with successful Zork I testing Root cause: Commit d7dec5c implemented overly restrictive validation that rejected fundamental Z-Machine arithmetic operations (sub, mul, div, mod). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 4427e2c commit 8c7312d

File tree

3 files changed

+84
-12
lines changed

3 files changed

+84
-12
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,4 @@ debug.log
1111

1212
# Claude Code settings
1313
.claude/settings.local.json
14+
pkg/

docs/ARCHITECTURE.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -386,6 +386,56 @@ Each opcode module follows a consistent pattern:
386386
4. **Code Organization**: Related operations are grouped together logically
387387
5. **Performance**: Direct method calls with minimal routing overhead
388388

389+
### Opcode Validation
390+
391+
The instruction decoder in `src/instruction.rs` performs critical validation of Z-Machine opcodes to ensure only valid instructions are processed. This validation prevents interpreter crashes from malformed or corrupted game files.
392+
393+
#### Valid Opcode Ranges
394+
395+
Z-Machine opcodes are validated based on their instruction form:
396+
397+
```rust
398+
// Long form (2OP) instructions: opcodes 0x01-0x1C
399+
if opcode == 0x00 || opcode > 0x1C {
400+
return Err(format!(
401+
"Invalid Long form opcode 0x{:02x} at address {addr:04x} (valid range: 0x01-0x1C)",
402+
opcode
403+
));
404+
}
405+
```
406+
407+
**Critical Opcode Mappings:**
408+
- `0x14` - add (addition)
409+
- `0x15` - sub (subtraction)
410+
- `0x16` - mul (multiplication)
411+
- `0x17` - div (division)
412+
- `0x18` - mod (modulo)
413+
- `0x19` - call_2s (call function, store result)
414+
- `0x1A` - call_2n (call function, no result)
415+
- `0x1B` - set_colour (set foreground/background colors)
416+
- `0x1C` - throw (exception handling)
417+
418+
#### Historical Issue: Opcode 0x15 Regression
419+
420+
A critical regression was introduced in commit `d7dec5c` that incorrectly rejected opcode 0x15 (sub instruction), causing Zork I and other commercial games to fail with "Invalid Long form opcode 0x15" errors. The issue was caused by implementing overly restrictive validation that assumed opcodes only went up to 0x14 (add).
421+
422+
**Root Cause:** The Z-Machine specification defines fundamental arithmetic operations in opcodes 0x14-0x18 (add, sub, mul, div, mod), but the validation code incorrectly limited the range to 0x01-0x14.
423+
424+
**Resolution:** Updated validation to accept the correct range 0x01-0x1C, ensuring all fundamental Z-Machine operations are properly supported.
425+
426+
**Testing:** The fix was validated by successfully running Zork I (`/Users/cb/Projects/infocom-testing-old/infocom/resources/test/zork1/DATA/ZORK1.DAT`), confirming compatibility with commercial Infocom games.
427+
428+
#### Validation Architecture
429+
430+
The decoder performs form-specific validation:
431+
432+
1. **Long Form (2OP)**: Validates opcodes 0x01-0x1C
433+
2. **Short Form (1OP/0OP)**: Uses different opcode spaces
434+
3. **Variable Form (VAR)**: Extended opcode validation for multi-operand instructions
435+
4. **Extended Form (EXT)**: v5+ extended instruction set
436+
437+
This layered validation ensures that instruction decoding failures are caught early with descriptive error messages, preventing silent corruption or undefined behavior during game execution.
438+
389439
## Core Components Deep Dive
390440

391441
### 1. Game Loading and Version Detection

src/instruction.rs

Lines changed: 33 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -160,18 +160,20 @@ impl Instruction {
160160
let opcode = opcode_byte & 0x1F;
161161

162162
// CRITICAL FIX: Validate opcode per Z-Machine specification
163-
// Long form (2OP) opcodes are only valid for 0x01-0x14 (1-20)
163+
// Long form (2OP) opcodes are valid for 0x01-0x1C (1-28) in Z-Machine spec
164164
// This validation prevents false positives in disassembly where
165165
// data regions were incorrectly decoded as instructions
166166
// Examples of false positives this prevents:
167167
// - Address 33c04 in AMFV: all zeros decoded as Long 0x00
168168
// - Addresses caf8, cafc: data incorrectly interpreted as code
169-
// - Invalid opcodes above 0x14 that don't exist in Z-Machine spec
170-
// ENHANCED VALIDATION: Reject ALL invalid Long form opcodes per Z-Machine spec
171-
// 2OP instructions exist for opcodes 1-20 (0x01-0x14), opcode 0 and 0x15+ are invalid
172-
if opcode == 0x00 || opcode > 0x14 {
169+
//
170+
// Valid 2OP opcodes include fundamental arithmetic operations:
171+
// 0x14 = add, 0x15 = sub, 0x16 = mul, 0x17 = div, 0x18 = mod
172+
// 0x19 = call_2s, 0x1A = call_2n, 0x1B = set_colour, 0x1C = throw
173+
// Zork I and other commercial Infocom games legitimately use these opcodes
174+
if opcode == 0x00 || opcode > 0x1C {
173175
return Err(format!(
174-
"Invalid Long form opcode 0x{:02x} at address {addr:04x} (valid range: 0x01-0x14)",
176+
"Invalid Long form opcode 0x{:02x} at address {addr:04x} (valid range: 0x01-0x1C)",
175177
opcode
176178
));
177179
}
@@ -696,10 +698,10 @@ mod tests {
696698
}
697699

698700
#[test]
699-
fn test_reject_invalid_long_form_opcode_0x15() {
700-
// Test that invalid Long form opcode 0x15 is rejected (above valid range)
701+
fn test_reject_invalid_long_form_opcode_0x1d() {
702+
// Test that invalid Long form opcode 0x1D is rejected (above valid range)
701703
let memory = vec![
702-
0x35, // Long form, 2OP, opcode 0x15 (INVALID), both small constants
704+
0x3D, // Long form, 2OP, opcode 0x1D (INVALID), both small constants
703705
0x34, // First operand (small constant)
704706
0x78, // Second operand (small constant)
705707
0x00, 0x00, // Padding
@@ -708,13 +710,13 @@ mod tests {
708710
let result = Instruction::decode(&memory, 0, 3);
709711
assert!(result.is_err());
710712
let error_msg = result.unwrap_err();
711-
assert!(error_msg.contains("Invalid Long form opcode 0x15"));
712-
assert!(error_msg.contains("valid range: 0x01-0x14"));
713+
assert!(error_msg.contains("Invalid Long form opcode 0x1d"));
714+
assert!(error_msg.contains("valid range: 0x01-0x1C"));
713715
}
714716

715717
#[test]
716718
fn test_accept_valid_long_form_opcodes() {
717-
// Test that valid Long form opcodes 0x01-0x14 are accepted
719+
// Test that valid Long form opcodes 0x01-0x1C are accepted
718720

719721
// Test opcode 0x01 (je)
720722
let memory1 = vec![
@@ -735,5 +737,24 @@ mod tests {
735737
0x00, // Padding
736738
];
737739
assert!(Instruction::decode(&memory2, 0, 3).is_ok());
740+
741+
// Test opcode 0x15 (sub) - critical for Zork I compatibility
742+
let memory3 = vec![
743+
0x35, // Long form, 2OP, opcode 0x15 (sub), both small constants
744+
0x0A, // First operand (small constant)
745+
0x05, // Second operand (small constant)
746+
0x01, // Store variable
747+
0x00, // Padding
748+
];
749+
assert!(Instruction::decode(&memory3, 0, 3).is_ok());
750+
751+
// Test opcode 0x1C (throw) - highest valid 2OP opcode
752+
let memory4 = vec![
753+
0x3C, // Long form, 2OP, opcode 0x1C (throw), both small constants
754+
0x01, // First operand (small constant)
755+
0x02, // Second operand (small constant)
756+
0x00, 0x00, // Padding
757+
];
758+
assert!(Instruction::decode(&memory4, 0, 3).is_ok());
738759
}
739760
}

0 commit comments

Comments
 (0)