The purpose of this repo is to reverse engineer the firmware for the Cummins CM550 ECU well enough to understand how the inner workings work.
This project has achieved complete naming coverage for all functions and global variables:
| Category | Named | Coverage |
|---|---|---|
| Functions | 793 | 100% β |
| Global Variables | 6,087 | 100% β |
The firmware is now fully navigable with meaningful names throughout. Every function has a descriptive name, and every global variable is documented in the knowledge database.
Understanding boot-time memory initialization is fundamental to reverse engineering this firmware. At startup, the ECU copies critical data from flash/EEPROM to working RAM. This is where everything begins.
| Region | Flash Address | RAM Address | Size | Copy Function |
|---|---|---|---|---|
| Calibration Block 1 Header | 0x4000-0x400B | 0x804882-0x80488D | 12 bytes | calibrationDataCopyWithChecksum |
| Calibration Block 2 Data | 0x4400-0x5A41 | 0x80488E-0x8062CF | 6,722 bytes | calibrationDataCopyWithChecksum |
| Calibration Block 1 Backup | 0x6000-0x600B | 0x804882-0x80488D | 12 bytes | calibrationDataCopySecondary |
| Calibration Block 2 Backup | 0x6400-0x7A41 | 0x80488E-0x8062CF | 6,722 bytes | calibrationDataCopySecondary |
| Firmware Tables | 0x37EAE-0x3A68D | 0x8062D2-0x808AB1 | 10,208 bytes | firmwareDataCopyToWorkingMemory |
| EBI Memory Controller | 0x28C10-0x28E9B | 0xFFE000-0xFFE68B | 652 bytes | initInternalRamAndCAN1 |
| CAN1 Mailboxes | 0x2929C-0x293FB | 0xFFE700-0xFFE7FE | 256 bytes | initInternalRamAndCAN1 |
Total: ~24,584 bytes copied at boot
Memory Map Overview (click to expand)
FLASH MEMORY (256KB)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 0x00000000-0x00003FFF Firmware Code (16KB) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β 0x00004000-0x00005A41 EEPROM Calibration Block 1 (Primary) β βββ
β 0x00006000-0x00007A41 EEPROM Calibration Block 2 (Backup) β β Copied to
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β 0x804882-0x8062CF
β 0x00028C10-0x000293FB Peripheral Init Data β βββΌββ 0xFFE000-0xFFE7FE
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β
β 0x00037EAE-0x0003A68D Firmware Parameter Tables β βββββ 0x8062D2-0x808AB1
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
EXTERNAL RAM (1MB @ 0x800000)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 0x00804882-0x008062CF Calibration Data (working copy) β
β 0x008062D2-0x00808AB1 Firmware Tables (working copy) β
β 0x0080CFD6-0x0080CFE6 Parameter Tables (runtime) β
β 0x008086F6-0x008086xx Reference Tables (scaling factors) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
INTERNAL REGISTERS (256B @ 0xFFE000)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 0x00FFE000-0x00FFE68B EBI Memory Controller Config β
β 0x00FFE700-0x00FFE7FE CAN1 Controller Mailboxes β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Boot Sequence (click to expand)
- Hardware Reset - MC68336 starts execution
- Memory Controller Init -
initInternalRamAndCAN1copies EBI config (652 bytes) to 0xFFE000 - CAN Controller Init -
initInternalRamAndCAN1copies CAN1 mailbox config (256 bytes) to 0xFFE700 - Primary Calibration Copy -
calibrationDataCopyWithChecksumcopies EEPROM data with checksum validation - Fallback on Failure - If primary checksum fails,
calibrationDataCopySecondaryloads backup block - Firmware Tables Copy -
firmwareDataCopyToWorkingMemorycopies CAN/parameter lookup tables - Parameter Initialization -
param_interpolate(0xd8b4) calculates defaults for blank/corrupted EEPROM - Sensor System Init -
initADCChannelConfiguration(0xAC1C) sets up sensor linearization tables
Parameter System Architecture (click to expand)
The parameter system uses a BlockΓ256+Offset addressing scheme:
Core Functions:
param_address_calc(0x12AFA) - Address calculation engineparam_lookup_1/2/3(0xD632/0xD69C/0xD756) - Parameter retrieval with validationparam_interpolate(0xD8B4) - EEPROM default value calculator (key initialization function)
Memory Layout:
- Parameter Tables: 0x80CFD6, 0x80CFDA, 0x80CFDE - Runtime parameter storage
- Reference Tables: 0x8086F6+ - Scaling factors and parameter definitions
- Safety Limit: All parameters capped at 32,000
Initialization Flow:
param_interpolate (0xd8b4)
ββ Reads reference tables (0x8086xx)
ββ Calculates default values using BlockΓ256 formula
ββ Stores results in parameter tables (0x80CFDx)
ββ Populates defaults when EEPROM is blank/corrupted
| File | Description |
|---|---|
ghidra/CM550.rep/flash_to_ram_mappings.csv |
Source of truth for all copy operations |
analysis/complete_parameter_system_map.md |
Full parameter system reverse engineering |
analysis/memory_range_analysis.md |
Memory range verification |
ghidra/CM550.rep/sensor_system_architecture.md |
ADC sensor initialization details |
This project features a complete CSV-driven automation system that transforms raw firmware into fully analyzed, human-readable code through Ghidra scripting.
For a fresh firmware import, run ONE script in Ghidra:
ghidra_scripts/MasterAnalysisSetup.java
This automatically executes all analysis:
- β Memory map setup (MC68336 architecture)
- β Function renaming (793 functions)
- β Global variable creation (6,087 variables)
- β Structure application (433 structure fields)
- β Label creation (3,495 control flow labels)
- β Constant documentation (73 magic numbers)
- β Enum creation (464 enum entries)
Result: Firmware goes from cryptic to human-readable instantly.
Your discoveries are stored in 9 CSV files in ghidra/CM550.rep/ - these ARE the playbook:
function_renames.csv- Function names (vp44FuelTempHandler, canMessageDispatcher, etc.)function_parameters.csv- Function parameter names and typesglobal_variables.csv- Typed variables (param_table_main, sensor_data_buffer, etc.)local_variables.csv- Decompiler local variable renames (matched by first-use address)structure_definitions.csv- C-style structures (parameter_table_t, can_param_msg_t, etc.)labels.csv- Control flow labels (switch_case_16, call_vp44_handler, etc.)constants.csv- Magic numbers (VP44_FUEL_TEMP_OFFSET=112, RPM_MULTIPLIER=4, etc.)enums.csv- Logical groupings (CAN_MSG_TYPE, PARAM_VALIDATION, etc.)arrays.csv- Arrays/buffers (parameter_buffer[16], sensor_data_buffer[256], etc.)
All CSV files are sorted to prevent merge conflicts during team collaboration:
function_renames.csv: Sorted by address (hex)global_variables.csv: Sorted by address (hex)structure_definitions.csv: Sorted by struct_name, then field_nameconstants.csv: Sorted by address (hex)labels.csv: Sorted by address (hex)enums.csv: Sorted by enum_name, then valuearrays.csv: Sorted by address (hex)
Why this matters:
- β No merge conflicts when team members add entries
- β Predictable ordering makes entries easy to find
- β Code reviewable discoveries in consistent format
- β Scalable to large teams with systematic organization
Enum sizes in enums.csv must match the assembly instruction operand size:
move.b(byte operations) β enum size = 1move.w(word operations) β enum size = 2move.l(long/dword operations) β enum size = 4
Why this matters: If enum size doesn't match the variable size, Ghidra creates overlapping symbols (shown with underscore prefix like _variable_name), and enum member names won't appear in switch statements.
How to verify: Check the disassembly for how the variable is accessed:
move.w d0,(main_loop_phase_index).w ; Variable is 2 bytes β enum size = 2
move.l d1,(fuel_sync_state).l ; Variable is 4 bytes β enum size = 4Example fix: If switch(_my_variable) shows underscore:
- Find the variable in
global_variables.csv - Check assembly to determine actual operand size
- Update both variable size AND enum size to match
Never worry about CSV ordering again!
Run the setup script once after cloning:
./setup-hooks.shThis configures a pre-commit hook that automatically:
- β Sorts all CSV files according to standardization rules
- β Prevents merge conflicts before they happen
- β Re-stages sorted files automatically during commit
- β Shows colorful feedback about what was sorted
When adding new entries, they will automatically sort into the correct position during git commit.
NEVER make changes directly in Ghidra (via MCP tools or manual edits). All changes MUST be made in CSV files only, then applied via ApplyAndExport.
Why?
- Direct Ghidra changes are NOT persisted to CSVs (source of truth)
- Creates inconsistency between CSV and Ghidra state
- Can cause unexpected side effects (e.g., variable types becoming
undefined) - The Ghidra project can be rebuilt from CSVs, but not vice versa
Pre-Commit Requirements:
- All changes are in CSV files (NOT made directly in Ghidra)
- ApplyAndExport (
Ctrl+Shift+E) was run in Ghidra - Verified
ghidra/CM550.rep/working/J90280.05.ghidra.cppshows expected changes - No unexpected type regressions in the output
- π Discover new functions/addresses in Ghidra
- π Update CSVs with findings
- β‘ Press
Ctrl+Shift+Ein Ghidra (or runApplyAndExport.java) - β
Verify output in
working/J90280.05.ghidra.cpp - π€ Claude Code sees changes immediately in exported files
The ApplyAndExport.java script combines both setup and export in one keystroke:
- Keyboard shortcut:
Ctrl+Shift+E - Menu: Tools β Apply and Export
- What it does: Runs MasterAnalysisSetup + ExportAnalysisResults automatically
- Log file: Creates
ghidra/CM550.rep/apply_and_export.log(cleared each run)
- π Discover new functions/addresses in Ghidra
- π Update CSVs with findings
- π Run MasterAnalysisSetup.java - applies changes instantly
- π€ Run ExportAnalysisResults.java - exports to
working/for Claude Code - π€ Claude Code sees changes immediately in exported files
- Import firmware β 2. Run MasterAnalysisSetup β 3. Run ExportAnalysisResults β Done!
ghidra_scripts/ExportAnalysisResults.java creates:
ghidra/CM550.rep/working/J90280.05.ghidra.asm- Assembly with meaningful names/commentsghidra/CM550.rep/working/J90280.05.ghidra.cpp- C++ decompilation with types
Claude Code instantly sees your latest Ghidra analysis!
ApplyAndExport.java- FASTEST WORKFLOW! Combines MasterAnalysisSetup + ExportAnalysisResults- Keyboard:
Ctrl+Shift+E - Menu: Tools β Apply and Export
- Perfect for: Iterative reverse engineering (update CSV β press hotkey β done!)
- Keyboard:
MasterAnalysisSetup.java- Complete analysis automation (functions, structures, enums, labels, etc.)ExportAnalysisResults.java- Export analysis to working/ for Claude Code integration
ghidra_scripts/SetupMemoryMap.java- MC68336 memory layout with 8KB EEPROMghidra_scripts/BulkFunctionRenamer.java- CSV-driven function renamingghidra_scripts/BulkVariableCreator.java- Typed global variablesghidra_scripts/BulkStructureCreator.java- Structure definitionsghidra_scripts/BulkLabelCreator.java- Control flow labelsghidra_scripts/BulkConstantCreator.java- Magic number documentationghidra_scripts/BulkEnumCreator.java- Enumeration creationghidra_scripts/BulkArrayCreator.java- Array/buffer definitionsghidra_scripts/BulkFunctionParameterRenamer.java- Function parameter namingghidra_scripts/BulkLocalVariableRenamer.java- Decompiler local variable renaming
The local_variables.csv uses first-use address matching for stability. Unlike global variable names, decompiler local variable names (like cVar6, bVar8) can shift when other variables in the same function are renamed. Matching by code address ensures renames are stable.
function_address,function_name,first_use_address,new_variable_name,type,comment
0x00012484,diagnosticCommandDispatcher,0x12580,securityCheckResult,char,Result from systemSecurityCheck()- Add an entry with
first_use_address=0x0(placeholder that won't match) - Run ApplyAndExport (Ctrl+Shift+E) - the script will output available variables with their first-use addresses
- Check
ghidra/CM550.rep/apply_and_export.logfor the output:Processing function: diagnosticCommandDispatcher @ 0x12484 β No variable found with first-use at 0x0 Searching for: myVariableName Available variables: - cVar6 (char) first-use: 0x12580 - bVar7 (byte) first-use: 0x1256c - Update the CSV with the correct address from the log
- Re-run ApplyAndExport - the variable will be renamed
- Stable: Code addresses never change, even when other variables are renamed
- Unique: Each assignment location is unique in the binary
- Semantic: Ties directly to where the variable gets its value
When scripts are modified in the project directory:
-
Copy to Ghidra Scripts Directory:
# Copy individual script cp ghidra_scripts/SetupMemoryMap.java ~/ghidra_scripts/ cp ghidra_scripts/MasterAnalysisSetup.java ~/ghidra_scripts/ # Copy all scripts (after updates) cp ghidra_scripts/*.java ~/ghidra_scripts/
-
Verify Script Updates:
ls -la ~/ghidra_scripts/*.java
-
Refresh Ghidra Script Manager:
- In Ghidra: Window β Script Manager
- Click Refresh button to reload updated scripts
ghidra_scripts/ to ~/ghidra_scripts/ directory after modifications to ensure Ghidra uses the latest versions.
π Script Organization:
ghidra_scripts/- Project scripts (version controlled)~/ghidra_scripts/- User Ghidra directory (runtime execution)
Claude Code can interact directly with Ghidra via the MCP (Model Context Protocol) server for reading and analysis only.
DO NOT use MCP tools to modify Ghidra (rename, set types, etc.). Direct modifications:
- Are NOT persisted to CSV files (source of truth)
- Can cause unexpected side effects (variable types becoming
undefined) - Create inconsistency between CSVs and Ghidra state
All changes must go through CSV files β ApplyAndExport workflow.
| Tool | Purpose | Safe to Use |
|---|---|---|
decompile_function |
Get C code by function name | β Yes |
decompile_function_by_address |
Get C code by hex address | β Yes |
search_functions_by_name |
Find functions by pattern | β Yes |
get_function_xrefs |
Get cross-references | β Yes |
list_strings |
List strings in binary | β Yes |
get_xrefs_to / get_xrefs_from |
Trace references | β Yes |
disassemble_function |
Get assembly code | β Yes |
list_functions |
List all functions | β Yes |
rename_function_by_address |
Rename function | β NO - Use CSV |
rename_variable |
Rename local variable | β NO - Use CSV |
rename_data |
Rename global data label | β NO - Use CSV |
set_function_prototype |
Set function signature | β NO - Use CSV |
set_local_variable_type |
Set variable type | β NO - Use CSV |
Pattern: Rename a Function
1. mcp__ghidra__decompile_function_by_address("0x0000a30c") β READ (OK)
2. Analyze decompiled code, determine name
3. Edit function_renames.csv (add: 0x0000a30c,myFunctionName) β CSV ONLY
4. Run ApplyAndExport in Ghidra (Ctrl+Shift+E)
5. Verify output in working/J90280.05.ghidra.cpp
Pattern: Add Structure/Variable
1. mcp__ghidra__decompile_function("myFunction") β READ (OK)
2. Analyze code, identify structure patterns
3. Edit structure_definitions.csv or global_variables.csv β CSV ONLY
4. Run ApplyAndExport in Ghidra (Ctrl+Shift+E)
5. Verify output shows expected field names
- All changes are in CSV files (NOT made via MCP write tools)
- Run
Ctrl+Shift+Ein Ghidra (ApplyAndExport) - Verify
working/*.cppshows expected changes - Check for unexpected type regressions (e.g.,
byte *βundefined *) - Commit includes synchronized CSVs + exports
- CSV files are the source of truth - ALL changes go through CSVs
- MCP for reading only - use
decompile_function,get_xrefs, etc. for analysis - Never use MCP write tools -
rename_*,set_*tools cause sync issues - Always verify before commit - run ApplyAndExport and check the output
- Address references help:
function_name @ 0x12345for precise location - Exported files have latest analysis -
working/*.asmandworking/*.cpp
The CSVβApplyAndExportβVerify workflow ensures reproducible, consistent analysis.
File: firmware/J90280.05.full.bin (converted from Intel HEX)
Architecture: 68000:BE:32:default
Base Address: 0x00000000
Import Method: Raw Binary
- Internal Flash: 0x000000 - 0x007FFF (32KB, R/X)
- Internal Registers: 0xFFFF00 - 0xFFFFFF (256B, R/W)
- Internal RAM: 0xFFFE00 - 0xFFFEFF (256B, R/W)
- External Memory: 0x800000 - 0x8FFFFF (1MB, R/W) - Parameter system region
- vp44FuelTempHandler @ 0x1C538 - VP44 injection pump fuel temperature processing
- canMessageDispatcher @ 0x1C846 - CAN message routing by type (16=VP44, 17/19=Alt, 255=Error)
- buildCanMessage @ 0x29C52 - J1939 message assembly (fuel%, RPMΓ4, timing advance)
- param_address_calc @ 0x12AFA - EEPROM parameter address calculation (BlockΓ256+Offset)
- param_lookup_1/2/3 @ 0xd632/0xd69c/0xd756 - Parameter retrieval with validation
- param_interpolate @ 0xd8b4 - EEPROM default value calculator
- Parameter tables @ 0x80CFDx - Runtime parameter storage
- Reference tables @ 0x8086xx - Scaling factors and limits
- J1939 functions: sendJ1939Msg, sendJ1939SingleFrame, sendJ1939MultiFrame
- VP44 network: Separate CAN bus for injection pump communication
- Message formats: 8-byte J1939 frames with engine sensor data
- MAIN_LOOP_PHASE (40 phases) - Main scheduler task phases (PHASE_0_VP44_RPM through PHASE_39)
- ENGINE_OPERATING_MODE (9 states) - IDLE, LOW_RPM_RUNNING, HIGH_RPM_RUNNING, CRANKING, etc.
- VP44_ENGINE_STATE (12 states) - VP44 injection pump state machine
- PROTECTION_STATE (5 states) - Engine protection coordinator (FAULT_DURATION_COUNT, DIAGNOSTIC_VALIDATE, etc.)
- FUEL_SYNC_STATE (5 states) - Fuel pressure synchronization state machine
- RETARDER_MODE_STATE (6 states) - Retarder/engine brake control
The CSV files contain the complete reverse engineered knowledge base for this firmware.
| Category | Named | Remaining | Progress |
|---|---|---|---|
| Functions | 793 | 0 | 100% β |
| Global Variables | 6,087 | 3* | 100% β |
| Local Variables | 42 | ~22 | 66% |
| Function Parameters | 9 | TBD | In Progress |
*3 remaining DAT_ references are indirect pointer accesses to already-named variables
| CSV File | Entries | Description |
|---|---|---|
function_renames.csv |
793 | All functions named |
global_variables.csv |
6,087 | All global variables documented |
structure_definitions.csv |
433 | Structure field definitions |
labels.csv |
3,495 | Control flow labels |
enums.csv |
464 | Enumeration entries |
constants.csv |
73 | Magic number documentation |
local_variables.csv |
42 | Local variable renames |
function_parameters.csv |
9 | Function parameter names |
- ~22 unnamed local variables -
local_XXpatterns in key functions - ~1,050 undefined type usages - Variables needing proper typing
- Function parameter naming - Improve function signatures
The common_parameters.json file contains CalTerm parameter definitions extracted from e2m calibration files. While the parameter names and descriptions are reliable, the memory addresses may be incorrect.
Why addresses may be wrong:
- Addresses were extracted from e2m files, not verified against actual firmware
- Different firmware versions may use different memory layouts
- The extraction process may have introduced errors
Verification Goal: One key objective of this reverse engineering effort is to:
- Verify which addresses in common_parameters.json are correct
- Identify and document incorrect addresses
- Build a verified address mapping through firmware analysis
When verifying addresses:
- Cross-reference decompiled code behavior with parameter descriptions
- Note verified mappings in global_variables.csv comment field
- Mark confirmed matches with "VERIFIED: matches common_parameters.json"
- Mark mismatches with "NOTE: common_parameters.json shows different address"