Skip to content

SAM firmware reverse engineering #64

@quo

Description

@quo

First of all, @qzed thanks for all the work you've already done reverse engineering and documenting a lot of the SAM stuff. It's been quite helpful!

I've been reverse engineering the SP7 SAM firmware (specifically SurfaceSAM_14.312.139.bin) in an attempt to debug something. Haven't found anything particularly useful so far, but figured I would share regardless. Let me know if you'd like me to look at anything in particular.

Click here for info dump

Firmware structure

As can be seen on SP7 teardown photos, the SAM microcontroller is an NXP LPC54S001J (Cortex-M4, 360KB RAM), with a separate Winbond 16MB flash chip.

The SVD is for a slightly different part number, but it does the job. There's a script to import SVDs into Ghidra, but it fails due to some overlapping address ranges. You can either remove these from the SVD or hack the script a bit.

The bin file consists of a signature and two firmware images. The images are encoded as arrays of { u32 offset, u8 len = 16, u8[16] data }, so can be extracted fairly easily. The two images are identical, except one is meant to be flashed at 0x10004000 and the other at 0x10084000 (standard A/B update handling), so some addresses differ by one bit. The images have a header and end with a CRC16, which are used by the SAM when flashing. The actual raw firmware image starts at 0x66C.

The raw firmware images start with a standard ARM vector table, and contain an NXP image header which tells the NXP bootloader to load the first 0x29484 bytes into SRAM at address 0. For reverse engineering, you can just split the raw image and load the first 0x30000 bytes at 0, and the remaining 0x50000 bytes at 0x10034000 or 0x100B4000.

The firmware contains an RTOS which is internally referred to as Kaos. I can't find anything about it on Google, so I assume it was created by MS. Kaos appears somewhat inspired by FreeRTOS and offers the same basic primitives: tasks, timers, events, semaphores, and message queues. There are dozens of tasks and timers, which communicate through dozens of message queues, which leads to a ton of indirection (and memory overhead) so it can be very difficult to follow what's going on. Some parts of the firmware use vtable-like constructions for some additional indirection. Lots of fun.

SAM protocol

(I'll try to use the terminology from https://github.com/linux-surface/surface-aggregator-module/tree/master/doc.)

The following target(/source) IDs are implemented:

  • 0 = Host
  • 1 = SAM
  • 2 = KIP
  • 3 = Debug
  • 4 = Surflink

I believe the SAM just forwards any messages with TID != 1 to different serial connections. (But I have not yet traced the entire message path.)

For TID == 1, only the following TCs are handled by the SAM on the SP7:

  • 01 = SAM: SAM
  • 02 = BAT: Battery
  • 03 = TMP: Thermal
  • 04 = PMC: Power
  • 05 = FAN: Fan
  • 07 = DBG: Debug
  • 09 = FWU: Firmware update
  • 0c = TCL: (Trace/crash logs?)
  • 0d = SFL: Surflink
  • 10 = BLD: Surface Blades
  • 12 = SEN: Sensors
  • 13 = SRQ: (?)
  • 15 = HID: HID
  • 17 = BKL: Backlight
  • 1b = USC: USB-C

The firmware does not explicitly name the TCs in any way (not even using the abbreviated names), so the above names are based on names of related tasks, message queues, etc. Of course, most of these names were already known.

FWU/TCL/SRQ commands are handled together via the NVM message queue, so presumably they all use the flash storage in some way.

Debug mode

The SAM has a debug mode variable (default 0) and a "safe mode" flag (default 1).

The safe mode flag is read with TC 1, CID 0x27, and written by TC 7, CID 0x5F.
When the safe mode flag is true, all CIDs >= 0x80 are disabled (for all TCs), and various other functionality is disabled.

The debug mode is read with TC 1, CID 0x29, and written by TC 7, CID 0x4E.
The debug mode values are:

  • 0 = disabled
  • 1 = also disabled?
  • 2 = basic
  • 3 = possibly related to firmware update?
  • 4 = full

The debug mode can only be set to 0 or 2 normally. It is currently not known how to set it to other values.

There is a command to read arbitrary RAM in debug mode 2, which is very useful, however there appears to be no command to write RAM, even in the higher debug modes. And everything seems to be locked down pretty well (range checks on all command arguments), so I've been unable to find a way to write arbitrary memory so far.

Commands

Here's a complete list of command IDs I've found (for TID 1), and descriptions for some of them. This list will probably contain mistakes! Will try to update this as I figure out more.

Format: CID { command data } => { response data } description
"Handled separately" means the switch handling these commands is in a separate function, so presumably the commands have related functionality.
(Please excuse the poor formatting.)

TC 01: SAM

01  (see existing doc)
02  (see existing doc)
03  (see existing doc)
04  {} disable safe mode and set debug mode 4, if device is in a certain power state?
05  {} enable safe mode and set debug mode 0, if device is in a certain power state?
06  {} nop
07  {} nop
08  {} nop
0b  (see existing doc)
0c  (see existing doc)
0d
0e
0f  (see existing doc)
10  (see existing doc)
13  (see existing doc)  {} => { u32 } get SAM firmware version
14  (see existing doc)
15  (see existing doc)
16  (see existing doc)
17  (see existing doc)
18
19
1a  (see existing doc)
1b  (see existing doc)
1c  (see existing doc)
1d  (see existing doc)
1e  (see existing doc)
1f  (see existing doc)
20  (see existing doc)
21
22  (see existing doc)
23  (see existing doc)
24
25
26  { u8 } set debug mode to 0 if zero, or to 2 if nonzero
27  {} => { u8 } get safe mode flag
29  {} => { u8 } get debug mode
2a
2b
2c
2e  {} => { u8 x, u8 0, u16 0x2e, u32 0 } get active firmware image location (x = 0x11 for 0x10004000, or 0x12 for 0x10084000)
2f
33  (see existing doc)
34  (see existing doc)
35
36
37
38  { u16 } => { u16 }
39
3a
3b
81

TC 02: Battery

01  (see existing doc)
02  (see existing doc)
03  (see existing doc)
04  (see existing doc)
0b  (see existing doc)
0c  (see existing doc)
0d  (see existing doc)
0f  (see existing doc)
18
2d
2e
2f
30
31
32
33
34
3c
3d
3e
3f
42
50
53
51

Handled separately:

00
07
08
1e,20
1f,21
29
2a
2b
2c
35
36
37
38
39
3a
3b
43
44
45
46
47
48
4d
4e
52
54
55
56
57
58
59
5a
5d
5e
61
80
81
86
87
8c
8d
90
94
95
96
97
98
99
9a
9b

Handled separately:

8b

TC 03: Thermal

01  (see existing doc)
03  (see existing doc) (handled twice)
04  (see existing doc)
0c
0d
0e
17

Handled separately:

02  (see existing doc)
03  (see existing doc) (handled twice)
0f
10
11
14
15
16
83
90
91
92
93
94
95

Handled separately:

09  (see existing doc)
0a  (see existing doc)
12
13

TC 04: Power

01
02
04,8b
05
06
07
09
0a
81
83
8a
8c
8d
8e
8f
90  { u8 } set RTOS idle task enabled
91  {} log and reset idle stats

TC 05: Fan

01
02
03
04
05
80
81
83

TC 07: Debug

3f  { u8 } set debug pins connection mode? 0=SAM_Flash, 2=PCH_Logging, 3=SAM_Debug, 4=Touch_JTAG, 5=Power_Monitor, 6=?, 7=PCH_JTAG, 9=Blade_UART
4b  { u8 } set debug log target (0=Debug, 1=Host, 2=KIP, 3=Surflink)
4e  { u8 } set debug mode
53  { u32 } => { u32 } clear a gpio, sleep N microseconds in ram function, then set gpio again
5f  { u8 } set safe mode flag
80  { u8, u8 } => { u8 } flush logs(?) and optionally reset SAM (first byte & 0xe0 must be zero, second byte must be 2 for reset)

Requires debug mode 2 or 4:

03  {} log full OS state
11  {} log fw version and flash location
18  { u8 cmd, u8 module } cmd 0 = log enabled module bits, 1 = enable logging for module, 2 = disable logging for module; module = 0..127
19  { u8 cmd, u8 level } cmd 0 = log loglevel, cmd 1 = set loglevel
30  {} log safe mode state
41  {} same as TCL CID 86?
42  { u8 cmd, u8 module } cmd 0 = log verbose module bits, 1 = enable verbose logging for module, 2 = disable verbose logging for module; module = 0..127
4d  { u32 addr, u16 len } => { u32 addr, u16 len, u8[] data } read memory; only the region from 0x20001000 to 0x200258f0 can be read, max len 94

Requires debug mode 4:

0e  { u8 len, u8[] } => { u8 len, u8[] } ping?
0f  { u8 len, u8[] } => { u8 len, u8[] } ping? (different response type?)
20  {} set power related flag and feed watchdog
32  {} toggle debug LED on
4f  { u8 module, u8 addr, u8 register, u8 len } => { u8 error, u8 len, u8[16] data } i2c read; module = 0..4
50  { u8 module, u8 addr, u8 register, u8 len, u8[] data } i2c write
51  { u8 module } i2c bus scan (module 0xff == all)
54  { u8 module?, u8 }
55  { u8, u8 len?, u8[16], u16 resplen } => { u8[] }
5a  { u8 port, u8 pin } log GPIO pin value
5b  { u8 port, u8 pin } set GPIO high
5c  { u8 port, u8 pin } set GPIO low
63  {} crash (call null pointer)
64  {} read invalid(?) peripheral addr 0x402055aa

When log target is set to Host, SAM will send log messages with TID=3. The request ID for these messages (except CID 49) is set to a hash of internal timers, so is effectively random.
When the amount of data in a log record exceeds 40 bytes, it is split over multiple messages with the same request ID.
The first 8 bytes of the log data are always { u32 timestamp_millis, u32 event_code }. For split records, only the first message will have this header.
There are four types of log record: u32 array, string array, error, and buffer. These use the following CIDs:

43  u32 array (start of split record)
44  u32 array (middle of split record)
45  u32 array (end of split record, or non-split record)
46  null-terminated string array (start of split record)
47  null-terminated string array (middle of split record)
48  null-terminated string array (end of split record, or non-split record)
49  error (request ID is 0, data is { u32 timestamp_millis, u32 event_code, u32 value })
4a  buffer (i.e. raw byte array) (same CID is used for all messages if split)

TC 09: Firmware update

02  {} => { u8 numarrayentries, u8 0, u8 0, u8 4, { u32 version, u8 location+flags, u8 dest_id, u16 0x2e }[7] } get flash status
03  { u8 ?, u8 ?, u8 dest, u8 cookie?, u32 fwversion, u8 1, u8 ?, u8 ?, u8 ?, u8 ?, u8 ?, u16 0x2e } => { u8 0, u8 0, u8 0, u8 cookie, u8 0, u8 0, u8 0, u8 0, u8 ?, u8 0, u8 0, u8 0, u8 ?, u8 0, u8 0, u8 0, u8 0 } firmware upload setup
04  { u8 flags, u8 len, u16 cookie?, u32 offset, u8[] data } => { u16 cookie, u8 error, u8[13] 0 } firmware upload, flags: 0x80 = start, 0x40 = finish
80  {} switch active firmware location
a0,a1,a2

Requires debug mode 3 or 4:

09
0a
0b
0c

Firmware destinations:

  • 0 = SAM firmware
  • 0x12 = USB-C PD firmware?
  • 0xfe = two 32-byte buffers?

TC 0C: TCL

0b  (see existing doc) { u16 bufid } => { u16 bufid, u8 instanceid?, u8 flags? } erase?
0c  (see existing doc) { u16 bufid, u32 offset, u16 readlen, u8 0 } => { u16 bufid, u32 offset, u16 len, u8 status, u8[] } read, max len = 560, status: 0 = more, 1 = end, 0xfd/0xfe/0xff = error
0d  { u16 bufid } => { u16, u8 error } erase things? disabled in safe mode
0e  (see existing doc) {} => { u16 0xffff, { u16, u8 }[8] }
85  {} => { u16 0xffff, u8 0 } call CID 0D for all buffers
86  {} => { u16 0xffff, u8 0 } does something with buffers 2 and 3

Valid buffer/instance combinations:

  • Buffer 1, instance 1/2 = Crash dump
  • Buffer 2, instance 1/2 = ?
  • Buffer 3, instance 1/2 = ?
  • Buffer 4, instance 1 = Battery?
  • Buffer 5, instance 1 = Blades?
  • Buffer 6, instance 1 = Thermal?

Handled separately (data seems to be all zeroes?):

0f  {} => { u16 error, u32 } get something
10  {} => { u16 error, u32 } get something
11  {} => { u16 error, u32 } get something
12  {} => { u16 error, u32 } get something
13  {} => { u16 error, u32 } get something
14  {} => { u16 error, u32[16] } get something
90  {} => { u16 error, u16 size } get total buffer size for command 91/92
92  { u16 pos, u16 len } => { u16 pos, u16 len, u16 error, u8[] data } read some buffer

Handled separately (setters for above):

80  { u32 } => { u16 error } set something
81  { u32 } => { u16 error } set something
82  { u32 } => { u16 error } set something
8a  { u32 } => { u16 error } set something
8c  { u32 } => { u16 error } set something
8e  { u32[16] } => { u16 error } set something
91  { u16 pos, u16 len, u16 unused, u8[] data } => { u16 pos, u16 len, u16 error, u8[] data } write some buffer

TC 0D: Surflink

(TODO The command decoding seems different here. Maybe not CIDs?)

02
03
06
0c

TC 10: Surface Blades

01
02
03
04
05
06
07
08
0a
0c

Handled separately:
(TODO The command decoding seems different here?)

00
0e
10
15
23
2e
33
34
5a
5b

TC 12: Sensors

Instance ids:

  • 5 = BMA223 accelerometer on I2C
03 {} => { ... } read calib
80 { ... } => { ... } read registers
81 { ... } => { ... } write registers
82 {} => { ... } read sensor values
83 {} => { ... } reset default calib
84 { ... } => { ... } set calib

TC 13: SRQ

02  {} => { u8 1, u8 cid, u8 datalen = 32, u32 status, u32 garbage?, u8[32] data }
03  {} => { u8 1, u8 cid, u8 datalen = 1, u32 status, u32 garbage?, u8 safemodeflag } get safe mode flag?
04  {} => { u8 1, u8 cid, u8 datalen = 4, u32 status, u32 garbage?, u16 ?, u16 ? }
05  {} => { u8 1, u8 cid, u8 datalen = 1, u32 status, u32 garbage?, u8 safemodeflag } set safe mode flag?

TC 15: HID

00
01  (see existing doc)
02  (see existing doc)
03  (see existing doc)
04  (see existing doc)

TC 17: Backlight

02
03
04,87
05,88

TC 1B: USB-C

00
06,80
81
82
83

Handled separately:

04
05

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions