You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/design-documents/platform/crash-reporting/crash_reporting.md
+77-11Lines changed: 77 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,7 @@
26
26
27
27
### Overview and background
28
28
29
-
MbedOS currently implements error/exception handlers which gets invoked when the system encounters a fatal error/exception. The error handler capture information such as register context/thread info etc and these are valuable information required to debug the issue later. This information is currently printed over the serial port, but in many cases the serial port is not accessible and the serial terminal log is not captured, particularly in the case of field deployed devices. We cannot send this information using mechanisms like Network because the state of the system might be unstable after the fatal error. And thus a different mechanism is needed to record and report this data. So, if we can auto-reboot the system after a fatal error has occured, without losing the RAM contents where we have the error information collected, we can send this information over network or other interfaces to be logged externally(E.g:- ARM Pelion cloud) or can even be written to file system if required.
29
+
MbedOS currently implements error/exception handlers which gets invoked when the system encounters a fatal error/exception. The error handler capture information such as register context/thread info etc and these are valuable information required to debug the issue later. This information is currently printed over the serial port, but in many cases the serial port is not accessible and the serial terminal log is not captured, particularly in the case of field deployed devices. We cannot send this information using mechanisms like Network because the state of the system might be unstable after the fatal error. And thus a different mechanism is needed to record and report this data. So, if we can auto-reboot the system after a fatal error has occurred, without losing the RAM contents where we have the error information collected, we can send this information over network or other interfaces to be logged externally(E.g:- ARM Pelion cloud) or can even be written to file system if required.
30
30
31
31
### Requirements and assumptions
32
32
@@ -38,7 +38,7 @@ Below are the high-level design goals for "Crash Reporting" feature:
38
38
39
39
**Error information collection including exception context**
40
40
41
-
The current error handling implementation in Mbed OS already collects error and exception context. With this feature the above mentioned data structures should be placed in an uninitialized RAM region so that the data is retained after an auto-reboot(warm-reset).
41
+
The current error handling implementation in MbedOS already collects error and exception context. With this feature the above mentioned data structures should be placed in an uninitialized RAM region so that the data is retained after an auto-reboot(warm-reset).
42
42
43
43
**Mechanism to auto reboot(also called warm-reset) the system without losing RAM contents where error info is stored**
44
44
@@ -84,7 +84,7 @@ Note that the actual location of the data should be carefully chosen without aff
84
84
85
85
### Mechanism to auto reboot(also called warm-reset) the system without losing RAM contents where error info is stored
86
86
87
-
The current mbed_error() implementation should be modified to cause an auto-reboot at the end of error handling if this feature is enabled. The mechanism used for rebooting should make sure it doesn't cause a reset of RAM contents. This can be done by calling system_reset() function already implemented by MbedOS which cause the system to reboot without resetting the RAM. The mbed_error() implementation also should make sure it updates the error context stored in Crash-Report RAM with the right CRC value and it should also implement mechanism to track the reboot count caused by fatal errors. The below psuedo-code shows how the mbed_error() implementation should be modified.
87
+
The current mbed_error() implementation should be modified to cause an auto-reboot at the end of error handling if this feature is enabled. The mechanism used for rebooting should make sure it doesn't cause a reset of RAM contents. This can be done by calling system_reset() function already implemented by MbedOS which cause the system to reboot without resetting the RAM. The mbed_error() implementation also should make sure it updates the error context stored in Crash-Report RAM with the right CRC value and it should also implement mechanism to track the reboot count caused by fatal errors. The below pueudo-code shows how the mbed_error() implementation should be modified.
//This is the case when we dont have a crash report already stored.
102
102
Update the location with new error information
103
103
Set Reboot count to 1
@@ -118,21 +118,21 @@ The below APIs should be implemented.
118
118
119
119
The below API can be called by application to retrieve the error context captured in the Crash-Report RAM. The error context is copied into the location pointed by *error_info*. Note that the caller should allocate the memory for this location.
120
120
The function should return MBED_ERROR_NOT_FOUND if there is no error context currently stored.
The below API can be called by application to retrieve the fault context captured in the Crash-Report RAM. The error context is copied into the location pointed by *fault_context*. Note that the caller should allocate the memory for this location. Note that the fault context is valid only if the previous reboot was caused by an exception. Whether the previous reboot was caused by an exception can be determined from the error code stored in error context information retrieved using mbed_get_reboot_error_info() API above.
127
127
The function should return MBED_ERROR_NOT_FOUND if there is no fault context currently stored.
128
-
```
128
+
```C
129
129
//Call this function to retrieve the last reboot fault context
The below API can be called by application to reset the error context captured in the Crash-Report RAM.
134
134
The function should MBED_ERROR_NOT_FOUND if there is no error context currently stored.
135
-
```
135
+
```C
136
136
//Reset the reboot error context
137
137
mbed_error_status_tmbed_reset_reboot_error_info()
138
138
```
@@ -145,10 +145,10 @@ MbedOS initialization sequence should be modified as shown in below diagram to r
145
145
146
146

147
147
148
-
Below should be the siganture of the callback for reporting the error information.
148
+
Below should be the signature of the callback for reporting the error information.
149
149
150
150
The error handing system in MbedOS will call this callback function if it detects that the current reboot has been caused by a fatal error. This function will be defined with MBED_WEAK attribute by default and applications wanting to process the error report should override this function in application implementation.
@@ -168,13 +168,79 @@ Crash reporting implementation should provide enough parameters to control diffe
168
168
169
169
# Usage scenarios and examples
170
170
171
-
Below (pseudocode) are some common usage scenarios using the new error reporting APIs.
171
+
Below (pseudo code) are some common usage scenarios using the new error reporting APIs.
172
172
173
-
### Implementing crash reporting callback
173
+
### Implementing crash reporting callback
174
+
In order to implement the callback the user can override the default callback function(*mbed_error_reboot_callback()*) implemented with MBED_WEAK attribute in platform layer as below.
The above function will be called during boot with a pointer to *error_context* structure.
174
186
175
187
### Retrieving error info after reboot
188
+
The error context captured can be retrieved using mbed_get_reboot_error_info() API. See the below code
189
+
for example usage of that API. In the example below, a status variable reboot_error_detected has been used to track the presence of error context capture.
0 commit comments