Skip to content

Commit 50d1de9

Browse files
hartmannathanAlan Carvalho de Assis
authored andcommitted
Documentation: Import "Analyzing Cortex-M Hardfaults" from CWIKI
* Documentation/guides/cortexmhardfaults.rst: New. Migrated from [1] with conversion to reStructuredText, minor typo fixes, and a link to a Narkive archive of the original quoted question. * Documentation/guides/index.rst: Add above to TOC. [1] https://cwiki.apache.org/confluence/display/NUTTX/Analyzing+Cortex-M+Hardfaults
1 parent 0cadb0c commit 50d1de9

File tree

2 files changed

+204
-0
lines changed

2 files changed

+204
-0
lines changed
Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
=============================
2+
Analyzing Cortex-M Hardfaults
3+
=============================
4+
5+
.. epigraph::
6+
7+
> I have a build of PX4 (NuttX 6.29 with some patches) with new
8+
> lpc43xx chip files on 4337 chip running from FLASH (master
9+
> vanilla NuttX has no such problem). This gives me a hardfault
10+
> below if I stress NSH console (UART2) with some big output.
11+
>
12+
> I read some threads but can't get a clue how to analyze the
13+
> dump and where to look first:
14+
>
15+
> 1bXXX and 1aXXX addresses are FLASH. 100XXX addresses are RAM
16+
17+
.. code-block:: console
18+
19+
Assertion failed at file:armv7-m/up_hardfault.c line: 184 task: hpwork
20+
sp: 10001eb4
21+
IRQ stack:
22+
base: 10001f00
23+
size: 000003fc
24+
10001ea0: 1b02d961 1b03f07e 10001eb4 10005ed8 1a0312ab 1b03f600 000000b8 1b02d961
25+
10001ec0: 00000010 10001f40 00000003 00000000 1a03721d 1a037209 1b02d93b 00000000
26+
10001ee0: 1a0371f5 00000000 00000000 00000000 00000000 00000000 1a0314a5 10005d7c
27+
sp: 10005e50
28+
User stack:
29+
base: 10005ed8
30+
size: 00000f9c
31+
10005e40: 00000000 00000000 00000000 1b02d587 10004900 00000000 005b8d7f 00000000
32+
10005e60: 1a030f2e 00000000 00000000 00001388 00000000 00000005 10001994 00000000
33+
10005e80: 00000000 00000000 00000000 1b02c359 00000000 00000000 00000000 004c4b40
34+
10005ea0: 000002ff 00000000 00000000 1a030f2f 00000000 00000000 00000000 00000000
35+
10005ec0: 00000000 1a030f41 00000000 1b02c2a5 00000000 00000000 ffffffff 00bdeb39
36+
R0: ffffffff 00000000 00000016 00000000 00000000 00000000 00000000 00000000
37+
R8: 100036d8 00000000 00000000 004c4b40 10001370 10005e50 1b02b20b 1b02d596
38+
xPSR: 41000000 BASEPRI: 00000000 CONTROL: 00000000
39+
EXC_RETURN: ffffffe9
40+
41+
This question was asked in the old Yahoo! Group for NuttX, before the
42+
project joined the Apache Software Foundation. The old forum no longer
43+
exists, but the thread has been archived at
44+
`Narkive <https://nuttx.yahoogroups.narkive.com/QNbG3r5l/hardfault-help-analysing-where-to-start>`_
45+
(third party external link).
46+
47+
Analyzing the Register Dump
48+
===========================
49+
50+
First, in the register dump:
51+
52+
.. code-block:: console
53+
54+
R0: ffffffff 00000000 00000016 00000000 00000000 00000000 00000000 00000000
55+
R8: 100036d8 00000000 00000000 004c4b40 10001370 10005e50 1b02b20b 1b02d596
56+
xPSR: 41000000 BASEPRI: 00000000 CONTROL: 00000000
57+
58+
``R15`` is the PC at the time of the crash (``1b02d596``). In order to
59+
see where this is, I do this:
60+
61+
.. code-block:: console
62+
63+
arm-none-eabi-objdump -d nuttx | vi -
64+
65+
Of course, you can use any editor you prefer. In any case, this will
66+
provide a full assembly language listing of your FLASH content along
67+
with complete symbolic information.
68+
69+
**TIP:** Not comfortable with ARM assembly language? Try the
70+
``objdump --source`` (or just ``-S``) option. That will intermix the C
71+
and the assembly language code so that you can see which C statements
72+
the assembly language is implementing.
73+
74+
Once you have the FLASH image in the editor, it is then a simple thing
75+
to do the search in order to find the instruction at ``1b02d596``. The
76+
symbolic information will show you exactly which function the address
77+
is in and also the context of the instruction that can be used to
78+
associate it to the exact line of code in the original C source file.
79+
80+
You also have all of the register contents so it is pretty easy to see
81+
what happened (assuming you have some basic knowledge of Thumb2
82+
assembly language and the ARM EABI). But it is usually not so easy to
83+
see why it happened.
84+
85+
The rest of the instructions apply to finding out why the fault
86+
happened.
87+
88+
``R14`` often contains the return address to the caller of the
89+
offending functions. Bit one is set in this return address, but ignore
90+
that (I.e., use ``1b02b20a`` instead of ``1b02b20b``). Use the objdump
91+
command above to see where that is.
92+
93+
Sometimes, however, ``R14`` is not the caller of the offending
94+
function. If the offending functions calls some other function then
95+
``R14`` will be overwritten. But no problem, it will also then have
96+
pushed the return address on the stack where we can find it by
97+
analyzing the stack dump.
98+
99+
Analyzing the Stack Dump
100+
========================
101+
102+
The Task Stack
103+
--------------
104+
105+
To go further back in the time, you have to analyze the stack. It is a
106+
push down stack so older events are at higher stack addresses; the
107+
most recent things that happened will be at lower stack addresses.
108+
109+
Analyzing the stack is done in basically the same way:
110+
111+
1. Start at the highest stack addresses (oldest) and work forward in
112+
time (lower addresses)
113+
114+
2. Find interesting addresses,
115+
116+
3. Use ``arm-none-eabi-objdump`` to determine where those addresses
117+
are in the code.
118+
119+
An interesting address has these properties:
120+
121+
1. It lies in FLASH in your architecture. In your case these are the
122+
addresses that begin with ``0x1a`` and ``0x1b``. Other
123+
architectures may have different FLASH addresses or even addresses
124+
in RAM.
125+
126+
2. The interesting addresses are all odd for Cortex-M, that is, bit 0
127+
will be set. This is because as the code progresses, the return
128+
address (``R14``) will be pushed on the stack. All of the return
129+
addresses will lie in FLASH and will be odd.
130+
131+
Even FLASH addresses in the stack dump usually are references to
132+
``.rodata`` in FLASH but are sometimes of interest as well. Below are
133+
examples of interesting addresses (in brackets):
134+
135+
.. code-block:: console
136+
137+
sp: 10005e50
138+
User stack:
139+
base: 10005ed8
140+
size: 00000f9c
141+
10005e40: 00000000 00000000 00000000 [1b02d587] 10004900 00000000 005b8d7f 00000000
142+
10005e60: 1a030f2e 00000000 00000000 00001388 00000000 00000005 10001994 00000000
143+
10005e80: 00000000 00000000 00000000 [1b02c359] 00000000 00000000 00000000 004c4b40
144+
10005ea0: 000002ff 00000000 00000000 [1a030f2f] 00000000 00000000 00000000 00000000
145+
10005ec0: 00000000 [1a030f41] 00000000 [1b02c2a5] 00000000 00000000 ffffffff 00bdeb39
146+
147+
That will give the full backtrace up to the point of the failure.
148+
149+
The Interrupt Stack
150+
-------------------
151+
152+
Note that in some cases there are two stacks listed. The interrupt
153+
stack will be present if (1) the interrupt stack is enabled, and (2)
154+
you are in an interrupt handler at the time that the failure occurred:
155+
156+
.. code-block:: console
157+
158+
Assertion failed at file:armv7-m/up_hardfault.c line: 184 task: hpwork
159+
sp: 10001eb4
160+
IRQ stack:
161+
base: 10001f00
162+
size: 000003fc
163+
10001ea0: [1b02d961] 1b03f07e 10001eb4 10005ed8 1a0312ab 1b03f600 000000b8 [1b02d961]
164+
10001ec0: 00000010 10001f40 00000003 00000000 [1a03721d] [1a037209] [1b02d93b] 00000000
165+
10001ee0: [1a0371f5] 00000000 00000000 00000000 00000000 00000000 [1a0314a5] 10005d7c
166+
167+
(Interesting addresses again in brackets).
168+
169+
The interrupt stack is sometimes interesting, for example when the
170+
interrupt was caused by logic operating at the interrupt level. In
171+
this case, it is probably not so interesting since fault was probably
172+
caused by normal task code and the interrupt stack probably just shows
173+
the normal operation of the interrupt handling logic.
174+
175+
Full Stack Analysis
176+
-------------------
177+
178+
What I have proposed here is just skimming through the stack, finding
179+
and interpreting interesting addresses. Sometimes you need more
180+
information and you need to analyze the stack in more detail. That is
181+
also possible because every word on the stack is there because of an
182+
explicit push instruction in the code (usually a push instruction on
183+
Cortex-M or an stmdb instruction in other ARM architectures). This is
184+
painstaking work but can also be done to provide a more detailed
185+
answer to "what happened?"
186+
187+
Recovering State at the Time of the Hardfault
188+
=============================================
189+
190+
Here is another tip from Mike Smith:
191+
192+
.. epigraph::
193+
194+
"... for systems like NuttX where catching hardfaults is difficult,
195+
you can recover the faulting PC, LR and SP (by examining the
196+
exception stack), then write these values back into the appropriate
197+
processor registers (adjust the PC as necessary for the fault).
198+
199+
"This will put you back in the application code at the point at
200+
which the fault occurred. Some local variables will show as having
201+
invalid values (because at the time of the fault they were live in
202+
registers and have been overwritten by the exception handler), but
203+
the stack frame, function arguments etc. should all show correctly."

Documentation/guides/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,4 @@ Guides
1818
customapps.rst
1919
zerolatencyinterrupts.rst
2020
nestedinterrupts.rst
21+
cortexmhardfaults.rst

0 commit comments

Comments
 (0)