1+ # show link : underline
12
3+ To introduce the assembly/C++ concepts required to understand functions
4+ which return data, we will start by analyzing the assembly code for
5+ one of the most classical function types in all of programming: the Getter.
26
3- Hi mom
7+ A "getter" function is a function which simply returns a value.
8+ It is often used in Object-Oriented Programming languages to prevent
9+ unintended usage of values that only a particular class or file should
10+ be using, and helps create an easy-to-understand interface to those values.
11+
12+ Here's an example of a "getter" function from `xLightKit.cpp` in the BfBB
13+ decomp project which will be the study of this section:
14+ ```cpp
15+ xLightKit* xLightKit_GetCurrent(RpWorld* world)
16+ {
17+ return gLastLightKit;
18+ }
19+ ```
20+
21+ We will break this function down into its components shortly, but what
22+ you should notice is that the function only has a single C instruction
23+ here: return the data stored in `gLastLightKit` . Studying the assembly
24+ that goes into this simple operation will help us understand what is
25+ happening in more complex functions which also return data.
26+
27+ Let's start by creating our initial declaration using Objdiff.
28+
29+ === Creating the Function Declaration
30+
31+ Let's say we have opened up Objdiff and we want to work on the last
32+ missing function in the file xLightKit.cpp, `xLightKit_GetCurrent` :
33+
34+ # image (" return_imgs/filelist.png" )
35+
36+ We can use Objdiff's symbol mapping capabilities to get our declaration started
37+ by right-clicking the function we want to work on and copying the
38+ demangled, mapped function name (2nd item in the popup menu):
39+ # image (" return_imgs/objdiff_demangle.png" )
40+
41+ Doing that, we know that our declaration should start to look like this:
42+ ```cpp
43+ ... xLightKit_GetCurrent(RpWorld*);
44+ ```
45+
46+ However, we are still missing the required return type for this declaration.
47+ Let's inspect the PS2 DWARF data to see if it's any help here:
48+ ```cpp
49+ /*
50+ Compile unit: C:\SB\Core\x\xLightKit.cpp
51+ Producer: MW MIPS C Compiler
52+ Language: C++
53+ Code range: 0x00305BD0 -> 0x00305BD8
54+ */
55+ // Range: 0x305BD0 -> 0x305BD8
56+ class xLightKit * xLightKit_GetCurrent() {
57+ // Blocks
58+ /* anonymous block */ {
59+ // Range: 0x305BD0 -> 0x305BD8
60+ }
61+ }
62+ ```
63+
64+ We're in luck! There is DWARF data for this function.
65+ Based on this, we're able to see that the `xLightKit_GetCurrent`
66+ has a return type in the DWARF data of `class xLightKit*` . We can apply that
67+ to our working function declaration:
68+ ```cpp
69+ class xLightKit* xLightKit_GetCurrent(RpWorld*);
70+ ```
71+
72+ Note that the DWARF data does not show that this function had any parameters
73+ on the PS2 version of the game. We will see soon that the function does not actually
74+ make use of this parameter, but it should be included nonetheless because Objdiff
75+ tells us that it is present in the GCN version of the code, which is our single Source
76+ of Truth. It may be that some debugging code excluded from the GCN release build
77+ of the game made use of this parameter, but it's not possible to say for sure.
78+
79+ There are some final refinements we can make to this declaration.
80+ "`class xLightKit*` " is what is known as an _elaborated type specifier_ , which can be
81+ used in cases where a type name conflicts with a local variable name. Since in practice
82+ this is very rare, we can shorten this to simply include the type:
83+ ```cpp
84+ xLightKit* xLightKit_GetCurrent(RpWorld*);
85+ ```
86+
87+ As a final touch, let's give our unused `RpWorld*` parameter a descriptive symbol name.
88+ ```cpp
89+ xLightKit* xLightKit_GetCurrent(RpWorld* world);
90+ ```
91+
92+ And there is our function declaration! We can now pop this into `xLightKit.h` in
93+ the appropriate spot:
94+ ```cpp
95+ // xLightKit.h
96+ ...
97+
98+ xLightKit* xLightKit_Prepare(void* data);
99+ void xLightKit_Enable(xLightKit* lkit, RpWorld* world);
100+ xLightKit* xLightKit_GetCurrent(RpWorld* world); // Our new declaration!
101+ void xLightKit_Destroy(xLightKit* lkit);
102+
103+ ...
104+ ```
105+
106+ === Calling Conventions and Returning a Register Value
107+
108+ Now that we've got our function declaration setup, we can add our implementation stub
109+ to `xLightKit.cpp` :
110+ ```cpp
111+ xLightKit* xLightKit_GetCurrent(RpWorld* world)
112+ {
113+ // TODO: Implement me
114+ }
115+ ```
116+
117+ Attempting to build for the moment will yield the following compiler error:
118+ ```
119+
120+ User break, cancelled...
121+ # File: src\SB\Core\x\xLightKit.cpp
122+ # ------------------------------------
123+ # 122: }
124+ # Error: ^
125+ # return value expected
126+ # Too many errors printed, aborting program
127+ ninja: build stopped: subcommand failed.
128+ ```
129+
130+ As the compiler says, a return value is expected for this function, but we have yet to
131+ implement that, and so a compiler error occurs.
132+
133+ Let's begin analyzing the assembly code for this function so that we can
134+ return the value the compiler expects!
135+ ```
136+ 0: lwz r3, gLastLightKit@sda21
137+ 4: blr
138+ ```
139+
140+ In the prior case study, we learned that the `blr` instruction will exit an
141+ assembly subroutine and jump back to the subroutine pointed at by the value
142+ stored in the link register. Our C instruction equivalent is the simple
143+ ```cpp
144+ return;
145+ ```
146+
147+ However, this is not enough to return a value. To do that, we will need to
148+ understand a bit about assembly function calling conventions.
149+
150+ While high-level programming languages like C++ or Java let you clearly state what a
151+ function returns and what parameters it receives, assembly language does not have
152+ built-in ways to pass information between functions. Instead, functions must use CPU
153+ registers or program memory to share data. To make sure all functions agree on how to do this,
154+ CPU designers define *calling conventions* . Calling conventions are rules that say where
155+ to put things like parameters and return values when calling and returning from a function.
156+
157+ The PowerPC CPU architecture supports user access of two types of registers:
158+ general-purpose and floating point registers # footnote (" https://datasheets.chipdb.org/IBM/PowerPC/Gekko/gekko_user_manual.pdf" ).
159+ There are 32 registers of both types, all of which are freely available for programmers
160+ to use as needed.
161+
162+ By calling convention, parameters to an assembly function (called a subroutine) are passed
163+ starting at `r3` and/or `f1` . For example, the C function:
164+ ```cpp
165+ void exampleFunction(U8 id, S32 isSpunch, F32 dt, F64 yPos);
166+ ```
167+
168+ would have the following values in registers at the beginning of its assembly subroutine
169+ analogue:
170+ ```
171+ r3 - id
172+ r4 - isSpunch
173+ f1 - dt
174+ f2 - yPos
175+ ```
176+
177+ Now for return values. By calling convention, values to be returned to the calling subroutine
178+ are stored into f1 is the value is a float, or r3 otherwise.
179+
180+ Knowing this, we can now figure out how the `blr` instruction is supported in order to return
181+ a value. The previous instruction:
182+ ```
183+ 0: lwz r3, gLastLightKit@sda21
184+ ```
185+ loads the value of the global variable `gLastLightKit` into the r3 register. Immediately following
186+ this, the `blr` instruction is called, which will exit the subroutine.
187+
188+ Since we know that the assembly code loads `gLastLightKit` into the r3 register prior
189+ to returning from the subroutine, we know that the decompiled C equivalent will return
190+ the value gLastLightKit. We can achieve that by using the following return syntax:
191+ ```cpp
192+ return gLastLightKit;
193+ ```
194+
195+ Combining everything we've done up to this point, we get what we set out to decompile:
196+ ```cpp
197+ xLightKit* xLightKit_GetCurrent(RpWorld* world)
198+ {
199+ return gLastLightKit;
200+ }
201+ ```
202+
203+ Et voila - building this via Objdiff will now produce a 100% match!
204+ # image (" return_imgs/final_comparison.png" )
205+
206+ Hopefully you're beginning to feel a little bit more familiar with some assembly
207+ programming constructs by this point. After completing this decompilation exercise,
208+ we've so far learned the following:
209+ - How to use the PS2 DWARF data to validate function return types
210+ - PowerPC Calling Conventions for Function Parameters and Return Values
211+ - How to use calling conventions when decompiling functions that return a value
212+
213+ In the next exercise, we will continue by introducing functions which include
214+ branching and conditional operations.
0 commit comments