Skip to content

Commit 88d558d

Browse files
committed
"Add return.typ with image assets"
1 parent a2a2f3d commit 88d558d

File tree

5 files changed

+217
-6
lines changed

5 files changed

+217
-6
lines changed

src/decomp-guide/main.typ

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,10 @@
22
#let functions = (
33
("The Simplest Function: Nothing", <fn.nothing>, "nothing.typ"),
44
("Returning Values", <fn.return>, "return.typ"),
5-
("A Function with Some Logic", <fn.small>, "return.typ"),
6-
("A Function with a For Loop", <fn.for>, "return.typ"),
7-
("A Function with a Switch Statement", <fn.switch>, "return.typ"),
8-
("A Large Function", <fn.large>, "return.typ"),
5+
// ("A Function with Some Logic", <fn.small>, "return.typ"),
6+
// ("A Function with a For Loop", <fn.for>, "return.typ"),
7+
// ("A Function with a Switch Statement", <fn.switch>, "return.typ"),
8+
// ("A Large Function", <fn.large>, "return.typ"),
99
)
1010

1111

@@ -28,4 +28,4 @@ C++ language features into assembly language.
2828
== #i. #value.at(0) #value.at(1)
2929
#include value.at(2)
3030
#(i += 1)
31-
]
31+
]

src/decomp-guide/return.typ

Lines changed: 212 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,214 @@
1+
#show link: underline
12

3+
To introduce the assembly/C++ concepts required to understand functions
4+
which return data, we will start by analyzing the assembly code for
5+
one of the most classical function types in all of programming: the Getter.
26

3-
Hi mom
7+
A "getter" function is a function which simply returns a value.
8+
It is often used in Object-Oriented Programming languages to prevent
9+
unintended usage of values that only a particular class or file should
10+
be using, and helps create an easy-to-understand interface to those values.
11+
12+
Here's an example of a "getter" function from `xLightKit.cpp` in the BfBB
13+
decomp project which will be the study of this section:
14+
```cpp
15+
xLightKit* xLightKit_GetCurrent(RpWorld* world)
16+
{
17+
return gLastLightKit;
18+
}
19+
```
20+
21+
We will break this function down into its components shortly, but what
22+
you should notice is that the function only has a single C instruction
23+
here: return the data stored in `gLastLightKit`. Studying the assembly
24+
that goes into this simple operation will help us understand what is
25+
happening in more complex functions which also return data.
26+
27+
Let's start by creating our initial declaration using Objdiff.
28+
29+
=== Creating the Function Declaration
30+
31+
Let's say we have opened up Objdiff and we want to work on the last
32+
missing function in the file xLightKit.cpp, `xLightKit_GetCurrent`:
33+
34+
#image("return_imgs/filelist.png")
35+
36+
We can use Objdiff's symbol mapping capabilities to get our declaration started
37+
by right-clicking the function we want to work on and copying the
38+
demangled, mapped function name (2nd item in the popup menu):
39+
#image("return_imgs/objdiff_demangle.png")
40+
41+
Doing that, we know that our declaration should start to look like this:
42+
```cpp
43+
... xLightKit_GetCurrent(RpWorld*);
44+
```
45+
46+
However, we are still missing the required return type for this declaration.
47+
Let's inspect the PS2 DWARF data to see if it's any help here:
48+
```cpp
49+
/*
50+
Compile unit: C:\SB\Core\x\xLightKit.cpp
51+
Producer: MW MIPS C Compiler
52+
Language: C++
53+
Code range: 0x00305BD0 -> 0x00305BD8
54+
*/
55+
// Range: 0x305BD0 -> 0x305BD8
56+
class xLightKit * xLightKit_GetCurrent() {
57+
// Blocks
58+
/* anonymous block */ {
59+
// Range: 0x305BD0 -> 0x305BD8
60+
}
61+
}
62+
```
63+
64+
We're in luck! There is DWARF data for this function.
65+
Based on this, we're able to see that the `xLightKit_GetCurrent`
66+
has a return type in the DWARF data of `class xLightKit*`. We can apply that
67+
to our working function declaration:
68+
```cpp
69+
class xLightKit* xLightKit_GetCurrent(RpWorld*);
70+
```
71+
72+
Note that the DWARF data does not show that this function had any parameters
73+
on the PS2 version of the game. We will see soon that the function does not actually
74+
make use of this parameter, but it should be included nonetheless because Objdiff
75+
tells us that it is present in the GCN version of the code, which is our single Source
76+
of Truth. It may be that some debugging code excluded from the GCN release build
77+
of the game made use of this parameter, but it's not possible to say for sure.
78+
79+
There are some final refinements we can make to this declaration.
80+
"`class xLightKit*`" is what is known as an _elaborated type specifier_, which can be
81+
used in cases where a type name conflicts with a local variable name. Since in practice
82+
this is very rare, we can shorten this to simply include the type:
83+
```cpp
84+
xLightKit* xLightKit_GetCurrent(RpWorld*);
85+
```
86+
87+
As a final touch, let's give our unused `RpWorld*` parameter a descriptive symbol name.
88+
```cpp
89+
xLightKit* xLightKit_GetCurrent(RpWorld* world);
90+
```
91+
92+
And there is our function declaration! We can now pop this into `xLightKit.h` in
93+
the appropriate spot:
94+
```cpp
95+
// xLightKit.h
96+
...
97+
98+
xLightKit* xLightKit_Prepare(void* data);
99+
void xLightKit_Enable(xLightKit* lkit, RpWorld* world);
100+
xLightKit* xLightKit_GetCurrent(RpWorld* world); // Our new declaration!
101+
void xLightKit_Destroy(xLightKit* lkit);
102+
103+
...
104+
```
105+
106+
=== Calling Conventions and Returning a Register Value
107+
108+
Now that we've got our function declaration setup, we can add our implementation stub
109+
to `xLightKit.cpp`:
110+
```cpp
111+
xLightKit* xLightKit_GetCurrent(RpWorld* world)
112+
{
113+
// TODO: Implement me
114+
}
115+
```
116+
117+
Attempting to build for the moment will yield the following compiler error:
118+
```
119+
120+
User break, cancelled...
121+
# File: src\SB\Core\x\xLightKit.cpp
122+
# ------------------------------------
123+
# 122: }
124+
# Error: ^
125+
# return value expected
126+
# Too many errors printed, aborting program
127+
ninja: build stopped: subcommand failed.
128+
```
129+
130+
As the compiler says, a return value is expected for this function, but we have yet to
131+
implement that, and so a compiler error occurs.
132+
133+
Let's begin analyzing the assembly code for this function so that we can
134+
return the value the compiler expects!
135+
```
136+
0: lwz r3, gLastLightKit@sda21
137+
4: blr
138+
```
139+
140+
In the prior case study, we learned that the `blr` instruction will exit an
141+
assembly subroutine and jump back to the subroutine pointed at by the value
142+
stored in the link register. Our C instruction equivalent is the simple
143+
```cpp
144+
return;
145+
```
146+
147+
However, this is not enough to return a value. To do that, we will need to
148+
understand a bit about assembly function calling conventions.
149+
150+
While high-level programming languages like C++ or Java let you clearly state what a
151+
function returns and what parameters it receives, assembly language does not have
152+
built-in ways to pass information between functions. Instead, functions must use CPU
153+
registers or program memory to share data. To make sure all functions agree on how to do this,
154+
CPU designers define *calling conventions*. Calling conventions are rules that say where
155+
to put things like parameters and return values when calling and returning from a function.
156+
157+
The PowerPC CPU architecture supports user access of two types of registers:
158+
general-purpose and floating point registers #footnote("https://datasheets.chipdb.org/IBM/PowerPC/Gekko/gekko_user_manual.pdf").
159+
There are 32 registers of both types, all of which are freely available for programmers
160+
to use as needed.
161+
162+
By calling convention, parameters to an assembly function (called a subroutine) are passed
163+
starting at `r3` and/or `f1`. For example, the C function:
164+
```cpp
165+
void exampleFunction(U8 id, S32 isSpunch, F32 dt, F64 yPos);
166+
```
167+
168+
would have the following values in registers at the beginning of its assembly subroutine
169+
analogue:
170+
```
171+
r3 - id
172+
r4 - isSpunch
173+
f1 - dt
174+
f2 - yPos
175+
```
176+
177+
Now for return values. By calling convention, values to be returned to the calling subroutine
178+
are stored into f1 is the value is a float, or r3 otherwise.
179+
180+
Knowing this, we can now figure out how the `blr` instruction is supported in order to return
181+
a value. The previous instruction:
182+
```
183+
0: lwz r3, gLastLightKit@sda21
184+
```
185+
loads the value of the global variable `gLastLightKit` into the r3 register. Immediately following
186+
this, the `blr` instruction is called, which will exit the subroutine.
187+
188+
Since we know that the assembly code loads `gLastLightKit` into the r3 register prior
189+
to returning from the subroutine, we know that the decompiled C equivalent will return
190+
the value gLastLightKit. We can achieve that by using the following return syntax:
191+
```cpp
192+
return gLastLightKit;
193+
```
194+
195+
Combining everything we've done up to this point, we get what we set out to decompile:
196+
```cpp
197+
xLightKit* xLightKit_GetCurrent(RpWorld* world)
198+
{
199+
return gLastLightKit;
200+
}
201+
```
202+
203+
Et voila - building this via Objdiff will now produce a 100% match!
204+
#image("return_imgs/final_comparison.png")
205+
206+
Hopefully you're beginning to feel a little bit more familiar with some assembly
207+
programming constructs by this point. After completing this decompilation exercise,
208+
we've so far learned the following:
209+
- How to use the PS2 DWARF data to validate function return types
210+
- PowerPC Calling Conventions for Function Parameters and Return Values
211+
- How to use calling conventions when decompiling functions that return a value
212+
213+
In the next exercise, we will continue by introducing functions which include
214+
branching and conditional operations.
159 KB
Loading
86.6 KB
Loading
79.8 KB
Loading

0 commit comments

Comments
 (0)