If you ran setup.py, you should have a dump of the debug symbols inside of the dump/sym folder. It includes everything you need regarding debug information (prototypes, local variables, globals, etc).
Look for any function that catches your eye, and search for it on the dumped debug symbols (p3_functions.cpp).
Paste it into the corresponding file, and start decompiling.
Let's imagine that you chose the function exh_mbar_time_out from main/scrctrl.c.
Let's begin by first taking a look at the info we have for this function:
- Debug info for the function:
- Name:
exh_mbar_time_out - Signature:
static int exh_mbar_time_out(EXAM_CHECK*) - Arguments:
$a0->EXAM_CHECK *ec_pp
- Local variables:
$a1->int ret
- Name:
- Disassembly (of course):
lw $v1, 0x32c($a0) lw $v0, 0x28($a0) beqz $v0, l_2c negu $a1, $v1 bnezl $v1, l_20 lbu $v1, 0x2c($a0) b l_2c move $a1, $zero l_20: lbu $v0, 0x330($a0) xor $v0, $v0, $v1 movz $a1, $zero, $v0 l_2c: jr $ra move $v0, $a1
Given the above, you can start in two ways:
- Let a decompiler give you a head start (Ghidra, m2c):
- Pros:
- a.) Works great with small functions.
- b.) Ghidra can figure GCC optimizations nicely.
- c.) m2c is designed for matching decomp in mind.
- Cons:
- a.) Not efficient for big functions.
- b.) m2c isn't very good at figuring GCC optimizations.
- c.) Ghidra isn't built for matching decomp--its pseudocode won't match most of the times.
- c.1) m2c isn't perfect for PS2, will sometimes fail.
- I strongly recommend using both decompilers due to its strong and weak points. There are more decompiling tools around but these are the most recommended (for PS1/PS2).
- Pros:
- Translate assembly by hand:
- Pros:
- a.) Very efficient regardless of the function.
- b.) You will have a better view of the function.
- Cons:
- a.) Very time-consuming depending on the length and complexity of the function.
- b.) Requires previous knowledge of how the compiler works and the optimizations it often applies.
- Pros:
I strongly recommend option 1 for starting and later going for option 2.
Here I'll guide through option 2 as decompilers do most* of the work for you, so
let's take another look at the disassembly and slowly translate it to a C-like pseudocode.
But before that, let's take a moment to define what our pseudocode will be like:
- Every GPR will be a local variable.
- We will explicitly add type casts on loads and saves.
- For loads and saves, we will always assume that the base is a struct pointer unless suspected to be a pointer to a base C type.
- Most of the times you won't have to do any assumptions thanks to the debug information (unless you're working with
prlib).
- Most of the times you won't have to do any assumptions thanks to the debug information (unless you're working with
- Struct fields will be prefixed with
unkfollowed by the offset in hexadecimal and uppercase (e.g.:unk1FF). - The function's real signature will be defined after the translation in the case that there is no debug information.
- Labels will be prefixed by
l_followed by the function relative offset (e.g.:l_4c).
An example function would look like this after translation:
void my_func(void) {
a1 = (s32)a0->unk0;
v0 = a1 + 1;
return;
}That out of the way, we can begin. We will explain an instruction once after first encountering it, next instances of said instruction will be skipped to be brief.
- First 'unknown' instruction is
lw:- Defined as the following:
Load Word: To load a word from memory as a signed value.lw rt, offset(base)GPR[rt] = memory[GPR[base] + offset];
- On our pseudocode, it would translate to the following:
v1 = *(s32*)(a0 + 0x32c);
- Assuming
$a0is a struct pointer (we will always assume so from now on), it would translate to the following:v1 = (s32)a0->unk32C;
- Defined as the following:
- Second unknown instruction is
beqz:- Branch instructions are a bit confusing, so hold on tight.
- This is a pseudoinstruction, equivalent to
beq rs, $zero, offset. Thebeqinstruction will be explained instead. - MIPS has what's called a delay slot, it is basically an instruction slot that gets executed after a branch or jump instruction, but before the branch takes effect.
Given this example:You would think thebeq $a0, $a1, label nopnopinstruction wouldn't be executed if the branch is taken--but it does.
Thisnopis executed regardless of whether the branch is taken or not. This slot (thenopin this case) is called the delay slot.
There are other types of delay slots (e.g. load delay slots), though the PS2's EE core only has branch delay slots.- GCC likes to sometimes reuse registers in branches, where you may see code like
the following:
When the branch instruction is executed, it first makes the check (
lw $v0, 0x4($a1) beqz $v0, label lw $v0, 0x4($a2)$v0 == 0) before executing the delay slot. Therefore it would look like this on our pseudocode:Or to make things simpler and avoid making a temporary variable:v0 = (s32)a1->unk4; tmp = (v0 == 0); v0 = (s32)a2->unk4; if (tmp) { goto label; }
It's easy to be confused when looking at this register reuse (I was at the beginning), but once you get the hold of it, it'll be a breeze.v0 = (s32)a1->unk4; if (v0 == 0) { v0 = (s32)a2->unk4; goto label; } v0 = (s32)a2->unk4;
- Pseudocode equivalent:
[delay slot here if applicable] if (v0 == 0) { [delay slot here if applicable] goto l_2c; }
- GCC likes to sometimes reuse registers in branches, where you may see code like
the following:
- Third unknown instruction is
negu:- Pseudoinstruction, equivalent to
subu rd, $zero, rt(GPR[rd] = 0 - GPR[rt]). - Pseudocode equivalent:
a1 = -v1;
- Pseudoinstruction, equivalent to
- 4th unknown instruction is
bnezl:- Pseudoinstruction, equivalent to
bnel rs, $zero, offset. Thebnelinstruction will be explained instead. - MIPS has special 'likely' branches, which means:
- a.) If the branch is taken, the delay slot is executed.
- b.) If the branch is not taken, the delay slot is not executed.
- Example:
bnel $v1, $zero, l_20 lbu $v1, 0x2c($a0)- If
$v1is not zero, the word at0x2c($a0)is loaded to$v1(delay slot executed) and the branch is taken. - If
$v1is zero, the branch isn't taken and the delay slot is skipped, leading to the next instruction.
- If
- Pseudocode equivalent:
if (v1 != 0) { v1 = (u8)a0->unk2C; goto l_20; }
- Pseudoinstruction, equivalent to
- 5th unknown instruction is
lbu:Load Byte Unsigned: To load a byte from memory as an unsigned value.- Pseudocode equivalent:
$v1 = (u8)a0->unk2C;
- 6th unknown instruction is
b:- Pseudoinstruction, equivalent to
beq $zero, $zero, offset(branch always taken). - Pseudocode equivalent:
goto l_offset;- Don't forget the delay slot!
- Pseudoinstruction, equivalent to
- 7th unknown instruction is
move:- Pseudoinstruction, equivalent to
daddu rd, rs, $zero. - Pseudocode equivalent:
rd = rs;/$a1 = 0;(for this case)
- Pseudoinstruction, equivalent to
- 8th unknown instruction is
xor:Exclusive OR: To calculate a bitwise logical EXCLUSIVE OR.- Format:
xor rd, rs, rt - Pseudocode equivalent:
v0 = v0 ^ v1;
- 9th unknown instruction is
movz:Move Conditional on Zero: To move data between GPRs according to the value of a GPR.- Format:
movz rd, rs, rt - Pseudocode equivalent:
if (v0 == 0) { a1 = 0; }
- 10th unknown instruction is
jr:Jump Register: To branch to the address specified by a GPR.- Format:
jr rs - Pseudocode equivalent:
goto rs;- If
rsis$ra(return address register):return;
For more instructions, check the ISA manual.
Having translated everything, the function should look like this:
void exh_mbar_time_out(void) {
v1 = (s32)a0->unk32C;
v0 = (s32)a0->unk28;
a1 = -v1;
if (v0 == 0) {
goto l_2c;
}
if (v1 != 0) {
v1 = (u8)a0->unk2C;
goto l_20;
}
a1 = 0;
goto l_2c;
l_20:
v0 = (u8)a0->unk330;
v0 = v0 ^ v1;
if (v0 == 0) {
a1 = 0;
}
l_2c:
v0 = a1;
return;
}We've got a lot of nasty gotos and labels. Our goal will now be to get rid of those, going branch for branch, sweeping out each goto.
Let's look at the first branch:
if (v0 == 0) goto l_2c;- On this branch, if
v0is zero, it makes the jump. If the condition fails though, no jump is made, and the rest of the code is executed up. A common thing is that the control flow eventually leads tol_2cregardless of the branches that are taken.
So, to remove the first goto, we can invert the condition, and insert all of the code into the if statement.
The function should look like this now:And just like that, the goto is gone. Though, we cannot remove the label yet, as there is another branch pointing to that label. We'll take care of that shortly.void exh_mbar_time_out(void) { v1 = (s32)a0->unk32C; v0 = (s32)a0->unk28; a1 = -v1; if (v0 != 0) { if (v1 != 0) { v1 = (u8)a0->unk2C; goto l_20; } a1 = 0; goto l_2c; l_20: v0 = (u8)a0->unk330; v0 = v0 ^ v1; if (v0 == 0) { a1 = 0; } } l_2c: v0 = a1; return; }
- On this branch, if
if (v1 != 0) { [delay slot]; goto l_20; }- Here we have a likely branch. The delay slot inside the if statement is irrelevant to us right now, our target is the goto--let's ignore it exists for a moment. Much better.
We can do the same thing we did earlier, and we will have gotten rid of the goto andl_20label. But what about the delay slot? We can make an else statement and place it there.
It doesn't feel like it belongs there, though. We'll make a better use of that else statement later.
That done, let's check the pseudocode:Looking better!void exh_mbar_time_out(void) { v1 = (s32)a0->unk32C; v0 = (s32)a0->unk28; a1 = -v1; if (v0 != 0) { if (v1 == 0) { a1 = 0; goto l_2c; } else { v1 = (u8)a0->unk2C; } v0 = (u8)a0->unk330; v0 = v0 ^ v1; if (v0 == 0) { a1 = 0; } } l_2c: v0 = a1; return; }
- Here we have a likely branch. The delay slot inside the if statement is irrelevant to us right now, our target is the goto--let's ignore it exists for a moment. Much better.
goto l_2c;- We're almost there, there's only one goto left.
On a first glance it looks confusing, but let's see it from another point of view. This goto is two blocks deep in the control flow, which jumps out of those two blocks.
Described this way, this seems like the block just ends and also gets back from the other block. But if that was the case, you could just remove the goto, right? We can't just remove it given the code inside the first block, though the second block makes a goto out of the two blocks, and the possibility of it being an actual goto is very small. It's almost as if that code only executed if the(v1 == 0)condition wasn't true, hm.void exh_mbar_time_out(void) { v1 = (s32)a0->unk32C; v0 = (s32)a0->unk28; a1 = -v1; if (v0 != 0) { if (v1 == 0) { a1 = 0; } else { v1 = (u8)a0->unk2C; v0 = (u8)a0->unk330; v0 = v0 ^ v1; if (v0 == 0) { a1 = 0; } } } v0 = a1; return; }
- We're almost there, there's only one goto left.
Now, it doesn't look that confusing anymore, does it?
We've successfully removed all fake gotos and restored the original control flow. Congrats!
Though, while the control flow is much better, the pseudocode is still looking ugly.
We just need to do some cleaning, that's all. This is where we start using the debugging info (if there isn't debugging info, this becomes merely guesswork), so let's have a look:
static int exh_mbar_time_out(/* a0 4 */ EXAM_CHECK *ec_pp) {
/* a1 5 */ int ret;
}We know there's only a real variable, ret, mapped to the $a1 register. There are cases where the register isn't always mapped to that variable, so there is still some guesswork to do. There are also cases where variables are optimized out and won't appear on the debug information. In this case however, the $a1 register is always mapped to the ret variable.
Afterwards, we can get rid of unnecessary variables, and knowing $a0 is a pointer to a EXAM_CHECK object, we can name fields properly. Since we know these field types, we can get rid of the explicit casts as well.
Now, our function will look like this:
void exh_mbar_time_out(void) {
ret = -a0->oth_num;
if (a0->ted_num != 0) {
if (a0->oth_num == 0) {
ret = 0;
} else {
if ((ec_pp->oth[0].th_num ^ ec_pp->ted[0].th_num) == 0) {
ret = 0;
}
}
}
v0 = ret;
return;
}It's slowly taking shape! But what's that strange move to v0? And where does a0 come from?
This code was compiled following the MIPS EABI, where $v0 is used as a return register, and $a0 is used as the first argument. But of course, we already know that thanks to the debugging info.
So let's apply that, and merge the separate else and if statements into an 'else if' statement:
static int exh_mbar_time_out(EXAM_CHECK *ec_pp) {
int ret = -ec_pp->oth_num;
if (ec_pp->ted_num != 0) {
if (ec_pp->oth_num == 0) {
ret = 0;
} else if ((ec_pp->oth[0].th_num ^ ec_pp->ted[0].th_num) == 0) {
ret = 0;
}
}
return ret;
}At last, it finally looks like a function, and it matches! But something looks odd.
That XOR check is weird--could it be that the developers didn't actually write a XOR?
Let's examine it further:
- We have two unsigned bytes, XOR'd and checked if the result is zero.
- The XOR will only result into a zero when both values match.
This is an optimization made by the compiler, where it avoided doing branches in favor of a XOR and a conditional move.
static int exh_mbar_time_out(EXAM_CHECK *ec_pp) {
int ret = -ec_pp->oth_num;
if (ec_pp->ted_num != 0) {
if (ec_pp->oth_num == 0) {
ret = 0;
} else if (ec_pp->oth[0].th_num == ec_pp->ted[0].th_num) {
ret = 0;
}
}
return ret;
}If we replace that XOR check, our function still matches.
Now that we matched our function, we can move on to the next!.
If you need some help with a function or you just want to decomp from the browser (the site does have a very nice-looking diffing view), you'll want to make a decomp(dot)me scratch:
- Obtain the disassembly:
- It is recommendable that you produce a disassembly with line info to make the decompilation process easier, using the following command as an example:
./tools/add_lines_to_asm.py --asm-path asm/nonmatchings/main/scrctrl/exh_nombar_sub.s --stdump-json-path dump/sym/p3.json --asm-out dump/out.asm --offset 10- It is also recommended to keep the offset at 10 or close values.
- Second way (no line numbers):
- Go to the
dump/disasmfolder and find the function you want to decompile on the{NAME}.text.sfile. - If necessary, grab the required data from
.rodata(e.g. strings or jumptables) in order to have a working m2c decompilation.
- Go to the
- It is recommendable that you produce a disassembly with line info to make the decompilation process easier, using the following command as an example:
- Go to the decomp.me website and make a new scratch, using the
PaRappa the Rapper 2preset.- Use the
PaRappa the Rapper 2 (C++)preset if the function comes from prlib (C++ library).
- Use the
- Paste the disassembly into the 'Target assembly' box.
- Fill the necessary context. You can always modify it later if unsure. You can start with this context with the base types and defines.
-
typedef unsigned char u_char; typedef unsigned short u_short; typedef unsigned int u_int; typedef unsigned long u_long; #define NULL (0) #define TRUE (1) #define FALSE (0)
-
- Click the 'Create scratch' button and get decompiling!