The following describes how the source code of the recompilation can be annotated such that reccmp can compare the recompilation to the original binary.
All annotations are of the form
// <annotation type>: <target> <address>For example,
// FUNCTION: LEGO1 0x100b12c0refers to a function at address 0x100b12c0 in the build target aliased by LEGO1 (since it is possible to build different targets from the same source code).
Functions can be annotated by one of the markers below. Each marker contains the address of the function as found in the original binaries. This information is then used to compare the recompiled assembly with the original assembly, resulting in an accuracy score.
Note that functions in a given compilation unit must be ordered by their address in ascending order.
Function annotations can have multiple different types, which are explained below.
There are three ways to annotate a function:
The preferable way is to annotate the implementation directly. For example:
// FUNCTION: LEGO1 0x100b12c0
MxCore* MxObjectFactory::Create(const char* p_name)
{
// implementation
}There are situations where the previous kind of annotation is not possible. Typical examples are:
- templated functions
- synthetic functions (generated by the compiler)
- library functions (like the C++ standard library)
- non-inlined inline functions
In those cases, one can spell out the function's name in a comment:
// TEMPLATE: LEGO1 0x100c4f50
// MxCollection<MxRegionLeftRight *>::`scalar deleting destructor'There are a few cases where two functions of the same name need to be annotated by comment (e.g. in function overloads). In such cases, you can annotate a comment of the function's debug symbol:
// TEMPLATE: LEGO1 0x10035790
// ?_Construct@@YAXPAPAVROI@@ABQAV1@@ZNames that begin with ? are assumed to be MSVC-like symbols. In all other cases, you should specify that the name is a symbol by adding SYMBOL after the address. For example:
// LIBRARY: TEST 0x10002000 SYMBOL
// __strlwrThe compiler may combine redundant functions that produce the same instructions. If this occurs, the functions will share the same address. In MSVC, this is controlled by the /OPT:ICF option.
To annotate this properly in reccmp, use the FOLDED option after the FUNCTION marker. For example:
// FUNCTION: HELLO 0x4513d0 FOLDED
void NeonCactus7532::VTable0x1c(undefined4)
{
}
// FUNCTION: HELLO 0x4513d0 FOLDED
void NeonCactus7532::VTable0x20(undefined4)
{
}Functions with a reasonably complete implementation which are not templated or synthetic (see below) should be annotated with FUNCTION. It is preferable to annotate the function's implementation directly.
Functions with no or a very incomplete implementation should be annotated with STUB. These will not be compared to the original assembly.
// STUB: LEGO1 0x10011d50
LegoCameraController::LegoCameraController()
{
// TODO
}Templated functions should be annotated with TEMPLATE:
// TEMPLATE: LEGO1 0x100c0ee0
// list<MxNextActionDataStart *,allocator<MxNextActionDataStart *> >::_Buynode
// TEMPLATE: LEGO1 0x100c0fc0
// MxStreamListMxDSSubscriber::~MxStreamListMxDSSubscriber
// TEMPLATE: LEGO1 0x100c1010
// MxStreamListMxDSAction::~MxStreamListMxDSActionSynthetic functions should be annotated with SYNTHETIC. A synthetic function is generated by the compiler; most common is the "scalar deleting destructor" found in virtual tables. Other cases include default destructors and assignment operators. Note: SYNTHETIC takes precedence over TEMPLATE.
// SYNTHETIC: LEGO1 0x10003210
// Helicopter::`scalar deleting destructor'
// SYNTHETIC: LEGO1 0x100c4f50
// MxCollection<MxRegionLeftRight *>::`scalar deleting destructor'
// SYNTHETIC: LEGO1 0x100c4fc0
// MxList<MxRegionLeftRight *>::`scalar deleting destructor'Functions located in 3rd party libraries should be annotated with LIBRARY. This can be useful for working towards a full accounting of all the functions present in the binaries.
// LIBRARY: ISLE 0x4061b0
// _MemPoolInit@4
// LIBRARY: ISLE 0x406520
// _MemPoolSetPageSize@8
// LIBRARY: ISLE 0x406630
// _MemPoolSetBlockSizeFS@8Classes with a virtual table should be annotated using the VTABLE marker, which includes the module name and address of the virtual table:
// VTABLE: LEGO1 0x100dc900
class MxEventManager : public MxMediaManager {
// ...
}Global variables should be annotated using the GLOBAL marker, which includes the module name and address of the variable.
// GLOBAL: LEGO1 0x100f456c
MxAtomId* g_jukeboxScript = NULL;
// GLOBAL: LEGO1 0x100f4570
MxAtomId* g_pz5Script = NULL;
// GLOBAL: LEGO1 0x100f4574
MxAtomId* g_introScript = NULL;String values should be annotated using the STRING marker, which includes the module name and address of the text content. Note that this is usually not required since most strings can be auto-detected. If you want, you can use this for bookkeeping, but it will usually not affect the reccmp match.
inline virtual const char* ClassName() const override // vtable+0x0c
{
// STRING: LEGO1 0x100f03fc
return "Act2PoliceStation";
}String constants can have a distinct STRING and GLOBAL address at the same time. The STRING points at the actual text while the GLOBAL is a pointer to the text:
// GLOBAL: LEGO1 0x10102048
// STRING: LEGO1 0x10102040
const char* g_strACTION = "ACTION";In this example, there is an A at address 0x10102040 and a 32-bit pointer to 0x10102040 at address 0x10102048.
Individual code lines can be annotated using the LINE marker:
short token = 0;
// LINE: BETA10 0x1013e643
short xmax = xofs + width - 1;This may be helpful when the recompiled code does not match the original code very well, or the assembly text diff misdetects which parts correspond to each other. At the moment, this annotation must be followed by a line of code (i.e. not by an empty line or another comment).