The programming language Rust is a language Mozilla has developed and is now attracting attention as a substitute for C and C++. Rust is increasingly adopted in recent years for its excellent memory safety and high-speed performance.
As Rust becomes popular as a programming language, however, malware developed using Rust tends to increase. Rust malware includes SysJoker variants and BlackCat ransomware.
Nevertheless, techniques and knowledge of reverse engineering of Rust malware have not been sufficiently accumulated compared to reverse engineering of malware created with other languages. Binaries created with new languages, such as Rust, have structures different from those of binaries created with conventional languages such as C. This means we need approaches and knowledge different from conventional ones.
This report summarizes results from studies on knowledge required for reverse engineering of binaries created with Rust.
This report assumes its readers are engineers as described below:
- Malware analyst
- An engineer engaged in reverse engineering of Rust binaries
- An engineer who wants to understand the internal structure of Rust in detail
The versions of rustc and cargo we used in this study are listed below. For compilation, we used an MSVC environment on Windows.
- cargo: 1.82.0
- rustc: 1.82.0
For disassembling binaries, we used IDA Pro. The version of IDA Pro we used in this study is listed below:
- IDA Pro v8.3.230608
This report summarizes results from studies on the items listed below:
| No. | Item | Overview |
|---|---|---|
| 1 | Differences between binaries, associated with setting modifications of Profiles in Cargo | Study on what extent approaches that use cargo to reduce binary sizes can reduce the sizes and what information is left unremoved. The approaches should be those available from disclosed information. |
| 2 | Reducing binary sizes | Study on to what extent approaches that use rustc to reduce binary sizes can reduce the sizes and what information is left unremoved. The approaches should be those available from disclosed information. |
| 3 | Identifying Rust binaries | Study on approaches that determine whether a binary is a Rust binary |
| 4 | Exception Directory | Study on information available from an Exception Directory structure |
| 5 | TLS Directory | Study on information available from a TLS Directory structure and contents of a TLS Callback |
| 6 | Identifying the main function and initialization | Approaches for identifying user-defined main functions |
| 7 | Strings | Approaches for handling strings |
| 8 | Mangling function names | Structure of mangled function names and how to demangle mangled function names |
| 9 | Closure | Behavior of closures and memory layouts to be used |
| 10 | Enum types | Study on how behavior of enum types in Rust is implemented in assembly code |
| 11 | Match statement | Study on how behavior of match statements in Rust is implemented in assembly code |
| 12 | Panic statement | Differences in assembly code between behavior on a panic "unwind" and "abort" |
| 13 | Iterator | Study on how code using iterators or the "next" function is implemented in assembly code |
| 14 | Trait | Differences between calls to a function using traits and to a common function |
| 15 | Identify typical traits | Approaches for identifying traits the #[derive] attribute uses, in assembly code |
| 16 | Dynamic dispatch reference | Characteristics of assembly code and differences between calls using dynamic and static dispatches |
| 17 | Collection | Memory layout to be used |
| 18 | Identifying functions generated from the same generics | Study on approaches for identifying a function generating another function |
| 19 | Smart pointer | Characteristics and memory layouts of smart pointers |
| 20 | Inline assembly | Characteristic code patterns |
| 21 | Link attribute | Differences in how to link libraries |
| 22 | Repr attribute | Study on how memory layouts change according to specifiable options |
| 23 | How to identify code in standard and third-party libraries | Approaches for identifying statically linked standard and third-party library functions |
If you have a request for additional studies or find an error in this report, please contact us through Issue or Pull Request.