title: N64ReCompile项目README翻译
tags: [翻译]
翻译
N64: Recompiled
是一个能够将二进制N64文件静态编译成任何平台都可进行编译的C代码的工具. 它可以被用于那些执行模拟行为的工具/软件,并且对比解释器或动态编译有更好的模拟效率. 更进一步,它可以用于任何你想在独立环境下运行某些N64二进制文件的情况.
本项目不是第一个在主机游戏二进制文件上使用静态编译的项目. 一个众所周知的例子就是 jamulator, 主攻NES二进制文件. 并且, 本项目甚至不是第一个在N64相关-项目中应用静态编译的: IDO static recompilation 在现代系统上重编译了 SGI IRIX IDO 编译器,以方便对N64游戏的匹配反编译. 本项目的运作方式在某项方面与 IDO 静态重编译项目类似, 且该项目是本人主要的灵感来源之一.
目录
- 翻译
- N64: Recompiled
- 目录
- 运作原理
- 视图
- 使用方法
- 单文件输出模式 (用于补丁)
- RSP 微码支持
- 计划更新的特性
- 构建
- 使用的库
- 原文
- N64: Recompiled
- Table of Contents
- How it Works
- Overlays
- How to Use
- Single File Output Mode (for Patches)
- RSP Microcode Support
- Planned Features
- Building
- Libraries Used
运作原理
重编译器通过接受与二进制文件一同提供的一系列标志与元数据来实现将输入二进制文件分割成函数,每个函数都能被重编译成根据元数据命名的C函数.
指令都被一对一处理过了,并且每条指令在被处理时都会发出相应的C代码. 为了保持低复杂度,指令的转化都很字面意义。 例如, addiu $r4, $r4, 0x20
指令, 这个指令会将0x20
添加到低字节寄存器的32位值中,并且存储$r4
中符号扩展的64位结果,并将其重编译成 ctx->r4 = ADD32(ctx->r4, 0X20);
. jal
(jump-and-link) 指令被直接转化成了一个函数调用, 以及 j
或者 b
指令 (无条件跳转和分支),这种可以被识别为尾部调用优化(Tail-Call Optimizations, TCO)的指令也被重编译成了函数调用. 分支延迟槽位在必要时是通过复制指令来实现的. 对于特定指令,本项目也有其他特定的行为,例如,如果可以展现出被用于跳转表,重编译器会尝试将jr
指令编程一个 switch-case语句.重编译器已经在旧的MIPS编译器构建的二进制文件(如 mips gcc 2.7.2 和 IDO)以及现代clang mips目标架构.现代 mips gcc 可能会由于其特定的优化妨碍重编译器运作,但这些问题大概可以通过设置特定的编译标志来避免.
每个由重编译器处理的输出函数都被当前写入到自己的文件中.未来可能会提供一个可选选项来将函数合在一起写入输出文件中,通过减少构建过程中的文件I/O操作来改善构建时间.
重编译的输出能被任意C编译器(使用msvc,gcc,clang测试过)进行编译.输出需要与一个能够提供运行所需函数与宏实现的runtime使用. N64ModernRuntime 提供了一个可以在 Zelda 64: Recompiled 项目中看到的Runtime.
视图
静态链接与可重定位的视图都可以使用这个工具处理.这两种情况下,该工具为跳转-链接寄存器(即函数指针与虚函数)发出函数查找,所提供的Runtime可以以任何形式的查找表实现这些函数.例如,jalr $25
指令会被重编译为 LOOKUP_FUNC(ctx->r25)(rdram, ctx);
The runtime can then maintain a list of which program sections are loaded and at what address they are at in order to determine which function to run whenever a lookup is triggered during runtime.
For relocatable overlays, the tool will modify supported instructions possessing relocation data (lui
, addiu
, load and store instructions) by emitting an extra macro that enables the runtime to relocate the instruction's immediate value field. For example, the instruction lui $24, 0x80C0
in a section beginning at address 0x80BFA100
with a relocation against a symbol with an address of 0x80BFA730
will get recompiled as ctx->r24 = S32(RELOC_HI16(1754, 0X630) << 16);
, where 1754 is the index of this section. The runtime can then implement the RELOC_HI16 and RELOC_LO16 macros in order to handle modifying the immediate based on the current loaded address of the section.
Support for relocations for TLB mapping is coming in the future, which will add the ability to provide a list of MIPS32 relocations so that the runtime can relocate them on load. Combining this with the functionality used for relocatable overlays should allow running most TLB mapped code without incurring a performance penalty on every RAM access.
使用方法
The recompiler is configured by providing a toml file in order to configure the recompiler behavior, which is the only argument provided to the recompiler. The toml is where you specify input and output file paths, as well as optionally stub out specific functions, skip recompilation of specific functions, and patch single instructions in the target binary. There is also planned functionality to be able to emit hooks in the recompiler output by adding them to the toml (the [[patches.func]]
and [[patches.hook]]
sections of the linked toml below), but this is currently unimplemented. Documentation on every option that the recompiler provides is not currently available, but an example toml can be found in the Zelda 64: Recompiled project here.
Currently, the only way to provide the required metadata is by passing an elf file to this tool. The easiest way to get such an elf is to set up a disassembly or decompilation of the target binary, but there will be support for providing the metadata via a custom format to bypass the need to do so in the future.
单文件输出模式 (用于补丁)
This tool can also be configured to recompile in “single file output” mode via an option in the configuration toml. This will emit all of the functions in the provided elf into a single output file. The purpose of this mode is to be able to compile patched versions of functions from the target binary.
This mode can be combined with the functionality provided by almost all linkers (ld, lld, MSVC's link.exe, etc.) to replace functions from the original recompiler output with modified versions. Those linkers only look for symbols in a static library if they weren't already found in a previous input file, so providing the recompiled patches to the linker before providing the original recompiler output will result in the patches taking priority over functions with the same names from the original recompiler output.
This saves a tremendous amount of time while iterating on patches for the target binary, as you can bypass rerunning the recompiler on the target binary as well as compiling the original recompiler output. An example of using this single file output mode for that purpose can be found in the Zelda 64: Recompiled project here, with the corresponding Makefile that gets used to build the elf for those patches here.
RSP 微码支持
RSP microcode can also be recompiled with this tool. Currently there is no support for recompiling RSP overlays, but it may be added in the future if desired. Documentation on how to use this functionality will be coming soon.
计划更新的特性
- Custom metadata format to provide symbol names, relocations, and any other necessary data in order to operate without an elf
- Emitting multiple functions per output file to speed up compilation
- Support for recording MIPS32 relocations to allow runtimes to relocate them for TLB mapping
- Ability to recompile into a dynamic language (such as Lua) to be able to load code at runtime for mod support
构建
This project can be built with CMake 3.20 or above and a C++ compiler that supports C++20. This repo uses git submodules, so be sure to clone recursively (git clone --recurse-submodules
) or initialize submodules recursively after cloning (git submodule update --init --recursive
). From there, building is identical to any other cmake project, e.g. run cmake
in the target build folder and point it at the root of this repo, then run cmake --build .
from that target folder.
使用的库
- rabbitizer for instruction decoding/analysis
- ELFIO for elf parsing
- toml11 for toml parsing
- fmtlib
原文
N64: Recompiled
N64: Recompiled is a tool to statically recompile N64 binaries into C code that can be compiled for any platform.This can be used for ports or tools as well as for simulating behaviors significantly faster than interpreters or dynamic recompilation can. More widely, it can be used in any context where you want to run some part of an N64 binary in a standalone environment.
This is not the first project that uses static recompilation on game console binaries. A well known example is jamulator, which targets NES binaries. Additionally, this is not even the first project to apply static recompilation to N64-related projects: the IDO static recompilation recompiles the SGI IRIX IDO compiler on modern systems to faciliate matching decompilation of N64 games. This project works similarly to the IDO static recomp project in some ways, and that project was my main inspiration for making this.
Table of Contents
- 翻译
- N64: Recompiled
- 目录
- 运作原理
- 视图
- 使用方法
- 单文件输出模式 (用于补丁)
- RSP 微码支持
- 计划更新的特性
- 构建
- 使用的库
- 原文
- N64: Recompiled
- Table of Contents
- How it Works
- Overlays
- How to Use
- Single File Output Mode (for Patches)
- RSP Microcode Support
- Planned Features
- Building
- Libraries Used
How it Works
The recompiler works by accepting a list of symbols and metadata alongside the binary with the goal of splitting the input binary into functions that are each individually recompiled into a C function, named according to the metadata.
Instructions are processed one-by-one and corresponding C code is emitted as each one gets processed. This translation is very literal in order to keep complexity low. For example, the instruction addiu $r4, $r4, 0x20
, which adds 0x20
to the 32-bit value in the low bytes of register $r4
and stores the sign extended 64-bit result in $r4
, gets recompiled into ctx->r4 = ADD32(ctx->r4, 0X20);
The jal
(jump-and-link) instruction is recompiled directly into a function call, and j
or b
instructions (unconditional jumps and branches) that can be identified as tail-call optimizations are also recompiled into function calls as well. Branch delay slots are handled by duplicating instructions as necessary. There are other specific behaviors for certain instructions, such as the recompiler attempting to turn a jr
instruction into a switch-case statement if it can tell that it's being used with a jump table. The recompiler has mostly been tested on binaries built with old MIPS compilers (e.g. mips gcc 2.7.2 and IDO) as well as modern clang targeting mips. Modern mips gcc may trip up the recompiler due to certain optimizations it can do, but those cases can probably be avoided by setting specific compilation flags.
Every output function created by the recompiler is currently emitted into its own file. An option may be provided in the future to group functions together into output files, which should help improve build times of the recompiler output by reducing file I/O in the build process.
Recompiler output can be compiled with any C compiler (tested with msvc, gcc and clang). The output is expected to be used with a runtime that can provide the necessary functionality and macro implementations to run it. A runtime is provided in N64ModernRuntime which can be seen in action in the Zelda 64: Recompiled project.
Overlays
Statically linked and relocatable overlays can both be handled by this tool. In both cases, the tool emits function lookups for jump-and-link-register (i.e. function pointers or virtual functions) which the provided runtime can implement using any sort of lookup table. For example, the instruction jalr $25
would get recompiled as LOOKUP_FUNC(ctx->r25)(rdram, ctx);
The runtime can then maintain a list of which program sections are loaded and at what address they are at in order to determine which function to run whenever a lookup is triggered during runtime.
For relocatable overlays, the tool will modify supported instructions possessing relocation data (lui
, addiu
, load and store instructions) by emitting an extra macro that enables the runtime to relocate the instruction's immediate value field. For example, the instruction lui $24, 0x80C0
in a section beginning at address 0x80BFA100
with a relocation against a symbol with an address of 0x80BFA730
will get recompiled as ctx->r24 = S32(RELOC_HI16(1754, 0X630) << 16);
, where 1754 is the index of this section. The runtime can then implement the RELOC_HI16 and RELOC_LO16 macros in order to handle modifying the immediate based on the current loaded address of the section.
Support for relocations for TLB mapping is coming in the future, which will add the ability to provide a list of MIPS32 relocations so that the runtime can relocate them on load. Combining this with the functionality used for relocatable overlays should allow running most TLB mapped code without incurring a performance penalty on every RAM access.
How to Use
The recompiler is configured by providing a toml file in order to configure the recompiler behavior, which is the only argument provided to the recompiler. The toml is where you specify input and output file paths, as well as optionally stub out specific functions, skip recompilation of specific functions, and patch single instructions in the target binary. There is also planned functionality to be able to emit hooks in the recompiler output by adding them to the toml (the [[patches.func]]
and [[patches.hook]]
sections of the linked toml below), but this is currently unimplemented. Documentation on every option that the recompiler provides is not currently available, but an example toml can be found in the Zelda 64: Recompiled project here.
Currently, the only way to provide the required metadata is by passing an elf file to this tool. The easiest way to get such an elf is to set up a disassembly or decompilation of the target binary, but there will be support for providing the metadata via a custom format to bypass the need to do so in the future.
Single File Output Mode (for Patches)
This tool can also be configured to recompile in “single file output” mode via an option in the configuration toml. This will emit all of the functions in the provided elf into a single output file. The purpose of this mode is to be able to compile patched versions of functions from the target binary.
This mode can be combined with the functionality provided by almost all linkers (ld, lld, MSVC's link.exe, etc.) to replace functions from the original recompiler output with modified versions. Those linkers only look for symbols in a static library if they weren't already found in a previous input file, so providing the recompiled patches to the linker before providing the original recompiler output will result in the patches taking priority over functions with the same names from the original recompiler output.
This saves a tremendous amount of time while iterating on patches for the target binary, as you can bypass rerunning the recompiler on the target binary as well as compiling the original recompiler output. An example of using this single file output mode for that purpose can be found in the Zelda 64: Recompiled project here, with the corresponding Makefile that gets used to build the elf for those patches here.
RSP Microcode Support
RSP microcode can also be recompiled with this tool. Currently there is no support for recompiling RSP overlays, but it may be added in the future if desired. Documentation on how to use this functionality will be coming soon.
Planned Features
- Custom metadata format to provide symbol names, relocations, and any other necessary data in order to operate without an elf
- Emitting multiple functions per output file to speed up compilation
- Support for recording MIPS32 relocations to allow runtimes to relocate them for TLB mapping
- Ability to recompile into a dynamic language (such as Lua) to be able to load code at runtime for mod support
Building
This project can be built with CMake 3.20 or above and a C++ compiler that supports C++20. This repo uses git submodules, so be sure to clone recursively (git clone --recurse-submodules
) or initialize submodules recursively after cloning (git submodule update --init --recursive
). From there, building is identical to any other cmake project, e.g. run cmake
in the target build folder and point it at the root of this repo, then run cmake --build .
from that target folder.
Libraries Used
- rabbitizer for instruction decoding/analysis
- ELFIO for elf parsing
- toml11 for toml parsing
- fmtlib