Mikage Progress Report: January + February 2020

Mikage Progress Report: January + February 2020

We have another update in store, this time with major optimization work going on and of course the obligatory compatibility improvements. Let's dive right in!

AArch64 JIT for super-fast CPU emulation

Every emulator project has to start somewhere, and in terms of CPU emulation that's usually an interpreter core: Easy to get early results from, with little complexity, and adaptable when surprising hardware behavior is discovered. The big weakness of interpreters is performance: Emulating every 3DS CPU instruction with a C++ function call flat out isn't going to run at great speeds.

A more performant solution is called Just-In-Time binary translation (commonly abbreviated to JIT in emulation circles), where you disassemble and recompile functions executed by an emulated game to equivalent binary code that can be executed on the target device (i.e. your phone). This method of CPU emulation can provide massive speedups (up to 10x) depending on the situation, so logically I assigned high priority to the JIT on Mikage's roadmap.

That said, writing a JIT is a complex undertaking, certainly not something you can pull off in a couple of days and immediately collect fruits of labor from. It takes dedicated experimentation and research over several weeks, and then there are still several components that need to be written independently. I announced my early tinkering with JITting back when the project was announced, having been working secretly on it until it was ready for a general showcase.

And finally, it's here: The first JIT-enabled Mikage release shipped in February! The translation process doesn't cover the entire ARM instruction set yet, so it will be another couple of weeks until all comercial games benefit from the JIT. But the potential is very visible in yeti3DS and Retro City Rampage DX - the former getting a mind-blowing 650% framerate improvement! See for yourself:

yeti3DS runs at 7.5 times the interpreter's framerate now!

Of course, this JIT is just the basis to build on, and it will be refined and extended over the coming weeks to make sure the speedups translate to all other titles as well.

If you're curious about the inner workings of this new JIT, I shared a more technical overview in Development Update 9.

Interpreter speedups

In preparing the JIT, I had to rework the interpreter's instruction dispatcher so it could be used as a fallback by the JIT for instructions it doesn't cover yet. The new dispatcher replaces some ugly logic with a much neater approach that also happens to improve performance a little.

If you've seen my professional work, you'll know I'm a big proponent of applying C++ metaprogramming when a good chance comes about. In this case, going through the cumbersome process of decoding ARM instructions manually was very error-prone and led to unnecessary branching within the code that would lead to slow interpretation. There is now a simple table that describes the various ARM opcodes, and we have the C++ compiler conveniently generate a lookup table from that, which can be used to map a CPU instruction's binary encoding to the interpreter's instruction handler. Very convenient and fast :)

This new dispatch table allows for cleaner and faster CPU emulation

New challenger approaching: Steel Diver: Sub Wars

Steel Diver: Sub Wars now reaches the title screen! Other than many small fixes, I uncovered two major bugs:

  • When emulating IPC requests, Mikage considered it an error when a game provided a zero-sized buffer. After all, if you're asking the system to do work, why would you request no data to be returned? Well, my educated guess is that it was just the easier thing to do in the game's source code, so the programmers just opted for leaving these seemingly pointless requests in. It took some creativity to find a clean way of supporting zero-sized buffers, but these games now won't error out anymore.
  • During startup, Steel Diver requests a memory transfer ("DMA") from a seemingly unknown memory address. It turns out this was a bug in the GSP emulation, where DMAs were assumed to refer to memory with process-agnostic virtual-to-physical address mapping. By referring to the game's own memory space instead, the input memory address is recognized properly and hence the DMA runs through just fine now. Ironically, Citra got this wrong too: They implemented what was easiest (and correct, in the common case) and eventually concluded it was incorrect behavior, leaving a note in the source code suggesting to move to the wrong behavior in the future.

As you can guess, these issues by no means only affected Steel Diver. I'll have more news in the next Progress Report about other titles :)

Better graphics in Zelda OoT3D & others

The Legend of Zelda: Ocarina of Time 3D and Nano Assault EX got a nice update to their visuals: You might have noticed various objects to look a bit dull or boring compared to how they look on hardware. And indeed, this was because Mikage didn't implement a feature called texture combiner scalers!

Adding support for this feature luckily wasn't too difficult and the games now look fantastic. See for yourself:

What's next?

There's a big stream of reveals coming up, especially in terms of compatibility! Various new titles are just around the corner of being functionally playable in the next version. I'm also super excited to finally announce a feature that actually has been part of Mikage ever since its early days, but for which there had never been a good chance to show it off... until very recently ;)

Meanwhile, the optimization work continues, of course. More refinements to the JIT are in progress, and there's a lot of optimizations in the pipeline for the Vulkan renderer.

That's it for now though - stay tuned for next months! Be safe, and remember to wash your hands and stay inside ✌️

Mastodon