Unicorn2: Breakages, Migration, Future and Community
Introduction
Last week, I have been traversing through many downstreams of Unicorn Engine to get the feedback of our recent move. It is quite surprising that the migration from v1 or 2.0.x is actually harder than I thought. Therefore, as a complement of the release notes, I’m writing this to explain the possible breakages and migration, the things we did and the future of Unicorn.
Breakages && Migration
Unfortunately, breakages are inevitable for such a big project, especially given the complex semantics of emulation. Sometimes, the “breakages” might be just fixes that instead break previous workarounds.
One common usage of Unicorn is just emulating a few instructions and checking the results. Unicorn v1 previously hacks QEMU internal implementation to provide a 1:1
mapping of virtual address to physical address, which is super convenient in most cases. However, the problematic thing is that this causes incorrect instruction semantics when MMU is involved. Therefore, any code expecting such mapping might be broken.
Fortunately, we have a specific workaround for this, called virtual TLB mode. In short, enable it by:
1 | uc_ctl_tlb_mode(UC_TLB_VIRTUAL); |
And Unicorn Engine will continue providing the 1:1
mapping. The function can be found on popular bindings as well. Technically, this will skip MMU implementation by installing trivial TLBs and is much cleaner than v1 approach.
Also we made uc_emu_start
return an error if any error within memory hooks is not handled. This is easy to migrate by correctly mapping the regions.
Another notable breakage is how we handle MIPS delay slot and ARM IT blocks. In all previous versions, if Unicorn happens to stop within the IT blocks or delay slots, the branch information is lost, i.e. incorrect emulation result. In Unicorn 2, we make it unstoppable in such blocks to ensure the correct semantics, at a cost of less tracing availability.
I will keep updating this section when I come up with more.
New Features
There are many features we developed for Unicorn 2 and it is a good opportunity to summarize.
164 More Unit Tests & CI
Though not a real feature, more unit tests ensure that we can have consistent behavior between releases. And importantly, Unicorn 2 has CI to test on all platforms Unicorn supports. This would also help use bump QEMU versions.
Less Hacks
We are working on removing Unicorn dirty hacks to QEMU internals to help us bump QEMU version and speed up executions.
Performance Gain
There is at least 15% performance gain when upgrading from unicorn 1 to unicorn 2 thanks to QEMU version bump and cleaning of old Unicorn hacks. This is crucial for the use cases like fuzzing.
Copy-on-write Snapshots
Unicorn 2 now supports copy-on-write snapshots! Enable this by:
1 | uc_ctl_context_mode(UC_CTL_CONTEXT_MEMORY) |
In this case, the contexts saved will contain the contents of the memory in a copy-on-write fashion. It is implemented by placing overlay memory regions and doing copy when necessary. This could significantly increase the fuzzing speed.
Full System Emulation
Unicorn 2 removes almost all the hacks to MMU so that full system emulation is finally possible. I already saw some interesting projects spinning up real-world operating systems by Unicorn.
Better Distributions
Unicorn 2 now can be distributed on more platforms and architectures like Alpine Linux with musl. We are also actively porting Unicorn to more platforms.
Future
So far, Unicorn has been moving to a stable state by eliminating obvious bugs. I acknowledge and thank the efforts and feedback from all of our users. Here I would like to share my planning of Unicorn Engine:
- Community
I would like to hear your feedback! If you prefer instant feedback, join our telegram group. Or for a formal report, please report to our Github issues.
- Patch release roughly every 2-3 months, ideally minor release every year
Please also note that, the minor release of Unicorn might contain semantics breakages because our major number only bumps when we bump qemu major version (in most cases).
- Chase up to the latest QEMU
We will probably bump to QEMU 5.1.0 in 2.2.0 and try to keep up to the latest QEMU.
- More target/backend support
We will at least add AVR/RH850 target support and loongarch backend support. Ideally, we shall also backport loongarch64 target support.
- Stability fix
My top priority is still making Unicorn stable by fixing any security vulnerabilities or segment faults.
- Better tracing
In many cases, Unicorn related issue is hard to diagnose because the code emulated might be complex and private. We will add more support for tracing Unicorn internal states.
- Fuzzing for semantics
Given there are already a few emulator prototypes, it would be interesting to fuzz between different emulators to find potential incorrect semantics.