Unicorn Devblog: setjmp/longjmp on Windows
Introdution
This post is a longer explanation why we need a wrapper for setjmp on Windows x86_64 for Unicorn.
For the corresponding pull request, see this.
Story
The story starts with Qiling Framework. When I ran tests for Qiling on native Windows someday, the whole python process exited silently. After some investigating and debugging, I’m sure that the crash happens in unicorn, not Qiling. Thus, I submitted an issue to Unicorn. However, recently we’d like to make Qiling run on Windows so I decided to solve the issue.
What happened?
At first look, the stacktrace shows it is in uc_version
:
1 | 0:000> kv |
But that is quite confusing since the crashed context is in a hook callback which never calls uc_version
. The stacktrace above is not so helpful so a minimum reproduction code is required, as @aquynh also suggested.
Reproduction
The good news is that this bug also exists in real mode when I implement it for Qiling Framework. The crash often happens when a hook callback is called multiple times but the exact time when crash happens differs. Thus, I write a snippet of reproduction code in C:
1 |
|
The exact crash point is inside RtlUnwindEx
function which is part of setjmp implementation. However, this code only reproduces the bug with SEH enabled and Debug build. If it is built in Release mode or the SEH is disabled, no crash would happen. At this time, I guess that it’s highly likely related to some undefined behavior so I turn to docs.
MSDN & ctypes hint
After some googling, MSDN gives me a hint.
In Microsoft C++ code on Windows, longjmp uses the same stack-unwinding semantics as exception-handling code. It is safe to use in the same places that C++ exceptions can be raised.
After checking the property of VS project, I find that unicorn disable exceptions indeed so calling longjmp
is not safe. In addition, ctypes
docs states:
On Windows, ctypes uses win32 structured exception handling to prevent crashes from general protection faults when functions are called with invalid argument values.
Wow! Looks like it’s the root cause, right? Unfortunately, after enabling exceptions for Unicorn, the program still crashes.
I’m really lost in thought… Why the setjmp
/longjmp
is not safe with exceptions disabled? Why are these two library functions tied to a platform-dependent mechanism?
RtlUnwindEx
The questions above bring me to RtlUnwindEx
for an answer. The document of the function is pretty simple on MSDN:
Initiates an unwind of procedure call frames.
After some analysis, I find RtlUnwindEx
will identify previous frames firstly and locate where the exception (in our case, setjmp
) is called. So, the question is: How RtlUnwindEx
parses previous frames? As we know, on x86_64 Windows, all functions share the same calling convention so it’s easy to identify such frames… Wait, Unicorn supports JIT so how about those generated codes?
Bingo! The jit-ed code doesn’t follow any existing calling conventions and that confuses RtlUnwindEx
and results in a crash. To confirm this, I write a PoC in following steps:
- Create a normal DLL which calls
setjmp
. - Then use IDA Pro to insert some instructions like the generated code above and call
longjmp
. - Write another program loads the DLL and runs our target function.
As expected, the program crashes in RtlUnwindEx
, which proves my guess.
Musl implementation
Since the native implementation is not compatible with JIT, I consider replace it with some other implementation. Musl implementation is an extremely simple one:
1 | /* Copyright 2011-2012 Nicholas J. Kain, licensed under standard MIT license */ |
After translating the code with intel grammar, Unicorn works like a charm and the problem seems to be really resolved. However, some comments in qemu suggest that this bug can be resolved in a more proper way:
1 |
|
In Unicorn, it was patched since _setjmp(env NULL)
doesn’t exist in MSVC while qemu is built by mingw cross-compilation. But wait, if qemu can work with native longjmp
implementation, why not Unicorn?
setjmp/longjmp implementation
Finally, after debugging qemu-2.2.0-win32 for about 1 hour, I find the reason. Below is the disassembly of longjmp
.
Microsoft puts two longjmp
implementations in one fucntion and the first field of jmp_buf
(rcx
in the figure above) decides which implementation to use. Where does this flag come from?
Look into setjmp
implementation and the answer is quite simple.
Yes, qemu is right. If the second parameter is NULL(0)
, longjmp
will use the common implementation instead of unwinding stack with RtlUnwindEx
. Okay, make sense and how can we call _setjmp(env, NULL)
in msvc? After reading numerous MSDN docs and Windows SDK headers, my answer is:
NO WAY.
Microsoft writes a function with two parameters but only gives you a signature which has one arguments. Nice work again, Microsoft. :/
So the only way is to write a wrapper in a standalone assembly file since Microsoft also removes support for x64 inline assembly.
Conclusion
I HATE WIN32.