-
Notifications
You must be signed in to change notification settings - Fork 773
Various fixes and improvements for Windows emulation #1552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Some more information, the |
|
Regarding |
|
I would like to elaborate on my previous comments. After doing some more investigation, I can provide very detailed information about the problems that were occurring previously, and why these changes are beneficial. First, the An
Inside We can see that Previously, without the The ntdll implementation of With a correct implementation of This also means that the stub hooks for For msvcp140.dll, the situation with However, with fixed Note that And |
|
Update: I have since built more on this work, currently I have added some preliminary support for user-mode C++ runtime exceptions to work properly on 64-bit Windows. I turned this PR into a draft for now. I will update it with more changes soon. |
|
There is an issue currently where C++ exceptions are still not quite working because of the RaiseException hook. If the RaiseException hook is removed then C++ exceptions are emulated correctly, however I am pretty sure this is incompatible with the previous code for setting unhandled exception filters, which I did not want to break. Likewise, typeid and dynamic_cast are still not 100% there, but they're very close to working properly, I think these can be fixed with maybe another little hook or two. I will try to figure these out later. |
|
After the latest changes, Some of the tests were failing on x86. It turned out that this was because user32 DllMain was proceeding for longer than before since some APIs are now functioning properly, but now it was crashing in a different manner that made the tests fail. I added user32 to the DllMain blacklist because it was expected to fail in the tests before anyways, and allowing it to run at all was causing more issues than was worth to fix. |
|
To resolve the conflict with the existing unhandled exception filter code, a new hook for The result of this work is now that C++ exceptions and try.. catch are functioning more or less correctly. |
elicn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great contribution, thank you!
Really appreciating the time you took to comment and clarify what the code is doing and why.
Please see my comments and questions.
|
Would you be able to create test cases for the exception handling? |
I've been working on some test cases, I will add them soon :) Thanks for looking it over, I will incorporate your feedback and try to answer your questions. |
|
Just note |
Tagging a hook as a Maybe worth documenting that is your desired behavior, so it won't fail when the code changes in the future. |
Thanks! We'll need to author test cases for them. |
|
|
Hi @sakura57 thanks for the contribution and welcome to Qiling. This looks amazing. its a huge PR and i might need sometime to take a look. as of now i will go ahead to approve the PR in rootfs and enable the CI. |
Thanks for the reply, I am looking forward to your feedback. :) I think I can see why the tests are failing also. There is a problem related to native |
|
After some changes yesterday, wannacry is working properly, and al-khaser is starting. I know why al-khaser is failing, it won't really be a big problem to fix. There is an issue with the clipboard test though. The clipboard test binary was compiled in Debug mode, and uses debug versions of the C/C++ runtime DLLs. These include a lot of additional error checking. For example, there are debug versions of all the memory functions which allocate a little extra space at the beginning and end of all memory buffers, and fill them with special values. If any changes are detected, then this is taken to mean the heap is corrupted, and the CRT terminates with assertion failures. This is the main reason why the clipboard test is not working right now. I think a solution for this could be to allow all the native CRT memory functions to run. These depend ultimately on the kernelbase Heap* functions. As long as those implementations are robust, then all the error checking logic should work properly in theory. The test is still failing though after I tried these changes. I think there are some different possibilities, either the CRT is actually doing its job correctly, and this is actually revealing a buffer overrun occurring somewhere in Qiling hooks. Or, somewhere Qiling is passing allocated memory to the program, which the CRT debug libs are expecting to have the debug header/footer, but it doesn't. Though, I do think it is slightly strange to include a debug build in the test suites, since they are not suitable for production, even for malware. Not everyone is guaranteed to have the -d versions of the CRT DLLs on their system, they are usually included with Visual Studio. This is another reason the test could ultimately be failing, debug binaries should be really run with the same exact builds of the debug libs they were linked against. @xwings was this intentional to include a debug binary in the tests? |
|
At the end of the day this is an emulation framework, so we strive to emulate as much as we can rather than simulate. If you notice a working API implementation, we can safely remove the hook -- or better, make it a pass-through stub. As for the debug versions, we need to support them too. There is no good reason for us to exclude them. We can try to work around problematic APIs, or admit we cannot guarantee a flawless emulation, but we need to support it. |
|
To ease your debugging work, be sure to use the I am attaching here a small script that helped me debug al-khaser back then, by controlling the trace output. File extension changed to allow upload. |
|
My feeling was correct, the CRT assertions were detecting a real buffer overrun. It turned out to be from the LCMapString* functions, this has been addressed in a3bccf5. I have cleaned up much of the code for msvcrt, keeping in mind to make hooks passthru when native APIs can be used instead of removing them, and allowing the native CRT memory functions to run. This is important for the debug CRT and its additional heap error checking to work correctly. With these changes the tests are passing again, on my end at least. For al-khaser, I reverted some of the original hook for |
|
@elicn are we good to merge ? |
|
@sakura57, is this PR ready for merge, or you are still working on it? |
@elicn Ready, I think it's in a good state right now. it has been getting some "battle testing" the past few weeks as I'm using this branch in my particular use case, haven't noted any problems since the last fixes. |
|
@xwings we are good to go. |
Checklist
Which kind of PR do you create?
Coding convention?
Extra tests?
Changelog?
Target branch?
One last thing
Summary
In this PR, the following is addressed for Windows emulation on x86 and x86_64:
typeidanddynamic_cast.InterlockedPushEntrySList, etc) are now functioning properly.RtlAllocateHeapandHeapAllocwill trigger the same Qiling hook.The following is also addressed, on x86_64 only:
Details
(Original 24 March)
Previously, Qiling hooked some functions in msvcrt.dll (
_initterm,_initterm_eand__acrt_iob_func) so that they would return right away. This worked fine for programs using the C runtime library, as the startup code was not necessary when running in emulation. However, this caused some problems for C++ programs, namely programs using C++ runtime DLLs, for example msvcp140.dll. During my testing, a program printing to std::cout or std::err using operator<< was unable to print to the terminal. This was because of some global variables which were necessary for standard streams to function properly, yet were uninitialized because the CRT startup code did not fully complete due to the hooks.Leaving these functions unhooked, and implementing
HeapReAllocand_realloc_baseallowed my test C++ program to run without crashing. Qiling loads the C++ runtime library without crashing, and the program is able to print using the standard input/output streams. However, removing the CRT startup hooks breaks msvcrt.dll DllMain, which crashes when a program that uses msvcrt loads the module.After some analysis, I have found that the error during CRT startup in programs using msvcrt.dll when Qiling CRT startup hooks are removed is due to a part of the C runtime startup code which obtains a pointer from
RtlPcToFileHeaderwhich is invalid.After implementing a hook for
RtlPcToFileHeaderin Qiling, the issue is resolved. Both msvcrt and msvcp140 DllMain load without errors, and my test C++ program can successfully print to stdout and stderr. C and C++ programs, even those compiled with recent compilers, initialize CRT successfully and can print to the terminal.With these additional winapi implementations, the stub hooks for
_initterm,_initterm_eand__acrt_iob_funcare no longer needed as a stopgap solution, and the library versions of these functions can run without problems.(Update 27 March)
Currently, Qiling has limited support for software exceptions. Windows programs using SEH or C++ language features such as try.. catch are not emulated correctly.
After some analysis, I have found that the main obstacle in emulating software exception handling in Qiling is in several functions in ntdll which make use of various global structures, caches, or loader data, which are in an invalid state during emulation because Qiling does not fully emulate Windows kernel initialization or the loader process.
The most important functions in question are
RtlLookupFunctionEntryandRtlLookupFunctionTable. These functions are used for looking up compiler-generated data used during exception handling, including function locations and stack unwinding instructions.After implementing hooks for these functions, many of the functions involved with exception handling, including
RtlRaiseException,RtlDispatchException, and stack unwinding functions such asRtlVirtualUnwindandRtlUnwindEx, are actually emulated correctly. Handler routines are correctly located and executed.However, there is another obstacle. After the exception is handled and control is returned to the dispatcher,
RtlUnwindExcallsRtlRestoreContext. In the case of C++ exceptions, this function callsRcFrameConsolidationwhich recursively consolidates stack frames before finally restoring the new context with an IRETQ instruction.Due to the way Qiling sets up the GDT on x86_64, the IRETQ instruction at this point resulted in CPU faults. After making some adjustments to the GDT setup on x86_64, the context switch occurs as expected. All of the tests (at least those I have access to) are still passing, so it seems this new GDT setup does not break anything for other platforms. I believe it is technically more correct on x86_64, also.
Regardless, in my tests, some simple programs which make use of C++ exceptions are now correctly emulated.