IL2CPP Internals – Debugging tips for generated code
This is the third blog post in the IL2CPP Internals series. In this post, we will explore some tips which make debugging C++ code generated by IL2CPP a little bit easier. We will see how to set breakpoints, view the content of strings and user defined types and determine where exceptions occur.
As we get into this, consider that we are debugging generated C++ code created from .NET IL code. So debugging it will likely not be the most pleasant experience. However, with a few of these tips, it is possible to gain meaningful insight into how the code for a Unity project executes on the actual target device (we’ll talk a little bit about debugging managed code at the end of the post).
Also, be prepared for the generated code in your project to differ from this code. With each new version of Unity, we are looking for ways to make the generated code better, faster and smaller.
For this post, I’m using Unity 5.0.1p3 on OSX. I’ll use the same example project as in the post about generated code, but this time I’ll build for the iOS target using the IL2CPP scripting backend. As I did in the previous post, I’ll build with the “Development Player” option selected, so that il2cpp.exe will generate C++ code with type and method names based on the names in the IL code.
After Unity is finished generating the Xcode project, I can open it in Xcode (I have version 6.3.1, but any recent version should work), choose my target device (an iPad Mini 3, but any iOS device should work) and build the project in Xcode.
Before running the project, I’ll first set a breakpoint at the top of the
Start method in the
HelloWorld class. As we saw in the previous post, the name of this method in the generated C++ code is
HelloWorld_Start_m3. We can use Cmd+Shift+O and start typing the name of this method to find in in Xcode, then set a breakpoint in it.
We can also choose Debug > Breakpoints > Create Symbolic Breakpoint in XCode, and set it to break at this method.
Now when I run the Xcode project, I immediately see it break at the start of the method.
We can set breakpoints on other methods in the generated code like this if we know the name of the method. We can also set breakpoints in Xcode at a specific line in one of the generated code files. In fact, all of the generated files are part of the Xcode project. You will find them in the Project Navigator in the Classes/Native directory.
There are two ways to view the representation of an IL2CPP string in Xcode. We can view the memory of a string directly, or we can call one of the string utilities in libil2cpp to convert the string to a
std::string, which Xcode can display. Let’s look at the value of the string named
_stringLiteral1 (spoiler alert: its contents are “Hello, IL2CPP!”).
In the generated code with Ctags built (or using Cmd+Ctrl+J in Xcode), we can jump to the definition of
_stringLiteral1 and see that its type is
In fact, all strings in IL2CPP are represented like this. You can find the definition of
Il2CppString in the object-internals.h header file. These strings include the standard header part of any managed type in IL2CPP,
Il2CppObject (which is accessed via the
Il2CppDataSegmentString typedef), followed by a four byte length, then an array of two bytes characters. Strings defined at compile time, like
_stringLiteral1 end up with a fixed-length
chars array, whereas strings created at runtime have an allocated array. The characters in the string are encoded as UTF-16.
If we add
_stringLiteral1 to the watch window in Xcode, we can select the View Memory of “_stringLiteral1” option to see the layout of the string in memory.
Then in the memory viewer, we can see this:
The header member of the string is 16 bytes, so after we skip past that, we can see that the four bytes for the size have a value of 0x000E (14). The next byte after the length is the first character of the string, 0x0048 (‘H’). Since each character is two bytes wide, but in this string all of the characters fit in only one byte, Xcode displays them on the right with dots in between each character. Still, the content of the string is clearly visible. This method of viewing string does work, but it is a bit difficult for more complex strings.
We can also view the content of a string from the lldb prompt in Xcode. The utils/StringUtils.h header gives us the interface for some string utilities in libil2cpp that we can use. Specifically, let’s call the
Utf16ToUtf8 method from the lldb prompt. Its interface looks like this:
static std::string Utf16ToUtf8 (const uint16_t* utf16String);
We can pass the chars member of the C++ structure to this method, and it will return a UTF-8 encoded
std::string. Then, at the lldb prompt, if we use the p command, we can print the content of the string.
(lldb) p il2cpp::utils::StringUtils::Utf16ToUtf8(_stringLiteral1.chars)
(std::__1::string) $1 = "Hello, IL2CPP!"
Viewing user defined types
We can also view the contents of a user defined type. In the simple script code in this project, we have created a C# type named
Important with a field named
InstanceIdentifier. If I set a breakpoint just after we create the second instance of the
Important type in the script, I can see that the generated code has set
InstanceIdentifier to a value of 1, as expected.
So viewing the contents of user defined types in generated code is done that same way as you normally would in C++ code in Xcode.
Breaking on exceptions in generated code
Often I find myself debugging generated code to try to track down the cause of a bug. In many cases these bugs are manifested as managed exceptions. As we discussed in the last post, IL2CPP uses C++ exceptions to implement managed exceptions, so we can break when a managed exception occurs in Xcode in a few ways.
The easiest way to break when a managed exception is thrown is to set a breakpoint on the
il2cpp_codegen_raise_exception function, which is used by il2cpp.exe any place where a managed exception is explicitly thrown.
If I then let the project run, Xcode will break when the code in
Start throws an
InvalidOperationException exception. This is a place where viewing string content can be very useful. If I dig into the members of the
ex argument, I can see that it has a
___message_2 member, which is a string representing the message of the exception.
With a little bit of fiddling, we can print the value of this string and see what the problem is:
(lldb) p il2cpp::utils::StringUtils::Utf16ToUtf8(&ex->___message_2->___start_char_1)
(std::__1::string) $88 = "Don't panic"
Note that the string here has the same layout as above, but the names of the generated fields are slightly different. The
chars field is named
___start_char_1 and its type is
uint16_t. It is still the first character of an array though, so we can pass its address to the conversion function, and we find that the message in this exception is rather comforting.
But not all managed exceptions are explicitly thrown by generated code. The libil2cpp runtime code will throw managed exceptions in some cases, and it does not call
il2cpp_codegen_raise_exception to do so. How can we catch these exceptions?
If we use Debug > Breakpoints > Create Exception Breakpoint in Xcode, then edit the breakpoint, we can choose C++ exceptions and break when an exception of type
Il2CppExceptionWrapper is thrown. Since this C++ type is used to wrap all managed exceptions, it will allow us to catch all managed exceptions.
Let’s prove this works by adding the following two lines of code to the top of the Start method in our script:
Important boom = null;
The second line here will cause a
NullReferenceException to be thrown. If we run this code in Xcode with the exception breakpoint set, we’ll see that Xcode will indeed break when the exception is thrown. However, the breakpoint is in code in libil2cpp, so all we see is assembly code. If we take a look at the call stack, we can see that we need to move up a few frames to the
NullCheck method, which is injected by il2cpp.exe into the generated code.
From there, we can move back up one more frame, and see that our instance of the
Important type does indeed have a value of
After discussing a few tips for debugging generated code, I hope that you have a better understanding about how to track down possible problems using the C++ code generated by IL2CPP. I encourage you to investigate the layout of other types used by IL2CPP to learn more about how to debug the generated code.
Where is the IL2CPP managed code debugger though? Shouldn’t we be able to debug managed code running via the IL2CPP scripting backend on a device? In fact, this is possible. We have an internal, alpha-quality managed code debugger for IL2CPP now. It’s not ready for release yet, but it is on our roadmap, so stay tuned.
The next post in this series will investigate the different ways the IL2CPP scripting backend implements various types of method invocations present in managed code. We will look at the runtime cost of each type of method invocation.