Search Unity

This is the sixth post in the IL2CPP Internals series. In this post, we will explore how il2cpp.exe generates wrapper methods and types use for interop between managed and native code. Specifically, we will look at the difference between blittable and non-blittable types, understand string and array marshaling, and learn about the cost of marshaling.

I’ve written a good bit of managed to native interop code in my days, but getting p/invoke declarations right in C# is still difficult, to say the least. Understanding what the runtime is doing to marshal my objects is even more of a mystery. Since IL2CPP does most of its marshaling in generated C++ code, we can see (and even debug!) its behavior, providing much better insight for troubleshooting and performance analysis.

This post does not aim to provide general information about marshaling and native interop. That is a wide topic, too large for one post. The Unity documentation discusses how native plugins interact with Unity. Both Mono and Microsoft provide plenty of excellent information about p/invoke in general.

As with all of the posts in this series, we will be exploring code that is subject to change and, in fact, is likely to change in a newer version of Unity. However, the concepts should remain the same. Please take everything discussed in this series as implementation details. We like to expose and discuss details like this when it is possible though!

The setup

For this post, I’m using Unity 5.0.2p4 on OSX. I’ll build for the iOS platform, using an “Architecture” value of “Universal”. I’ve built my native code for this example in Xcode 6.3.2 as a static library for both ARMv7 and ARM64.

The native code looks like this:



The scripting code in Unity is again in the HelloWorld.cs file. It looks like this:


Each of the method calls in this code are made into the native code shown above. We will look at the managed method declaration for each method as we see it later in the post.

Why do we need marshaling?

Since IL2CPP is already generating C++ code, why do we need marshaling from C# to C++ code at all? Although the generated C++ code is native code, the representation of types in C# differs from C++ in a number of cases, so the IL2CPP runtime must be able to convert back and forth from representations on both sides. The il2cpp.exe utility does this both for types and methods.

In managed code, all types can be categorized as either blittable or non-blittable. Blittable types have the same representation in managed and native code (e.g. byte, int, float). Non-blittable types have a different representation in managed and native code (e.g. bool, string, array types). As such, blittable types can be passed to native code directly, but non-blittable types require some conversion before they can be passed to native code. Often this conversion involves new memory allocation.

In order to tell the managed code compiler that a given method is implemented in native code, the extern keyword is used in C#. This keyword, along with a DllImport attribute, allows the managed code runtime to find the native method definition and call it. The il2cpp.exe utility generates a wrapper C++ method for each extern method. This wrapper performs a few important tasks:

  • It defines a typedef for the native method which is used to invoke the method via a function pointer.
  • It resolves the native method by name, getting a function pointer to that method.
  • It converts the arguments from their managed representation to their native representation (if necessary).
  • It calls the native method.
  • It converts the return value of the method from its native representation to its managed representation (if necessary).
  • In converts any out or ref arguments from from their native representation to their managed representation (if necessary).

We’ll take a look at the generated wrapper methods for some extern method declarations next.

Marshaling a blittable type

The simplest kind of extern wrapper only deals with blittable types.



In the Bulk_Assembly-CSharp_0.cpp file, search for the string “HelloWorld_Increment_m3”. The wrapper function for the Increment method looks like this:


First, note the typedef for the native function signature:


Something similar will show up in each of the wrapper functions. This native function accepts a single int32_t and returns an int32_t.

Next, the wrapper finds the proper function pointer and stores it in a static variable:


Here the Increment function actually comes from an extern statement (in the C++ code):


On iOS, native methods are statically linked into a single binary (indicated by the “__Internal” string in the DllImport attribute), so the IL2CPP runtime does nothing to look up the function pointer. Instead, this extern statement informs the linker to find the proper function at link time. On other platforms, the IL2CPP runtime may perform a lookup (if necessary) using a platform-specific API method to obtain this function pointer.

Practically, this means that on iOS, an incorrect p/invoke signature in managed code will show up as a linker error in the generated code. The error will not occur at  runtime. So all p/invoke signatures need to be correct, even with they are not used at runtime.

Finally, the native method is called via the function pointer, and the return value is returned. Notice that the argument is passed to the native function by value, so any changes to its value in the native code will not be available in the managed code, as we would expect.

Marshaling a non-blittable type

Things get a little more exciting with a non-blittable type, like string. Recall from an earlier post that strings in IL2CPP are represented as an array of two-byte characters encoded via UTF-16, prefixed by a 4-byte length value. This representation does not match either the char* or wchar_t* representations of strings in C on iOS, so we have to do some conversion. If we look at the StringsMatch method (HelloWorld_StringsMatch_m4 in the generated code):



We can see that each string argument will be converted to a char* (due to the UnmangedType.LPStr directive).


The conversion looks like this (for the first argument):


A new char buffer of the proper length is allocated, and the contents of the string are copied into the new buffer. Of course, after the native method is called we need to clean up those allocated buffers:


So marshaling a non-blittable type like string can be costly.

Marshaling a user-defined type

Simple types like int and string are nice, but what about a more complex, user defined type? Suppose we want to marshal the Vector structure above, which contains three float values. It turns out that a user defined type is blittable if and only if all of its fields are blittable. So we can call ComputeLength (HelloWorld_ComputeLength_m5 in the generated code) without any need to convert the argument:


Notice that the argument is passed by value, just as it was for the initial example when the argument type was int. If we want to modify the instance of Vector and see those changes in managed code, we need to pass it by reference, as in the SetX method (HelloWorld_SetX_m6):



Here the Vector argument is passed as a pointer to native code. The generated code goes through a bit of a rigmarole, but it is basically creating a local variable of the same type, copying the value of the argument to the local, then calling the native method with a pointer to that local variable. After the native function returns, the value in the local variable is copied back into the argument, and that value is available in the managed code then.

Marshaling a non-blittable user defined type

A non-blittable user defined type, like the Boss type defined above can also be marshaled, but with a little more work. Each field of this type must be marshaled to its native representation. Also, the generated C++ code needs a representation of the managed type that matches the representation in the native code.

Let’s take a look at the IsBossDead extern declaration:


The wrapper for this method is named HelloWorld_IsBossDead_m7:


The argument is passed to the wrapper function as type Boss_t2, which is the generated type for the Boss struct. Notice that it is passed to the native function with a different type: Boss_t2_marshaled. If we jump to the definition of this type, we can see that it matches the definition of the Boss struct in our C++ static library code:

We again used the UnmanagedType.LPStr directive in C# to indicate that the string field should be marshaled as a char*. If you find yourself debugging a problem with a non-blittable user-defined type, it is very helpful to look at this _marshaled struct in the generated code. If the field layout does not match the native side, then a marshaling directive in managed code might be incorrect.

The Boss_t2_marshal function is a generated function which marshals each field, and the Boss_t2_marshal_cleanup frees any memory allocated during that marshaling process.

Marshaling an array

Finally, we will explore how arrays of blittable and non-blittable types are marshaled. The SumArrayElements method is passed an array of integers:


This array is marshaled, but since the element type of the array (int) is blittable, the cost to marshal it is very small:


The il2cpp_codegen_marshal_array function simply returns a pointer to the existing managed array memory, that’s it!

However, marshaling an array of non-blittable types is much more expensive. The SumBossHealth method passes an array of Boss instances:


It’s wrapper has to allocate a new array, then marshal each element individually:


Of course all of these allocations are cleaned up after the native method call is completed as well.


The IL2CPP scripting backend supports the same marshalling behaviors as the Mono scripting backend. Because IL2CPP produces generated wrappers for extern methods and types, it is possible to see the cost of managed to native interop calls. For blittable types, this cost is often not too bad, but non-blittable types can quickly make interop very expensive. As usual, we’ve just scratched the surface of marshaling in this post. Please explore the generated code more to see how marshaling is done for return values and out parameters, native function pointers and managed delegates, and user-defined reference types.

Next time we will explore how IL2CPP integrates with the garbage collector.


Subscribe to comments

Comments are closed.

  1. Hi, it’s become something of a tradition for me to post this bug and ask you if you intend to fix it someday?

  2. a user defined type is blittable if and only if all of its fields are blittable

    I assume that is only true as long as you don’t use the StructLayoutAttribute?

    but since the element type of the array (int) is blittable, the cost to marshal it is very small … simply returns a pointer to the existing managed array memory, that’s it!

    How does the array get pinned automatically? And if so, how does the C# side knows when its safe to move the array again? Is it tight to the call of the C++ – function, so you are not allowed to store any array pointer for longer than the duration of the function call?

    1. Josh Peterson

      July 7, 2015 at 12:10 pm

      > I assume that is only true as long as you don’t use the StructLayoutAttribute?

      Yes, once you get into use the of StructLayout and things like packing, IL2CPP has to do special layout and allocation, often making the types no longer blittable.

      > How does the array get pinned automatically? And if so, how does the C# side knows when its safe to move the array again? Is it tight to the call of the C++ – function, so you are not allowed to store any array pointer for longer than the duration of the function call?

      Yes, it is definitely not a good idea to store the array pointer on the native side longer than the duration of the function call. This is not only because the memory could theoretically be moved, but also because it could be freed.

      In practice, IL2CPP does not use a moving GC now, so we have no worries about the memory being moved. But at least we can be sure that it won’t be freed during the duration of the native call, since the managed code holds on to a reference to the memory.

      If or when we have a moving GC in IL2CPP, then the implementation of this marshaling code may need to change, to ensure that the memory is not moved during the native call. But that will be handled on the IL2CPP side; it should not require any changes to user code.

  3. Eugene Ovcharenko

    July 6, 2015 at 8:48 am

    Thank you.
    il2cpp internals series is great. Interesting to read as always.

    Would be great to hear a bit about marshaling in WebGL, but that will come around in other posts I suppose.
    (we’re using Convert.ToBase64String(byteArray); and passing a string to the browser for now)

    1. Josh Peterson

      July 6, 2015 at 12:24 pm

      We’re not planning now to do a marshaling post about WebGL interaction, because from the IL2CPP side there is nothing different going on with WebGL from any other platform. Is there something specific you would like to see related to WebGL and marshaling though? Thanks for the suggestion.

  4. Nice that this is making progress!

    Since IL is is .NET (or mono), this mean IL2CPP will support F# right?

    1. Josh Peterson

      July 6, 2015 at 12:22 pm

      We’re not officially supporting F# now, and I can’t say yet if or when we will support it. As you mentioned, it is IL code, so I suspect that some or most of it might just work. It would be fun to try though!

  5. All this stuff is Chinese X Aramaic X binary language of moisture evaporators … so thanks a lot for making it just work :)

  6. Linear on mobile bro

    July 3, 2015 at 6:40 am

    This is all very interesting. I am just curious when will linear color space be available oh mobile? Thanks bros

  7. Corey St. Pierre

    July 2, 2015 at 5:32 pm

    Thanks for going into such detail. This was a great post. I’m looking forward to the GC blog! Keep up the awesome work.