Search Unity

In this final episode of our IL2CPP micro-optimization miniseries, we’ll explore the high cost of something called “boxing”, and we’ll see how IL2CPP can avoid it when it is done unnecessarily.

Heap allocations are slow

Like many programming languages, C# allows the memory for objects to be allocated on the stack (a small, “fast”, scope-specific, block of memory) or the heap (a large, “slow”, global block of memory). Usually allocating space for an object on the heap is much more expensive than allocating space on the stack. It also involves tracking that allocated memory in the garbage collector, which has an additional cost. So we try to avoid heap allocations where possible.

C# lets us do this by separating types into value types (which can be allocated on the stack), and reference types (which must be allocated on the heap). Types like int and float are value types, string and object are reference types. User-defined value types use the struct keyword. User-defined reference types use the class keyword. Note that a value type can never hold a the value null. In C#, the null value can only be assigned to reference types. Keep this distinction in mind as we continue.

Being good performance citizens, we try to avoid heap allocations unless they are necessary. But sometimes we need to convert a value type on the stack into a reference type on the heap. This process is called boxing. Boxing:

  1. Allocates space on the heap
  2. Informs the garbage collector about the new object
  3. Copies the data from the value type object into the new reference type object

Ugh, let’s add boxing to our list of things to avoid!

That pesky compiler

Suppose we are happily writing code, avoiding unnecessary heap allocations and boxing. Maybe we have some trees for our world, and each has a size which scales with its age:

Elsewhere in our code, we have this convenient method to sum up the size of many things (including possibly Tree objects):

This looks safe enough, but let’s peer into a little bit of the Intermediate Language (IL) code that the C# compiler generates:

The C# compiler has implemented the if (things[i] != null) check using boxing! If the type T is already a reference type, then the box opcode is pretty cheap – it just returns the existing pointer to the array element. But if type T is a value type (like our Tree type), then that box opcode is very costly. Of course, value types can never be null, so why do we need to implement the check in the first place? And what if we need to compute the size of one hundred Tree objects, or maybe one thousand Tree objects? That unnecessary boxing will quickly become very important.

The fastest code is anything you don’t execute

The C# compiler needs to provide a general implementation that works for any possible type T, so it is stuck with this slower code. But a compiler like IL2CPP can be a bit more aggressive when it generates code that will be executed and when it doesn’t generate the code that won’t!

IL2CPP will create an implementation of The TotalSize<T> method specifically for the case where T is a Tree. the IL code above looks like this in generated C++ code:

IL2CPP recognized that the box opcode is unnecessary for a value type, because we can prove ahead of time that a value type object will never be null. In a tight loop, this removal of an unnecessary allocation and copy of data can have a significant positive impact on performance.

Wrapping up

As with the other micro-optimizations discussed in this series, this one is a common optimization for .NET code generators. All of the scripting backends used by Unity currently perform this optimization for you, so you can get back to writing your code.

We hope you have enjoyed this miniseries about micro-optimizations. As we continue to improve the code generators and runtimes used by Unity, we’ll offer more insight into the micro-optimizations that go on behind the scenes.

17 replies on “IL2CPP Optimizations: Avoid Boxing”

“IL2CPP will create an implementation of The TotalSize method specifically for the case where T is a Tree.”

The method is generic, why does it create the implementation for this specific case? not really clear from the post.

Also, if IL2CPP can make these assumptions, why doesn’t the Mono compiler work in the same way ?

I think what they mean is “IL2CPP will create an implementation of The TotalSize method specifically for the case where T is a a value type ( as is the case for Tree).”

So they mean, IL2CPP will recognise that Tree is a value type, and react accordingly by omitting the boxing line.

Unfortunately C# does not include a logic that deems constraints as part of the signature of an operation for overload checking, and thus, if you write two TotalSize operations, where one has a contraint “where T: struct, HasSize” and the other “where T: class, HasSize”, the compiler will claim that it is the same operation defined twice.

Although some may argue whether the “if (things[i] != null)” check should be included in TotalSize or two different names should be assigned the methods (one for class types and another for strcuts), that optimization is indeed imvho a great feature in ILL2CPP.

So, nice addition. Thanks a lot!

Can IL2CPP prevent boxing when enums are used as keys for dictionaries? I can of course implement the IEqualityComparer, but I’m just curious whether the compiler could do something in this case.

Comments are closed.