Search Unity

The Unity Burst Compiler transforms your C# code into highly optimized machine code. Since the first stable release of Burst Compiler a year ago, we have been working to improve the quality, experience, and robustness of the compiler. As we’ve released a major new version, Burst 1.3, we would like to take this opportunity to give you more insights about why we are excited about a key performance focused feature – our new enhanced aliasing support.

The new compiler intrinsics Unity.Burst.CompilerServices.Aliasing.ExpectAliased and Unity.Burst.CompilerServices.Aliasing.ExpectNotAliased allow users to gain deep insight into how the compiler understands the code they write. These new intrinsics are combined with extended support for the [Unity.Burst.NoAlias] attribute, we’ve given our users a new superpower in the quest for performance.

Takeaways

In this blog post we will explain the concept of aliasing, how to use the [NoAlias] attribute to explain how the memory in your data structures alias, and how to use our new aliasing compiler intrinsics to be certain the compiler understands your code the way you do.

Aliasing

Aliasing is when two pointers to data happen to be pointing to the same memory allocation.

The above is a classic performance related aliasing problem – the compiler without any external information cannot assume whether a aliases with b, and so produces the following nonoptimal assembly:

As can be seen it:

  • Stores 13 into b.
  • Stores 42 into a.
  • Reloads the value from b to return it.

It has to reload b because the compiler does not know whether a and b are backed by the same memory or not – if they were backed by the same memory then b will contain the value 42, if they were not it would contain the value 13.

A More Complex Example

Let’s look at the following simple job:

The above job is simply copying from one buffer to another. If Input and Output do not alias above, EG. none of the memory locations backing them do not overlap, then the output from this job is:

If a compiler is aware that these two buffers do not alias, like Burst is with the above code example, then the compiler can vectorize the code such that it can copy N things instead of one at at time:

Let’s look at what would happen if Input and Output happened to alias above. Firstly, the safety system will catch these common kinds of cases and provide user feedback if a mistake has been made. But let’s assume you’ve turned safety checks off, what would happen?

As you can see, because the memory locations slightly overlap, the value a from the Input ends up propagated across the entirety of Output. Let’s assume that the compiler also vectorized this example because it wrongly thought the memory locations did not alias, what would happen now?

Very bad things happen – the Output will not contain the data you expected.

Aliasing limits the Burst compilers ability to optimize code. It has an especially hard toll on vectorization – if the compiler thinks that any of the variables being used in the loop can alias, it generally cannot safely vectorize the loop. In Burst 1.3.0 and later, with our extended and improved aliasing support, we have vastly improved our performance story around aliasing.

Introducing the [NoAlias] Attribute

In Burst 1.3.0 we’ve extended where the [NoAlias] attribute can be placed to four places:

  • On a function parameter it signifies that the parameter does not alias with any other parameter to the function, or with the ‘this’ pointer.
  • On a field it signifies that the field does not alias with any other field of the struct.
  • On a struct itself it signifies that the address of the struct cannot appear within the struct itself.
  • On a function return value it signifies that the returned pointer does not alias with any other pointer ever returned from the same function.

In cases of fields and parameters, if the field type or parameter type is a struct, “does not alias with X” means that all pointers that can be found through any of the fields (even indirectly) of that struct are guaranteed not to alias with X.

In cases of parameters, note that a [NoAlias] attribute on a parameter guarantees it does not alias with this, which often is a job struct, which contains all data for the struct. In Entities.ForEach() scenarios, this will contain all the variables that were captured by the lambda.

We will now go through an example of each of these uses in turn.

NoAlias Function Parameter

If we look again at the example with Foo above, we can now add a [NoAlias] attribute and see what we get:

Which turns into:

Notice that the load from ‘b’ has been replaced with moving the constant 13 into the return register.

NoAlias Struct Field

Let’s take the same example from above but apply it to a struct instead:

The above produces the following assembly:

Which when parsed into our speech translates to:

  • Loads the address of the data in ‘b’ into rax.
  • Stores 42 into it (1109917696 is 0x‭42280000‬ which is 42.0f).
  • Loads the address of the data in ‘a’ into rcx.
  • Stores 13 into it.
  • Reloads the data in ‘b’ and converts it to an integer for returning.

Let’s assume that you as the user know that the two NativeArray’s are not backed by the same memory, you could:

By attributing both a and b with [NoAlias] we have told the compiler that they definitely do not alias with each other within the struct, which produces the following assembly:

Notice that the compiler can now just return the integer constant 42!

NoAlias on a Struct

Nearly all structs you will create as a user will be able to have the assumption that the pointer to the struct does not appear within the struct itself. Let’s take a look at a classic example where this is not true:

Lists are one of the few structures where it is normal to have the pointer to the struct accessible from somewhere within the struct itself.

Now onto a more concrete example of where [NoAlias] on a struct can help:

Which produces the following assembly:

As can be seen it:

  • Loads ‘p’ into rax.
  • Stores 42 into ‘p’.
  • Loads ‘p’ into rax again!
  • Loads ‘i’ into ecx.
  • Returns the index into ‘p’ by ‘i’.

Notice that it loaded ‘p’ twice – why? The reason is that the compiler does not know whether ‘p’ points to the address of the struct bar itself – so once it has stored 42 into ‘p’, it has to reload the address of ‘p’ from ‘bar’, just in case. A wasted load!

Let’s add [NoAlias] now:

Which produces the following assembly:

Notice that it only loaded the address of ‘p’ once, because we’ve told the compiler that ‘p’ cannot be the pointer to ‘bar’.

NoAlias Function Return

Some functions can only return a unique pointer. For instance, malloc will only ever give you a unique pointer. For these cases [return:NoAlias] can provide the compiler with some useful information.

Let’s take an example using a bump allocator backed with a stack allocation:

Which produces the following assembly:

It’s quite a lot of assembly, but the key bit is that it:

  • Has ‘ptr1’ in rdi.
  • Has ‘ptr2’ in rax.
  • Stores 42 into ‘ptr1’.
  • Stores 13 into ‘ptr2’.
  • Loads ‘ptr1’ again to return it.

Let’s now add our [return: NoAlias] attribute:

Which produces:

And notice that the compiler doesn’t reload ‘ptr2’ but simply moves 42 into the return register.

[return: NoAlias] should only ever be used on functions that are 100% guaranteed to produce a unique pointer, like with the bump-allocating example above, or with things like malloc. It is also important to note that the compiler aggressively inlines functions for performance considerations, and so small functions like the above will likely be inlined into their parents and produce the same result without the attribute (which is why we had to force no-inlining on the called function).

Function Cloning for Better Aliasing Deduction

For function calls where Burst knows about the aliasing between parameters to the function, Burst can infer the aliasing and propagate this onto the called function to allow for greater optimization opportunities. Let’s look at an example:

This is because within the Bar function, the compiler did not know the aliasing of ‘a’ and ‘b’. This is in line with what other compiler technologies will do with this code snippet.

Burst is smarter than this though, and through a process of function cloning Burst will create a copy of Bar where the aliasing properties of ‘a’ and ‘b’ are known not to alias, and replace the original call to Bar with a call to the copy. This results in the following assembly:

Which as we can see doesn’t perform the second load from ‘a’.

Aliasing Checks

Since aliasing is so key to the compilers ability to optimize for performance, we’ve added some aliasing intrinsics:

  • Unity.Burst.CompilerServices.Aliasing.ExpectAliased expects that the two pointers do alias, and generates a compiler error if not.
  • Unity.Burst.CompilerServices.Aliasing.ExpectNotAliased expects that the two pointers do not alias, and generates a compiler error if not.

An example:

These intrinsics allow you to be certain that the compiler has all the information that you as the user know. These are compile time checks. When the code you write to produce the arguments for the intrinsics do not have side effects, there is no runtime cost for these aliasing intrinsics. They are particularly useful when you have some code that is performance sensitive that you want to be sure that any later changes do not change the assumptions the compiler can make about aliasing. With Burst, and the control we have over the compiler, we can provide this sort of in-depth feedback from the compiler to our users to ensure your code remains as optimized as you intended.

Job System Aliasing

The Unity Job System has some built-in assumptions it can make about aliasing. The rules are:

  1. Any struct with a [JobProducerType] (EG. anything like IJob, IJobParallelFor, etc) knows that any field of that struct that is a [NativeContainer] (EG. NativeArray, NativeSlice, etc) cannot alias with any other field that is also a [NativeContainer].
  2. The above is true except for fields that have the [NativeDisableContainerSafetyRestriction] attribute on them. For these fields, the user has explicitly told the Job System that this field can alias with any other field of the struct.
  3. Any struct with a [NativeContainer] cannot have the ‘this’ pointer of that struct within the struct itself.

Ok all the formal definitions over, let’s look at some code to better explain the above rules:

Walking through the above aliasing checks:

  • a and b do not alias since they are both [NativeContainer]’s contained within a [JobProducerType] struct.
  • But since c has the field attribute [NativeDisableContainerSafetyRestriction] it can alias with a or b.
  • And the pointers to each of a, b, or c cannot appear within them (EG. in this case the data backing the NativeArray cannot be the data backing the contents of the array).

These built-in aliasing rules allow Burst to perform pretty darn good optimizations for most user code, allowing the performance by default that we strive for.

Common Use Case Scenario

Many users will write code along the lines of BasicJob below:

The code is loading from three arrays, combining their results, and storing it to a fourth array. This kind of code is great for the compiler because it allows it to generate vectorized code, making the most of the powerful CPUs we all have in our mobiles and desktop computers today.

If we look at the Burst Inspector view of the above job:

We can see the code is vectorized – the compiler has done a good job here! The compiler is able to vectorize because as we explained above the Unity Job System has rules that each variable in a job struct cannot alias any other member in the struct.

But there are cases that can be seen in the wild where developers are building up data structures where Burst has no information on how the aliasing works with those structures, for example:

In the above example we’ve just wrapped the data members from the BasicJob in a new struct Data, and stored this struct as the only variable in the parent job struct. Let’s see what the Burst Inspector shows us now:

Burst has been smart enough to vectorize this example – but at the cost of having to check that all of the pointers being used are not overlapping at the start of the loop.

This is because the Job System aliasing rules only give Burst guarantees about direct variable members of a struct – not anything derived from them. So Burst has to assume that the native array backing the variables a, b, c, and o is the same variable – meaning the complicated and performance draining dance of ‘Do any of these pointers actually equal each other?’. So how can we fix this? By using our [NoAlias] attribute to explain this to Burst!

In the WithAliasingInformationJob job above, we can see that there are new [NoAlias] attributes set on the fields of Data. These [NoAlias] attributes are telling Burst that:

  • a, b, c, and o do not alias with any other member of Data that has a [NoAlias] attribute.
  • So each variable does not alias with any other variable in the struct because they all have the [NoAlias] attribute.

And again we’ll look at the Burst Inspector:

 

With this change we have removed all those expensive runtime pointer checks, and can just get on with running the vectorized loop – nice!

Using the new Unity.Burst.CompilerServices.Aliasing intrinsics will ensure that you never accidentally change the code to affect aliasing again in the future. For example:

These checks do not cause a compiler error in the above job – which means as we already seen, Burst has enough information because of the added [NoAlias] attributes to detect and optimize this case.

Now while this is a contrived example for the sake of conciseness in this blog, these kind of aliasing hints can provide very real-world performance benefit in your code. As we always recommend, using the Burst Inspector when iterating on code modifications you have made will ensure that you keep stepping towards a more optimized future.

Conclusion

With the release of Burst 1.3.0 we provided you another set of tools to get the maximum performance from your code. With the extended and enhanced [NoAlias] support you can perfectly control how your data structures work. And the new compiler intrinsics give you a meaningful insight into how the compiler understands your code.

If you haven’t started with Burst yet and would like to learn more about our work on the new Data-Oriented Technology Stack (DOTS), head over to our DOTS pages, where we will be adding more learning resources and links to talks from our teams as more becomes available. 

We always welcome your feedback – join the forum here to let us know how we can help you level up your Burst code in future.

16 replies on “Enhanced Aliasing with Burst”

This is like having your whole car covered in bird’s cr*p, and freaking out because you saw ant and you didn’t want it to dirty your car. Can the ant make your car more dirty? Yeah….
will it make any real difference? F*** no….
99% of unity api is C# manager layer to CPP layer call and is almost as slow as reflection(very, VERY slow compared to any normal direct method call), and they are optimising some memory aliasing for burst that will be used in 0.01% cases by 0.1% users.
Thats typical unity for ya rotfl.

Most of the engine features are being rewritten to make use of burst, and many of the old APIs are getting updated with support for the Job System. You can definitely make use of those optimizations today. In the title I’m working on, Burst is a live saver. Btw, if you’re having problems with extern call performance, you’re definitely doing something wrong – you can relatively easily refactor your code to mostly avoid those.

Are there any cleaner way of writing these “ExpectNotAliased” checks? As the code is in your example, it sort of feels like you have unit tests in your production code…

Also, some sort of tool that would highlight which jobs the compiler expects to have Aliasing in would be super useful, instead of having to read machine code to figure out which jobs can be optimized.

very interesting. It’s great that youprovide this power and also the detailed explanation. The doc in the burst manual was already really good but it is great that you take the time to write these in depth blogs.

I showed burst to a friend last night who is not a game developer and he was like. Can i generate code with this and call in a normal .NET core app? :D

Burst and all DOTS stack is fantastic. Never had so much pleasure while coding stuff. Good job.
P.S. quaterniond or ability to multiply quaternion with double3 would be great too.

Sorry guys but you are creating an engine which is harder than to learn unreal\c++ while lots of missing and incomplete features with worse graphics except the tech demos you created inhouse

I would strongly have to disagree here, we are using DOTS and it has reduced complexity by order of magnitudes in comparison to C++. I’m really surprised at the comment.

Nonsense, have you actually used it? Its many times easier to learn than c++, and very different. Your comment makes it sound like you have only briefly glanced over DOTS related things rather than used them yourself.

DOTS is not only much simpler to use than UE4 c++, but it’s also easier to make *robust/scalable* code than in Monobehaviour. DOTS is code architecture heaven and makes you save a tremendous amount of time

It really feels like most of the people who actually tried DOTS in practice love it, and most of the people who’ve never tried it think it’s too scary and complicated

Haha, it’s funny that you use monospaced font in your text output with assembly code, but still use a non monospaced code in console output and test runner, is that a very hard to do? This is are a rhetorical question, since I ll made my test runner write result in monospaced font, and this is awesome, but console does not supplied with package manager.

댓글 남기기

이메일은 공개되지 않습니다. 필수 입력창은 * 로 표시되어 있습니다