Search Unity

Unity, as you may know, helps you build for many different platforms, be it consoles, computers, mobiles or XR. In order to do this, it uses things like a compiler, specific compiler flags, various SDKs, third party libraries and maybe even a specific OS required by the platforms. We call this combination of software strung together a Toolchain.

So as consoles, VR headsets, phone or smart tv OS are introduced or made obsolete, or just updated, we need to tinker with these Toolchains and/or make necessary code changes to the Unity codebase to keep everything working as intended.

What Are We Trying to Solve?

As with all software there is, more often than you’d think, unexpected behaviour. This could be everything from bugs to undocumented behaviour. Compilers and SDKs aren’t immune to things such as these either, and like any other software they’re also affected by software updates. You may have undocumented behaviour you’re relying on change suddenly or bugs being fixed and/or introduced as you perform an update. Add to this that different platforms may support wildly different feature sets, with some being bleeding edge and others more archaic.

Having some order in this and keeping track of changes seems like a good thing.

Over the years there are many scenarios we handle for our Toolchains, such as bugs in specific compilers or certain toolchains lacking some compiler features. Even the now somewhat old C++11 isn’t something we can use across our entire code base, since simply some of our platforms do not, and can’t support it.

Based on the state of these different toolchains at one point in time, we need to take different strategies.

This means it would be possible for these fixes to work around a problem, to long outlive the issues themselves and remain left in the code, warning future people of a problem that really isn’t a problem anymore.

So we need to stay on top of what is fixed, broken, supported and unsupported for our Toolchains as they get added, updated or left by the wayside.

Another issue is the round-trip time for our developers who are working at the lower levels of our code base. Today if they’re, for example, making some new macros and want to be sure it will work across all our Toolchains, they end up having to build the editor on our build system for each platform just to find out if it will compile. That is a very time consuming and a lot of wasted resources for our build farm just to see if a macro will compile or not.

So this is also something we want to try to improve.

What is The Compiler Test Framework?

It’s a series of C# NUnit tests written as small C or C++ snippets that get compiled against all our Toolchains. Each test contains one code snippet which asserts on the  behaviour we’re expecting. Even if this behaviour is in fact a bug and doesn’t compile, this is still a behaviour we expect and will write a test for it. So if down the line it’s fixed for a compiler or an SDK update, we will be notified, and maybe that will mean we can finally use that compiler feature across all our platforms.

The framework for now only compiles the snippets, so it won’t actually run the binary it generates or in any way analyses it. So there can be no runtime assertions done in the tests, only static ones.

The compiler tests utilize our build system, which is made in C#, for all the heavy lifting, and lets it run the build for our Toolchains and uses the snippet in the test as the source files for it.

Doing this lets us assert on compiler output, including build errors and warnings. So we can do things like saying “This code works for all our Toolchains except for these two, and for these two it will print this error message” and mark the test as successful as long as this remains true.

You may think it’s weird that a build failure should still let a test succeed, but the compiler tests are all about writing down our assumptions about the toolchains.

If we know that a specific feature or corner case currently isn’t supported on one platform, we make sure there is a test added which says so.

This means that we can easily look up in our code which features are supported by our Toolchains. However, it will also let us know if this actually changes at some point as we update our Toolchains.

This will also help the low level developers with their round-trip time for testing that changes compile on all platforms and that any compile time assertions are valid.

The build system knows about our code base, so the developers can include the header files for what they have implemented and do extra assertions if necessary in the test. This also allows them to explicitly test for things that shouldn’t compile, which our normal builds wouldn’t allow for obvious reasons.

In Practice

Take the example below which is one of our tests:

What you see here is an unexpected behaviour we found with the MSVC compiler.

If we compile the code in the snippet, it will fully expand the macro on all our Toolchains, except for our Visual Studio Toolchain.

As you can see in the comments in the code, we have made some assumptions in our macros elsewhere based on this behaviour. So if this test starts failing for future Visual Studio versions, by the compilation succeeding (strange huh?), we can get rid of this special consideration in our macros, so we don’t have to maintain this behaviour anymore.

This holds true for other toolchains as well. Just for C++ language features we can use there is a huge disparity between our different Toolchains. Some features we can’t use at all because a specific Toolchain doesn’t support it and some of these we have written our own emulations for. With these tests helping us keep track, maybe we can even go full C++11 some day, and drop many of the emulations we have been forced to create for some of our platforms.

So for the framework we have created some attributes to facilitate this for us. The CompilerTestAttribute essentially just figures out which Toolchains are supported on the current OS and makes sure the test gets run multiple times with a different Toolchain selected each time.

This is coupled to the CompilerTestBase class we inherit from, which just sets things up for the currently requested Toolchain to make sure Compile() only compiles for the correct toolchain and reports back the results.

We also have some custom documentation attributes which we use to generate a report for our tests. You can read more about this further down.

Below you can see how it looks if I run the test on my Windows machine:

It runs on our Visual Studio 2010 and 2015 (x86 and x64) and Emscripten toolchain. Since this is a Windows machine, it will only run the Toolchains supported on the current OS. So if I were to run this on my Mac, it would show our Mac Editor Toolchain.

Our build runners on Katana are configured to run these tests for Mac, Windows and Linux, so it will cover all our Toolchains. If something changes in our Toolchains that we have covered in our tests, the build will fail and let us know. This also allows the developers to easily make sure that the tests they’re writing will work across all our supported Toolchains. As of right now, it takes around 8 minutes to have the test suite run for all the Toolchains we support (1 minute to checkout the code, 6 minutes to prepare and build the build system and 1 minute to run all build system tests).

We also created a little tool that will generate some information on the current status of the compiler tests in trunk, which we then use to generate a report on an internal website as you can see here.

Compiler Test Report Site

Here you can see all the information about the tests we have now. We only have one case where it intentionally fails so far, and that is the Macro Expansions test for the Visual Studio toolchain that I showed you above.

So our devs can just visit this page if they feel unsure to get a better picture of the situation. The custom documentation attributes I mentioned earlier were meant for this purpose. As you see below, you can hover over the various sections and get information about the individual entries.

Each row in the report is the result for one test class, which may contain multiple tests. Each class should represent one high level feature.

 

They can also use the handy links on the right hand side to browse the test class in our code repository if they want to take a closer look at what is actually tested.

Conclusion

You can see that in truth we don’t have that many tests and Toolchains added yet. This has much to do with the fact that the entire build system of Unity hasn’t been moved over to the new C# based one and the framework is highly dependent on this. So this is still very early days for the Compiler Test Framework. We hope it will steadily grow as our Toolchains are ported over to C#, until we have mapped out all the anomalies we have detected, as well as all compiler features we use and want to use. You can expect to see a future blog post on this with a status update.

 

Dejar una respuesta

Puede utilizar estas etiquetas y atributos HTML: <a href=""> <b> <code> <pre>

  1. Great insight into how Unity Technologies manages things. I did not know about JetBrains Katana; very nice. Looking forward to how ML-Agents integrates into all this while preserving the use of Python and TensorFlow; which has great Serving of models.

  2. Thanks for posting this, it was an interesting read.

    I was wondering for how long you’re using this test framework already, but then read in your conclusion it’s the very early days of the Compiler Test Framework.

    How often did you run into such issues, that you now expect to catch with the new test framework?

    1. Thanks for reading, glad you liked it.
      We started got the initial set of tests added 5 months or so ago.

      Finding new issues in of themselves are not very frequent. However it is somewhat regular that people find issues themselves that we already know about, or ask “Can we use X on all platforms?”. It is easy to lose track of these findings over time, and if it has been years since someone noticed it you may have to go investigate it again to know if it still applies.
      So a large part of it is just automatically keeping track of behavior we know and make assumptions about, so we can easily find this information as well as know if something suddenly changes it.

      For example the code you find in the blog post was one such scenario.
      We were discussing some macros that seemed to be doing some extra stuff that I couldn’t understand why it was doing it. Which turned out to be the case the test is now covering for MSVC.
      While discussing it though we did spend a little time on checking that the issue was in fact still there, since we had updated the compiler since the macros were originally written. It could potentially be code that was protecting against something that wasn’t an issue anymore and we could have removed it. The issue was still there however, so now the test will let us know when we can remove that extra code without anyone having to keep track of it.