How to test Unity in a week?
That is the question we constantly face as Quality Experts in Sustained Engineering. Every week we release two new minor versions of Unity with fixes and improvements. A crucial aspect of our job is to assess the quality of these builds, identify possible issues, provide feedback to the development teams and give the green light for the update to be released. This question became even more prominent with the introduction of LTS, our Long Term Support version. How can you test something as complex as Unity in a week though and make sure that what will end up in the hands of hundreds of thousands of users will not make everything turn into magenta pink?
Let the robots take over
The first part of the answer to this question is a bit obvious. Automation. Lots and lots of automation actually! Our builds have to pass around 58000 unique tests. Take note of the word “unique” because the actual number of tests that run is much bigger. A lot of these tests run on different configurations and different platforms.
So before it even reaches human hands an update release for LTS is getting hammered by tests to make sure that it works as expected. These tests come in a lot of different flavors. The majority of automated tests are Editor and Native tests. The editor tests are the ones that cover the editor’s functionality. The native tests cover the C++ side of Unity’s code. On top of those. we also have Playmode tests, Graphics tests, Performance tests, to name a few. Even if one of those tests fails the build goes back to cooking until we discover what is wrong with the failed test, incorporate the fix and create a new build.
After the build passes all the tests, it lands in our hands and the manual quality assurance process starts. There are two main phases in this process, the case verification phase, where we verify that all bugs we believed are fixed are actually fixed, and the “no regression” phase, where we check if the fixes have created some new bugs. It would be good to note that the process we follow is the same for both the LTS and the Tech Releases, so they both adhere to the same standards of quality.
All the fixes that are included in an update are getting verified by the person who is responsible for the associated bug. This could be anyone in the Unity organization: Embedded Quality Engineers who specialise in certain areas like Graphics or 2D, Field Engineers who filed a bug while getting feedback from a client or, the QA Student Worker team who are the frontline of Unity’s QA, which is responsible for converting the hundreds of reports we get from Unity’s users into bug reports.
Each one of the people above is responsible for checking if the bug has been fixed. Usually that is the second or third time that a bug fix is verified because someone usually also verifies the fix on a development branch before it lands in a particular release version.
The above process is how we strive hard to ensure that when we say something is fixed, it is actually fixed. If the fix doesn’t work, the bug goes back to the developer who originally added the fix.
The positive side effect of the above process is that we get a lot of of eyes on the build. Different people from different areas take a look at the build and assess the quality of it.
Release Acceptance Test
This is our baseline test phase. It’s an all around manual testing phase that pokes different areas of Unity and makes sure that all the basic parts are working as they are supposed to work. It covers main areas like 3D, 2D, Animation, XR, and others. We also test that everything is building and running correctly on all the popular platforms. We have tests for Windows, MacOS, Android, iOS, and consoles that we support.
Our Release Acceptance Test suite is evolving constantly and we change it depending on the changing feature set. Moreover, we do major overhauls from time to time based on how effective certain cases are in catching issues.
Targeted Exploratory Testing
The Release Acceptance Test described above is our “nothing major is broken” phase. It’s usually the same across different builds and isn’t affected that much by the type of fixes that go into a release.
In order to identify possible regressions that might come in a build with a fix, we have a targeted exploratory testing phase. Every week we have a meeting where we go over the different type of fixes that are going in an update and we assess them. Depending on this assessment, we decide which areas need a bit more poking around. If we see that a build has a major 2D fix that affects a lot of areas, we’re going to do a round of exploratory testing around the 2D features. Inn order to make it even more specific, the person responsible for testing a specific area is also taking a look at the fixes and the code that was added to make the testing more targeted.
We have Unity Projects of several popular games and applications that games studios were kind enough to make available to us. We use those to test that nothing is breaking when upgrading from one version to another. We import these projects into the new build and then we run them in the editor to make sure that we don’t get any errors. In most cases, we also create a build to verify nothing is broken across the Unity pipeline.
Moreover, we test with various popular asset store packages, with content that our Content team has created, like 3D Game Kit and Tower defence template and also with packages that have been developed internally in order to test specific areas like Particles and 2D Physics.
All the above are the things that we do at the moment to make sure that the LTS is as stable as possible and moving across the LTS line will not create any issues to users. We’re doing fairly well but we’re trying to get even better. That is why whatever you read above might not be completely accurate even 3 months from now. We track how well we’re doing and adjust our process accordingly.
Our philosophy is to catch problems as early as possible and even eliminate the factors that might cause them if that is possible. Our end goal is to have “almost zero” regressions on an LTS release, no matter how many fixes land on it.
One example of the things we do to reach that point is a Root Cause Analysis. We identify the critical issues that we missed, try to figure out why we missed them, and more importantly, why the issue was created in the first place. Then depending on the findings, we optimize our process to handle similar issues better in the future.
A big chunk of our time is also spent identifying tools that might make the overall quality of every Unity release better. These could range from a tool that could give us valuable feedback when a developer commits code to data analytics tools that would make it easier for us to identify riskier areas of the code and optimize our testing. Some of these tools are already helping us, others are prototypes and others are still in our heads.
Unity is a complex piece of software that is used by millions of users in all sorts of exciting and creative ways. Is it even possible to test a piece of software like that in a week and make sure that the fixes you put in didn’t create any regression? It’s a hard problem that we are very eager to solve.