Graphics tests, the last line of automated testing
There are dozens of platforms you can deploy to with Unity. It can be difficult for us developers at Unity to maintain visual integrity on all of them. Here is a quick peek into how we ensure graphics features don’t get unintentionally broken.
We have a lot of automated tests. There are unit tests, integration tests, system tests inside Unity itself in the form of Editor and Playmode tests and finally we have Graphics tests. Graphics tests set up a scene with specific graphical features turned on or off, builds that scene, runs it on all supported devices and finally renders the output as an image.
The resulting image is then compared with a previously approved reference image for that scene / graphics settings and device. Should any of the resulting images differ from the reference images, we then flag the test as failed and someone needs to manually verify if the fail is the result of some intentional or unintentional change that needs to get fixed.
Since it’s not always easy to spot changes from the reference image to the resulting test image (see the example below), we also provide the failed test with a diff image.
What makes graphics tests a bit more difficult to work with compared to normal tests is that they are brittle. Different platforms, device models and graphics cards will produce slightly different results. So in order to get consistent results from graphics tests, they must be executed on the test farm where we are sure the hardware remains same. This means that the workflow for updating tests or adding a new one is a bit convoluted, the developer has to:
- Make and push his changes
- Run the graphics tests on all appropriate devices
- Wait the tests to complete and fail
- Download the failed reference images from each of the builds
- Compare each reference image with the resulting image to ensure that the changes made are the expected changes
- Copy all the new images that need to be updated into the graphics tests repository
- Commit and push the changes to the graphics tests repository
- Run the graphics tests again
This entire process can be very time consuming, so to help make the process a bit easier, we made a small Polymer application with an Asp.net Core backend that queries our build statistics system Hoarder and finds all the graphics test on a specific revision. Then it downloads the graphics tests artifacts from the build system for each of the builds and presents the results on a single web page.
The developer can then see the failed tests and compare them with the reference images and diff images. However, the changes between two images aren’t always easy to spot, see the two images below:
So the tool allows the developer to toggle between the test image and reference image and/or the diff image and can now quickly see the changes in the image. This is helpful since it’s not always easy to spot the changes until you can swap back and forth between the two images.
The developer can then select the tests he/she wants to update and finally get a command line to automatically download and update the selected images into his or hers graphics tests repository or manually download a combined zip file with the correct directory structure and copy them manually to their graphics tests repository.
With over 13’700 graphics test distributed among 33 build configurations, and several updates every day to the graphics repository, this tool helps to make a developer’s life a bit better and it reduces some of the manual overhead when working with graphics tests.