On Web Player Regression Testing
One of the unwritten commitments we make to our customers is that our webplayer will play back their content identically in all versions we publish. We constantly tweak and improve the webplayer. For example webplayer 2.6 added threaded background loading and improved animation culling. Each change that we make intending to improve the webplayer has the potential to break a customer game, and we don’t like that. Remember, webplayer 2.6.1 can play content authored with Unity 2.0, 2.1, 2.5 and so on. To make sure that we don’t break customer games with new webplayers we have to do a shed load of regression testing. Lots and lots and lots. So much so we have built a dedicated test rig to do the testing for us.
|The regression rig has a server and multiple clients. The clients are PCs and Macs with our internal daily build webplayer installed. A server has a big list of our customer games, and sends jobs to each client that basically say “play this customer game”. Through a little bit of magic the clients take screen shots every second that the game runs, and these screen shots are forwarded onto the server as a record of the test execution.
Currently we have nine clients (7 windows and 2 mac) in the rig. Seven of these are physically located in our UK “test room”, the other two machines are under developers’ desks. We are slowly adding more machines. When a webplayer is installed we get a snapshop of the machine configuration. Using this database we can make sure we have the most popular machine specs represented in the rig.
Obviously, the games require inputs such as keyboard and mouse entry, random numbers, assets fetched from remote websites. These inputs are all faked up when the regression rig runs. What we do when we first get our hands on a customer game is give it to one of our testers who plays the game for us on the rig using the latest released webplayer. In this manual first-run mode we capture all the inputs and store them in an input file. Once we have this, we then generate a “golden” set of screen shots and know that these inputs generate these images. The idea is that we now know what images the game should generate with a given set of inputs. The regression rig can then compare images obtained from a customer game played back using the daily build webplayer against the golden set and we can tell quickly when one of our talented developers has broken something. When we find we have broken the webplayer we can iterate back through code checkins looking for the commit that first introduced the problem.
The regression rig should help us find problems with the webplayer before you do, but as always, please let us know (by sending in a detailed bug report) if you have issues when you publish your game to the web. Oh, and for the record we do have people employed to do testing, it is not all done with the rig.