Search Unity

How Metric Validation can help you finetune your game

Willis Kennedy

November 13, 2020 in Technology | 7 min. read

Topics covered

Games Game simulation

Is this article helpful for you?

Thank you for your feedback!

Over the past year, Unity Game Simulation has enabled developers to balance their games during development by running multiple playthroughs in parallel in the cloud. Today we are excited to share the next step in automating aspects of the balancing workflow by releasing Metric Validation, a precursor to our upcoming Optimization feature. In this blog post, we will review the balancing framework within Unity Game Simulation, introduce the new Metric Validation feature, and share a case study with our partner Furyion where our upcoming Optimization feature enabled them to balance their game 10x faster.

Try Game Simulation for free today!

When launching a game, one of the best ways to delight your players is to include them in the design process, through soft launches, mass playtesting, and early betas.

These techniques work, but as games become more complex over time, asking players to iteratively validate all the different paths and strategies within a game can become tedious, costly, or just impossible. The Unity Game Simulation team believes that a solid framework for measuring your game’s balance with automated playthroughs can drastically reduce the player data required from playtesting.

A framework for game balance

Three key elements

The framework we’ve been developing focuses on three key elements of a game – metrics, test cases, and parameters. Metrics are the results of a playthrough that are important to the balance or health of the game. Test cases are the different scenarios that you’d like to measure within a game. Finally, parameters are the configurations that can change your game, thus directly impacting your metrics.

Let’s consider balancing a simple racing game with many different vehicles. Our goal when balancing the game is to provide fun and distinct experiences without giving too much advantage from a player’s vehicle choice. We can measure this advantage by having similarly skilled bots or players race each other. The difference in completion time becomes a metric to evaluate advantage for a vehicle. With this metric, we want to evaluate all of our different vehicles. To do this, we can set up a series of test cases where every combination of two vehicles competes against each other. When our test cases don’t meet expectations, we need a way to change the performance of those vehicles. By creating parameters that affect each vehicle’s speed, acceleration, or special power ups we can alter the performance of the vehicles until we get a better result. Taken together, these three elements let us evaluate and change our game’s balance to meet design goals.

Unity Game Simulation empowers these kinds of experiments by letting you list all of your test cases and parameters in a single experiment. A grid search is then performed to create every parameter combination for every test case and run that particular automated playthrough. Finally, the test case and parameters are returned with each metric so every result can be connected with its configuration.

Taking action with your metrics

The design goal above can be reduced to a simple expression: the difference in completion time between any pair of vehicles should be between 0 and 20 seconds. Our new feature, Metric Validation, enables you to define these expressions and automatically evaluate them. The metric generated from an automated playthrough is scored with a boolean system evaluating to 1 or 0 depending on whether a metric is within its bounds. More generally, Metric Validation lets a developer put a lower and upper bound on each of their metrics. Finding parameter combinations that satisfy these metric bounds becomes easier with this simple and consistent score. The best parameter combinations result from every test case having a score of 1. These best parameters can then be passed on to playtesters for deeper analysis, or A/B tested with real players. Additionally, setting up a recurrent balance test to verify changes in your game becomes a simple exercise of rerunning chosen parameter combinations and verifying that they still receive scores of 1.

These scores enable a more efficient system than Grid Search and will be part of our upcoming Optimization feature. If we process the scores in real-time then we can learn the relationship between the parameters and metrics. This relationship lets our Optimization system switch from random exploration to targeted. This means focusing on parameters that are most likely to make all the test cases pass. Optimization can greatly improve the efficiency for initial or recurrent balance tests. Additionally, this system offers the opportunity to more directly connect your game to a balancing framework. This connection means easily verifying balance every step of the way from early development to release.

Case study: Balancing Furyion’s Death Carnival

We’ve worked through several case studies with studios like iLLOGIKA and Furyion to better our understanding of game balance. When developing this framework, we returned to Furyion’s Death Carnival to demonstrate how the three key elements of our balance framework can work together to improve a game’s balance.

Death Carnival’s metrics were the number of deaths for an NPC on a sample level and how long the sample level took to complete. Furyion provided us with bounds of 16 to 22 for the number of deaths and 500 to 750 seconds for level completion time. These bounds eliminated negative gameplay experiences but were wide enough for players to feel like their choices impacted the gameplay. The parameters included weapon damage, special weapon effects, or the strength of bonuses from different weapon modifiers. These parameters affected the strength of a weapon and therefore reduced level times and deaths. Finally, test cases included weapon modifiers like spread, piercing, and fiery. Furyion wanted any choice of weapon modifier to fall within the provided metric bounds. The resulting experiment scored more than 110,000 playthroughs (22,000 combinations and five trials for each combination) and found just over 200 combinations of parameters that resulted in valid metrics across all test cases.

If using human playtesters, Furyion’s case would have required at least 3,650 hours of gameplay, not including rerunning it for new content or major changes. For an indie studio, that’s an infeasible proposition. Unity Game Simulation can measure all this gameplay data by deploying many simultaneous builds to our cloud simulation platform, supporting millions of different parameter combinations. For this Furyion experiment, we ran the 3,650 hours of gameplay in about 4 hours, which let us start analyzing the data that same day. Metric Validation provides a more cost-effective solution to measuring the balance and playability of a game across all of the distinct choices that a player could make.

We wanted to go one step further and begin measuring the final piece of our framework. This meant evaluating our Optimization system on Death Carnival. We gathered scores in real-time and used those results to inform which parameter combinations were worth testing. We identified five valid parameter combinations out of the 22,000 possible. We found these five valid combinations in just 2,000 playthroughs of the game. This cut simulation costs by 10x compared to Grid Search and 2–3x compared to a Random Search technique. Improvements to the system’s efficiency can empower developers to explore limitless paths that could exist within their games and allows game balancing to occur with every build.

The Optimization system is still in the development phase, and we are opening it up to a small number of studios to try. If you are interested, please reach out to us at gamesimulation@unity3d.com.

Metric Validation for better balancing

Metric Validation provides a way for measuring your game balance using automated playthroughs. It can greatly reduce the amount of energy and time required to playtest a game. These savings will enable small studios to playtest their games with just a few developers, larger studios to go through fewer rounds of soft launch, and more time for playtesters to focus on the human elements of a game like how fun it is to play. We look forward to iterating on Metric Validation with game studios to enable more balance testing.

Try Metric Validation today.