Search Unity

As developers, we’re always aware of performance, both in terms of CPU and GPU. Maintaining good performance gets more challenging as scenes get larger and more complex, especially as we add more and more characters. Me and my colleague in Shanghai come across this problem often when helping customers, so we decided to dedicate a few weeks to a project aimed to improve performance when instancing characters. We call the resulting technique Animation Instancing.

We often implement outdoors scenes with GPU Instancing, such as grasses and trees. But for SkinnedMeshRenderer (for example characters), we can’t use instancing, because the skinning is calculated on the CPU, and submitted to the GPU one by one. In general, we can’t draw all characters through one submission. When there are lots of SkinnedMeshRenderers in the scene, this results in lots of draw calls and animation calculations.

We have found a way to reduce CPU cost and supplement GPU Instancing in Unity with Animation Instancing. You can get our code on GitHub. Be aware that this is custom experimental solution, we’ve only shared it with a few of our enterprise support customers until recently. Now we’re ready for more feedback – please let us know what you think directly in the project comments!

Goals

Our initial goals for this experimental project were:

  • Instancing SkinnedMeshRenderer
  • Implement as many animation features as possible
  • LOD
  • Support mobile platform
  • Culling

Not all of our goals were reached due to time constraints. Animation features supported are: Root Motion, Attachment, Animation Events (not yet supported features: Transitions, Animation Layer). Also, bear in mind that this only works on mobile platforms using OpenGL ES 3.0 and newer.

However, we felt that the experiment was successful in proving that this approach can have interesting results. Let’s dig into some of the details.

Animation Generation

Before using instancing for characters, we need to generate the animations. We generated the animations of a character into textures. These textures are called Animation Texture. The textures are used in skinning on GPU.

This generator collects animations from the Animator component attached to the GameObject in question. It collects the animation events as well. It’s convenient to transfer from Mecanim system to Animation Instancing. If you want to attach something on a character, you need to specify the bones to which something can attach in the Attachment settings.

When we finish generating the animation texture, the Animation Instancing script will load the animation information at runtime. Note that the animation information are not the animation clip files.

Instancing

It’s simple to apply Animation Instancing. Let’s add the Animation Instancing script to our generated game object. The Bone Per Vertex parameter controls the number of bones which are calculated per vertex. The important thing to be aware of here is that having less bones improves performance, but decreases accuracy.

Next, we need to modify the shader in order to support Instancing. Basically, what you need is to add these lines into your shaders. It doesn’t affect your shading, but adds a vertex shader to skinning.

Performance Analysis

We used a slightly modified version of a demo scene from the Mecanim Example Scenes and tested its performance on an iPhone 6. Let’s take a closer look at the profiler views for both original and instancing example.

CPU

The original projects spawns 300 characters, and our FPS is around 15. To get to at least 30 FPS, we have to limit the number of characters to about 150. In the Animation Instancing version we can spawn 900 characters while maintaining 30 FPS.

As you can see, calculations on the CPU slow the project down.

Using the instancing project, we reduced the animation calculations (skeleton and skinning etc.) a lot on the CPU. That way,  we can spawn five or six times as many characters!

In the test scene, drawing the environment requires around 80 draw calls. The character has three materials. So we have three draw calls to render a character.

Without instancing spawning 250 characters requires around 1100 draw calls (3 *250 characters + their shadows).

When using Animation Instancing, after spawning 800 characters, the draw calls only increases to about 50. As you can see, there are 4800 batched draw calls in the instancing column and 48 batches(3 * 8 characters + 3 * 8 shadows). That is because we submit 100 characters per batch. 

GPU

This technique increases GPU cost a little, because we put skinning on the GPU. If the characters have shadows, we have to skin the characters  again in the shadow pass. However, it improves the overall frame rate because it reduces CPU cost. Usually CPU cost is the biggest issue in crowd simulations in games.

Memory

The additional memory is used to store the Animation Textures. The texture holds the skin matrix. We use RGBAHalf format texture. Let’s assume a character has N bones and four pixels per bone(one matrix); we generate one animation as M key frames. So one animation costs N * 4 * M * 2 = 8NM bytes. If a character has 50 bones and we generate 30 keyframes, one animation has 50 * 4 * 30 = 6000 pixels. So a 1024*1024 texture can store up to 174 animations.

Conclusion

We’ve found Animation Instancing can significantly reduce CPU cost if you have lots of SkinnedMeshRenderers. It’s suitable for crowds of similar enemies such as zombies etc.

We hope this experimental project provides some insight that can shine into your own project’s performance challenges and gives you the ability to build more elaborate scenes. Certainly, there are many avenues for future work, such as support for transitions, animation layers etc.

Please check out the code on Github and post any comments / issues you have directly to the project!

21 Comments

Subscribe to comments

Leave a reply

You may use these HTML tags and attributes: <a href=""> <b> <code> <pre>

  1. can i attach an ia to them and use nav mesh ?

  2. Would be interesting to see performance comparison on PC between built-in SkinnedMeshRenderer, this technique, and Unity Austin demo technique.

  3. Isn’t this just an animated impostors technique?
    What you do is rendering the animation into a sprite sheet, then use the sprite to do fewer draw calls, isn’t it?

    1. No. The characters are not sprites.
      In a few words, we generate the skeleton animations into textures in order to skinning on GPU.
      So we can draw many characters as a batch.

  4. I rolled my own solution like this using the GPU instancing API and a custom palette skinning shader. I guess in future it will be a lot less work for developers.
    In the meantime it was very interesting to learn about, and our game will be released very soon with around 1000 soldiers animated at once. They are also many different characters and soldiers mounted on horse back. I think these things do add to the CPU workload.

    https://youtu.be/-1d0HdstoHw

  5. I do something similar in my asset Mesh Animator, but achieve instancing by display mesh snapshots and can get 5000+ characters at 60fps, GPU skinning was my next step but looks like I don’t have to do it now! https://assetstore.unity.com/packages/tools/animation/mesh-animator-26009

  6. Why unity not support vertex you animation as usual?

  7. This’ll be great once you get Transitions working. Hope work continues on the project :)

  8. Doesn’t Unity already have a “use GPU skinning” checkbox? why not just use that?

    1. I’d like more explanation on that too.

      1. The checkbox “use GPU skinning” is GPU skinning. It puts the skinning calculation on GPU. It saves a little time on CPU. But the key of Animation Instancing is instancing. GPU skinning is a necessary component.
        Note: this gpu skinning is a little from the checkbox “use GPU skinning”.

  9. On 2017.3.1 I get some really bad framerates on a very capable gaming PC (5 fps for 500 characters). I did enable instancing with the in game button. The main culprits seem to be “PreLateUpdate.DirectorUpdateAnimationEnd/Begin”. Disabling rootMotion seems to improve things a bit, but it’s still far from the kind of performance we should expect from this (1000 characters at 20 fps)

    I think if other Unity projects like Ultimate Epic Battle Simulator and the Nordeus demo could have something like 100k units at 30 fps, we should be able to reach that ballpark with such a system

    1. Did you get any errors in console? Please open a issue on GitHub, I will have a look at it.

  10. Mecanim needs to improve.

  11. Is it compatible with 2018? Also, Is useful or compatible with Speedtree wind movement?

    1. Yes, it is supported on Unity5.4 or newer. I think it’s enough to Speedtree with gnu instancing.

  12. Awesome! How do I understand this is the technique that was shown on Unite Austin 2017 in the Spellsouls Universe massive battle demo?

    1. Since the Unite demo was 60000+ units at 60fps, and this here is 900 units at 30fps, I think it’s safe to assume this is not the same technique.

      In the Unite demo they used a different approach that would be completely inaccessible for most of the unity commnity. And it’s not an approach that’s easily “systemifiable” for more accessibility, because it has to be implemented very specifically for your specific use cases

      1. Yes, they are different. But the difference is not the units amount. The analysis in the blog is on Iphone 6, not PC. Of course you can run it on PC to support thousands of thousands units.

    2. I believe that was the ‘Job System’: https://create.unity3d.com/jobsystem

    3. The technique you mentioned on Unite Austin 2017 is C# job system. It’s different from Animation Instancing. But they’re not conflict. You can use both of them on Unity2018. C# Job system is a lower level technique. So we will implement Animation Instancing based on job system on Unity2018.