On DOTS: Entity Component System
This is one of several posts about our new Data-Oriented Tech Stack (DOTS), sharing some insights into how and why we got to where we are today, and where we’re going next.
In my last post, I talked about HPC# and Burst as low-level foundational technologies for Unity going forward. I like to refer to this level of our stack as the “game engine engine”. Anyone can use this stack to write a game engine. We can. We will. You can too. Don’t like ours? Write your own, or modify ours to your liking.
Unity’s Component System
The next layer we’re building on top is a new component system. Unity has always been centered around the concepts of components. You add a Rigidbody component to a GameObject and it will start falling. You add a Light component to a GameObject and it will start emitting light. Add an AudioEmitter component and the GameObject will start producing sound.
It’s a very natural concept for programmers and non-programmers alike, and easy to build intuitive UIs for. I’m actually quite amazed at how well this concept has aged. So well that we want to keep it.
What hasn’t aged well is how we implemented our component system. It was written with an object-oriented mindset. Components and GameObjects are “heavy c++” objects. Creating/destroying them requires a mutex lock to modify the global list of id->objectpointers. All GameObjects have a name. Each one gets a C# wrapper object that points to the C++ one. That C# object could be anywhere in memory. The C++ object can also be anywhere in memory. Cache misses galore. We try to mitigate the symptoms as best we can, but there’s only so much you can do.
With a data-oriented mindset, we can do much better. We can keep the same nice properties from a user point of view (add a Rigidbody component, and the thing will fall), but also get amazing performance and parallelism with our new component system.
This new component system is our Entity Component System (ECS). Very roughly speaking, what you do with a GameObject today you do with an Entity in the new system. Components are still called components. So what’s different? The data layout.
Let’s look at some common data access patterns
A typical component that you would write in Unity in the traditional way might look like this:
class Orbit : MonoBehaviour
public Transform _objectToOrbitAround;
//please ignore this math is all broken, that's not the point here :)
var currentPos = GetComponent<Transform>().position;
var targetPos = _objectToOrbitAround.position;
GetComponent<RigidBody>().velocity += SomehowSteerTowards(currentPos,targetPos)
This pattern comes back over and over. A component has to find one or more other components on the same GameObject and read/write some values on it.
There are a lot of things wrong with this:
- The Update() method gets called for a single orbit component. The next Update() call might be for a completely different component, likely causing this code to be evicted from the cache the next time it has to run this frame for another Orbit component.
- Update() has to use GetComponent() to go and find its Rigidbody. (It could be cached instead, but then you have to be careful about the Rigidbody component not being destroyed).
- The other components we’re operating on are in completely different places in memory.
The data layout ECS uses recognizes that this is a very common pattern and optimizes memory layout to make operations like this fast.
ECS Data Layout
ECS groups all entities that have the exact same set of components together in memory. It calls such a set an archetype. An example of an archetype is: “Position & Velocity & Rigidbody & Collider”. ECS allocates memory in chunks of 16k. Each chunk will only contain the component data for entities of a single archetype.
Instead of having the user Update method searching for other components to operate on at runtime, per Orbit instance, in ECS land you have to statically declare “I want to run some operations on all entities that have both a Velocity and a Rigidbody and an Orbit component. To find all those entities, we simply find all archetypes that match a specific “component search query”. Each archetype has a list of Chunks where entities of that archetype are stored. We loop over all those chunks, and inside each of the chunks, we’re doing a linear loop of tightly packed memory, to read and write the component data. This linear loop that runs the same code on each entity also makes for a likely vectorization opportunity for Burst.
In many cases, this process can be trivially split up into several jobs, making the code operating the ECS component run on nearly 100% core utilization.
ECS does all this work for you, you just need to supply the code that you want to run on each entity. (You can do the chunk iteration manually if you want to though.)
When you add/remove a component from an Entity, it switches archetype. We move it from its current chunk to a chunk of the new archetype, and back swap the last entity of the previous chunk to “fill the hole”.
In ECS, you also statically declare what you intend to do with the component data. ReadOnly or ReadWrite. By promising (the promise is verified) to only read from the Position component, ECS can get more efficient scheduling of its jobs. Other jobs that also want to read from the Position component won’t have to wait.
This data layout also allows us to deal with a long-standing frustration we’ve had, which are load times and serialization performance. Loading/streaming ECS data for a big scene isn’t much more than just loading raw bytes from disk and using them as is.
This is the reason the Megacity demo loads in a few seconds on a phone.
While entities can do what game objects do today, they can do more because they are so lightweight. In fact, what really is an Entity? In an earlier draft of this post I wrote “we store entities in chunks”, and later changed it to “we store component data for entities in chunks”. It’s an important distinction to make, to realize that an Entity is just a 32-bit integer. There is nothing to store or allocate for it, other than the data of its components. Because they’re so cheap, you can use them for scenarios that game objects weren’t suitable for. Like using an entity for each individual particle in a particle system.
HPC#, Burst, ECS. Awesome, but where’s my game engine?
The next layer we need to build is very big. It’s the “game engine” layer composed of features like “renderer”, “physics”, “networking”, “input”, “animation”, etc. This is roughly where we are today. We have started to work on these pieces, but they won’t be ready overnight.
That might sound like a bummer. In a way it is, but in another way, it’s not. Because ECS and everything built on top of it are written in C#, it can run inside of traditional Unity. Because it runs inside of Unity, you can write ECS components that use pre-ECS functionality. There is no pure ECS mesh drawing system right now. However, you can write an ECS MeshRenderSystem that uses pre-ECS Graphics.DrawMeshIndirect API as an implementation, while you wait for a pure ECS version to ship. This is exactly the technique that our Megacity demo uses. Loading/Streaming/Culling/LODding/Animation is done with pure ECS systems, but the final drawing is not.
So you can mix & match. What’s great about that is you can already reap the benefits of Burst codegen, and ECS performance for your game code, instead of having to wait for us to ship pure ECS versions of all subsystems. What’s not great about it is that in this transition phase, you can see and feel this friction that you’re “using two different worlds that are glued together”.
We will ship all the source code to our ECS HPC# subsystems in packages. You can inspect, debug, modify each subsystem, as well as have more fine-grained control over when you want to upgrade which subsystem. You could, for example, upgrade the Physics subsystem package without upgrading anything else.
What will happen to Game Objects?
Game Objects aren’t going anywhere. People have successfully shipped amazing games on it for over a decade. That foundation isn’t going anywhere.
What will change is that you will over time see our energy to make improvements tilt from going exclusively into the game object world, towards the ECS world.
API Usability / Boilerplate
A common, very valid, point people bring up when looking at ECS, is that there’s a lot of typing. A lot of boilerplate code that stands in between you and what you’re trying to achieve.
There are a lot of improvements on the horizon that aim to remove the need for most boilerplate and make it simpler to express your intent. We haven’t implemented many of them yet as we’ve been focussing on the foundational performance, but we believe there is no good reason for ECS game code to have much boilerplate code, or be particularly more work to write than writing a MonoBehaviour.
Project Tiny has already implemented some of these improvements (like a lambda based iteration API). Speaking of which..
How does Project Tiny’s ECS fit into all this?
Project Tiny will ship on top of the same C# ECS as this blog post has been talking about. Project Tiny will be a big ECS milestone for us in several ways:
- It will be able to run in a complete ECS-only environment. A new player with no baggage from the past.
- That means it’s also pure-ECS and has to ship with all ECS subsystems a real (tiny) game needs.
- We’ll adopt Project Tiny’s Editor support for Entity editing for all ECS scenarios, not just tiny.
We have job openings for all the different parts of the DOTS stack, particularly in Burbank and Copenhagen, check out careers.unity.com.
Also, make sure to join us on Unity Entity Component System and C# Job System forum to give feedback and get information on experimental and preview features.