Categories & Tags
Archive

Metal, a new graphics API for iOS 8

July 3, 2014 in Technology by

metal-logo

Exciting times for graphics on iOS 8!

At its recent World Wide Developers Conference, Apple introduced Metal, a new graphics API that’s low-overhead, high efficient, and designed specifically for the A7 chip. It provides a way for game makers to take full advantage of iOS hardware and achieve far greater realism, detail, and interactivity in their games than ever before.

We’ll be adding support for Metal soon, but in advance, wanted to take you through the new technology and explain why it is such a big deal.

Metal at a glance

Metal has several key ideas in it that enable lower overhead, more predictable performance and better programmability:

  • Create and validate as much state up-front as possible. Shaders can be compiled and partially optimized offline. Everything that is related to rendering pipeline state: shaders, vertex layout, blending modes, render target formats, etc. can be created and validated before rendering even starts. This means no more state checks every draw call and a lot of CPU processing power freed.
  • Enable much more versatile multi-threading. Resources can be created from any thread and there are several ways to prepare draw call submission from multiple threads in parallel.
  • All iOS devices have shared memory for CPU & GPU. There’s no need to pretend that data from the CPU has to be “copied” into some video memory anymore. When you create a buffer, you just get a pointer to it, and that’s the same memory that the GPU sees.
  • Let the user (engine) handle synchronization. OpenGL ES has to jump through lots of hoops and do lots of guesswork in order to behave in every imaginable scenario. In Metal, synchronization of data between CPU & GPU is user’s responsibility. The engine has much better knowledge of what it tries to do, afterall!
  • All GPUs in iOS devices are using Tile-Based Deferred Rendering architecture. It is explicitly reflected in Metal API, particularly when it comes to render targets. The API does not try to guess anything anymore – all framebuffer related actions such as tile loads & stores, anti-aliasing resolves are done explicitly.
  • All the points above translate to much lower CPU overhead and much more predictable performance.
  • A new C/C++11-based language is introduced for both graphics & compute shaders. This also means that iOS can do compute shaders, atomics, arbitrary buffer writes and similar fancy sounding tricks on the GPU now.
  • No legacy baggage, the API is very simple & streamlined. Oh, and it also has a super-helpful optional “debug layer” that does extra validation and notifies you of any errors or mistakes you make.

Now let’s go into even more details!

The Draw Call Problem

If you’re making games, particularly mobile games, you’re probably aware of The Draw Call Problem. Each and every object that is rendered in the game has some CPU cost, and realistically on mobile right now you cannot afford more than a few hundred objects being rendered. In a real game, you also very much want to use CPU for other things – gameplay logic, physics, AI, animations and so on. Unity has some measures to minimize the number of draw calls being made – static & dynamic batching, occlusion culling, LOD and distance-based layer culling; you can also merge close objects together, put textures into atlases to reduce number of materials.

A good question is – why there has to be a CPU cost to render something? After all, it’s the GPU that is doing the actual work.

Some of the overhead is on “the engine” side – CPU has to iterate over visible objects, figure out which shader passes need to be rendered, which lights affect which objects, which material parameters to apply and so on. Some of that is cached; some of that is multi-threaded; and generally this is platform-independent code. In each Unity release, we try to optimize this part, and Metal generally does not affect this.

However, other part of the CPU overhead is in the “graphics API & driver” part. Depending on the game, this part can be significant. Metal is an attempt to make this part virtually go away, by being a much better match for modern hardware, somewhat lower level and doing massively less guesswork than OpenGL ES used to do. Up-front rendering state creation & validation; explicit actions related to render target loads & stores; no synchronization dances done on the API side — all these things contribute to much lower CPU overhead.

Based on our testing so far, we have seen API+driver overhead vanish to just a few percent of CPU time. That is a tremendous improvement comparing to 15-40% of CPU time that it used to be before! That means majority of the remaining CPU overhead is in our own code now. I guess we’ll have to continue optimizing that :)

We’re also looking forward to using Metal ability to do rendering submissions from multiple threads; this opens up very interesting optimization opportunities as well.

The Compute Opportunity

With Metal, the GPU can be used for doing computation outside of typical vertex+fragment shaders area — known as “compute shaders”. Basically, this is an ability to run any kind of “parallel computation” on the many little processors inside a GPU. Compute shaders also have a concept of “local storage” – very fast piece of dedicated on-GPU memory that can be used to share data between these parallel work items. This particular piece of memory enables using GPU for things that aren’t easily expressible in ye olde vertex and fragment shaders.

There are tons of interesting areas to use compute shaders for — optimized post-processing effects, particle systems, shadow and light culling and so on.

While we aren’t using compute shaders much in Unity just yet, we’re looking forward to using them for more and more stuff. Exciting times ahead!

FAQ

When can I get this?
We can’t wait to ship this, but would like to avoid promising any actual dates. We have done a lot already, but still some things remain in order to be “shippable”. Our current plan is to first integrate all of the bits of Metal that provide the huge boosts to CPU side performance. Hopefully in Unity 5.0. Later on, we’ll add compute shader support (compute shader support is somewhat more involved on our side).

What would be the platform requirements?
Metal requires iOS 8 and an A7-based device (iPhone 5S, iPad Air, Retina iPad Mini).

What would I have to do to take advantage Metal’s lower CPU overhead?
Generally, nothing. Once we add support for Metal in Unity, using it should be very transparent. All your existing projects, all your shaders and graphics effects should just work. Just enjoy your lower CPU usage!

But what about shaders, since Metal has a different shading language?
We’ll take care of that. Right now you generally write shaders in Cg/HLSL, which we convert into GLSL for OpenGL ES behind the scenes. For Metal, we’ll convert them in a very similar way.

What can I do with lower CPU overhead, again?
Have better physics, AI or more complex gameplay logic. Put more objects on the screen. Or just enjoy lower battery usage. It’s all up to you!

Share this post

Comments (22)

Martin
3 Jul 2014, 5:16 pm

Will any of this be restricted to Unity Pro?

3 Jul 2014, 8:12 pm

It’s good to know that Unity is on top of this major development.

Breakmachine
3 Jul 2014, 10:35 pm

How will Android keep up?

Cian Mc Sweeney
3 Jul 2014, 10:52 pm

@BREAKMACHINE Google have already announced that android L(The next Android Version) will eradicate the same problems that Metal does.

4 Jul 2014, 4:49 am

Funny how this reminds me something called ‘mantle’.. :-)

Sinister Mephisto
4 Jul 2014, 4:53 am

Metal mantle piece. I just want to know about the new Physics API

Rainsing
4 Jul 2014, 5:26 am

What if I want to support both older iOS devices and A7 based devices alike? Do I have to make separate builds?

4 Jul 2014, 6:52 am

@Rainsing: if you’re running on a pre-A7 device (or pre-iOS8 OS), Unity will fall back to GLES2.0 or 3.0. That is, unless you choose to make a “Metal only” build in player settings. It’s up to you what to do when you detect you’re running on Metal – most likely you can afford more objects on screen, or boost up other CPU related things.

Peter Dwyer
4 Jul 2014, 10:32 am

Sounds interesting. I look forward to using it, though it doesn’t exactly sound close to being available. I guess I’ll put in an early request for beta access.

Peter Dwyer
4 Jul 2014, 10:34 am

Totally Unrelated. Check your spam protection as it’s asked me exactly the same sum for the last four posts I’ve made. I think it’s stopped randomly generating or something….

4 Jul 2014, 2:05 pm

Excellent news. Seems it will be implemented exactly the way I hoped.

gaon
5 Jul 2014, 5:09 am

you didn’t mention simd.h

You can share same datatypes as the Shading Language
but run on the CPU that is all auto vectorized for you.
That means all the vector and martix processing running on the cpu
is going to be really fast now as well.

PS. Android L is not even same as Metal. It is just ES 3.1 with some
extra extensions. Only reason the demo is impressive is that K1 is using
192 cores. Don’t worry next Apple Chip should have the same number of cores
in their GPU too. So all the Android Bravado will disappear as soon as October.

Lief
5 Jul 2014, 8:19 am

I only wish that I can use it on Unity3d free version.

Pkid
6 Jul 2014, 2:01 am

Can you give us any idea as to how much metal might increase frame rates? It would be nice to hear an example of a scene rendered with and without metal and what the frame rate difference was.

7 Jul 2014, 4:40 am

Wow nice. When is this coming?

7 Jul 2014, 3:41 pm

I was at WWDC and few of us were wondering how Unity would implement Metal. Again you guys prove again and again why you are the best. Because you guys think like your customers think and try to get the latest developments to us. Thanks.

groan
7 Jul 2014, 10:19 pm

@PKID

Metal is about CPU sending more draw calls within the frame.
Driver doing less work. and GPU being less idle.

So for CPU bound game you get 40% increase in ability to do more.
GPU bound game is probably less than 15% increase.

Other caveats is that because there is no compiling of shaders.
There is more stuff that can be shown.

There is also multithreading on the CPU side of thing
which probably means more design changes of how you prep
your frame to be sent to the GPU.

So things will be smoother, less battery usage,
more memory, more stuff
but not necessarily more frames
as most games are either designed for 30 fps or 60 fps.

The thing about optimization is that programers
generally only go for the low hanging fruit at first.
so Game Engines might take all the wins but
with more visual details.

GPU hasn’t gotten faster. It is just used more efficiently
and CPU can do more to help GPU by taking away
20 years of CRUFT (OpenGL). that is Metal

True change will come when games are designed with multithreading
in mind. Less prebake scenes.

przemo_li
8 Jul 2014, 10:42 am

@GAON

1) Indirect rendering (this actually make GPU faster)
2) Tesselation
3) Geometry
4) Etc.

Yes AEP is MORE then Metal feature wise.

Metal is targeting OpenGL ES 3.1 feature set. Nothing more.
AEP is more.

* Though, ES 3.1 AEP or no AEP, AZDO is not possible. So Apple still do it quicker on CPU (and more reliable – apps know when stals may be introduced)

8 Jul 2014, 1:30 pm

Wonderful news! Can’t wait it coming out. It was implemented even better thatn I hoped, and if it really will have lower CPU overhead it will open new possibilities for mobile game developers

Quazi
9 Jul 2014, 9:04 am

Thanks for putting this all in one place.

Andy
12 Jul 2014, 2:00 am

OK, finally a valid reason to upgrade my mid-2007 MacMini and my iPod touch 4th Gen.

20 Jul 2014, 1:52 am

como hago para jugar

Leave a Reply

Please enter your name
Please enter a valid email
Please enter a comment
Spam protection error

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>