Metal, a new graphics API for iOS 8

July 3, 2014 in Technology

metal-logo

Exciting times for graphics on iOS 8!

At its recent World Wide Developers Conference, Apple introduced Metal, a new graphics API that’s low-overhead, high efficient, and designed specifically for the A7 chip. It provides a way for game makers to take full advantage of iOS hardware and achieve far greater realism, detail, and interactivity in their games than ever before.

We’ll be adding support for Metal soon, but in advance, wanted to take you through the new technology and explain why it is such a big deal.

Metal at a glance

Metal has several key ideas in it that enable lower overhead, more predictable performance and better programmability:

  • Create and validate as much state up-front as possible. Shaders can be compiled and partially optimized offline. Everything that is related to rendering pipeline state: shaders, vertex layout, blending modes, render target formats, etc. can be created and validated before rendering even starts. This means no more state checks every draw call and a lot of CPU processing power freed.
  • Enable much more versatile multi-threading. Resources can be created from any thread and there are several ways to prepare draw call submission from multiple threads in parallel.
  • All iOS devices have shared memory for CPU & GPU. There’s no need to pretend that data from the CPU has to be “copied” into some video memory anymore. When you create a buffer, you just get a pointer to it, and that’s the same memory that the GPU sees.
  • Let the user (engine) handle synchronization. OpenGL ES has to jump through lots of hoops and do lots of guesswork in order to behave in every imaginable scenario. In Metal, synchronization of data between CPU & GPU is user’s responsibility. The engine has much better knowledge of what it tries to do, afterall!
  • All GPUs in iOS devices are using Tile-Based Deferred Rendering architecture. It is explicitly reflected in Metal API, particularly when it comes to render targets. The API does not try to guess anything anymore – all framebuffer related actions such as tile loads & stores, anti-aliasing resolves are done explicitly.
  • All the points above translate to much lower CPU overhead and much more predictable performance.
  • A new C/C++11-based language is introduced for both graphics & compute shaders. This also means that iOS can do compute shaders, atomics, arbitrary buffer writes and similar fancy sounding tricks on the GPU now.
  • No legacy baggage, the API is very simple & streamlined. Oh, and it also has a super-helpful optional “debug layer” that does extra validation and notifies you of any errors or mistakes you make.

Now let’s go into even more details!

The Draw Call Problem

If you’re making games, particularly mobile games, you’re probably aware of The Draw Call Problem. Each and every object that is rendered in the game has some CPU cost, and realistically on mobile right now you cannot afford more than a few hundred objects being rendered. In a real game, you also very much want to use CPU for other things – gameplay logic, physics, AI, animations and so on. Unity has some measures to minimize the number of draw calls being made – static & dynamic batching, occlusion culling, LOD and distance-based layer culling; you can also merge close objects together, put textures into atlases to reduce number of materials.

A good question is – why there has to be a CPU cost to render something? After all, it’s the GPU that is doing the actual work.

Some of the overhead is on “the engine” side – CPU has to iterate over visible objects, figure out which shader passes need to be rendered, which lights affect which objects, which material parameters to apply and so on. Some of that is cached; some of that is multi-threaded; and generally this is platform-independent code. In each Unity release, we try to optimize this part, and Metal generally does not affect this.

However, other part of the CPU overhead is in the “graphics API & driver” part. Depending on the game, this part can be significant. Metal is an attempt to make this part virtually go away, by being a much better match for modern hardware, somewhat lower level and doing massively less guesswork than OpenGL ES used to do. Up-front rendering state creation & validation; explicit actions related to render target loads & stores; no synchronization dances done on the API side — all these things contribute to much lower CPU overhead.

Based on our testing so far, we have seen API+driver overhead vanish to just a few percent of CPU time. That is a tremendous improvement comparing to 15-40% of CPU time that it used to be before! That means majority of the remaining CPU overhead is in our own code now. I guess we’ll have to continue optimizing that :)

We’re also looking forward to using Metal ability to do rendering submissions from multiple threads; this opens up very interesting optimization opportunities as well.

The Compute Opportunity

With Metal, the GPU can be used for doing computation outside of typical vertex+fragment shaders area — known as “compute shaders”. Basically, this is an ability to run any kind of “parallel computation” on the many little processors inside a GPU. Compute shaders also have a concept of “local storage” – very fast piece of dedicated on-GPU memory that can be used to share data between these parallel work items. This particular piece of memory enables using GPU for things that aren’t easily expressible in ye olde vertex and fragment shaders.

There are tons of interesting areas to use compute shaders for — optimized post-processing effects, particle systems, shadow and light culling and so on.

While we aren’t using compute shaders much in Unity just yet, we’re looking forward to using them for more and more stuff. Exciting times ahead!

FAQ

When can I get this?
We can’t wait to ship this, but would like to avoid promising any actual dates. We have done a lot already, but still some things remain in order to be “shippable”. Our current plan is to first integrate all of the bits of Metal that provide the huge boosts to CPU side performance. Hopefully in Unity 5.0. Later on, we’ll add compute shader support (compute shader support is somewhat more involved on our side).

What would be the platform requirements?
Metal requires iOS 8 and an A7-based device (iPhone 5S, iPad Air, Retina iPad Mini).

What would I have to do to take advantage Metal’s lower CPU overhead?
Generally, nothing. Once we add support for Metal in Unity, using it should be very transparent. All your existing projects, all your shaders and graphics effects should just work. Just enjoy your lower CPU usage!

But what about shaders, since Metal has a different shading language?
We’ll take care of that. Right now you generally write shaders in Cg/HLSL, which we convert into GLSL for OpenGL ES behind the scenes. For Metal, we’ll convert them in a very similar way.

What can I do with lower CPU overhead, again?
Have better physics, AI or more complex gameplay logic. Put more objects on the screen. Or just enjoy lower battery usage. It’s all up to you!

Comments (25)

Subscribe to comments
  1. Alphazenn

    August 13, 2014 at 2:51 pm / 

    I like your blog.

  2. Aras Pranckevičius

    August 7, 2014 at 7:50 am / 

    @Saikyou: in a very similar fashion on how we convert Cg/HLSL to GLSL already. Metal shader pipeline looks like this right now: Cg/HLSL input -> hlsl2glsl converts into GLSL ES -> glsl-optimizer parses it and does offline optimizations -> prints Metal shader source -> compiled with Apple’s Metal compiler.

  3. SAIKYOU

    August 7, 2014 at 5:18 am / 

    Amazing!!!

    but, how does unity convert cg to metal shader??? some details???

  4. manuel

    July 20, 2014 at 1:52 am / 

    como hago para jugar

  5. Andy

    July 12, 2014 at 2:00 am / 

    OK, finally a valid reason to upgrade my mid-2007 MacMini and my iPod touch 4th Gen.

  6. Quazi

    July 9, 2014 at 9:04 am / 

    Thanks for putting this all in one place.

  7. Victoria

    July 8, 2014 at 1:30 pm / 

    Wonderful news! Can’t wait it coming out. It was implemented even better thatn I hoped, and if it really will have lower CPU overhead it will open new possibilities for mobile game developers

  8. przemo_li

    July 8, 2014 at 10:42 am / 

    @GAON

    1) Indirect rendering (this actually make GPU faster)
    2) Tesselation
    3) Geometry
    4) Etc.

    Yes AEP is MORE then Metal feature wise.

    Metal is targeting OpenGL ES 3.1 feature set. Nothing more.
    AEP is more.

    * Though, ES 3.1 AEP or no AEP, AZDO is not possible. So Apple still do it quicker on CPU (and more reliable – apps know when stals may be introduced)

  9. groan

    July 7, 2014 at 10:19 pm / 

    @PKID

    Metal is about CPU sending more draw calls within the frame.
    Driver doing less work. and GPU being less idle.

    So for CPU bound game you get 40% increase in ability to do more.
    GPU bound game is probably less than 15% increase.

    Other caveats is that because there is no compiling of shaders.
    There is more stuff that can be shown.

    There is also multithreading on the CPU side of thing
    which probably means more design changes of how you prep
    your frame to be sent to the GPU.

    So things will be smoother, less battery usage,
    more memory, more stuff
    but not necessarily more frames
    as most games are either designed for 30 fps or 60 fps.

    The thing about optimization is that programers
    generally only go for the low hanging fruit at first.
    so Game Engines might take all the wins but
    with more visual details.

    GPU hasn’t gotten faster. It is just used more efficiently
    and CPU can do more to help GPU by taking away
    20 years of CRUFT (OpenGL). that is Metal

    True change will come when games are designed with multithreading
    in mind. Less prebake scenes.

  10. James Griggs

    July 7, 2014 at 3:41 pm / 

    I was at WWDC and few of us were wondering how Unity would implement Metal. Again you guys prove again and again why you are the best. Because you guys think like your customers think and try to get the latest developments to us. Thanks.

  11. Tong

    July 7, 2014 at 4:40 am / 

    Wow nice. When is this coming?

  12. Pkid

    July 6, 2014 at 2:01 am / 

    Can you give us any idea as to how much metal might increase frame rates? It would be nice to hear an example of a scene rendered with and without metal and what the frame rate difference was.

  13. Lief

    July 5, 2014 at 8:19 am / 

    I only wish that I can use it on Unity3d free version.

  14. gaon

    July 5, 2014 at 5:09 am / 

    you didn’t mention simd.h

    You can share same datatypes as the Shading Language
    but run on the CPU that is all auto vectorized for you.
    That means all the vector and martix processing running on the cpu
    is going to be really fast now as well.

    PS. Android L is not even same as Metal. It is just ES 3.1 with some
    extra extensions. Only reason the demo is impressive is that K1 is using
    192 cores. Don’t worry next Apple Chip should have the same number of cores
    in their GPU too. So all the Android Bravado will disappear as soon as October.

  15. Mark Hessburg

    July 4, 2014 at 2:05 pm / 

    Excellent news. Seems it will be implemented exactly the way I hoped.

  16. Peter Dwyer

    July 4, 2014 at 10:34 am / 

    Totally Unrelated. Check your spam protection as it’s asked me exactly the same sum for the last four posts I’ve made. I think it’s stopped randomly generating or something….

  17. Peter Dwyer

    July 4, 2014 at 10:32 am / 

    Sounds interesting. I look forward to using it, though it doesn’t exactly sound close to being available. I guess I’ll put in an early request for beta access.

  18. Aras Pranckevičius

    July 4, 2014 at 6:52 am / 

    @Rainsing: if you’re running on a pre-A7 device (or pre-iOS8 OS), Unity will fall back to GLES2.0 or 3.0. That is, unless you choose to make a “Metal only” build in player settings. It’s up to you what to do when you detect you’re running on Metal – most likely you can afford more objects on screen, or boost up other CPU related things.

  19. Rainsing

    July 4, 2014 at 5:26 am / 

    What if I want to support both older iOS devices and A7 based devices alike? Do I have to make separate builds?

  20. Sinister Mephisto

    July 4, 2014 at 4:53 am / 

    Metal mantle piece. I just want to know about the new Physics API

  21. Stephane

    July 4, 2014 at 4:49 am / 

    Funny how this reminds me something called ‘mantle’.. :-)

  22. Cian Mc Sweeney

    July 3, 2014 at 10:52 pm / 

    @BREAKMACHINE Google have already announced that android L(The next Android Version) will eradicate the same problems that Metal does.

  23. Breakmachine

    July 3, 2014 at 10:35 pm / 

    How will Android keep up?

  24. schmosef

    July 3, 2014 at 8:12 pm / 

    It’s good to know that Unity is on top of this major development.

  25. Martin

    July 3, 2014 at 5:16 pm / 

    Will any of this be restricted to Unity Pro?

Comments are closed.