Search Unity

With the release of ARKit and the iPhone X paired with Unity, developers have an easy-to-use set of tools to create beautiful and expressive characters. This opens up exploring the magic of real-time puppeteering for the upcoming “Windup” animated short, directed by Yibing Jiang.

Unity Labs and the team behind “Windup” have come together to see how far we could push Unity’s ability to capture facial animation in real time on a cinematic character. We also enlisted the help of Roja Huchez of Beast House FX for modeling and rigging of the blend shapes to help bring the character expressions to life.

What the team created is Facial AR Remote, a low-overhead way to capture performance using a connected device directly into the Unity editor. We found using the Remote’s workflow is useful not just for animation authoring, but also for character and blend shape modeling and rigging, creating a streamlined way to build your own animoji or memoji type interactions in Unity. This allows developers to be able to iterate on the model in the editor without needing to build to the device, removing time-consuming steps in the process.

Why build the Facial AR Remote

We saw an opportunity to build new animation tools for film projects opening up a future of real-time animation in Unity. There was also a “cool factor” in using AR tools for authoring and an opportunity to continue to push Unity’s real-time rendering. As soon as we had the basics working with data coming from the phone to the editor, our team and everyone around our desks could not stop having fun puppeteering our character. We saw huge potential for this kind of technology. What started as an experiment quickly proved itself both fun and useful. The project quickly expanded into the current Facial AR Remote and feature set.

 

The team set out expanding the project with Unity’s goal of democratizing development in mind. We wanted the tools and workflows around AR blend shape animation to be easier to use and more available than what was currently available and traditional methods of motion capture. The Facial Remote let us build out some tooling for iterating on blend shapes within the editor without needing to create a new build just to check mesh changes on the phone. What this means is a user is able to take a capture of an actor’s face and record it in Unity. And that capture can be used as a fixed point to iterate and update the character model or re-target the animation to another character without having to redo capture sessions with your actor. We found this workflow very useful for dialing in expressions on our character and refining the individual blend shapes.

How the Facial AR Remote works

The remote is made up of a client phone app, with a stream reader acting as the server in Unity’s editor. The client is a light app that’s able to make use of the latest additions to ARKit and send that data over the network to the Network Stream Source on the Stream Reader GameObject. Using a simple TCP/IP socket and fixed-size byte stream, we send every frame of blendshape, camera and head pose data from the device to the editor. The editor then decodes the stream and to updates the rigged character in real time. To smooth out some jitter due to network latency, the stream reader keeps a tunable buffer of historic frames for when the editor inevitably lags behind the phone. We found this to be a crucial feature for preserving a smooth look on the preview character while staying as close as possible the real actor’s current pose. In poor network conditions, the preview will sometimes drop frames to catch up, but all data is still recorded with the original timestamps from the device.

On the editor side, we use the stream data to drive the character for preview as well as baking animation clips. Since we save the raw stream from the phone to disk, we can continue to play back this data on a character as we refine the blend shapes. And since the save data is just a raw stream from the phone, we can even re-target the motion to different characters. Once you have a stream you’re happy with captured, you can bake the stream to an animation clip on a character. This is great since they can use that clip that you have authored like any other animation in Unity to drive a character in Mecanim, Timeline or any of the other ways animation is used.

The Windup animation demo

With the Windup rendering tech demo previously completed, the team was able to use those high-quality assets to start our animation exploration. Since we were able to get a baseline up and running rather quickly, we had a lot of time to iterate on the blend shapes using the tools we were developing. Jitter, smoothing and shape tuning quickly became the major areas of focus for the project. The solves for the jittering were improved by figuring out the connection between frame rate and lag in frame processing as well as removing camera movement from the playback. Removing the ability to move the camera really focused the users on capturing the blend shapes and facilitated us being able to mount the phone in a stand.

Understanding the blend shapes and getting the most out of the blend shape anchors in ARKit is what required the most iteration. It is difficult to understand the minutia of the different shapes from the documentation. So much of the final expression comes from the stylization of the character and how the shapes combine in some expected ways. We found that shapes like the eye/cheek squint shapes and mouth stretch were improved by limiting the influence of the blend shape changes to specific areas of the face. For example, the cheek squint should have little to no effect on the lower eyelid, and the lower eyelid in the squint should have little to no effect on the cheek. It also does not help that we initially missed how the mouthClosed shape was a corrective pose to bring the lips closed with the jawOpen shape at 100%.

Using information from the Skinned Mesh Renderer to look at the values that made up my expression on any frame, then under- or over-driving those values really helped to dial in the blend shapes. We were able to quickly over or underdrive the current blend shapes and determine if any blend shapes needed to be modified, and by how much. This helped with one of the hardest things to do, getting the right character to a key pose, like the way we wanted the little girl to smile. This was really helped by being able to see what shapes make up a given pose and in this case, it was the amount mouth stretch right and left worked with the smile to give the final shape. We found it helps to think of the shapes the phone provided as little building blocks, not as some face pose a human could make in isolation.

At the very end of art production on the demo, we wanted to try an experiment to improve some of the animation on the character. Armed with the collective understanding of the blend shapes from ARKit, we tried modifying the base neutral pose of the character. Due to the stylization of the little girl character, there was an idea that the base pose of the character had the eyes too wide and a little too much base smile to the face. This left too little in the delta between eyes wide and base, with too wide a delta between base and closed. The effect of the squint blend shapes also needed to be better accounted for. The squint as it turns out seems to always be at ~60-70% when someone closes their eyes for the people we tested on. The change to the neutral pose paid off, and along with all the other work makes for the expressive and dynamic character you see in the demo.

The future

Combining Facial AR Remote and the rest of the tools in Unity, there is no limit to the amazing animations you can create! Soon anyone will be able to puppeteer digital characters, be it kids acting out and recording their favorite characters then sharing with friends and family, game streamers adding extra life to their avatars, or opening up new avenues for professionals and hobbyists to make animated content for broadcast. Get started by downloading Unity 2018 and checking out setup instructions on Facial AR Remote’s github. The team and the rest of Unity look forward to the artistic and creative uses of Facial AR Remote our users will create.

Ya no se aceptan más comentarios.

  1. I hope to see this come to Android phones. I won’t be purchasing an Apple device.

  2. This is a really wonderful feature. I was waiting for such a function. Thank you very much.

    However, I saw Github, quick start is too simple to understand.
    If possible, would you please write a more gentle quick start?
    I think that users will increase more by enriching quick start :)

  3. Ist cool you choose what comment it will be show up and you just delete all what is not says “thanks you are best ” – sad.

  4. Guys, this is the first iteration. IPhone have the facil kit already integrated. It is not even finished and you all want the full version for every plataform?

  5. Wow awesome, but i would never buy apple phone, when does it come to Android?

    1. Me too. That’s why I try to use DLib and OpenCV for facial mask on Android. Of course it is not 3D capture but can be used for some applications
      https://www.youtube.com/watch?v=P9AbHQjS1-A

  6. I’ve seen that the iphone app needs to be built and uploaded in the device which means that we also need to be in the apple developer program. But what if we want to create facial animations for desktop games using this? Every penny counts for us indie devs. Do you plan to create a dedicated iphone app that we can download for this use case?

  7. Great job guys! This is truly futuristic technology, stuff that would cost a ton of money and time before this. This is a game changer, literally and figuratively.

  8. Had to agree – I’m also want to see this on ARCore

  9. GIFs are the absolute worst way to show off this tech…

    1. Johnathan Newberry

      agosto 14, 2018 a las 7:40 pm

      Thanks for the feedback. We have updated the article with higher quality webm videos. You should also checkout the Unite Berlin Keynote where we demo the Facial AR Remote on stage.
      https://youtu.be/3omw9dLkrR8

      1. This looks cool. I think the video links might be broken though.

        1. Damn, didn’t mean to post it there! The youtube video is fine (skip to 33 minutes in by the way) the blog videos seem broken…

  10. Will we see support for wrinklemaps via this workflow in the future?

    1. Johnathan Newberry

      agosto 14, 2018 a las 7:37 pm

      Currently, there are no plane to support wrinklemaps. I will make sure that it gets added to a list for suggested features. In the mean time, if you would like to add them to your project you can look at creating your own implementation of IUsesStreamReader to drive the wrinklemaps in a renderer’s material. Look at how BlendShapesController and CharacterRigController do this to control other attributes of the character.

  11. Great! Can’t wait to experiment with this.

  12. Any plan to make it supports Android phone like Xiaomi Mi 8?

  13. Can this feature use off the shelf depth cameras? Or is it only with the phone’s depth camera?

    1. Johnathan Newberry

      agosto 13, 2018 a las 6:53 pm

      The Facial AR Remote makes use of Unity’s ARKit Plugin for facial data provided by iOS and ARKit from the iPhone X to drive the blend shapes in editor.

  14. Could it be ARCore support?

  15. i hope to see this Facial AR feature on Android phones too :]

    i hate iphone and apple!!