Looking to the future of mixed reality (Part III)
In this third and final part of the series we analyze how foundational platforms will and must evolve, and outline our vision of how MR applications will be created and executed. Part I summarized the current and future issues facing mixed reality and its adoption in the mainstream. Part II explored the design challenges facing developers with this new mixed reality medium.
Foundational MR services
The tools and services Unity provides for creating apps -and the apps themselves- must live in and interact with a set of foundational services. So in order to understand the future of MR creation/adoption and where we should go, we take a step back and think holistically.
Facebook’s purchase of Oculus back in 2014 kicked off the latest wave of consumer virtual reality devices and applications, and because of it, mixed reality has been predicted to become the next computing platform. Goldman Sachs stated in early 2016 that VR and AR have the potential “to emerge from specific use cases to a broader computing platform”. A March 2017 Forbes article titled “The Next Mobile Computing Platform: A Pair Of Sunglasses” projected that MR glasses will replace mobile phones, monitors and TVs, with foundation software and services evolving accordingly.
Articles like this one from Reuters characterize this emerging platform as a combination of augmented reality, artificial intelligence and cloud-based services, and we see major tech companies offering each their own perspective and implementation. Clay Bavor sees the combination of VR and AR with artificial intelligence and machine learning as enabling “the next phase of Google’s mission to organize the world’s information”. Through their Windows 10 Fall Creators Update and VR/AR devices, Microsoft aims to build “the platform that empowers everyone to create.” In June this year Apple introduced ARKit with iOS 11, and their keynote presented the A11 Bionic chip with its neural engine for performing artificial intelligence, initially aimed at facial recognition.
Operating systems and platform services are already evolving to include features for immersive technologies. Android Nougat introduced support and optimizations for virtual reality at the OS level. Windows Mixed Reality Platform APIs are part of the Universal Windows Platform. And as we saw in this year’s keynote, Apple’s new hardware is built for AR and AI, with the operating system and SDKs tightly coupled for optimized use. Now, what else do we need in a MR OS? What does it look like?
A paper from the 14th Workshop on Hot Topics in Operating Systems explored how operating systems should evolve to support AR applications. Since natural user input is needed for AR, the authors reimagine input within the context of it being a continuous sensing of the real world. They analyse user privacy mechanisms that should be put in place due to sensitive info mixed in with raw data. Furthermore, they argue sensor input access should not be restricted to one application at a time; instead it is desirable to let multiple AR apps from different vendors simultaneously read sensor inputs and render virtual overlays in a shared 3D reality. They also point out how the synthetic window abstraction in traditional GUIs is no longer viable so OSs must evolve to “expose 3D objects as the display abstraction and perform display isolation and management in the 3D space.” And functionality that is very common across AR apps, such as computer vision and AI, should be offloaded from applications to dedicated OS modules.
Now, as described in our previous articles, we are re-thinking the principle that “one content or service equals one application icon”. We believe MR should be human and object centric, and use the world as a device (aka “clickable world”). So we are re-examining the notion that applications are presented as separate icons that must be clicked on by the user in order to be executed. In this new computing platform where computer vision and AI are foundational services, we envision MR experiences launching not just when requested by the user but when they are needed (e.g. a battle game automagically starting when user picks up a physical nerf gun, or microwave cooking instructions appearing when user picks a dinner meal from the fridge). This would require OS/platform services that allow apps to register their launching conditions, an extension of automating location-based tasks on mobile devices.
In previous articles we also mentioned our projection that sensors outside MR devices will populate physical spaces around us. The IoT will communicate bidirectionally with MR wearables. Clearly this scenario is inspired by Vernor Vinge’s Rainbows End – “Cryptic machines were everywhere nowadays. They lurked in walls, nestled in trees, even littered the lawns. They worked silently, almost invisibly, twenty-four hours a day”– but there are already examples of general purpose sensing, like the Synthetic Sensors initiative at Carnegie Mellon University. Sensors are needed to allow better understanding of the physical world, but as Greg alluded to in part II, the world as a playground and keeping users connected to the tangible world around them also requires the ability to manipulate physical objects from within mixed reality. Some current examples that allow this bidirectional communication are the Reality Editor and Open Hybrid. What this means for the emerging MR computing platform is that OS services on MR glasses should extend the concept of input/output devices to include new non-local hardware that it identifies, connects to and possibly controls at run time (i.e. recognize and connect to environmental sensors when entering a room, or identify and interact at run time with smart outdoor cameras) so that individual applications don’t have to reimplement this every time.
So, based on current trends and how we believe the “next computing platform” should evolve, these are some of the assumptions for our research work:
- Connectivity will not be an issue: Fifth-generation (5G) wireless networks are predicted to compete with fixed broadband. Light Fidelity (LiFi) communication has been deployed and is expected to take off soon in the US and Europe, perhaps also in the rest of the world.
- Operating systems and platforms will include machine learning and computer vision services (some of them running in the cloud), and allow bidirectional communication with the IoT.
- Cloud services will allow storing a persistent matrix, i.e. a digital copy of the physical world, and the synchronized shared mixed realities. This way, devices become windows into persistent realities.
- A variety of sensors and devices connected to the IoT will populate environments and communicate bidirectionally with MR devices.
Of course we don’t really know exactly how services and technology around these assumptions will shape up and the profound implications on what an engine like Unity is able to build as features. Additionally, data privacy and security implications are large and need to be considered deeply by everyone involved. We hope our experimentations bring some answers and shape the field towards mixed reality becoming mainstream the right way.
In a world described in the previous section, we envision MR experiences taking the form of contextual applications like the business card and the cereal box game presented in part II of our blog. These types of applications would require shifting some paradigms around creation and development.
First, we address launching MR experiences when needed. As part of the process of creating an MR app in Unity, creators should be able to specify the conditions that must be met to launch the app. We use the term contextual trigger for this, and define it this way:
Contextual triggers are the criteria around a user’s virtual or physical environment that activate an MR experience. They are a combination of time of day, GPS coordinates, proximity to certain physical object(s), fiducial markers, serial number on an internet-enabled device, and through brain-computer interfaces and advances in neuroscience and machine learning, contextual triggers will come from users’ intent, preferences and behaviors. The contextual trigger for a specific experience will be the set of predicates that when valid will cause an MR experience to start.
Second, we address moving away from applications as isolated executables that run on their own memory space. Instead, we define a contextual application as a collection of data that defines an MR experience and has access to a persistent reality. In simple terms, we “augment” the classic definition of interactive app to be composed of:
- Contextual triggers
- Application logic, aka interaction rules: how it interacts with the user and the world
- Assets: 3D/2D visuals and audio that are rendered on the client.
Contextual triggers and apps are just groundwork concepts for upcoming research and experimentation. We are currently elaborating on them and prototyping, but there are many more questions to answer, e.g. when 3D objects (instead of windows) are the display abstraction, what should the Unity canvas look like? How to bring physical objects into the authoring process? What does a debugger or spectator module look like when editing reality? “Editing reality” is the overarching theme. We need to think not only about how applications will have virtual and physical realities interacting, but also about the process of creating these applications.
I leave you with a video of a mock application exploring how MR contextual apps could be created inside MR. By reusing some UI elements with which many are familiar, this is a little step towards pure virtual UIs. A frame around the working area acts as a container, and the UI is anchored outside so that dragging the frame towards a new physical object will bring the UI elements with it. The frame also guides the user and indicates where to look in order to activate/deactivate UI. As mentioned in part II of this series, the idea is that UI should be clean, subtle, efficient and contextual. This first step is an evolution, not a revolution so that it is useful and usable tomorrow, but we are continuously re-thinking UX and interaction in order to get to that future when physical screens will disappear (by “we” I mean our lead designer Greg. “I” the engineer am continuously figuring out how to implement it! 😄)A mock application, rendered (or captured) with ARkit, showing how MR contextual apps could be authored in MR.
We are prototyping these and more ideas to understand the space and push for innovation and will keep you informed.
Article contributors: Greg Madison, Lead UX/IxD Designer and futurist; Colin Alleyne, Senior Technical Writer; and Sylvio Drouin, VP – Unity Labs.
Thanks to Unity Lab’s Authoring Tools Group who programmed the mock application shown in the video.