Mixed and Augmented Reality Studio (MARS): Designing a framework for simplified, flexible AR authoring
We set out to build workflows that give creators the ability to make AR applications that work the way we want them to: context-aware, flexible, customizable, functional anywhere and with any kind of data – without requiring tons of code.
As we near the release of MARS this year, we wanted to share the backstory of how and why we’re building this work environment and companion apps – which we believe are a major step toward ushering in the next generation of spatial, contextual computing across experience levels.
The question that spawned MARS
How could AR make better sense in Unity? Real world and application coordinate frames rarely align. Developers need a tool that allows them to author precisely for an uncertain world. This tool would need to accommodate a variety of situations with “fuzzy” placement rules.
To get there, our answer is breaking one large coordinate system into a series of small ones. Each represents a known space. These spaces then procedurally adapt to reality. The solution is symbiotic: the aspects of the real world we base our content on also define the bounds of our coordinate systems.
We tested this idea during a Unity Labs hackweek, where we were able to create a multi-plane experience. This was unheard of a few years ago, and it’s still rare today without MARS.
Our solution was so effective that we wanted to make it possible to create even more ambitious AR experiences. This leads to more challenges. The development of MARS has been a two-step cadence of question and answer ever since. Each step evolves what MARS can do, then pushes us to develop the platform further.
We identified several major challenges with AR development during that initial hackweek:
- Describing the world digitally is an intense dependency graphing challenge. It requires a real-time database that continuously grows and gains new data types.
- Game code and platform management do not mix well. The further we can keep these elements apart, the better.
- We want users to write scripts that affect the layout of an entire AR scene – without knowing about the entire AR scene.
We also had core values we wanted MARS to embody:
- Feels like Unity – Our team’s motto is “Reality as a build target,” and we mean it quite literally. MARS extends Unity to make it an Editor for reality, but it is still fundamentally Unity.
- Safe – Spatial computing is a new frontier, and users should feel safe exploring and experimenting there. Good development patterns should be frictionless.
- Welcoming to all developers– Any part of MARS should be tailored to the technical level of the developer we expect to use it.
These problems are complex and comprehensive, so their solution needed to be so complete and separate from the application logic that it felt invisible. The MARS Data Layer was created to meet this challenge.
The MARS Data Layer can be separated into four parts that combine to form a versatile abstraction of reality.
|Query and Data Events|
|Data Description and Storage|
Our base is a digital description of the world. Semantics are the foundation of our universal AR language, and traits applied to surfaces like ‘floor’ and ‘wood’ are simple semantics.
Spatial semantics model something generic that can adapt to many variations in the real world. Let’s use a face as an example – there are many different shapes and sizes. Authoring AR content for a face would involve knowing which part of a body is the face and where the eyes are. Every face is different, but by targeting these pieces of data the content can adapt. The true power of MARS (and AR) comes when you move from authoring against specific low-level data into this higher-level semantic realm.
Data ownership is layered on top. For digital content to coexist with the world, there needs to be clear boundaries about what data is available. Data ownership allows aspects of the real world to be reserved for content – automatically. Managing the ideal arrangement of real-world data to digital content can be an impossible problem for application developers, but it’s solved by the MARS Data Layer.
Unity has a consistent pattern for data access in AR. Events are raised when data is detected, updated, or lost. The Query and Data Events system takes this concept to a higher level. Users set up queries, which are specified combinations of data and values. The MARS Data Layer then raises acquire, update, and lost events. This means that instead of knowing that just any plane was found, a user can know when a plane of the specified size, orientation, and light level was detected. Remember that MARS data storage is designed to dynamically work with new custom types of data. This means that users can get events for literally any kind of scenario, regardless of how complex it is.
At the very top of this structure is the proxy system. The proxy system represents the physical world with Unity objects. It automatically handles the conversion of this native Unity representation into queries. The AR proxy objects are hooked up to data events to give them an object lifecycle that matches purely digital Unity content.
Data for every developer type
MARS Data needs to work for every AR developer. We’ve organized them into three core groups:
- Designers, who have no interest in the minutiae of the data and want to create new AR apps in the semantic and visual context of the real world.
- Providers, who add novel data to the AR ecosystem through cutting-edge hardware and software techniques.
- Engineers, who make clever use of data and interaction to bridge the gap between a designer’s vision and a provider’s data.
Each group has a specialized way to interact with the MARS Database. This way, each developer type can interact with the others while each benefiting from an experience tailored to their own needs and workflow.
MARS Data providers
Providers work exclusively with the Data Storage and Description features of the MARS Data Layer. As you can see from the architecture above, MARS is hungry for external data and functionality. Providers are given a simple API to work with, consisting of just a few functions to add, update and remove data of any type. Providers can add multiple types of data, such as orientation, position, rotation, color, light, and roughness, then associate them together.
Here is an example of how AR Foundation planes are piped into MARS:
void AddPlaneData(MRPlane plane)
var id = this.AddOrUpdateData(plane);
this.AddOrUpdateTrait(id, TraitNames.Plane, true);
this.AddOrUpdateTrait(id, TraitNames.Pose, plane.pose);
this.AddOrUpdateTrait(id, TraitNames.Bounds2D, plane.extents);
this.AddOrUpdateTrait(id, TraitNames.Alignment, (int)plane.alignment);
if (planeAdded != null)
It’s important to note that the provider interface uses a functionality injection pattern. By decoupling the API from implementation, we are able to easily switch between data sources. This is critical for data simulation and playback while authoring.
Providers are required to list which data they add to the system in the class definition. This is the trait list for the MARS plane provider:
static readonly TraitDefinition k_ProvidedTraits =
By having this data available at compile-time, we can see exactly what data is available to our application on each platform. This lets us know whether an application will function, or if it needs additional support scripts in the form of Reasoning APIs.
AR engineers often have to reason about the world with incomplete or unexpected data. Consider this simple example: an application that displays an educational graphic around a statue. Object recognition for the statue will be available on some devices, while image markers can be used on others. Relocalization to a space that has the statue’s location pre-authored in it is available on yet another platform.
This example no longer seems so simple. Let’s add some more complications: what happens if a user views a picture of the statue? Or a small-scale replica? What about users who want the experience but don’t have access to the statue? What about users in VR?
It’s possible to make an application that can handle a subset of all of these scenarios and contingencies. In the past, this might have been the only solution to create this kind of AR experience, but it’s not a good one. The resulting scene would be a confounding web of objects and fragile fallbacks with application logic, platform abstraction, and world analysis all mixed together. Platform-specific problems are common, and debugging is difficult at best.
Reasoning APIs are the solution: these are scripting interfaces that provide engineers with the power and knowledge to handle all of these complex scenarios. MARS handles the logic of when these scripts are needed. The list of traits supplied by available providers is combined with the list of traits required by MARS content to determine which reasoning APIs most efficiently bridge the gap. In a case where there is no suitable reasoning API available, we can alert the developer to this fact.
The reasoning API interface interacts with the Data Storage and Description feature of MARS. Reasoning APIs are given the ability to access entire lists of MARS data at once, such as an entire list of vertically sorted planes, for example.
Reasoning APIs use the same function calls as data providers to add, update, and remove data. This means MARS can mix and match data from reasoning APIs and hardware providers. The gap in functionality is filled seamlessly without the application having to make any changes.
We want to represent the properties of the real world in Unity as visual objects that users can reference. Visuals enable users to author and validate their digital content quickly. Object references allow Unity scripts and events to work with their existing game code without any additional scripting. This point is critical – we want MARS to work with all Asset Store and other user packages without modification. Cross-compatibility is strongest when our workflows follow Unity’s best practices.
Our object system is designed to be simple – complex structures are made by combining these simple pieces together in different ways. It is designed to be self-contained. It works in all the ways regular Unity objects do: directly in the scene view and in all types of Prefabs.
The three components that make up MARS content are the Proxy, ProxyGroup, and Spawner. We go into more detail about these objects here.
A Proxy defines a single object in AR – this is the limit of most AR toolsets. ProxyGroups allow users to describe scenarios where multiple things in the real world must relate in some way. No other AR authoring experience offers this functionality. It is an incredibly complex problem to solve algorithmically, which is why we created the MARS Data Layer to handle it for you. The last component is the Spawner. Spawners are objects that contain a Proxy or ProxyGroup and duplicate them repeatedly, transforming them into a ruleset that reskins your entire reality.
From the top down
We’ll recap how everything fits together by referring back to the layer diagram:
- All Proxy, ProxyGroup, and Spawner components are authored by designers. They create queries and respond to events.
- Queries search the database for matches and control ownership.
- Reasoning APIs and Providers add and remove data from the database.
Each aspect of the data layer comes together similarly to ensure the best experience on every platform.
- MARS proxy objects in the scene define the required set of traits an application needs to run.
- Providers define the set of traits available to an application.
- Reasoning APIs act as bridges to navigate from the set of available traits to the full collection an application requires.
We are beyond excited to see the boundary-pushing applications that users will be able to create with MARS. With your help, we will continue to push spatial computing even farther. If you’re interested in learning about MARS and receiving the latest news as we move toward a wider release, please sign up for updates and check out our new Project MARS webpage.