Search Unity

Nobody likes loading screens. Did you know that you can quickly adjust Async Upload Pipeline (AUP) parameters to significantly improve your loading times? This article details how meshes and textures are loaded through the AUP. This understanding could help you speed up loading time significantly — some projects have seen over 2x performance improvements!

Read on to learn how the AUP works from a technical standpoint and what APIs you should be using to get the most out of it.

Try it Out

The latest, most optimal implementation of the Asset Upload Pipeline is available in the 2018.3 beta.

 

Download 2018.3 Beta Today

 

First, let’s take a detailed look at when the AUP is used and how the loading process works.

When is the Async Upload Pipeline used?

Prior to 2018.3, the AUP only handled textures. Starting with 2018.3 beta, the AUP now loads textures and meshes, but there are some exceptions. Textures that are read/write enabled, or meshes that are read/write enabled or compressed, will not use the AUP. (Note that Texture Mipmap Streaming, which was introduced in 2018.2, also uses AUP.)

How the loading process works

During the build process, the Texture or Mesh Object is written to a serialized file and the large binary data (texture or vertex data) is written to an accompanying .resS file. This layout applies to both player data and asset bundles. The separation of the object and binary data allows for faster loading of the serialized file (which will generally contain small objects), and it enables streamlined loading of the large binary data from the .resS file after. When the Texture or Mesh Object is deserialized, it submits a command to the AUP’s command queue. Once that command completes, the Texture or Mesh data has been uploaded to the GPU and the object can be integrated on the main thread.

Figure: Layout of mesh and texture data when serialized for a build.

During the upload process, the large binary data from the .resS file is read to a fixed-sized ring buffer. Once in memory, the data is uploaded to the GPU in a time-sliced fashion on the render thread. The size of the ring buffer and the duration of the time-slice are the two parameters that you can change to affect the behavior of the system.

The Async Upload Pipeline has the following process for each command:

  1. Wait until the required memory is available in the ring buffer.
  2. Read data from the source .resS file to the allocated memory.
  3. Perform post-processing (texture decompression, mesh collision generation, per platform fixup, etc).
  4. Upload in a time-sliced manner on the render thread
  5. Release Ring Buffer memory.

Multiple commands can be in progress simultaneously, but all must allocate their required memory out of the same shared ring buffer. When the ring buffer fills up, new commands will wait; this waiting will not cause main-thread blocking or affect frame rate, it simply slows the async loading process.

A summary of these impacts are as follows:

Load Pipeline Comparison
Without AUP AUP Impact on you
Memory Usage Allocate as data is read out of default heap. (High memory  watermarks) Fixed size ring buffer Reduced high memory watermarks
Upload Process Upload as data is available Amortized uploading with fixed time-slice Hitchless uploading
Post Processing Performed on loading thread (blocks loading thread) Performed on jobs in background Faster Loading

What public APIs are available to adjust loading parameters

To take full advantage of the AUP in 2018.3, there are three parameters that can be adjusted at runtime for this system:

  • QualitySettings.asyncUploadTimeSlice — The amount of time in milliseconds spent uploading textures and mesh data on the render thread for each frame. When an async load operation is in progress, the system will perform two time slices of this size. The default value is 2ms. If this value is too small, you could become bottlenecked on texture/mesh GPU uploading. A value too large, on the other hand, might result in framerate hitching.
  • QualitySettings.asyncUploadBufferSize — The size of the Ring Buffer in Megabytes. When the upload time slice occurs each frame, we want to be sure that we have enough data in the ring buffer to utilize the entire time-slice. If the ring buffer is too small, the upload time slice will be cut short. The default was 4MB in 2018.2 but has increased 16MB in 2018.3.
  • QualitySettings.asyncUploadPersistentBuffer — Introduced in 2018.3, this flag determines if the upload ring buffer is deallocated when all pending reads are complete. Allocating and deallocating this buffer can often cause memory fragmentation, so it should generally be left at its default(true). If you really need to reclaim memory when you are not loading, you can set this value to false.

These settings can be adjusted through the scripting API or via the QualitySettings menu.

Example workflow

Let’s examine a workload with lots of textures and meshes being uploaded through the Async Upload Pipeline using the default 2ms time slice and a 4MB ring buffer. Since we’re loading, we get 2 time-slices per render frame, so we should have 4 milliseconds of upload time. Looking at the profiler data, we only use about 1.5 milliseconds. We can also see that immediately after the upload, a new read operation is issued now that memory is available in the ring buffer. This is a sign that a larger ring buffer is needed.

Let’s try increasing the Ring Buffer and since we’re in a loading screen, it is also a good idea to increase the upload time-slice. Here’s what a 16MB Ring Buffer and 4-millisecond time slice look like:

Now we can see that we are spending almost all our render thread time uploading, and just a short time between uploads rendering the frame.

Below are the loading times of the sample workload with a variety of upload time slices and Ring Buffer sizes. Tests were run on a MacBook Pro, 2.8GHz Intel Core i7 running OS X El Capitan. Upload speeds and I/O speeds will vary on different platforms and devices. The workload is a subset of the Viking Village sample project that we use internally for performance testing. Because there are other objects being loaded, we aren’t able to get the precise performance win of the different values. It’s safe to say in this case, however, that the texture and mesh loading is at least twice as fast when switching from the 4MB/2MS settings to the 16MB/4MS settings.

Experimenting with these parameters outputs the following results.

To optimize loading times for this particular sample project, we should, therefore, configure settings like this:

Takeaways and recommendations

General recommendations for optimizing loading speed of textures and meshes:

  • Choose the largest QualitySettings.asyncUploadTimeSlice that doesn’t result in dropping frames.
  • During loading screens, temporarily increase QualitySettings.asyncUploadTimeSlice.
  • Use the profiler to examine the time slice utilization. The time slice will show up as AsyncUploadManager.AsyncResourceUpload in the profiler. Increase QualitySettings.asyncUploadBufferSize if your time slice is not being fully utilized.
  • Things will generally load faster with a larger QualitySettings.asyncUploadBufferSize, so if you can afford the memory, increase it to 16MB or 32MB.
  • Leave QualitySettings.asyncUploadPersistentBuffer set to true unless you have a compelling reason to reduce your runtime memory usage while not loading.

FAQ

Q: How often will time-sliced uploading occur on the render thread?

  • Time-sliced uploading will occur once per render frame, or twice during an async load operation. VSync affects this pipeline. While the render thread is waiting for a VSync, you could be uploading. If you are running at 16ms frames and then one frame goes long, say 17ms, you will end up waiting for the vsync for 15ms. In general, the higher the frame rate, the more frequently upload time slices will occur.

Q: What is loaded through the AUP?

  • Textures that are not read/write-enabled are uploaded through the AUP.
  • As of 2018.2, texture mipmaps are streamed through the AUP.
  • As of 2018.3, meshes are also uploaded through the AUP so long as they are uncompressed and not read/write enabled.

Q: What if the ring buffer is not large enough to hold the data being uploaded(for example a really large texture)?

  • Upload commands that are larger than the ring buffer will wait until the ring buffer is fully consumed, then the ring buffer will be reallocated to fit the large allocation. Once the upload is complete, the ring buffer will be reallocated to its original size.

Q: How do synchronous load APIs work? For example, Resources.Load, AssetBundle.LoadAsset, etc.

  • Synchronous loading calls use the AUP and will essentially block the main thread until the async upload operation completes. The type of loading API used is not relevant.

Tell us what you think

We’re always looking for feedback.  Let us know what you think in the comments or on the Unity 2018.3 beta forum!

Добавить комментарий

Вы можете использовать эти теги и атрибуты HTML: <a href=""> <b> <code> <pre>

  1. If you update the application with windows modules installer worker high cpu it can be possible to go with the update of the processor, and with that, you can get the update regarding the template, with the shotgun template.

  2. Thank you for good information.
    I want to apply this but I can’t find asyncUpload info in profiler.
    async mesh upload, resource upload etc I can’t see anything.
    I use LoadSceneAsync in coroutine.
    Is there setting to use async upload?
    I use Unity 2018.3.0b9 and 2017.3.1p4
    Thank you

  3. Thank you for good information.
    I want to apply this but I can’t find asyncUpload info in profiler.
    async mesh upload, resource upload etc I can’t see anything.
    I use LoadSceneAsync in coroutine.
    Is there setting to use async upload?
    I use Unity 2018.3.0b9 and 2017.3.1p4
    Thank you

  4. Still confused with time slice , is there any useful resource to understand time slic in the asynchronous upload progress ?
    for example, why add this time slice feature, how this value affect the asynchronous upload ?what ‘s the relative btween time slice with frame rate?

  5. About the ring buffer… Do I properly understand that having a larger ring buffer would mean that stutters are more likely to occur during the upload time?

    1. Hi! Increasing the size of the ring buffer will NOT introduce stuttering. You want a large enough ring buffer so that each render thread timeslice can be fully utilized. If you’re ring buffer is small, you might consume all the pending uploads it contains before the timeslice is complete. The idea is to keep the ring buffer primed with data so the uploading step can use the entire timeslice

  6. Can this be used to Load big Textures from disk without Hickups? It would be wonderfull to Load a Texture in Async mode like mentioned here: https://feedback.unity3d.com/suggestions/async-texture2d-dot-loadimage-and-other-texture-operations?page=2

    Would this be possible?

    1. Currently the AUP is only used to load textures that were built through the build process

      1. Does this include AssetBundles, too?

        1. Yes it does

  7. Thanks so much!
    Now big problem is Asynchronous loading shaders — when shaders appears on screen, it’s take 100-600ms on render tread on iPhone 6 — it’s totally jork game…

    You can try our project: MadOut2 BigCityOnline

    Need add possibility to load and compile shaders async too!

    1. That is because of Shader Warming. The first time your shader is used it takes time to compile/load. To mitigate this you need to prewarm them manually, durin a startup or loading. You can easily do this through ShaderVariantCollection.WarmUp(), do note not to put all of your shaders in there.
      https://docs.unity3d.com/ScriptReference/ShaderVariantCollection.WarmUp.html

  8. Awesome feature. Question to Unity staff: how does this feature affect multi scene async loading, if it does? I would like to load in sections of my level as I move around. Also, what API is there to manage meshes and audio (if any)? Texture mip streaming was a great start but there is not much information about everything else.

    Thanks again!

    1. The AUP is used for textures and meshes whether you load a scene async or sync. With the new async uploading of mesh data, you should be able to realize faster mesh loading times in most cases (especially if your meshes have collision) as well as smoother uploading since the uploads are time sliced over multiple frames. So this should be good for your case of loading/unloading in sub-scenes during gameplay.

      Once the Mesh data is loaded through the AUP, it stays loaded until the Mesh object is unloaded. At the moment you can’t stream the mesh data in and out while keeping the Mesh object loaded. So you might want to load your sections asynchronously as individual scenes.

      1. Thank you for the informative reply, Joseph! I currently have designed the game around the concept of just streaming in a village or other event when triggered by culling groups API. Due to a major lack of information around the whole subject I suppose I need a guide on best practises for managing streaming in cases like this. Mesh data. Collision data, audio, probes and so on!

        Thanks for your hard work on this.

  9. Hi,

    thank you very much for this post, great information!

    You wrote «the higher the frame rate, the more frequently upload time slices will occur». Does this mean if I turn off VSync and set the applicationTargetFramerate as high as possible, it affects the loading time in a positive fashion?

    I’m asking, because I did the exact opposite. I reduced framerate to 20fps and turned on VSync during loading screens, thinking it would give Unity more resources to actually load scenes, assets and integrate those faster. I thought I trade faster loading for more hiccups in framerate.

    Thanks for your answer in advance.

    1. Hi Peter,

      That’s correct. You can give more time for mesh/texture uploading by increasing the timeslice or increasing the framerate or both. Reducing your framerate to 20fps would likely make your loading times worse as you concluded. As an experiment you might try measuring loading times after turning vsync off and increases both your buffer size and time slice.

      1. Hi Joseph,

        thanks for the answer.

        I gave it a try. Turned vsync off, increased buffer from 4 to 16 and time slice from 2 to 4 and 8. It didn’t affect loading times, to my surprise. The scene I tested had about 250 unique meshes and 500 unique textures, mostly splat- and alpha maps from terrain.

        I tested this with Unity 2017.4.9f1 on Console.

        1. Sounds like your loading speeds aren’t bottlenecked on the async upload pipeline. It’s difficult to diagnose the performance problem without more info. Prior to 2018.3 however, the terrain system would make all referenced textures readable, and as a result they can’t use the async upload pipeline. In 2018.3 the terrain code was refactored and it no longer marks textures as readable.

        2. Sounds like your loading speeds aren’t bottlenecked on the async upload pipeline. It’s difficult to diagnose the performance problem without more info. Prior to 2018.3 however, the terrain system would make all referenced textures readable, and as a result they can’t use the async upload pipeline. In 2018.3 the terrain code was refactored and it no longer marks textures as readable.

  10. What about triggering the upload of resources? Is this still bound to renderers becoming visible on cameras and does it still require «trick»s like rendering one frame behind a full-screen overlay? Or are there proper APIs for ensuring resources that I know will be needed can be loaded/uploaded completely during a loading screen?

    1. When textures and meshes are loaded with any loading API, their data will be uploaded to the GPU through the AUP. This is independent of Renderer components in your scene. mipmap streaming on the other hand is influenced by Renderer objects in the scene, and you do have to perform an update before they will start streaming in. More information on mipmap streaming can be found here: https://docs.google.com/document/d/1P3OUoQ_y6Iu9vKcI5B3Vs2kWhQYSXe02h6YrkDcEpGM/edit#heading=h.d2nucy9ys0gh

      1. Hey Joseph, when you said «When textures and meshes are loaded with any loading API, their data will be uploaded to the GPU through the AUP. This is independent of Renderer components in your scene.» — Is this new behavior to 2018.3? I ask because I am using 2018.2: During a loading screen, I load a GameObject with large textures via an AssetBundle, and spawn it in the scene. There is a noticeable hiccup the first time the user looks at the GameObject. Would I expect this hitch to go away by upgrading to 2018.3?

  11. as console developer who stock with unity 2017 … it pisses me of that all the cool stuff is only unity 2018 without any support for 2017 make regret that i didn’t switch to unreal engine

    1. The updates in UE 4.21 are not in UE 4.20… just saying….
      Happy coding ;)

      1. they offer work arounds unlike unity

        1. Uh… how does it screw over older versions when you can just… download all the previous versions if you want to?

    2. Soooo… if I understand your post… are you saying : «INJUSTICE!!! I picked and am clinging on an older version of your software and it doesn’t do what the newer version does — you suck, It would never happen in other superior engines, I demand support for everything everywhere «. Dude, either get with the newer versions or go to Unreal and deal with their pros and cons there- porting your project to a newer version of Unity will probably be easier on your soul than porting it to Unreal… You are always limited to the functionality offered in your version, maybe sometimes you get support later on, but they are primarily trying to make new stuff and implement it in the newer versions and move forward so you can get new S*!t in the newer versions FASTER and more robust instead of using countless resources focusing on compatibility issues for people using outdated versions. Maybe you wanna work on your new VR project and you will chose Unity 4.2, but you want job system and ECS and VR and basically everything? And I bet If I go find and install several versions of Unreal I will probably find dosens of examples with similar/same problems. I’m not sure why your post triggered me but I find your logic a bit faulty. Or just perhaps Unity might be looking for people that are willing to make packages to offer support of new functionalities for older versions and by what I read you are the right man/woman/apache for the job.

      1. unity is not up to date on console we are way behind you can’t use the latest version

        1. Any source to back this up? First time I hear about it but I haven’t dabbled in consoles with Unity yet.