Profiling with Instruments

February 1, 2016 in 技术

In the Enterprise Support team, we see a lot of iOS projects. At some point, in any iOS development, developers often end up running their game and sitting there thinking “Why the hell is this running so slowly?”. There are some great sets of tools for analysing performance out there and, one of the best is Instruments. Read on to find out how to use it to find your issues!   

To use Instruments, or any of XCode’s debugging tools, you will need to build a Unity project for the iOS Build Target (with the Development Build and Script Debugging options unchecked). Then you will need to compile the resultant XCode project with XCode in Release mode and deploy it to an attached iOS device.

After starting Instruments (by either a long press on the play button, or selecting Products>Profile), select the Time Profiler. To begin a profiling run, select the built application from the application selector, then press the red Record button. The application will launch on the iOS device with Instruments connected, and the Time Profiler will begin recording telemetry. The telemetry will appear as a blue graph on the Instruments timeline.

blogpic

P.S. To clean up the call hierarchy, the Details pane on the right-hand side of the Call Tree has two options, located in the “Settings” submenu (click on the gear icon in the Details pane). Select Flatten Recursion and Hide System Libraries.

A list of method calls will appear in the detail section of the Instruments window. Each top-level method call represents a thread within the application.

In general, the main method is the location of all hotspots of interest, as it contains all managed code.

Expanding the main method will yield a deep tree of method calls. The major branch is between two methods:

  • [startUnity] and UnityLoadApplication (These method names sometimes appear in ALL CAPS).
  • PlayerLoop

[startUnity] is of interest as it contains all time spent initializing the Unity engine. A method named UnityLoadApplication will be found beneath it.It is beneath UnityLoadApplication that startup time can be profiled.

image00

Once you have a nice time-slice of your application profiled, pause the Profiler, and start expanding the tree.  As you work down the tree, you will notice the time in ms reduces in the left hand column.d What you are looking for are items that cause a significant reduction in the time.  This will be a performance hotspot.  Once you have found one, you can go back to your code-base, and find out WTF is going on that is taking so much time.  It could be that it is a totally necessary operation, or it could be some time in the distant past you hacked some  pre-production code in that has made it over to your production project, or …well… it could for a million reasons really.  How/if you decide to fix this hotspot would be largely up to you, as you know your codebase better than anyone :D.

Instruments can also be used to look for performance sinks that are distributed broadly — ones that lack a single large hotspot, but instead show up as a few milliseconds of lost time in many different places in a codebase.  To do this, type either a partial or full function name into Instruments’ symbol search box, located above and to the right of the call tree. If profiling a slice of gameplay, expand PlayerLoop and collapse all the methods beneath it. If profiling startup time, expand UnityLoadApplication and collapse the methods beneath it.  The total number of milliseconds wasted on a specific operation can be roughly estimated by looking at the total time spent in PlayerLoop or UnityLoadApplication and subtracting the number of milliseconds located in the self column.

Common methods to look for:
– “Box(“, “box” and “box” — these indicate that C# value boxing is occurring; most instances of boxing are trivially fixed
– “Concat” — string concatenation is often easily optimized away
– “CreateScriptingArray” — All Unity APIs that return arrays will allocate new copies of arrays. Minimize calls to these methods.
– “Reflection” — reflection is slow. Use this to estimate the time lost to reflection and eliminate it where possible.
– “FindObjectOfType” — Use this to locate repeated or unnecessary calls to FindObjectOfType, or other known-slow Unity APIs.
– “Linq” — Examine the time lost to creating and discarding Linq queries; consider replacing hotspots with manually-optimized methods.

As well as profiling CPU time, Instruments also allows you to profile memory usage.  Instruments’ Allocations profiler provides two probes that offer detailed views into the memory usage of an application. The Allocations probe permits inspection of the objects resident within memory during a specific time-span. The VM Tracker probe permits monitoring of the dirty memory heap size, which is the primary metric used by iOS to determine when an application must forcibly closed.

Both probes will run simultaneously when selecting the Allocations profiler in Instruments. As usual, begin a profiling run by pressing the red Record button.

To set up the Allocations probe correctly, ensure the following settings are correct in the Detail tab on the right-hand side of Instruments.   Under Display Settings (middle option), ensure Allocation Lifespan is set to Created & Persistent.  Under Record Settings (left option), ensure Discard events for freed memory is checked.

The most useful display for examining memory behavior is the Statistics display, which is the default display when using the Allocations probe. This display shows a timeline. When used with the recommended settings, the graph displays blue lines indicating the time and magnitude of memory allocations which are still currently live.By watching this graph, you can watch for long-lived or leaked memory by simply repeating the scenario under test and ensuring that no blue lines remain alive between runs.

Another useful display is the Call Trees display. It displays the line of code at which allocations are performed, along with the amount of memory consumption the line of code is responsible for.

Below you can see that about 25% of the total memory usage of the application under test is solely due to shaders. Given that the shaders’ location in the loading thread, these must be the standard shaders bundled with default Unity projects which are then loaded at application startup time.

image01

As before, once you have identified a hotspot, what you do with it is totally dependent on your project.

So there you go.  A brief guide to Instruments. 1000(ish) words and no A-Team references. We don’t want to get into trouble like last time. Copyright violations are officially Not Funny™.

The Enterprise Support team is creating more of these guides, and we will be posting the full versions of our Best Practice guides in the coming months!

We love it when a plan comes together.

Comments (7)

Subscribe to comments
  1. کریو

    February 20, 2016 at 11:43 am / 

    Tnx dear admin very good ;) کريو

  2. Luis Correa

    February 4, 2016 at 3:11 am / 

    All my method names apear as memory addresses, am I doing something wrong?

    1. r618

      February 4, 2016 at 4:52 am / 

      build in Debug build configuration – you’re missing symbols

    2. Ian D.

      February 4, 2016 at 10:59 am / 

      Make sure you’re building a dSYM file. In XCode, click on the top-level project icon in the Project Navigator, then go to “Build Settings”. Under Build Options, looks for the Debug Information Format settings, and make sure the Release setting is set to DWARF with dSYM.

      If that doesn’t work, make sure your XCode project is set up to use the Release profile when profiling. Go to Product > Scheme > Edit Scheme…. In the menu that pops up, click the Release entry and ensure that Build Configuration is set to Release.

      Building in Debug mode can alter timings, as compile-time optimizations won’t be performed. It’s still a good way to look for big performance hotspots, but profiling in Release is definitely more accurate at getting timings that correspond to your final product.

      1. Luis Correa

        February 4, 2016 at 4:12 pm / 

        That’s probably it! I usually skip DSYM for internal testing release builds.

      2. r618

        February 4, 2016 at 6:55 pm / 

        yes, you’re right, i completely forgot about/omitted profiling context

  3. r618

    February 2, 2016 at 7:25 pm / 

    a valuable post, much needed for iOS profiling
    ( much better than zerolight ninja bragging – sorry couldn’t resist ^^)

Comments are closed.