Showing posts with label Profiling. Show all posts
Showing posts with label Profiling. Show all posts

12 December 2010

StrictMode API for Built-In Performance Monitoring

[This post is by Brad Fitzpatrick, an Android Software Engineer who worries unreasonably about responsiveness. —Tim Bray]

Back Story

One great thing about Google is “20% time”: spending 20% of your time working on projects outside your main focus area. When I joined Google, I bounced all over the place, often joking that I had seven 20% projects. One project I kept coming back to was Android. I loved its open nature, giving me access to do whatever I wanted, including opening my garage door when I approached my house on my motorcycle. I really wanted it to succeed but I worried about one thing: It wasn’t always super smooth. Animations would sometimes stutter and UI elements weren’t always immediately responsive to input. It was pretty obvious that things were sometimes happening on the wrong thread.

As a heavy SMS user, one of my 20% projects during the Cupcake (Android 1.5) release was speeding up the Messaging app and making it feel smoother. I got the app to a happy state and then continued bouncing between other 20% projects. When the Donut (Android 1.6) release came out, I noticed that a few of my Messaging optimizations had been accidentally broken. I was sad for a bit but then I realized what Android really needed was always-on, built-in, pervasive performance monitoring.

I joined the Android team full-time just over a year ago and spent a lot of time investigating Froyo performance issues, in particular debugging ANRs (those annoying dialogs you get when an application stalls its main thread’s Looper). Debugging ANRs with the tools at hand was painful and boring. There wasn’t enough instrumentation to find the causes, especially when multiple processes were involved (doing Binder or ContentResolver operations to Services or ContentProviders in other processes). There had to be a better way to track down latency hiccups and ANRs...

Enter StrictMode

“I see you were doing 120 ms in a 16 ms zone...”


StrictMode is a new API in Gingerbread which primarily lets you set a policy on a thread declaring what you’re not allowed to do on that thread, and what the penalty is if you violate the policy. Implementation-wise, this policy is simply a thread-local integer bitmask.

By default everything is allowed and it won’t get in your way unless you want it to. The flags you can enable in the thread policy include:

  • detect disk writes

  • detect disk reads

  • detect network usage

  • on a violation: log

  • on a violation: crash

  • on a violation: dropbox

  • on a violation: show an annoying dialog

In addition, StrictMode has about a dozen hooks around most of the places that hit the disk (in java.io.*, android.database.sqlite.*, etc) and network (java.net.*) which check the current thread’s policy, reacting as you’ve asked.

StrictMode’s powerful part is that the per-thread policies are propagated whenever Binder IPC calls are made to other Services or Providers, and stack traces are stitched together across any number of processes.

Nobody wants to be slow

You might know all the places where your app does disk I/O, but do you know all the places where the system services and providers do? I don’t. I’m learning, but it’s a lot of code. We’re continually working to clarify performance implications in the SDK docs, but I usually rely on StrictMode to help catch calls that inadvertently hit the disk.

Background on disks on phones

Wait, what’s wrong with hitting the disk? Android devices are all running flash memory, right? That’s like a super-fast SSD with no moving parts? I shouldn’t have to care? Unfortunately, you do.

You can’t depend on the flash components or filesystems used in most Android devices to be consistently fast. The YAFFS filesystem used on many Android devices, for instance, has a global lock around all its operations. Only one disk operation can be in-flight across the entire device. Even a simple “stat” operation can take quite a while if you are unlucky. Other devices with more traditional block device-based filesystems still occasionally suffer when the block rotation layer decides to garbage collect and do some slow internal flash erase operations. (For some good geeky background reading, see lwn.net/Articles/353411)

The take-away is that the “disk” (or filesystem) on mobile devices is usually fast, but the 90th percentile latencies are often quite poor. Also, most filesystems slow down quite a bit as they get more full. (See slides from Google I/O Zippy Android apps talk, linked off code.google.com/p/zippy-android)

The “main” Thread

Android callbacks and lifecycle events all typically happen on the main thread (aka “UI thread”). This makes life easier most of the time, but it’s also something you need to be careful of because all animations, scrolls, and flings process their animations by callbacks on the main thread.

If you want to run an animation at 60 fps and an input event comes in (also on the main thread), you have 16 ms to run your code reacting to that input event. If you take longer than 16 ms, perhaps by writing to disk, you’ve now stuttered your animation. Disk reads are often better, but they can also take longer than 16 ms, especially on YAFFS if you’re waiting for the filesystem lock that’s held by a process in the middle of a write.

The network is especially slow and inconsistent, so you should never do network requests on your main thread. In fact, in the upcoming Honeycomb release we’ve made network requests on the main thread a fatal error, unless your app is targeting an API version before Honeycomb. So if you want to get ready for the Honeycomb SDK, make sure you’re never doing network requests on your UI thread. (see “Tips on being smooth” below.)

Enabling StrictMode

The recommended way to use StrictMode is to turn it on during development, learn from it, and turn it off before you ship your app.

For example, in your application or component’s onCreate():

 public void onCreate() {
     if (DEVELOPER_MODE) {
         StrictMode.setThreadPolicy(new StrictMode.ThreadPolicy.Builder()
                 .detectDiskReads()
                 .detectDiskWrites()
                 .detectNetwork()
                 .penaltyLog()
                 .build());
     }
     super.onCreate();
 }

Or, simply:

    public void onCreate() {
     if (DEVELOPER_MODE) {
         StrictMode.enableDefaults();
     }
     super.onCreate();
 }

That latter form was specifically added so you can target pre-Gingerbread API versions but still easily enable StrictMode using reflection or other techniques. For instance, you could be targeting Donut (Android 1.6) but still use StrictMode if you’re testing on a Gingerbread device or emulator, as long as you use enough Reflection to call StrictMode.enableDefaults().

Watching StrictMode

If you’re using penaltyLog(), the default, just run adb logcat and watch the terminal output. Any violations will be logged to your console, slightly rate-limited for duplicate elimination.

If you want to get fancier, turn on penaltyDropbox() and they’ll be written to the DropBoxManager, where you can extract them later with
adb shell dumpsys dropbox data_app_strictmode --print

Tips on being smooth

In addition to Thread and java.util.concurrent.*, check out some of the Android APIs such as Handler, AsyncTask, AsyncQueryHandler, and IntentService.

Our Experience

During Android development we have a new “dogfood” build each day that the whole team uses. Throughout the development of Gingerbread we set up our daily dogfood builds to enable StrictMode logging and upload all found violations for analysis. Every hour a MapReduce job runs and produces an interactive report of all the event loop stalls, their stack traces (including cross-process ones), their latency percentiles, which processes/packages they appear in, etc.

Using the data from StrictMode we fixed hundreds of responsiveness bugs and animation glitches all across the board. We made performance optimizations in the Android core (e.g. system services and providers) so all apps on the system will benefit, as well as fixing up tons of app-specific issues (in both AOSP apps and Google apps). Even if you’re using Froyo today, the recent updates to GMail, Google Maps, and YouTube all benefited from StrictMode data collection gathered on Gingerbread devices.

Where we couldn’t automatically speed up the system, we instead added APIs to make certain patterns easier to do efficiently. For example, there is a new method SharedPreferences.Editor.apply(), which you should be using instead of commit() if you don’t need commit()’s return value. (It turns out almost nobody ever checks it.) You can even use reflection to conditionally use apply() vs. commit() depending on the user’s platform version.

Googlers who switched from Froyo to Gingerbread without seeing all the baby steps between were shocked at how much more responsive the system became. Our friends on the Chrome team then recently added something similar. Of course, StrictMode can’t take all the credit. The new concurrent garbage collector in Gingerbread also greatly reduces latency hiccups.

The Future

The StrictMode API and its capabilities will continue to expand. We have some good stuff lined up for StrictMode in Honeycomb but let us know what else you’d like to see! I’ll be answering questions on stackoverflow.com for questions tagged “strictmode”. Thanks!

14 October 2010

Traceview War Story

I recently took my first serious look at Traceview, and it occurred to me, first, that there are probably a few other Android developers who haven’t used it and, second, that this is an opportunity to lecture sternly on one of my favorite subjects: performance improvement and profiling. This is perhaps a little bit Android-101; If you already know all about Traceview, you can stop here and go back to coding.

Making Apps Fast

Here’s a belief that I think I share with most experienced developers: For any app that is even moderately complex, you’re not smart enough to predict what the slow parts are going to be, because nobody is smart enough to predict where software bottlenecks will turn up.

So the smart way to write a fast app is to build it in the simplest way that could possibly work, avoiding egregiously-stupid thing like order-N-squared algorithms and doing I/O on the Android UI thread. Who knows, it might be fast enough, and then you’re done!

If it isn’t fast enough, don’t guess why. Measure it and find out, using a profiler. Actually I’ve been known to do this, when backed into a corner, using things like System.err.println("Entered at" + System.currentTimeMillis()); Fortunately, Android comes with a reasonably decent profiler, so you don’t have to get ugly like that.

Case Study: LifeSaver 2

I have this little utility in Android Market called LifeSaver 2, the details are on my personal blog. At one point, it reads the SMS and phone-call logs out of the system and persists them in a JSON text file on the SD card. Since this is kind of slow, it shows a nice dynamic progress bar. It occurred to me to wonder why it was kind of slow to write a few hundred records into a text file on a device that, after all, has a gigahertz processor.

Somebody who foolishly disregarded my advice above might assume that the slowdown had to be due to the ContentProvider Cursor machinery reading the system logs, or failing that, the overhead of writing to the SD card. A wiser person would instrument the code and find out. Let’s do that.

Turning On Tracing

I went into Saver.java and bracketed the code in its run() method like so:

       public void run() {

            android.os.Debug.startMethodTracing("lsd");

            // ... method body elided

            android.os.Debug.stopMethodTracing();
        }

The first call turns tracing on, the argument "lsd" (stands for Life Saver Debug, of course) tells the system to put the trace log in /sdcard/lsd.trace. Remember that doing this means you have to add the WRITE_EXTERNAL_STORAGE permission so you can save the trace info; don‘t forget to remove that before you ship.

[Update:] Android engineer Xavier Ducrohet writes to remind me: “DDMS has a start/stop profiling button in the ‘device view’. Upon clicking stop it launches TraceView with the trace file. This is not as fine grained as putting start/stopMethodTracing in your code but can be quite useful. For VMs earlier than froyo, the permission is required as well (DDMS basically automate getting the trace from the sd card and saving it locally before calling traceview). For Froyo+ VMs, the VM is able to send the trace file through the JDWP connection and the permission is not needed anymore (which is really useful).” Thanks, Xav!

Then you run your app, then you copy the output over to your computer, and fire up Traceview.

540> adb pull /sdcard/lsd.trace
541> traceview lsd

At this point, you will have noticed three things. First, turning tracing on really slows down your app. Second, the tracefile is big; in this case, 8.6M for a run that took like four seconds. Third, that traceview looks pretty cool.

The bars across the top show the app’s threads and how they dealt out the time; since the Nexus One is single-threaded CPU, they have to take turns. Let’s zero in on one 100-msec segment.

The top line is where my app code is running (the red segment is GC happening), the middle line is the UI thread and the bursts of activity are the ProgressBar updating, and I have no idea what the third thread, named HeapWorker, does, but it doesn’t seem a major contributor to the app’s runtime, so let’s ignore it.

The bottom of the screen is where the really interesting data is; it shows which of your methods burned the time, and can be sorted in a bunch of different ways. Let’s zero in on the first two lines.

Translated into English, this tells us that the top-level routine consumed 100% of the time if you include everything it called (well, yeah), but only 0.9% of the time itself. The next line suddenly starts to get real interesting: java.io.PrintStream.println(Object) and whatever it calls are using 65.2% of the app’s time. This is the code that writes the JSON out to the SD card. Right away, we know that apparently the task of pulling the data out of the phone’s ContentProviders doesn’t seem to be very expensive; it’s the output that’s hurting.

Can we conclude that the app is limited by the sluggish write performance of the SD card? Let’s drill down, which is done in the most obvious point-and-click way imaginable.

Ooh, there’s a nasty surprise. Of course, println calls (in effect) toString() on all its arguments. It looks like turning the arguments to strings is taking over half the time, before it even dispatches from println(Object) to println(String).

I’ll skip the step of drilling down into println(String) but it does suggest that yes, there is some slow I/O happening there, to the SD card. But let’s look inside that String.valueOf() call.

There’s your smoking pistol. It turns out that org.json.JSONObject.toString() is what we professional programmers call a, well, this is a family-friendly operation so I won’t go there. You can poke around inside it, but it’s just depressing.

What you can do, however, is sort all the routines by their “Exclusive” times, as in the number of CPU circles burned right there in the routine. Here are all of them that use 1% or more of the total execution time.

There’s a little bit of GC and Android framework View-wrangling stuff in there, but the display is dominated by org.jason and java.lang.StringBuilder code.

The Conclusion

The real conclusion is that in the case of this app, I actually don’t care about the performance. A user runs it a grand total of two times, once on the old phone and once on the new phone, and it’s got lots of eye candy, so I just don’t think there’s a problem.

If I did want to speed this up, it’s obvious what to do. First, either stop using JSON, or find a cheaper way to serialize it. Second, do fewer println() calls; glom the data together in one big buffer and just blast it out with a single I/O call. But, and here’s the key point, if I’d guessed where the bottlenecks were, I’d have been wrong, mostly.

Traceview is a nice tool, and if you don’t already know it, you owe it to yourself to learn it.