19 December 2011

Watch out for XmlPullParser.nextText()

Jesse Wilson

[This post is by Jesse Wilson from the Dalvik team. —Tim Bray]

Using XmlPullParser is an efficient and maintainable way to parse XML on Android. Historically Android has had two implementations of this interface:

The implementation from Xml.newPullParser() had a bug where calls to nextText() didn’t always advance to the END_TAG as the documentation promised it would. As a consequence, some apps may be working around the bug with extra calls to next() or nextTag():

    public void parseXml(Reader reader)
            throws XmlPullParserException, IOException {
        XmlPullParser parser = Xml.newPullParser();
        parser.setInput(reader);

        parser.nextTag();
        parser.require(XmlPullParser.START_TAG, null, "menu");
        while (parser.nextTag() == XmlPullParser.START_TAG) {
            parser.require(XmlPullParser.START_TAG, null, "item");
            String itemText = parser.nextText();
            parser.nextTag(); // this call shouldn’t be necessary!
            parser.require(XmlPullParser.END_TAG, null, "item");
            System.out.println("menu option: " + itemText);
        }
        parser.require(XmlPullParser.END_TAG, null, "menu");
    }

    public static void main(String[] args) throws Exception {
        new Menu().parseXml(new StringReader("<?xml version='1.0'?>"
                + "<menu>"
                + "  <item>Waffles</item>"
                + "  <item>Coffee</item>"
                + "</menu>"));
    }

In Ice Cream Sandwich we changed Xml.newPullParser() to return a KxmlParser and deleted our ExpatPullParser class. This fixes the nextTag() bug. Unfortunately, apps that currently work around the bug may crash under Ice Cream Sandwich:

org.xmlpull.v1.XmlPullParserException: expected: END_TAG {null}item (position:START_TAG <item>@1:37 in java.io.StringReader@40442fa8) 
     at org.kxml2.io.KXmlParser.require(KXmlParser.java:2046)
     at com.publicobject.waffles.Menu.parseXml(Menu.java:25)
 at com.publicobject.waffles.Menu.main(Menu.java:32)

The fix is to call nextTag() after a call to nextText() only if the current position is not an END_TAG:

  while (parser.nextTag() == XmlPullParser.START_TAG) {
      parser.require(XmlPullParser.START_TAG, null, "item");
      String itemText = parser.nextText();
      if (parser.getEventType() != XmlPullParser.END_TAG) {
          parser.nextTag();
      }
      parser.require(XmlPullParser.END_TAG, null, "item");
      System.out.println("menu option: " + itemText);
  }

The code above will parse XML correctly on all releases. If your application uses nextText() extensively, use this helper method in place of calls to nextText():

  private String safeNextText(XmlPullParser parser)
          throws XmlPullParserException, IOException {
      String result = parser.nextText();
      if (parser.getEventType() != XmlPullParser.END_TAG) {
          parser.nextTag();
      }
      return result;
  }

Moving to a single XmlPullParser simplifies maintenance and allows us to spend more energy on improving system performance.