Parsing JSON Data Efficiently on Android: JsonReader

March 2, 2012 Robert Szumlakowski

I recently worked on an Android app that parsed large amounts of JSON-formatted data. We would routinely allocate 20 or 30 kilobytes of heap space for Strings of JSON-formatted text retrieved from a server. We’d pass the strings to the Android framework’s JSONObject class in order to construct our data models. After downloading many objects in a short amount of time we discovered that we were frequently running out of memory and decided that we needed a different strategy to parse the content we downloaded from our server: all those Strings and JSONObjects on the heap were killing us.

When parsing XML-formatted text, it’s well known that there are two general strategies: DOM and SAX. DOM (“Document Object Model”) loads the entire content into memory and permits the developer to query the data as they wish. SAX (“Simple API for XML”) presents the data as a stream: the developer waits for their desired pieces of data to appear and saves only the parts they need. DOM is considered easier to use but SAX uses much less memory.

But what about JSON?

The JSONObject class has been in Android since the very beginning (API level 1), is easy to use, and is probably the developer’s default choice for their JSON parsing needs. However, the JSONObject class is like DOM; it reads the whole object into memory. On mobile devices with limited resources that’s not always the best idea.

Google decided to provide an alternative, the JsonReader class, which presents data for parsing with a stream, similar to SAX parsers. This class, however, is only available in API level 11 and up (Honeycomb and Ice Cream Sandwich). At the time of this writing these devices comprise less than 5% of the existing user base, though, and might not help you if you want to target Froyo and Gingerbread devices.

Thankfully, Google has made this class available in an open source library: google-gson. You can download this JAR file and include in any Java project — not just Android applications. We used version 2.1 of this library and succeeded in our goal of greatly reducing our memory usage.

I’ll demonstrate how to use this class with an example: parsing some JSON data from the Wikipedia API. In particular, I’ll show how to parse all the links on the Wikipedia page for mangos:

API Call: http://en.wikipedia.org/w/api.php?action=parse&page=Mango&format=json&prop=links

Result:

{
  'parse': {
    'title': 'Mango',
    'links': [
      {
        'ns': 14,
        '*': 'Category:Articles containing Tamil language text',
        'exists': ''
      },
      {
        'ns': 14,
        '*': 'Category:Articles containing Malayalam language text',
        'exists': ''
      } ...
}

The links are contained in an array of smaller objects. The link text itself is contained in fields named “*”.

With a stream-based JSON parser we no longer needed to allocate Strings with the entire JSON content anymore. We could use InputStream objects directly (after wrapping them in InputStreamReader and BufferedReaders). Our code looked much like this:

import com.google.gson.stream.JsonReader;

protected T generateData(InputStream inputStream) throws Exception {
	T t = null;
	InputStreamReader inputStreamReader = null;
	BufferedReader bufferedReader = null;
	JsonReader jsonReader = null;
	try {
		inputStreamReader = new InputStreamReader(inputStream);
		bufferedReader = new BufferedReader(inputStreamReader);
		jsonReader = new JsonReader(inputStreamReader);
		t = generateModel(jsonReader);
		// ...

We made heavy use of generic classes in our model and caching code. In this case, T is the type of model object we’re attempting to construct. Our code can handle data coming off the network or off the SD-card, since we’re just treating the data as simple InputStream objects. After wrapping them in InputStreamReader and BufferedReader objects, we can attempt to parse the JSON text and build our model objects.

Here’s where it gets interesting:

public void createFromJSON( JsonReader jsonReader ) throws IOException {
	jsonReader.beginObject();
	while( jsonReader.hasNext() ){
		final String name = jsonReader.nextName();
		final boolean isNull = jsonReader.peek() == JsonToken.NULL;
		if( name.equals( "parse" ) && !isNull ) {
			jsonReader.beginObject();
			while( jsonReader.hasNext() ) {
				final String innerName = jsonReader.nextName();
				final boolean isInnerNull = jsonReader.peek() == JsonToken.NULL;
				if( innerName.equals( "links" ) && !isInnerNull ) {
					jsonReader.beginArray();
					while( jsonReader.hasNext() ) {
						jsonReader.beginObject();
						while( jsonReader.hasNext() ) {
							final String innerInnerName = jsonReader.nextName();
							final boolean isInnerInnerNull = jsonReader.peek() == JsonToken.NULL;
							if( innerInnerName.equals( "*" ) && !isInnerInnerNull ) {
								links.add( jsonReader.nextString() );
							}
							else {
								jsonReader.skipValue();
							}
						}
						jsonReader.endObject();
					}
					jsonReader.endArray();
				}
				else jsonReader.skipValue();
			}
			jsonReader.endObject();
		}
		else
			jsonReader.skipValue();
	}
	jsonReader.endObject();
}

It looks big and ugly, but this code does its job well: it skips through the entire stream of JSON text and picks out only the parts it wants to keep. The ugly part is navigating through the parts of the stream you’re less interested in. Here’s some general strategies we discovered:

  • Each time that the JSON text contains an object (delimited by curly brackets: “{ … }”) you need to call the beginObject method at the start and endObject at the end. In between, you need to loop through the object contents (i.e.: each field) and look for the ones you want.
  • Each time that the JSON text contains an array (delimited by square brackets: “[ … ]”) you need to call the beginArray method at the start and endArray at the end. In between, you need to loop through all the array elements.
  • You loop through chunks of data at the same “level” in the JSON text by repeatedly calling the hasNext method and then handling the next data item.
  • For each field you examine, you’ll need to call the nextName method to read it. If you want its value, you can call a method like nextString to read it. If you don’t need it, you must call skipValue to get past it.
  • You should check if values are null by calling peek.

Once we figured things out, we found it much more efficient to parse JSON using these techniques. For anything other than the most simple JSON files, I don’t think I would recommend doing it any other way now.

 

About the Author

Biography

Previous
Mongo Madness!
Mongo Madness!

New faces No new faces today. Requests for help "MongoMapper v. Mongoid v. Candy v. etc...? Which s...

Next
Make Jasmine run at (near) full-speed in a background tab
Make Jasmine run at (near) full-speed in a background tab

Jasmine environments have a default updateInterval value of 250 that determines how often, in milliseconds,...

×

Subscribe to our Newsletter

!
Thank you!
Error - something went wrong!