Optimize data access for mobile phone apps

Justin James shares tricks for how to load data-driven mobile apps with a lot more speed than parsing XML.

I've been experimenting with Windows Phone 7 development, and I have not been 100% happy with the process. (For details, read My first Windows Phone 7 app development project and The Windows Phone 7 App Hub experience.) However, an interesting aspect of my experiment is that the limitations of mobile devices (and Windows Phone 7 specifically) are forcing me to dust off some old-school performance techniques.

The major limitation I encountered with Windows Phone 7 is that it does not support WCF Data Services natively. There is a library to remedy this problem, but unfortunately, it does not support authentication, and many data services will need authentication. You can manually make the request and parse the XML yourself if you really want to, but it is clumsy.

The other issue is that, as the publisher, publishing data via Web services have ongoing costs directly linked to usage rates, but the App Hub publishing model does not allow for subscription fees at this time. If your application is popular, the last thing you need is to be selling an app for 99 cents that costs you 20 cents per user per month to operate.

Another concern with using Web services is that the Windows Phone 7 application guidelines are very strict about delays related to networking; you cannot make these requests synchronously, and you must have provisions for cancellation for "long running" operations. In my experience, an application was rejected because it called a service that never took more than a second or two to return with results, and I needed to provide a Cancel button for that scenario.

Because trying to access Web services is so clumsy right now, and you need to be mindful of the need to support cancellation, an attractive alternative is to put the data locally and work with it there.

Unfortunately, Windows Phone 7 also lacks a local database system. At best, your local data options are XML files and text files.

If do this kind of work on a desktop PC, it's not a big deal; by default, people will just throw it into an XML file and have a great day. The problem is that XML is a very inefficient file format, particularly on parsing and loading, and mobile devices lack CPU power. Depending on how much data you have, your application could be very unresponsive while loading the data, which will get your application rejected or force you to support cancellation for loading the data. And honestly, what is the point of a data-driven app where the users cancel the data that is being loaded?

So I've been digging into my bag of tricks (I'm glad I remember Perl), and here are some ways you can load data with a lot more speed than parsing XML.

  • CSV: CSV is a tried and true data format, and it loads very fast. There are a zillion samples on the Internet of how to work with CSV data.
  • Fixed width records: If you need even more speed than CSV offers, and you are willing to give up some storage space in the process, fixed width records are even faster than CSV. You can find lots of examples online of how to implement a data system using fixed width records.
  • Indexing: You can create a simple indexing system to help locate your records in a jiffy. If your application only needs to read data, this is downright easy. Indexing provides awesome speed boosts with fixed width records since you can read the index location, multiply the row number by the record size to get the byte offset, and move directly there. It can provide an advantage for delimited files too, but usually only if you need to parse the records to find the data without the index. Load the index into RAM for additional benefits.
  • Data file partitioning: Sometimes data can logically be split amongst smaller files, which can help your performance with delimited data files. For example, if you have data that can be grouped by country, put each country's data into a separate file; this way you reduce the number of reads needed to find data, even if you know what line it is on. Fixed width records with an index usually will not benefit from data partitioning, since they can directly access the data.

These are just some of the optimization techniques that I've found for working with data in text files. Do you have any suggestions that you'd like to share with the TechRepublic community?


Disclosure of Justin's industry affiliations: Justin James has a contract with Spiceworks to write product buying guides; he has a contract with OpenAmplify, which is owned by Hapax, to write a series of blogs, tutorials, and articles; and he has a contract with OutSystems to write articles, sample code, etc.


Justin James is the Lead Architect for Conigent.


Time to open a post and play "Where's Waldo?" Must view the single post. Not in Print/View All mode.


I guess it depends on your circumstances. I have been loving the slickness of YAML as a data storage format when working in Ruby. A hash of other hashes and arrays, to (theoretically) arbitrary depth of nesting, can be stored and loaded with trivial ease using Ruby's YAML library. CSV is probably still faster, of course, but considerably less flexible in terms of data structure nesting. As an added bonus, the way YAML is represented is a lot easier on the eyes if you want to be able to read it directly by opening the data file in a text editor. If you don't care at all about reading it by eye or making it portable between programming languages (or even programs, maybe), there's also the simple option of directly dumping a complex data structure to file. For instance, Ruby's inspect method produces the contents of a variable in the same format you'd use to represent the data structure if assigning it to the variable in code, as a string you could write to file. I'm not sure, of course, in part because I have not actually run any benchmarks, but I suspect that would be even faster than CSV -- because you would not have to translate the data format at all. Of course, that last example might involve an injudicious use of eval when loading your data, so I'm not sure I'm so keen on that one for purposes of safety: ~> irb irb(main):001:0> foo = {:one => [0,1], :two => 2} => {:one=>[0, 1], :two=>2} irb(main):002:0>'tempfile.txt', 'w') do |f| irb(main):003:1* f.write foo.inspect irb(main):004:1> end => 23 irb(main):005:0> bar = => {} irb(main):006:0>'tempfile.txt', 'r') do |f| irb(main):007:1* bar = eval.( irb(main):008:1> end => {:one=>[0, 1], :two=>2} irb(main):009:0> bar[:one][1] => 1 irb(main):010:0> bar[:two] => 2 edit: redid code to fit within TR comment column width

Justin James
Justin James

... Ruby is not a really great option for mobile phones in terms of support, as far as I can tell. There are implementations out there (rumor has it you can even shoehorn IronRuby onto WP7), but I get the impression that there just isn't the tooling needed to really make things easy on a developer. I hope I'm wrong, though. Maybe I'll look into using IronRuby on WP7? Even just a simple REPL? That would be a potentially interesting project! J.Ja

Editor's Picks