Justin James shares tricks for how to load data-driven mobile apps with a lot more speed than parsing XML.
I've been experimenting with Windows Phone 7 development, and I have not been 100% happy with the process. (For details, read My first Windows Phone 7 app development project and The Windows Phone 7 App Hub experience.) However, an interesting aspect of my experiment is that the limitations of mobile devices (and Windows Phone 7 specifically) are forcing me to dust off some old-school performance techniques.
The major limitation I encountered with Windows Phone 7 is that it does not support WCF Data Services natively. There is a library to remedy this problem, but unfortunately, it does not support authentication, and many data services will need authentication. You can manually make the request and parse the XML yourself if you really want to, but it is clumsy.
The other issue is that, as the publisher, publishing data via Web services have ongoing costs directly linked to usage rates, but the App Hub publishing model does not allow for subscription fees at this time. If your application is popular, the last thing you need is to be selling an app for 99 cents that costs you 20 cents per user per month to operate.
Another concern with using Web services is that the Windows Phone 7 application guidelines are very strict about delays related to networking; you cannot make these requests synchronously, and you must have provisions for cancellation for "long running" operations. In my experience, an application was rejected because it called a service that never took more than a second or two to return with results, and I needed to provide a Cancel button for that scenario.
Because trying to access Web services is so clumsy right now, and you need to be mindful of the need to support cancellation, an attractive alternative is to put the data locally and work with it there.
Unfortunately, Windows Phone 7 also lacks a local database system. At best, your local data options are XML files and text files.
If do this kind of work on a desktop PC, it's not a big deal; by default, people will just throw it into an XML file and have a great day. The problem is that XML is a very inefficient file format, particularly on parsing and loading, and mobile devices lack CPU power. Depending on how much data you have, your application could be very unresponsive while loading the data, which will get your application rejected or force you to support cancellation for loading the data. And honestly, what is the point of a data-driven app where the users cancel the data that is being loaded?
So I've been digging into my bag of tricks (I'm glad I remember Perl), and here are some ways you can load data with a lot more speed than parsing XML.
- CSV: CSV is a tried and true data format, and it loads very fast. There are a zillion samples on the Internet of how to work with CSV data.
- Fixed width records: If you need even more speed than CSV offers, and you are willing to give up some storage space in the process, fixed width records are even faster than CSV. You can find lots of examples online of how to implement a data system using fixed width records.
- Indexing: You can create a simple indexing system to help locate your records in a jiffy. If your application only needs to read data, this is downright easy. Indexing provides awesome speed boosts with fixed width records since you can read the index location, multiply the row number by the record size to get the byte offset, and move directly there. It can provide an advantage for delimited files too, but usually only if you need to parse the records to find the data without the index. Load the index into RAM for additional benefits.
- Data file partitioning: Sometimes data can logically be split amongst smaller files, which can help your performance with delimited data files. For example, if you have data that can be grouped by country, put each country's data into a separate file; this way you reduce the number of reads needed to find data, even if you know what line it is on. Fixed width records with an index usually will not benefit from data partitioning, since they can directly access the data.
These are just some of the optimization techniques that I've found for working with data in text files. Do you have any suggestions that you'd like to share with the TechRepublic community?
J.JaDisclosure of Justin's industry affiliations: Justin James has a contract with Spiceworks to write product buying guides; he has a contract with OpenAmplify, which is owned by Hapax, to write a series of blogs, tutorials, and articles; and he has a contract with OutSystems to write articles, sample code, etc.