Microsoft's .NET technology rests on XML and employs it for Internet communication (including Web services), data storage, and many other purposes. .NET contains many functions that translate various kinds of objects—including database tables—to and from XML formats.
This article explores how you can use SOAP, a subset of XML, to store pretty much any kind of data, including hashtables, collections, structures, and objects. The example code is written in Visual Basic .NET.
SOAP, like XML, offers two primary advantages over previous data-storage schemes. It stores data as plain English text, and it includes information about the data it stores (metadata).
Not only does a SOAP file contain metadata explaining the file's purpose, schemas, and so on, it also contains descriptions of the information it contains. For example, when you serialize an object using SOAP, that object's structure is stored along with its data. (Everything in an object is stored except private fields, which are ignored. If you need to store values within an object's private fields, use binary serialization instead. Binary is also faster and more compact than SOAP.)
Serialization is similar to traditional file-saving but implies deconstructing a block of data into its component parts, then labeling those parts so the structure of the block is preserved. The names and order of an object's public properties are stored, along with the values of those properties. Serialization also implies that the target of the data might not be a classic disk file; instead it could be a temporary cache, a stream that modifies the data, an Internet address, and so on.
Persisting with SOAP
To store objects, collections, or whatever other data structures you want to save, you can use the SoapFormatter. It translates your data (and its structure) into an XML file.
You cannot simply reference the necessary namespace for XML serialization using an Imports statement. Instead, you must choose Project|Add Reference, scroll down the list of components, and add System.Runtime.Serialization.Formatters.Soap to your project. Next, create a new Windows-style Visual Basic .NET project, and type in the code from Listing A.
As you can see when you run this sample code, the first ArrayList is stored into a disk file in SOAP format, then a new ArrayList named ReadBack is declared, and the data is read back into this new array.
Now, take a look at the SOAP file on your hard drive. It's even more verbose than what you might expect from the usual XML file—it contains extra SOAP metadata. The actual ArrayList in the SOAP file is in Listing B.
The ArrayList is a proprietary Microsoft structure, so it requires extra notations and references, such as xsd:anyType. Had you serialized a traditional array rather than an ArrayList, it would be simpler. For example, each element in an ArrayList can be a different data type, so the data type of each item in the serialization must be specified (xsi:type="SOAP-ENC:string"). By contrast, a traditional string array can contain only strings, so the resulting serialization is simpler. Here's how a classic string array is stored in a SOAP file:
Serializing Multiple Disparate Structures
Not only can you serialize any kind of object or collection into an XML SOAP file and then restore it later via deserialization, you can also store mixed kinds of structures into a single SOAP file. Think of a stream as a pipe through which you transmit pretty much anything you want to send and then send back the other way when you want to restore it, intact and poured into the same structures that it was originally in.
This process—being able to easily convert varied data structures into a stream—moves us a step closer to the elusive goal of application independence. Even legacy data structures can be easily saved, restored, and, with a little effort, transformed into newer structures.
There are two requirements when you want to mix and match data formats in an XML file. First, you must ensure that the order in which you store the structures is the same order that you retrieve them: this is a first-in, first-out stack. So if you store an object, followed by an ArrayList, followed by a collection, you must later deserialize this file in that same object, ArrayList, collection order.
Second, the serializable attribute is not inheritable. Therefore, if you need to derive a new class from a serializable class, you must identify the new class as <Serializable()>. A derived type adds to the metadata, so only the derived class can really know whether or not it can be serialized, which is why serializability should not be inheritable.
Mixing and matching types
You can open the serialization pipe and send individual variables or data structures down into that pipe willy-nilly. Send anything in any order, as long as you pull them back out in the same order. Store a string, an object, an integer, a Char array, or any other combination you want.
To see how this works, first ensure that your project has the Soap namespace, "reference," added to your project, as described in the previous example. Then type (or paste in) the code from Listing C to see how to store a HashTable, followed by an object and then a floating point variable, all in the same SOAP file.
One stream can handle multiple serializations or deserializations. Here, I used a single stream (fs) to deserialize the HashTable, the object, and the variable. Press F5 to store the file; then look at Test.txt. The metadata surrounding the object is interesting (notice that the private variables are ignored and that they are not stored in Listing D:
However, had you stored a double-precision floating point number, it would have been identified as a double, like this: