XML is among the most popular standards for displaying data on the Web and sharing data between applications. In fact, it’s possible XML will replace fixed-length and delimited text as the “vanilla” data format of choice. Why? It’s human-friendly onscreen, and it’s database-friendly when it comes to spreadsheet, database, and Web applications.
To help you take advantage of XML, Microsoft has added new features in Excel 2002 that can save you time by allowing you to create and query XML data with just a few mouse clicks.
Brief explanation of XML
So what is XML and why should Excel 2002 users care? At its core, an XML file is a text file that stores data, so XML provides a format for sharing data. If you have a ton of data in Excel spreadsheets, but your customer or business partner requests that data in XML format, it’s easy to convert that data.
To contrast XML with traditional data formats, consider the two lines of text shown in Figure A. This data format is called comma-delimited text. The first row of text is a list of four field or column names, and the second row contains data.
|When data is presented in delimited format, it’s easy to tell where field records begin and end.|
Figure B shows what the same data looks like in fixed-length format. If you’re going to import fixed-length text into a spreadsheet or any other application, you need to have information (typically called the table layout) so that others will know which columns define the fields in your database. In this case, the customer ID is stored in columns 1 through 5, the last name is stored in columns 6 through 13, the first name in columns 14 through 21, and the address field is in columns 22 through 29.
|When data is presented in fixed-length format, both people and software must be told where fields begin and end.|
XML can store the same data. However, instead of columns and rows, the data in an XML document is stored in a tagged format. The XML tags resemble HTML tags. While HTML tags toggle attributes such as bold or italics, XML tags identify the beginning and endpoints of fields within records.
Excel 2002’s XML options
Does someone in your company maintain an Excel spreadsheet that contains data that ought to be published in XML format on the intranet, published on the Internet, or accessed by some database application? If so, Excel 2002 makes it easy to perform a manual extraction of the data. Assume the sample customer spreadsheet shown in Figure C is what you want to make available in XML format.
|I'll use the records in this sample worksheet to illustrate what XML means to Excel users.|
In Excel 2002 you have a new option for saving your work. Go to File | Save As and click on the Save As Type drop-down list. When you do, you’ll see a new option: XML Spreadsheet (*.xml), as shown in Figure D.
|Excel 2002 gives you the option of saving your worksheet in XML format.|
When you select XML Spreadsheet (*.xml) and click OK, Excel 2002 saves the current records in XML format. Onscreen, nothing much will probably change. You can go to File | Close to close that XML version of your spreadsheet.
So what’s in that XML file that Excel 2002 generates? In Excel 2002, if you go to File | Open, the XML file you created will show up because XML is listed under the default list of file types. If you open the .xml file, you’ll see pretty much the same thing as you see when you open the .xls version of the same file.
If you’re curious about what’s going on behind the scenes in that XML file, however, launch Microsoft Word. Then go to Insert | File and navigate to the .xml file you created with Excel. When you do, you’ll be able to peruse the XML code that Excel 2002 generated.
I won’t show you the contents of the entire XML file, but Listing A shows the table section of the XML file that was rendered based on the five records from the original sample Excel spreadsheet.
You don’t have to be a code jockey to figure out that in the first line, the assignment statement ExpandedColumnCount=”7” refers to the seven columns labeled in the source spreadsheet, namely CustID, Lname, Fname, Addr1, City, State, and Zip. The fact that I had five rows (one row of column headers and four rows of data) in the original spreadsheet accounts for the value of the ExpanddRowCount property.
The term schema refers to information about the table structure and other information wrapped around the table data. Depending on which parser is used to import the XML data, you may want to alter the XML file and delete the schema information before you submit the file.
Bringing XML data into Excel 2002
If ever you receive an XML file and you wonder whether Excel 2002 can open the file and display the data, just go to File | Open, select XML Files (*.xml) under the Files of Type drop-down list, and then navigate to the XML file. You’ll find out right away whether Excel 2002 can parse the XML data into columns and rows.
If opening the XML file doesn’t work, try going to Data | Import External Data | Import Data. Then navigate to the XML file you want to import and click Open. If Excel 2002 is able to import the data, you’ll be asked if you want to place the data in the current worksheet or load the data into a new worksheet.
Querying the Web
Excel 2002 offers yet another powerful tool for getting data off of a Web site and into your worksheet. Go to Data | Import External Data | New Web Query. When you do, a New Web Query dialog box will appear, like the one shown in Figure E. Assuming you’re connected to the Web, Excel 2002 will load the default home page. You can, of course, navigate to any Web page you like.
|Excel 2002 lets you import XML data from about any Web page.|
After the page loads, Excel 2002 will scan for information that might or might not be XML data, and arrows like the ones shown in Figure E will “point” to where Excel 2002 thinks the data is located. Just click on any of those arrows, and Excel 2002 will import whatever it finds into your worksheet. Of course, results may vary, depending on the way the data is presented on the Web page. However, overall this tool provides a much better way to get data into your Excel 2002 worksheet than simply copying and pasting.