Software

Convert an XML file into an Excel spreadsheet with this technique

One way to present XML information to a user is by converting it to a file format the user actually knows. Edmond Woychowsky walks you through his process, including all the necessary sample code, to convert an XML file to an Excel spreadsheet.

This article is also available as a TechRepublic download, which includes all of the sample code a more manageable text file format.

George Bernard Shaw once described the British and the Americans as two people separated by a common language. Interestingly the same can also be said of application developers and the people that use applications. Worse is that not only does one group not understand the other, but that both groups are usually blissfully unaware that the other group has no idea of what the other is trying to say. If you don't believe me try explaining what you do during an average day to some non-technical friend or relative and watch their eyes glaze-over as they try to follow your description.

As developers we have a tendency to deal with information that is in no way user-friendly without really thinking about how our relatives would handle it. For example, I'd rather not think of how my eldest half-brother would deal with the XML document shown in Listing A. If I was given the assignment of giving him information that maybe he might have to edit I'd make sure that it would be in a format that didn't require ButterflyXML or XMLSpy. Face it, as developers we have toys that non-developers don’t have, what’s needed is some kind of common ground, say something that we’d both have installed on our respective machines.

Listing A --- An XML document for my half brother


<?xml version="1.0" ?>
<root>
  <row>
    <column>Column 1 Row 1</column>
    <column>Column 2 Row 1</column>
    <column>Column 3 Row 1</column>
  </row>
  <row>
    <column>Column 1 Row 2</column>
    <column>Column 2 Row 2</column>
    <column>Column 3 Row 2</column>
  </row>
  <row>
    <column>Column 1 Row 3</column>
    <column>Column 2 Row 3</column>
    <column>Column 3 Row 3</column>
  </row>
  <row>
    <column>Column 1 Row 4</column>
    <column>Column 2 Row 4</column>
    <column>Column 3 Row 4</column>
  </row>
</root>

Because my eldest half-brother works for a brokerage I’d be willing to put money on the fact that he has a relatively recent vintage Microsoft Office Suite. This means that he’d have Excel and since I have Excel we’ve found a common ground between a developer and a non-developer. Alas, it also means that I’m the one that needs to figure-out how to jam an XML document into Excel. Maybe using a shoehorn might work?

There is, however, a little known trick with Excel 2002 that isn’t quite a shoe horn, but definitely fits the bill. Figure A offers a little hint as to the trick that I intend to use.

Figure A

Saving an Excel document

Now that we know that Excel can deal with XML there is still the minor problem of how to get from point X to point E. There is, however, a road map that we can follow, namely the creation of a simple spread sheet and saving it as an XML document. What we get is a document that looks like the one shown in Figure B and Listing B, one of those bizarre, not written by humans, document.

Figure B

An Excel spread sheet

Listing B --- An XML spread sheet


<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
 <Author>ed woychowsky</Author>
 <LastAuthor>Edmond Woychowsky</LastAuthor>
 <Created>2007-01-26T16:54:15Z</Created>
 <LastSaved>2007-01-27T05:18:54Z</LastSaved>
 <Company>None</Company>
 <Version>10.3501</Version>
</DocumentProperties>
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
 <DownloadComponents/>
 <LocationOfComponents HRef="file:///D:\"/>
</OfficeDocumentSettings>
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
 <WindowHeight>8700</WindowHeight>
 <WindowWidth>11355</WindowWidth>
 <WindowTopX>480</WindowTopX>
 <WindowTopY>120</WindowTopY>
 <ProtectStructure>False</ProtectStructure>
 <ProtectWindows>False</ProtectWindows>
</ExcelWorkbook>
<Styles>
 <Style ss:ID="Default" ss:Name="Normal">
  <Alignment ss:Vertical="Bottom"/>
  <Borders/>
  <Font/>
  <Interior/>
  <NumberFormat/>
  <Protection/>
 </Style>
</Styles>
<Worksheet ss:Name="Sheet1">
 <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="2" x:FullColumns="1"
  x:FullRows="1">
  <Row>
   <Cell><Data ss:Type="String">cell a1</Data></Cell>
   <Cell><Data ss:Type="String">cell b2</Data></Cell>
  </Row>
  <Row>
   <Cell><Data ss:Type="String">cell a2</Data></Cell>
   <Cell><Data ss:Type="String">cell b3</Data></Cell>
  </Row>
 </Table>
 <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
  <Print>
   <ValidPrinterInfo/>
   <HorizontalResolution>600</HorizontalResolution>
   <VerticalResolution>0</VerticalResolution>
  </Print>
  <Selected/>
  <Panes>
   <Pane>
    <Number>3</Number>
    <ActiveRow>2</ActiveRow>
   </Pane>
  </Panes>
  <ProtectObjects>False</ProtectObjects>
  <ProtectScenarios>False</ProtectScenarios>
 </WorksheetOptions>
</Worksheet>
<Worksheet ss:Name="Sheet2">
 <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
  <ProtectObjects>False</ProtectObjects>
  <ProtectScenarios>False</ProtectScenarios>
 </WorksheetOptions>
</Worksheet>
<Worksheet ss:Name="Sheet3">
 <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
  <ProtectObjects>False</ProtectObjects>
  <ProtectScenarios>False</ProtectScenarios>
 </WorksheetOptions>
</Worksheet>
</Workbook>

Dissecting the XML spreadsheet

As odd as the document is, there is actually a weird kind of structure to it. For example, it can be broken down into the following XML element hierarchy:

Workbook

  DocumentProperties

  ExcelWorkbook

  Styles

    Style

  Worksheet

    Tables

      Row

        Cell

          Data

    WorksheetOptions

      Print

        ValidPrinterInfo

        HorizontalResolution

        VerticalResolution

    Selected

    Panes

      Pane

        Number

        ActiveRow

    ProtectObjects

    ProtectScenarios

Not nearly as formidable broken down like this, is it? Actually looking at it from this perspective it becomes rather easy to create an XSL 1.0 style sheet to transform the XML document from Listing A into something that my half brother will feel comfortable with. In fact, the annotated style sheet can be found in Listing C and the results shown in Figure C and Listing D.

Listing C --- Annotated XSL style sheet to create an XML spreadsheet


<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
<!-- : /: Edmond Woychowsky: July 25, 2005: The purpose of this template is to create an Excel/XML spreadsheet from a
simple xml document.
-->
<xsl:template match="/">
<Workbook>
<xsl:call-template name="DocumentProperties"/>
<xsl:call-template name="OfficeDocumentSettings"/>
<xsl:call-template name="ExcelWorkbook"/>
<xsl:call-template name="Styles"/>
<xsl:apply-templates select="/*" mode="worksheet"/>
</Workbook>
</xsl:template>
<!-- : * worksheet: This template builds the spreadsheets individual worksheets, commonly know as
tabs.
-->
<xsl:template match="*" mode="worksheet">
<xsl:variable name="position" select="position()"/>
<Worksheet ss:Name="{concat('Sheet', $position)}">
<Table ss:ExpandedColumnCount="{count(./*[1]/*)}" ss:ExpandedRowCount="{count(./*) + 2}" x:FullColumns="1" x:FullRows="1">
<xsl:apply-templates select="*" mode="row"/>
</Table>
<xsl:call-template name="WorksheetOptions"/>
</Worksheet>
</xsl:template>
<!-- : * row: This template builds the worksheet's rows.
-->
<xsl:template match="*" mode="row">
<Row>
<xsl:apply-templates select="*" mode="cell"/>
</Row>
</xsl:template>
<!-- : * cells: This template builds the row's cells.
-->
<xsl:template match="*" mode="cell">
<xsl:variable name="type">
<xsl:choose>
<xsl:when test="number(.) = .">Number</xsl:when>
<xsl:otherwise>String</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<Cell>
<Data ss:Type="{$type}">
<xsl:value-of select="."/>
</Data>
</Cell>
</xsl:template>
<!-- : * column: This template describes a worksheet's individual columns.
-->
<xsl:template match="*" mode="column">
<xsl:variable name="name" select="name(.)"/>
<xsl:variable name="length">
<xsl:call-template name="length">
<xsl:with-param name="nodeset" select="//parent::*/parent::*/*/*[name(.) = $name]"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="width">
<xsl:choose>
<xsl:when test="($length * 5.75) &lt; 56.25">56.25</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$length * 5.75"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:variable name="style">
<xsl:choose>
<xsl:when test="parent::*/parent::*/*/*[name(.) = $name] = number(parent::*/parent::*/*[1]/*[name(.) = $name])">
<xsl:choose>
<xsl:when test="string-length(parent::*/parent::*/*/*[name(.) = $name][contains(.,'.')]) = 0">s23</xsl:when>
<xsl:otherwise>s24</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:otherwise>s22</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<Column ss:StyleID="{$style}" ss:AutoFitWidth="0" ss:Width="{$width}"/>
</xsl:template>
<!-- : DocumentProperties: This template describes the document to Excel.
-->
<xsl:template name="DocumentProperties">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author>ewoychowsky</Author>
<Company>EAW</Company>
<Version>10.4219</Version>
</DocumentProperties>
</xsl:template>
<!-- : OfficeDocumentSettings: This template describes the Office document to Excel.
-->
<xsl:template name="OfficeDocumentSettings">
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
<DownloadComponents/>
<LocationOfComponents HRef="file:///\\phlfsnt01\DOWNLOAD\OfficeXPSrc\"/>
</OfficeDocumentSettings>
</xsl:template>
<!-- : ExcelWorkbook: This template describes the characteristics of the wookbook to Excel.
-->
<xsl:template name="ExcelWorkbook">
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
<WindowHeight>9210</WindowHeight>
<WindowWidth>15195</WindowWidth>
<WindowTopX>0</WindowTopX>
<WindowTopY>60</WindowTopY>
<ProtectStructure>False</ProtectStructure>
<ProtectWindows>False</ProtectWindows>
</ExcelWorkbook>
</xsl:template>
<!-- : Styles: This template describes the display styles to Excel.
-->
<xsl:template name="Styles">
<Styles>
<Style ss:ID="Default" ss:Name="Normal">
<Alignment ss:Vertical="Bottom"/>
<Borders/>
<Font/>
<Interior/>
<NumberFormat/>
<Protection/>
</Style>
</Styles>
</xsl:template>
<!-- : WorksheetOptions: This template describes the worksheet options to Excel.
-->
<xsl:template name="WorksheetOptions">
<WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
<Print>
<ValidPrinterInfo/>
<HorizontalResolution>1200</HorizontalResolution>
<VerticalResolution>1200</VerticalResolution>
</Print>
<ProtectObjects>False</ProtectObjects>
<ProtectScenarios>False</ProtectScenarios>
</WorksheetOptions>
</xsl:template>
<!-- : length: Determine either the length of the node name or the longest node(s), which ever is longer.
-->
<xsl:template name="length">
<xsl:param name="nodeset"/>
<xsl:variable name="longest">
<xsl:call-template name="longest">
<xsl:with-param name="nodeset" select="$nodeset"/>
</xsl:call-template>
</xsl:variable>
<xsl:choose>
<xsl:when test="string-length(name($nodeset[1])) &gt; string-length($longest)">
<xsl:value-of select="string-length(name($nodeset[1]))"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="string-length($longest)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- : longest: This recursive template transverses a nodeset to find the nodes with the longest
string-length. Please note that the result of this template may itself be a nodeset.
-->
<xsl:template name="longest">
<xsl:param name="nodeset"/>
<xsl:param name="length" select="0"/>
<xsl:choose>
<xsl:when test="count($nodeset[string-length(.) &gt; $length]) &gt; 0">
<xsl:call-template name="longest">
<xsl:with-param name="nodeset" select="$nodeset[string-length(.) &gt; $length]"/>
<xsl:with-param name="length" select="string-length($nodeset[string-length(.) &gt; $length][1])"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$nodeset"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>

Figure C

The result in Excel

Listing D --- The result as XML


<?xml version="1.0" encoding="UTF-8"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:x="urn:schemas-microsoft-com:office:excel">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author xmlns="urn:schemas-microsoft-com:office:office">ewoychowsky</Author>
<Company xmlns="urn:schemas-microsoft-com:office:office">EAW</Company>
<Version xmlns="urn:schemas-microsoft-com:office:office">10.4219</Version>
</DocumentProperties>
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
<DownloadComponents xmlns="urn:schemas-microsoft-com:office:office" />
<LocationOfComponents xmlns="urn:schemas-microsoft-com:office:office" HRef="file:///\\phlfsnt01\DOWNLOAD\OfficeXPSrc\" />
</OfficeDocumentSettings>
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
<WindowHeight xmlns="urn:schemas-microsoft-com:office:excel">9210</WindowHeight>
<WindowWidth xmlns="urn:schemas-microsoft-com:office:excel">15195</WindowWidth>
<WindowTopX xmlns="urn:schemas-microsoft-com:office:excel">0</WindowTopX>
<WindowTopY xmlns="urn:schemas-microsoft-com:office:excel">60</WindowTopY>
<ProtectStructure xmlns="urn:schemas-microsoft-com:office:excel">False</ProtectStructure>
<ProtectWindows xmlns="urn:schemas-microsoft-com:office:excel">False</ProtectWindows>
</ExcelWorkbook>
<Styles>
<Style ss:ID="Default" ss:Name="Normal">
<Alignment ss:Vertical="Bottom" />
<Borders />
<Font />
<Interior />
<NumberFormat />
<Protection />
</Style>
</Styles>
<Worksheet ss:Name="Sheet1">
<Table ss:ExpandedColumnCount="3" ss:ExpandedRowCount="6" x:FullColumns="1" x:FullRows="1">
<Row>
<Cell>
<Data ss:Type="String">Column 1 Row 1</Data>
</Cell>
<Cell>
<Data ss:Type="String">Column 2 Row 1</Data>
</Cell>
<Cell>
<Data ss:Type="String">Column 3 Row 1</Data>
</Cell>
</Row>
<Row>
<Cell>
<Data ss:Type="String">Column 1 Row 2</Data>
</Cell>
<Cell>
<Data ss:Type="String">Column 2 Row 2</Data>
</Cell>
<Cell>
<Data ss:Type="String">Column 3 Row 2</Data>
</Cell>
</Row>
<Row>
<Cell>
<Data ss:Type="String">Column 1 Row 3</Data>
</Cell>
<Cell>
<Data ss:Type="String">Column 2 Row 3</Data>
</Cell>
<Cell>
<Data ss:Type="String">Column 3 Row 3</Data>
</Cell>
</Row>
<Row>
<Cell>
<Data ss:Type="String">Column 1 Row 4</Data>
</Cell>
<Cell>
<Data ss:Type="String">Column 2 Row 4</Data>
</Cell>
<Cell>
<Data ss:Type="String">Column 3 Row 4</Data>
</Cell>
</Row>
</Table>
<WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
<Print xmlns="urn:schemas-microsoft-com:office:excel">
<ValidPrinterInfo xmlns="urn:schemas-microsoft-com:office:excel" />
<HorizontalResolution xmlns="urn:schemas-microsoft-com:office:excel">1200</HorizontalResolution>
<VerticalResolution xmlns="urn:schemas-microsoft-com:office:excel">1200</VerticalResolution>
</Print>
<ProtectObjects xmlns="urn:schemas-microsoft-com:office:excel">False</ProtectObjects>
<ProtectScenarios xmlns="urn:schemas-microsoft-com:office:excel">False</ProtectScenarios>
</WorksheetOptions>
</Worksheet>
</Workbook>

Finding comfort

As odd as these examples might be and as strange as the scenario of presenting my eldest half brother with an XML document might seem, this endeavor actually has a real-world application. Imagine getting an assignment where the user community in you company needs to be able to review the contents of a database table and they'd like to see it in Excel. After all, they're comfortable with Excel and hopefully now you will be too.

18 comments
jpmanching
jpmanching

the latest office 2010 does it very easily. tire speed rating chart

scrybbler
scrybbler

Edmond, are there cases where such a conversion shouldn't be attempted? I'm no XML wizard, but I'm fairly certain that some XML files can't make it to Excel, or would lose data in the transition. DC Wedding Favor Team

andreaburton
andreaburton

thanks for posting this! It's been very useful :) Andrea B. Excel Spreadsheets

PRASSK
PRASSK

Very very goooood! it helps me very much & suggestions ALSO VERY GOOD

ssinghjrd
ssinghjrd

Confusing, I am having trouble importing when Excel thinks there is no scheme defined for the xml and it defines one itself, then only sees on record. Dubai apartments

lchene
lchene

It works nicely, bur what about attributes : John Doe 2007-04-04

crichman1109
crichman1109

I agrre with th econtent of the article, however, we are displaying in an XML spreadsheet data exported from an SQL server DB and if th euser clicks save without changing th eformat the file size is over 6 time the size of the same data in native .xls Does anyone know a way to force the save as .xls?

Gis Bun
Gis Bun

Hi, I have a simple XML file like: - Not sure Brador Game Studios 1.0 Some guy It doesn't run. But if I import into Excel, they get imported in heading order GameName, GameVersion, Publisher, SpecialAdjustments and Tester - which I don't want. Any way to correct this?

bmagurn
bmagurn

I read this article before, and was hoping it would show me how to read an *existing* XML file into excel. All this is telling you how to do is how to create an XML file that can be opened in excel. I might as well create a csv if I was trying to do that. It's a neat little trick, but isn't real useful (to me).

john.light
john.light

!-- : * row: This template builds the worksheet's rows. --> match="*" and select="*" says match and process all elements replace with match="*|@*" and select="*|@*" which means match and process all elements and all attributes

john.light
john.light

Use the XSL Stylesheet in this article to transform your original XML page to a more excel friendly XML page and then open in excel. one way to do this is to reference the stylesheet in the XML. then open the XML using excel and tell it to apply the stylesheet.

john.light
john.light

To read an existing XML file into excel just do a file/open and pick the XML file. But that can have issues in it's display that this article attempts to correct. I had the same problem with understanding until I re-read the article in more detail. May be a better title of "how to use XSL to transform an existing XML file to a more excel friendly XML file"

Systems Guy
Systems Guy

Yeah, I'm a little confused to about the point your trying to make. Not that it doesn't have merit. Regards.

ct_taylor
ct_taylor

I have been able to create a Excel friendly xml file using sample code from this article. Now I can create a minimal table. So far I am limited to that because the author does not tell us where to get a complete reference to the available elements so that we can do things like build headers etc. The author did not invent this stuff. He should state where he got his information. It would be a big help.

giorgio_rovelli
giorgio_rovelli

This is all well and good but how does one create an XSL Stylesheet?

doug
doug

The XML is the data, and most database type systems will generate it. The xsl is static, so you can produce a formated spreadsheet by just running a xsl transform on the XML sheet. Of course, Excel will load HTML files as well, so I've been converting my XML output to html with an XSL file. But you miss a lot of formating

ct_taylor
ct_taylor

The Microsoft Office SDK has the information for SpreadsheetML that is used in this article. I have posted this question on a few different sites and I could never get an answer.

Editor's Picks