XML data validation with XPath and XSL

If you have an application in which you collect information into an XML document, you can run that document through an XSLT validation mechanism to create a centralized validation mechanism. In this article, Phillip Perkins explains how you can accomplish that.

If you're designing Web applications, you should be using XML as your data transport mechanism because most languages provide a robust set of tools for handling XML. You also get the benefit of XSLT transformations and XPath queries. Above and beyond all of this, you can use XPath to validate input. If you have an application in which you collect information into an XML document, you can run that document through an XSLT validation mechanism to create a centralized validation mechanism.

For example, say you have an application that collects certain logical units of information. One of these logical units is user information, which consists of: a first name, a middle initial, a last name, a phone number, and an address. A representation of this user information in XML format might look something like this:

        <Line1>123 Some Street</Line1>
        <Line2 />

You collect this information from an HTTP form request. (You undoubtedly have some robust mechanism for parsing through the form data and populating your XML.) Now run this XML through a transformation to validate your user responses.

If the user forgets to supply some data, you'd like to alert the user about the oversight. One method is to send your XML data through a centralized XSLT transformation that will assert the data and provide an error if things aren't up to snuff. By using a combination of XSLT if, choose, and when elements, you can test data and catch the parse errors to display to the user.

For instance, if you want the Postal node's text node to contain only five digits, you can check this value and supply a message element when this isn't true:

. . .
<xsl:template match="/">
    <xsl:apply-templates select="//User"/>

<xsl:template match="User">
    <xsl:if test="string-length(Address/Postal)!=5 or
    <xsl:message terminate="yes">You must supply a 5 digit value for Postal
. . .

When you run this transformation, the User node(s) is sent through the User template. First, the string-length() XPath function checks to see if the string length is five. The number() XPath function will return NaN for any value that isn't a numeric value. (I use the string() XPath function to convert that value back to a string for comparison.) If either condition is true, the message XSLT element terminates the transformation through the terminate attribute, generating a system error. You can capture this error in your code and display the error message. Or, you can use the error message in an exception handling routine to return the errors to the user in the form of HTML content.

Here's some ASP example code to show you how to validate your data:

. . .
Dim myErr
Set myErr = New CError
If Not validateData(dom, "xslfile.xsl", myErr) Then
    'Do something with the error.
    'The myErr.description contains the error message.
End If
. . .
Class CError
    Public number, description
    Private Sub Class_Initialize
        number = 0
        description = ""
    End Sub
    Private Sub Class_Terminate
    End Sub
End Class
Function validateData(poDOM, psXSLFileName, poErr)
    'poDOM is the DOMDocument object that is passed as the first parameter.
    'psXSLFileName is the relative path to the validation XSLT stylesheet.
    Dim bReturn, m_xsl
    bReturn = True
    Set m_xsl = Server.CreateObject("MSXML2.DOMDocument")
    m_xsl.async = False
    m_xsl.load Server.MapPath(psXSLFileName)
    'A little routine to make sure our XSL parsed okay.
    If m_xsl.parseError.errorCode <> 0 Then _
        Err.Raise vbObjectError, "validateDate(. . .)", m_xsl.parseError.reason
    On Error Resume Next
    poDOM.transformNode m_xsl
    If Err.number <> 0 Then
        poErr.number = Err.number
        poErr.description = Err.Description
        bReturn = False
    End If
    On Error Goto 0
    validateData = bReturn
    Set m_xsl = Nothing
End Function

This script creates a special CError object. This is a structure to hold the error information in the form of CError.number and CError.description. Using the validateData() function, you pass a DOM Document object as the first parameter, the validation XSLT stylesheet filename, and a CError object. The validateData() function creates a temporary DOM Document object for the XSLT stylesheet, which loads from the path to the file and checked for errors during parsing. Exception handling is turned off in order to catch any errors during the transformation. The Err object is checked for an error. If an error occurs, the error information is stored to the poErr object. The return value is set to False, and the function eventually ends.

Keep your developer skills sharp by automatically signing up for TechRepublic's free Web Development Zone newsletter, delivered each Tuesday.

Editor's Picks

Free Newsletters, In your Inbox