Web Development

Run XPath queries in PHP

PHP uses functions that provide XPath—a language that allows you to address parts of an XML document—functionality through contexts. Phillip Perkins offers sample XML data and PHP code to grab different parts of the XML document.

XPath is a language that allows you to address parts of an XML document, making XSLT transformations practically necessary. It also makes it an invaluable tool for managing XML data in applications such as Web applications.

Microsoft provides XPath functionality through the selectSingleNode() and selectNodes() methods on DOM nodes and documents. However, PHP uses functions that provide XPath functionality through contexts. In the following example, I'll show sample XML data and PHP code to grab different parts of the XML document. I'll also explain how the PHP code works.

In the example code, I use the following XML data to provide the functionality. (Note: This code was developed and run successfully using PHP 4.3.4, Windows XP Home Edition, and IIS 5.1.)

<?xml version="1.0"?>
    <x:root xmlns:x="http://www.someplace.com">
        <x:row>
            <x:dog color="yellow">Marmaduke</x:dog>
            <x:cat>Garfield</x:cat>
        </x:row>
        <x:row>
            <x:dog color="white">Snoopy</x:dog>
            <x:cat>Heathcliff</x:cat>
        </x:row>
        <x:row>
            <x:dog color="gray">Spike</x:dog>
            <x:cat>Sylvester</x:cat>
        </x:row>
    </x:root>

This XML data contains a few elements and some attributes including a namespace declaration—some basic XML. This results in varied queries for me to test. Here's the PHP code:

<?php
$sxml = '<?xml version="1.0"?>
    <x:root xmlns:x="http://www.someplace.com">
        <x:row>
            <x:dog color="yellow">Marmaduke</x:dog>
            <x:cat>Garfield</x:cat>
        </x:row>
        <x:row>
            <x:dog color="white">Snoopy</x:dog>
            <x:cat>Heathcliff</x:cat>
        </x:row>
        <x:row>
            <x:dog color="gray">Spike</x:dog>
            <x:cat>Sylvester</x:cat>
        </x:row>
    </x:root>';

$xml = domxml_open_mem($sxml);

$xpc = xpath_new_context($xml);
xpath_register_ns($xpc, "x", "http://www.someplace.com");

$nodes = xpath_eval($xpc, "//x:row/x:dog[@color='yellow']/text()");
foreach ($nodes->nodeset as $node) {
    print $node->content . "\n";
}

$nodes = xpath_eval($xpc, "//x:row/x:dog");
foreach ($nodes->nodeset as $node) {
    print $xml->dump_node($node) . "\n";
}

$nodes = xpath_eval($xpc, "//x:cat/child::text()|//x:dog[@color='white' or
@color='gray']/text()");
foreach ($nodes->nodeset as $node) {
    print $node->content . "\n";
}

$xml->free();
?>

First, I create a local variable to hold the XML string. This information could have been passed in as part of a POST HTTP request. However, for this example, I'm going to include it in the code. The next step is to create a DOM Document by using the domxml_open_mem() function. This function creates a DOM Document object in memory from a valid XML string. It accepts one parameter: the XML string. Another way to accomplish this is to store the XML in a separate file and use the domxml_open_file() function to load the XML from a file. This function takes one parameter: the filename of the XML file.

Once I create the DOM Document object, I can create an XPath context with this object through the xpath_new_context() function, which takes one parameter: the current DOM Document object. This context is used to evaluate the XPath expression and is also used to register namespaces, if needed. Since my XML includes a namespace, I register the namespace with the xpath_register_ns() function. This makes it possible to create XPath queries using prefixes. The xpath_register_ns() function takes three parameters: the XPath context, the prefix, and the namespace, respectively.

Now I can run XPath queries. This is done with the xpath_eval() function, whose first parameter is the XPath context and second parameter is the XPath expression. The function returns an array of DOM Nodes. In my code, I step through the nodeset and produce some form of output.

In the first XPath example, I grab all the x:dog text elements under the x:row nodes, where the color attribute equals 'yellow'. This is where the XPath expression in PHP differs slightly from an XPath expression using MSXML. I include the '/text()' part of the expression to return the text nodes only. With MSXML, you access the text node with the 'text' property. Using the 'content' property on the returned text node, I can get the content of the text node.

In the second example, I grab all the x:dog elements under the x:row nodes. However, I use the dump_node() method on the DOM Document object to print out the complete XML of the appropriate node. The dump_node() method accepts one parameter: the DOM Node of which you wish to dump the contents.

In the last example, I grab all the x:cat text nodes and all the x:dog text nodes where the color attribute is 'white' or 'gray'. Once again, I step through the nodeset and print out the content of each text node. Finally, I free up the DOM Document object.

If you want to find information on these technologies in PHP, visit The PHP Group. For XPath standards, visit the W3C site. For information on MSXML, point your browser to MSDN.

Keep your developer skills sharp by automatically signing up for TechRepublic's free Web Development Zone newsletter, delivered each Tuesday.

0 comments

Editor's Picks