XPath is a language that allows you to address
parts of an XML document, making XSLT transformations practically
necessary. It also makes it an invaluable tool for managing XML
data in applications such as Web applications.
Microsoft provides XPath functionality through
the selectSingleNode() and selectNodes() methods on DOM nodes and
documents. However, PHP uses functions that provide XPath
functionality through contexts. In the following example, I’ll show
sample XML data and PHP code to grab different parts of the XML
document. I’ll also explain how the PHP code works.
In the example code, I use the following XML
data to provide the functionality. (Note: This code was developed
and run successfully using PHP 4.3.4, Windows XP Home Edition, and
IIS 5.1.)
<?xml version=”1.0″?>
<x:root
xmlns:x=”http://www.someplace.com”>
<x:row>
<x:dog
color=”yellow”>Marmaduke</x:dog>
<x:cat>Garfield</x:cat>
</x:row>
<x:row>
<x:dog
color=”white”>Snoopy</x:dog>
<x:cat>Heathcliff</x:cat>
</x:row>
<x:row>
<x:dog
color=”gray”>Spike</x:dog>
<x:cat>Sylvester</x:cat>
</x:row>
</x:root>
This XML data contains a few elements and some
attributes including a namespace declaration—some basic XML. This
results in varied queries for me to test. Here’s the PHP code:
<?php
$sxml = ‘<?xml version=”1.0″?>
<x:root
xmlns:x=”http://www.someplace.com”>
<x:row>
<x:dog
color=”yellow”>Marmaduke</x:dog>
<x:cat>Garfield</x:cat>
</x:row>
<x:row>
<x:dog
color=”white”>Snoopy</x:dog>
<x:cat>Heathcliff</x:cat>
</x:row>
<x:row>
<x:dog
color=”gray”>Spike</x:dog>
<x:cat>Sylvester</x:cat>
</x:row>
</x:root>’;
$xml = domxml_open_mem($sxml);
$xpc = xpath_new_context($xml);
xpath_register_ns($xpc, “x”, “http://www.someplace.com”);
$nodes = xpath_eval($xpc,
“//x:row/x:dog[@color=’yellow’]/text()”);
foreach ($nodes->nodeset as $node) {
print $node->content . “\n”;
}
$nodes = xpath_eval($xpc, “//x:row/x:dog”);
foreach ($nodes->nodeset as $node) {
print $xml->dump_node($node) .
“\n”;
}
$nodes = xpath_eval($xpc,
“//x:cat/child::text()|//x:dog[@color=’white’ or
@color=’gray’]/text()”);
foreach ($nodes->nodeset as $node) {
print $node->content . “\n”;
}
$xml->free();
?>
First, I create a local variable to hold the
XML string. This information could have been passed in as part of a
POST HTTP request. However, for this example, I’m going to include
it in the code. The next step is to create a DOM Document by using
the domxml_open_mem() function. This function creates a DOM
Document object in memory from a valid XML string. It accepts one
parameter: the XML string. Another way to accomplish this is to
store the XML in a separate file and use the domxml_open_file()
function to load the XML from a file. This function takes one
parameter: the filename of the XML file.
Once I create the DOM Document object, I can
create an XPath context with this object through the
xpath_new_context() function, which takes one parameter: the
current DOM Document object. This context is used to evaluate the
XPath expression and is also used to register namespaces, if
needed. Since my XML includes a namespace, I register the namespace
with the xpath_register_ns() function. This makes it possible to
create XPath queries using prefixes. The xpath_register_ns()
function takes three parameters: the XPath context, the prefix, and
the namespace, respectively.
Now I can run XPath queries. This is done with
the xpath_eval() function, whose first parameter is the XPath
context and second parameter is the XPath expression. The function
returns an array of DOM Nodes. In my code, I step through the
nodeset and produce some form of output.
In the first XPath example, I grab all the
x:dog text elements under the x:row nodes, where the color
attribute equals ‘yellow’. This is where the XPath expression in
PHP differs slightly from an XPath expression using MSXML. I
include the ‘/text()’ part of the expression to return the text
nodes only. With MSXML, you access the text node with the ‘text’
property. Using the ‘content’ property on the returned text node, I
can get the content of the text node.
In the second example, I grab all the x:dog
elements under the x:row nodes. However, I use the dump_node()
method on the DOM Document object to print out the complete XML of
the appropriate node. The dump_node() method accepts one parameter:
the DOM Node of which you wish to dump the contents.
In the last example, I grab all the x:cat text
nodes and all the x:dog text nodes where the color attribute is
‘white’ or ‘gray’. Once again, I step through the nodeset and print
out the content of each text node. Finally, I free up the DOM
Document object.
If you want to find information on these
technologies in PHP, visit The PHP
Group. For XPath standards, visit the W3C site. For
information on MSXML,
point your browser to MSDN.
Keep your developer skills sharp by automatically signing up for TechRepublic’s free Web Development Zone newsletter, delivered each Tuesday.