General discussion

Locked

Parsing XML documents with Perl's XML::Simple

By RexWorld ·
article root

This conversation is currently closed to new comments.

2 total posts (Page 1 of 1)  
| Thread display: Collapse - | Expand +

All Comments

Collapse -

Avoiding conflicts with attributes named "content"

by RexWorld In reply to Parsing XML documents wit ...

If you already have an attribute named "content" in your XML and you want to avoid a conflict between this attribute and the actual content of an element, tell XML::Simple to use a different name for this key through the "ContentKey" argument to the object constructor.

Collapse -

help with content - specific example

by bioinfo_perl In reply to Avoiding conflicts with a ...

Thank you for this excellent article. I am having a specific problem accessing the content of the reference fields in a xml::simple parsed xml file and I hope someone can help.
Specifically, there are records in the xml document that have no reference, some that have one, and some that have more than one. How do I capture
the values of the references (ie. content) in an array if the values exist, and avoid a program termination if it does not exist? Thank you in advance for the help.

Here is the perl code fragment, it retrieves the category and text values very easily. I am having difficulty
with the @annotationreferencekey variable:


#!C:\perl\bin
use XML::Simple;

@annotationkeyarray = qw(AN000036741 AN000036745 AN000036743 AN000036744);
#AN000036744 no item value, AN000036743 has two items

my $treeparser = new XML::Simple(ForceArray => 1);
my $annotationtree = $treeparser->XMLin("annotate_example.xml");

foreach $annotationkey (@annotationkeyarray){
$annotationcategory = $annotationtree->{Annotate}->{"$annotationkey"}->{category}->[0];
$annotationtext = $annotationtree->{Annotate}->{"$annotationkey"}->{text}->[0];
@annotationreferencekey = $annotationtree->{Annotate}->{"$annotationkey"}->{source}->{item}->{content}->[0];
print "$annotationkey:$annotationcategory:$annotationtext:@annotationreferencekey\n";
}

The actual xml file (annotate_example.xml) is:

<?xml version="1.0" encoding="UTF-8"?>
<network extent="annotate" source="bbdb82" version="5.2" xmlns:xlink="http://www.w3.org/1999/xlink">
<Annotate id="AN000036741">
<category>category 1</category>
<text>one reference instance 1</text>
<source>
<item type="Reference" xlink:type="simple" xlink:href="reference.xml#ID (TFPA23058)" xlink:show="new" xlink:actuate="onRequest">23058</item>
</source>
</Annotate>
<Annotate id="AN000036743">
<category>category 2</category>
<text>two references</text>
<source>
<item type="Reference" xlink:type="simple" xlink:href="reference.xml#ID (TFPA23058)" xlink:show="new" xlink:actuate="onRequest">23058</item>
<item type="Reference" xlink:type="simple" xlink:href="reference.xml#ID (TFPA23099)" xlink:show="new" xlink:actuate="onRequest">23099</item>
</source>
</Annotate>
<Annotate id="AN000036744">
<category>category 3</category>
<text>no references</text>
<source>
</source>
</Annotate>
<Annotate id="AN000036745">
<category>category 4</category>
<text>one reference instance 2</text>
<source>
<item type="Reference" xlink:type="simple" xlink:href="reference.xml#ID (TFPA23055)" xlink:show="new" xlink:actuate="onRequest">23055</item>
</source>
</Annotate>
</network>

The output from data::dumper is:

$VAR1 = {
'source' => 'bbdb82',
'Annotate' => {
'AN000036743' => {
'source' => [
{
'item' => [
{
'xlink:show' => 'new',
'xlink:actuate' => 'onRequest',
'xlink:href' => 'reference.xml#ID (TFPA23058)',
'content' => '23058',
'type' => 'Reference',
'xlink:type' => 'simple'
},
{
'xlink:show' => 'new',
'xlink:actuate' => 'onRequest',
'xlink:href' => 'reference.xml#ID (TFPA23099)',
'content' => '23099',
'type' => 'Reference',
'xlink:type' => 'simple'
}
]
}
],
'text' => [
'two references'
],
'category' => [
'category 2'
]
},
'AN000036744' => {
'source' => [
{}
],
'text' => [
'no references'
],
'category' => [
'category 3'
]
},
'AN000036741' => {
'source' => [
{
'item' => [
{
'xlink:show' => 'new',
'xlink:actuate' => 'onRequest',
'xlink:href' => 'reference.xml#ID (TFPA23058)',
'content' => '23058',
'type' => 'Reference',
'xlink:type' => 'simple'
}
]
}
],
'text' => [
'one reference instance 1'
],
'category' => [
'category 1'
]
},
'AN000036745' => {
'source' => [
{
'item' => [
{
'xlink:show' => 'new',
'xlink:actuate' => 'onRequest',
'xlink:href' => 'reference.xml#ID (TFPA23055)',
'content' => '23055',
'type' => 'Reference',
'xlink:type' => 'simple'
}
]
}
],
'text' => [
'one reference instance 2'
],
'category' => [
'category 4'
]
}
},
'version' => '5.2',
'xmlns:xlink' => 'http://www.w3.org/1999/xlink',
'extent' => 'annotate'
};

Back to Desktop Forum
2 total posts (Page 1 of 1)  

Related Discussions

Related Forums