General discussion


A trio of XML pitfalls

By debate ·
Tell us what you think about the three XML pitfalls: misplaced attributes, missing structure, and casing inconsistencies, featured in this week's XML e-newsletter.

This conversation is currently closed to new comments.

Thread display: Collapse - | Expand +

All Comments

Collapse -

flat hierarchy

by wendy.lewis In reply to A trio of XML pitfalls

I have a question about this line from the above article "A simple migration of this file to XML could easily result in a flat
I thought that all Mark up languages including XML were hierarchial, what is flat hierarchy? How does thatdiffer from just plain hierarchy?


Collapse -

by wierzbicki.allen In reply to A trio of XML pitfalls

I think that you are very correct in your explanations of the pitfalls. Good job.

Collapse -

Clear explanation of XML issues, Thanks

by scronson In reply to

I studied 2 programming languages but not a programmer. This XML explanation was simple and great! Thank you.

Collapse -

Excellent lecture

by tzihlmann In reply to A trio of XML pitfalls

Even if I dont have much exposure to XML, i find it an excellent lecture to digest just after the lunch.

Collapse -

by adrianr In reply to A trio of XML pitfalls

I agree with the points made in this article. I have made similar statements myself when running XML "primer" courses. This attribute thing is a real headache for some. Like the article, I tell people to use attributes to qualify the "payload" data - as in your Address example.

Well done.

Collapse -

A differing opinon

by jim_janko In reply to

The article contains an opinion of what constitutes a clear encoding of data. I agree about consistent casing, however, the use of the content vs attributes can go the other way as well. From a development perspective, I have seen different DOM implementations handle the content between open/close tags differently. The Xerces C++ DOM parser (granted I last used it 2 years ago) requires the developer to retrieve a chiled TextElement and retrieve the contents of the #text attribute to get the text between the open/close tags. And since this apparently varies with DOM implementation, this can be quite a pain for the software developer who is trying to process these XML documents.
The alternative that I adopted 2 years ago was to have an attribute named "content" in elements that would have content. The meaning was still clear, still consistent across elements, and much easier to code the processing.
I must admit that I do still feel that there is more conceptual clarity to having the attributes contain information about the content, and the content be the information described by the tag names. But I don't feel that it is poor form to do as I described above.

Collapse -

Attrributes vs. Content

by PTPage In reply to A trio of XML pitfalls

Here's an actual situation we had. I'm interested in anyone's ideas on whether it's better to use attributes or content.

We use XML to mark up form-letter templates. We use a grammar that resembles HTML form tags. The XML document includes text to be printed in the letter and elements describing fields that can be input by the user. Our current design uses primarily empty tags with all of the information about a field given in attributes. For example:
<text size='50' label='Street Addr' name='addr_strt' />

Does anyone have comments on whether some or all of this should be sub-elements or text content?

As another example, I have always been frustrated by radio buttons in HTML. In most GUI systems, you can click on the whole text label of the radio button, but in HTML forms, you can only click on the circle graphic.
I believe radio buttons should be specified like this:
<input type=radio name=choice>Choose Me</input>
Then the browser knows how much text to include as part of the radio button. Some browsers accept this syntax, but ignore the intent. Again, comments anyone?

Collapse -

Little-known HTML element

by Adrian Edwards In reply to Attrributes vs. Content

The best way to address your post is to answer your second question first. Please bear with me.

To enable clickable field labels in HTML forms, you just need to use the (little-known) LABEL element. Never heard of it? You're not alone!

Try marking up your radio buttons like this:

<label for="choice">Choose Me</label><input id="choice" type="radio" name="choice" />

The key is to link each LABEL element to its corresponding field element using the same value for the FOR and ID attributes as shown. The value of the ID attribute must be unique within the document, but can (basically ;-) be any value you choose. A common practice is to use the same value as you have already used for the name field as in the example above.

Although it may be heavy reading for some, the HTML 4.01 Spec does a good job of explaining the use of the LABEL element at:

It's worth noting that not only do recent versions of most majorbrowsers render the contents of LABEL elements so that clicking on it sets focus to the associated form field, but this technique is also a big help to alternative browsing software designed for web uses with disabilities (particularly voice browsers). A nice bonus.

Once you start to think about it, it?s hard to understand why you would mark up HTML forms without explicitly associating labels with fields!

I'll answer your first question in my next post.

Collapse -

Follow the W3C's lead

by Adrian Edwards In reply to Attrributes vs. Content

So now to your first question. Should some or all of an XML form-letter field be sub-elements or text content? My opinion is that labels for form fields should be marked up as element content rather than as an attribute value, as demonstrated by theHTML LABEL element we have just discussed.

This would suggest that you should try something like:

<label for="addr_strt">Street Addr"</label><text size="50" name="addr_strt" />

or maybe even:

<text size="50" name="addr_strt"><label for="addr_strt">Street Addr"</label></text>

although I must say I prefer the former approach. Note: It's a good idea to use ID and IDREF attributes to associate elements if you can.

Now for the justification. My reasoning is that labels for form fields are 'displayable information'. This doesn't mean that _you_ necessarily display them (although I imagine you do), just that they are _intended_ for human consumption. Even if your application only displays them as pop-up 'tool-tips' or status bartext when the mouse pointer is over the field, the point of using XML is to separate content from presentation. So its reasonable to say that some other presentation of the form may well choose to 'display' the form label in some other way.

Some readers may say that attribute values can be ?displayed? just as well as element content, but this is only partly true. Read my next post to see why.

Collapse -

Attributes and content: the difference

by Adrian Edwards In reply to Attrributes vs. Content

The crucial difference between element content and attribute values is that element content may itself contain sub-elements, whereas attribute values are simply flat text strings. To illustrate this, let?s take another look at the HTML LABEL element. Its declaration in the HTML 4 DTD is:

<!ELEMENT LABEL - - (%inline;)* -(LABEL)>

The %inline; parameter entity expands to allow TT, I, B, BIG, SMALL, EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE, ABBR, ACRONYM, A, IMG, OBJECT, BR, SCRIPT, MAP, Q, SUB, SUP, SPAN and BDO, as well as embedded form fields and, of course, text content.

If you have trouble understanding why anyone would need any of these elements in a form label, imagine the difference that EM, STRONG, ABBR and ACRONYM can make to the readability of some labels when rendered by a text-to-speech engine. For an example that might hit a bit closer to home if you are a sighted user, allowing SPAN lets you use CSS to underline a single character in a form label to highlight akeyboard shortcut character that you have enabled with an ACCESSKEY attribute.

Now _your_ XML DTD may or may not allow inline markup within form labels, but the point is that if you mark them up as attribute values, you don?t even have the choice!

Some of the points I have raised here may seem a little esoteric for most small XML apps, but we are searching for a principle here, and my advice is: use element content for ?displayable information?.

Adrian Edwards
Principal Developer
Netimpact Online Publishing

Related Discussions

Related Forums