XHTML: Extensibility, portability, and no more messy markup practices

HTML allows great flexibility in how you code, but XHTML imposes a stricter syntax. Learn about the various differences you'll find in XHTML.

HTML is the language of the Web. But a newer version of the language, Extended Hypertext Markup Language (XHTML), is growing in popularity. This article will look at some of the characteristics that distinguish XHTML from HTML.

Why extend HTML?
Nothing is wrong with HTML. But many aspects could be improved, and a set of standards would help resolve issues in some areas. HTML also lacks certain key elements, such as extensibility and accessibility. So while HTML isn't broken, XHTML aims to enhance the available features.

The extended family
The World Wide Web Consortium (W3C), the organization responsible for creating Web standards such as HTML and XML, has created a family in XHTML. Rather than just a single XML application, XHTML refers to a collection of XML grammars, which define document types based on the HTML version 4 standard. This family currently includes XML specifications for the three HTML 4 document types: Strict, Transitional, and Frameset.

The relationships between HTML and XHTML are easy to understand, as are the documents themselves. XHTML provides a stricter, but also cleaner, implementation of most HTML tags. For example, XHTML is not as lenient as HTML when it comes to case sensitivity. With HTML, you can use any casing you want for your tag names. Mixed casing can be used just as effectively as all uppercase or all lowercase. But XHTML coders must use lowercase tags exclusively.

Because XHTML is an XML grammar, normal XML rules also apply to the use of tags. Tags cannot be left unterminated in an XHTML document. This includes commonly unterminated HTML tags, such as the break tag, <br>, horizontal rule, <hr>, and paragraph tag, <p>. Rather than use the HTML tag, XHTML users must be sure to either close the tag, as in <br></br>, or use an empty tag such as <p/>.

Quotable arguments
Another difference between HTML and XHTML lies in how they handle attribute lists, or arguments to elements. With HTML, it has become common practice to use any of three syntaxes for specifying attribute parameter values. Sometimes, the values are double-quoted, sometimes they're single-quoted, and sometimes they aren't quoted at all:
<body bgcolor="#FF0000">
<script language='JavaScript'>
<table width=640>

This kind of flexibility is not permitted with XHTML. Instead, coders must use double quotes when specifying values to element attributes.

XHTML users also can't have rogue, or orphaned, attributes. In XML-speak, this is called attribute minimization. A common HTML example is when you use a form to display check boxes and want to indicate which boxes should be checked:
<input type="checkbox" checked>

With XHTML, the checked attribute should be specified like this:
<input type="checkbox" checked="checked">

A change for the better
XHTML is the wave of the future for describing Web content. It provides a more robust and standardized HTML via a user-friendly XML grammar. Despite the differences that exist between HTML and XHTML, most users will fall easily into using XHTML.

Editor's Picks