HTML 5: A change in course... straight for the iceberg

Justin James explains why the recently released working draft of HTML 5 is the worst specification he's ever read.

The W3C recently released a working draft specification for HTML 5. In its current iteration, this is the worst specification I have ever read. In comparison to HTML 5, I take back everything that I said about SMTP -- at least SMTP knows its role and sticks to it. HTML 5 is a 15 year leap backwards; it's also a transparent attempt to pander to the current tool vendors, in a format that is so incredibly "right here and now" that it is not a durable standard to carry us through to the future.

HTML 4 had an obvious and clear goal. The HTML standard started out as being extremely screen specific and lacking in styling ability. Tags such as <i> (italic) and <b> (bold) were specifically presentation items that carried no context. In other words, "Is that phrase italicized for emphasis, or because it is the title of a magazine?" The HTML standard has not kept up with the pace of browser development, and Netscape and Internet Explorer have introduced a variety of new tags and features that are helpful to Web developers and designers. HTML 4 pushed hard to make sure the HTML specification is a standard that provides context for text with the assistance of a dash of non-text resources such as images. It was up to CSS to transform the context into presentation. As a result, Web designers suddenly had much more granular control over design; Web developers had a way of extracting real meaning from HTML; and users could view HTML on a wide variety of devices and systems, and those viewers could render (with or without CSS) the pages as appropriate for that scenario. (The W3C has provided a helpful document that compares HTML 5 to HTML 4 and XHTML 1.)

HTML 5 takes this smart direction, locks it in a warehouse full of gasoline and ball bearings, and throws a match inside. Instead of taking a bold stand against the tool vendors, framework providers, and other vested interests and furthering the cause of device and software independence, the W3C decides to wed the HTML standard to the current trends in Web development.

The biggest example of this is in section 1.1.3. Relationship to XUL, Flash, Silverlight, and other proprietary UI languages: "This specification is independent of the various proprietary UI languages that various vendors provide. As an open, vender-neutral language, HTML provides for a solution to the same problems without the risk of vendor lock-in." Since when is HTML supposed to be a "solution" for the issues that those technologies try to solve? Do we really want HTML to become difficult for the disabled, hard for search engines to parse, and impossible to print? HTML 4 finally provided a good mechanism for dealing with issues like printing, and how to display content on devices that do not fit standard desktop monitor sizes. HTML 5 works hard to undo that progress.

Another good example of the sudden browser-intensive focus of HTML 5 is in the dependency disclaimer: "This specification does not require support of any particular network transport protocols, style sheet language, scripting language, or any of the DOM and WebAPI specifications beyond those described above. However, the language described by this specification is biased towards CSS as the styling language, ECMAScript as the scripting language, and HTTP as the network protocol, and several features assume that those languages and protocols are in use." So, what happens when I view HTML 5 that is stored as a file on a network server? What happens to systems that do not use ECMAScript (e.g., my printer) or a locked down Web browser or a search engine? Should the page fail to render properly, or should it just stop working?

To say that HTML 5 is trying to displace things like Flash and Silverlight and make AJAX the dogmatically pure method of client-side interactivity is putting it mildly. HTML 5 introduces a slew of new elements that play right into the hands of vendors looking to give us even more kludgy tools. Instead of writing hacked up ECMAScript to perform client-side validation (backed by server-side validation), we now have methods to "enforce" validation on the client side. I'm sure that hackers will ensure their scripts follow the input restrictions; meanwhile, less savvy developers will think their code is protected from bad input. I bet Sun and Microsoft are champing at the bit to introduce new versions of J2EE and ASP.NET that specifically generate the new HTML 5 "datagrid" element; or to introduce the new attributes for the "input" element that let the developer specify if a text box is for an e-mail address, date/time, and so on. We don't need a version of HTML that is tightly bound to both client-side scripting and server-side programming. And can anyone explain to me why: HTML 5 has APIs and, specifically, APIs for persistent storage; a "draggable" attribute for use with a drag-and-drop API; or APIs that address something as specific as the "Back" button?

I know why the writers felt these things were needed, but they are not necessary in the HTML specification. The writers made efforts to address the weaknesses of the currently in vogue AJAX methodology (which exists to address the weaknesses of the HTML-as-an-application-framework concept).The HTML standard is compensating for a problem with the things used to compensate for HTML's perceived "problems." The "problem" with HTML is not that it is lousy as a method for defining GUIs -- it's that people are using HTML to do what Flash, Silverlight, XUL, etc. are for. This is sheer idiocy on the part of the W3C. The HTML standard is a document standard -- not an application framework standard.

There are some good items in HTML 5. The draft continues to reverse the frameset mess (iframes still exist, which is fine with me). There is a new tag for "audio" and one for "video", which makes more sense than having to embed a reference to a media handler object or a Flash player; this is something that I have supported for a while. But these few bright spots are simply not enough to redeem to draft.

The only people and institutions I see benefiting from this draft of HTML 5 are tool vendors. For years, the tool market has been fairly stagnant when it comes to HTML editors. The real focus has been on a multitude of frameworks to do things like compile static, server-side code into client-side JavaScript, work with the client with AJAX frameworks, and so on. These have been nearly impossible to bake into the HTML editors because all of their functionality was outside of the spec. But with the HTML 5 draft, I am sure that tool vendors are drooling at the opportunity to make "new and improved" HTML 5-compliant editors that combine their proprietary backend of choice and their proprietary client-side framework of choice and use the new HTML 5 features as a bridge between the two.

Am I being paranoid about the motivations behind the HTML 5 draft? Probably. The stated reason for the direction of the changes is to bring HTML 5 inline with the current state of the art. This is the entire problem. I think the current state of the art is an unwieldy kludge and bringing the HTML standard into collusion is simply aiding and abetting.

The final irony in all of this is that the HTML 5 draft makes the vision of Web applications displacing desktop applications all the more likely. After all, the inclusion of Flash/Silverlight/XUL-like UI functionality in the HTML spec itself can do nothing but help spread Web applications. It's a shame that the true goals and principles of HTML have been thrown by the wayside. This UI-centric draft is ripe for vendor-specific add-ons, extensions, and hacks -- this will send us to the bad old days of "this site best viewed with..." logos. HTML 4 has been great for the industry and has been good because it allowed HTML writers to learn it, let the tools to catch up, and the browsers had a chance to become extremely compliant (some better than others).

The current HTML 5 is the wrong draft at the wrong time. I hope the W3C wakes up before it's too late and redrafts HTML 5 to continue the good work of HTML 4 instead of trying to become a "me too" attempt at UI abstraction.