HTTP and HTML: The paradox of dominance

Justin James says that the paradox of the dominance of HTML and HTTP is that it would be unlikely to see anything better suited to the Web development task emerge, while HTML and HTTP struggle to adapt to the task.

The saying, "When all you have is a hammer, every problem looks like a nail," makes me think of the mess that we're in when it comes to the dominance of HTML and HTTP.

I tend to be down on the concept of Web applications -- especially those that make use of AJAX -- but they exist for a number of valid reasons. Nonetheless, the paradox of the dominance of HTML and HTTP is that it would be unlikely to see anything better suited to the Web development task emerge, while HTML and HTTP struggle to adapt to the task.

To get a better understanding of the problem, think back to the era between 1990 and about 1997/1998. By 1990, the client/server revolution was in full swing. Novell NetWare dominated the non-UNIX server world, and UNIX was in the process of pushing mainframes out of the way. In the middle of all of this, some developers wanted a better way to publish documents, and HTML and the corresponding HTTP protocol was born. While the two are not inseparably linked (you can view HTML that is not transmitted over HTTP), HTTP was custom tailored to meet the needs of HTML at the time, in a way that FTP (and other existing protocols) would be a bit too heavy-handed or sometimes inadequate. HTML had some basic GUI widgets, and the CGI system was cobbled together to handle those widgets. Most Web sites were static HTML with the occasional CGI program mixed in.

At the time, there were some needs that the client/server architecture did not handle very well. For instance, the architecture required a level of connection quality and reliability that was not available to most users who were not on a LAN. It also had high administrative overhead, since the applications were installed on the clients and then configured to communicate with a central server. It was very painful to troubleshoot those connections; it involved trapping and analyzing packets.

Some time around 1997 or 1998, developers all over the world realized that a lot more could be done with HTML's widgets than the shopping carts and BBS replacements that passed for Web applications until then. Netscape added JavaScript to the browser, which allowed developers to add enough responsiveness to the user's actions to have HTML no longer feel like a pure document format. Even more importantly, Netscape added cookies to the browser, allowing developers to compensate for HTTP's connectionless, stateless nature.

Let's fast forward 10 years to when Web applications dominate mindshare -- even though they don't penetrate the market. Look at the buzz about Google Docs and Office Live compared to the sales numbers of Microsoft Office. Office gets zero buzz, yet for every user of a Web office suite, there are probably 100,000 Office users (I'm just throwing out a number). The assumption that everything will be a Web application comes from the fact that no one sees any other way to fill the needs that HTML and HTTP address -- even when they do it poorly.

At this point, coming up with a better alternative would be a pretty tough sell. I've described my ideal alternative, and every programmer I have talked to (as well as many networking and systems engineers) envision similar alternatives. This is hardly a scientific sample, but even in discussions in this Programming and Development blog, it is pretty rare to find someone who thinks that Web applications are really what we need. We all agree that the needs that Web applications try to address are real and that Web applications actually meet them rather poorly.

I feel that something like the X Window System would be a much better approach for the distributed, single point of storage and installation gap that is being filled with Web applications. Other people I talked to have suggested something like Windows Terminal Services (Citrix), VNC, or other remote display technologies.

Network engineers now design their entire networks around the HTTP protocol carrying (primarily) HTML traffic. After all, HTTP and e-mail are the main attack vectors for viruses, spyware, etc. In addition, the huge bulk of Internet bytes are HTML (and resources referred to by HTML documents) being fetched over HTTP. Even for remote application and data uses, HTTP carrying XML (which is close enough to HTML to be handled the same way by many programmers) is now the favored vehicle. As much as systems engineers and network engineers hate having to stay on top of the HTTP traffic, they are even less likely to let a new (or old) protocol through for remote applications. For one thing, as dense as some of that HTML (and JSON and XML) can be, it is fairly easy to detect viruses and inappropriate usage in it. With a protocol that carries a true remote terminal session, the IT team loses all monitoring, logging, etc. abilities. It's pretty easy to keep a list of URLs visited; it is a lot harder to store every mouse click and screen update that a remote terminal system would put through.

Now we have an IT environment that has become a victim of its own success. It is highly unlikely that IT departments will be willing to let anything more complex than HTTP through now that it looks like "anything can be done with HTML and HTTP." Meanwhile, the HTTP protocol has not changed in far too long, and HTML is struggling to try to adapt to how people are using it.

It is a pretty sad state of affairs that no one ever intended to happen. Back when HTML 4 was standardized, the talk was all about the Semantic Web and not the "Programmatic Web." It was obviously a push to get HTML closer towards being a document standard and allow applications other than Web browsers to consume it. HTML 5 is a push towards trying to incorporate the best efforts of Web developers into the standards, so at least the problems with Web browser incompatibility can be resolved.

While I disagree with this as a direction, I completely understand the motivation behind it. Programmers, not just of Web applications, but also of tools, browsers, and so on, all need these techniques to be standard somewhere. We all agree on that -- but we disagree on where. A lot of programmers choose HTTP and HTML as this place, simply because they are protocols that get through the firewall. Other programmers look to HTTP and HTML simply by default due to inertia in the industry.

It seems the existing success of the HTTP/HTML combination will continue, with it nearly pushing traditional client/server techniques out of the way. This replacement will occur regardless of the technical merits of client/server methods, the shortcomings of HTTP/HTML, or the existence or future creation of better alternatives. In other words, the stunning success of HTTP and HTML is guaranteeing mediocrity in the world of application development's future.