What To Do About The Sorry State Of Web Development
A commenter on previous article (The Sorry State Of Web Development) make a good point: I put out a lot of negativity without offering anything constructive in return. Well, Im going to make rectify that mistake.
Here is what I think needs to be done to improve the Web, as far as programming goes. I admit, much of it is rather unrealistic considering how much inertia the current way of doing things already has. But just as Microsoft (eventually) threw off the anchor of the 640 KB barrier for legacy code, we need to throw off the albatrosses around the neck of Web development.
HTTP is fine, but there needs to be a helper (or replacement) protocol. When HTTP was designed, the idea that anything but a connectionless, stateless protocol would be needed was not in mind. Too many people are laying stateful systems that need to maintain concurrency or two-way conversations on top of HTTP. This is madness. This applications (particularly within AJAX applications) would be much better served with something along the lines of telnet, which is designed to maintain a single, authenticated connection over the course of a two-way conversation.
HTML is a decent standard, but unfortunately, its implementation is rarely standard. Yeah, I know Firefox is great at it, but its penetration still "isnt there" yet. More importantly, while being extremely standard compliant, it is still just as tolerant of non-standard code as Internet Explorer is. If Internet Explorer and Firefox started simply rejecting non-standard HTML code, there is no way that a web developer could put out this junk code, because their customer or boss would not even be able to look at it. Why am I so big on HTML compliance? Because the less compliant HTML code is, the more difficult it is to write systems that consume it. Innovation is difficult when, instead of being able to rely upon a standard, you need to take into account a thousand potential permutations of that standard. This is my major beef with RSS; it allows all sorts of shenanigans on the content producers end of things, to make it "easy" for the code writers, which makes it extraordinarily difficult to consume it in a reliable way.
When developers are allowed to write code that adheres to no standard, or a very loose one, the content loses all meaning. An RSS feed (or HTML feed) that is poorly formed has no context, and therefore no meaning. All the client software can do is parse it like HTML and hope for the best.
The client-side scripting also needs the ability to open a direct data connection to the server. Why does an AJAX application need to format a request in HTTP POST format, send it to an application server which does a ton of work to interpret the request, pass it to an interpreter or compiled code, which then opens a database connection, transforms the results into XML, and then passes it back over the sloppy HTTP protocol? Wouldnt it be infinitely better for the client to simply get a direct read-only connection to the database via ODBC, named pipes, TCP/IP, or something similar? If were going to use the web as a form of distributed processing, with the code managed centrally on the server, this makes a lot more sense than the way were doing things now.
XML needs to be dropped, except in appropriate situations (where two systems from different sources that were not designed to work together need to work together, tree data structures, for example). Build into our client-side scripting native methods for data transfer which make use of compression, delimited and fixed width formats for "rectangular" data sets (XML is good for tree structures, and wasteful for rectangular data), preferably have that automatically negotiated between the client and the server, and were talking massive increases in client-side speed and server-side scalability. This would only add a few hours of development time to the server-side to code in, and would pay dividends for everyone involved.
The current crop of application servers stink, plain and simple. CGI/Perl is downright painful to program in. Any of the "pre-processing" languages like ASP/ASP.Net, JSP, PHP, etc. mix code and presentation in difficult-to-write and difficult-to-debug ways. Java and .Net (as well as Perl, and the Perl-esque PHP) are perfectly acceptable languages on the backend, but the way they incorporate themselves into the client-to-server-to-client roundtrip is current unacceptable. There is way too much overhead. Event driven programming is nearly impossible. Ideally, software can be written with as much of the processing done on the client, with the server only being accessed for data retrieval and updates.
The application server would also be able to record extremely granular information about the users session, for usability purposes (what path did the user follow through the site? Are users using the drop-down menu or the static links to navigate? Are users doing a lot of paging through long data sets? And so on). Furthermore, the application server needs to have SNMP communications built right into it. You can throw off all the errors you want to a log, but it would be a lot better if, for example, a particular function kept failing that someone was notified immediately. Any exceptions that occur more than, say, 10% of the time needs to be immediately flagged, and maybe even cause an automatic rollback (see below) to a previous version so that the users can keep working, while the development team fixes the problem.
The presentation layer needs to do a lot of what Flash does, and make it native. Vector graphics processing, for example. It also needs a sandboxed, local storage mechanism where data can be cached (for example, the values of drop down boxes, or "quick saves" of works in progress). This sandbox has to be understood by the OS to never have anything executable or trusted within it, for security, and only the web browser (and a few select system utilities) should be allowed to read/write to it.
Tableless CSS design (or something similar) needs to become the norm. This way, client-side code can determine which layout system to use based upon the intended display system (standard computer, mobile device, printer, file, etc.). In other words, the client should be getting two different items: the content itself, and a template or guide for displaying it based upon how it is intended to be used. Heck, this could wipe out RSS as a separate standard, just have the consuming software display it however it feels like, based upon the applications needs. This will also greatly assist search engines in being able to accurately understand your website. The difference (to a search engine) between static and dynamic content needs to be eradicated.
URLs need to be cleaned up so that bookmarks and search results return the same thing to everyone. It is way too frustrating to get a link from someone that gives you a "session timeout" error or a "you need to login first" message, and significantly impacts the websites usability. I actually like the way Ruby on Rails handles this end of things. It works well, from what I can see.
The development tools need to work better with the application servers and design tools. The graphics designers need to see how possible their vision will be to implement in code. They graphics designers will also be able to see how their ideas and designs impact the way the site handles that; if they can see, up front, how the banner they want at the top may look great on their monitor, but not look good on a wider or more narrow display, things will get better. All too often, I see a design that simply does not work well at a different resolution that what it was aimed at (particularly when you see a fixed-width page that wastes half the screen when your resolution is higher than 800x600).
Hopefully, these tools will also be able to make design recommendations based upon usability engineering. It would be even sweeter if you could pick a "school" of design thought (for example, the "Jakob Nielsen engine" would always get on your case for small fonts or grey-on-black text).
These design tools would be completed integrated with the development process, so as the designer updates the layout, the coder sees the updates. Right now, the way things are being done, with a graphic designer doing things in Illustrator or Photoshop, slicing it up, passing it to a developer who attempts to transform it into HTML that resembles what the designer did, is just ridiculous. The tools need to come together, and be at one with each other. Even the current "integrated tools" like Dreamweaver are total junk. It is sad that after ten years of "progress", most web development is still being done in Notepad, vi, emacs, and so forth. That is a gross indictment on the quality of the tools out there.
The development tools need a better connection to the application server. FTP, NFS, SMB, etc. just do not cut it. The application server needs things like version control baked in. Currently, when a system seems to work well in the test lab, then problems crop up when pushed to production, rolling back is a nightmare. It does not have to be this way. Windows lets me rollback with a system restore, or uninstall a hot-fix/patch. The Web deployment process needs to work the same way. It can even use FTP or whatever as the way you connect to it, if the server invisibly re-interprets that upload and puts it into the system. Heck, it can display "files" (actually the output of the dynamic system) and let you upload and download them, are invisibly, the same way a document management system does. This system would, of course, automatically add the updated content to the search index, site map, etc. In an ideal world, the publishing system could examine existing code and recode it to the new system. For example, it would see that 90% of the HTML code is the same for every static page (the layout) with only the text in a certain part changing, and take those text portions, put them in the database as content, and strip away the layout. This would rock my world.
What does all of this add up to? It adds up to a complete revolution on the Web in terms of how we do things. It takes the best ideas from AJAX, Ruby on Rails, the .Net Framework, content management systems, WebDAV, version control, document management, groupware, and IDEs and adds them all into one glorious package. A lot of the groundwork is almost there, and can be laid on top of the existing technology, albeit in a hackish and kludged way. There is no reason, for example, for SNMP monitoring to be built into the application server, or version control or document management. The system that I describe would almost entirely eliminate CMSs as a piece of add on functionality. The design/develop/test/deploy/evaluate cycle would be slashed by a significant amount of time. And the users would suffer much less punishment.
So why cant we do this, aside from entrenched ideas and existing investment in existing systems? I have no idea.
Justin James is the Lead Architect for Conigent.