What do the numbers in status code 404 signify under the formal HTTP spec?
The time has come to celebrate yet another of those all-too-unrecognized geek-centric holidays (which I may have just made up): 404 Day! Every April 4th, Web surfers of every persuasion should take time out to celebrate that one universal experience of all Internet consumers and professionals -- the 404 Page Not Found error. No matter which sites you frequent, which ISP you use, or which operating end of the browser zealot spectrum you fall on, we've all had our share of 404s.
So, where did the 404 come from (besides the server, of course)? Like pretty much everything World Wide Web-related, the 404 is an official component of the Hypertext Transfer Protocol (HTTP) specification ratified by the World Wide Web Consortium (W3C).
It first appeared in the version 0.9 HTTP spec, adopted in 1992. If you track down that document, you'll notice a rather telling signature: TimBL. That's the byline of one Tim Berners-Lee, he of the "I invented the World Wide Web and the first Web browser" fame. The same guy who made the modern Web page possible also invented the Page Not Found.
Genius though he was, Berners-Lee didn't spin the HTTP status codes out of whole cloth but based them on the preexisting File Transfer Protocol (FTP) status codes. If you compare the two code listings, you'll find only 10 overlapping codes: 100, 200, 202, 425, 426, 500, 501, 502, 503, and 504.
Only 100 and 200 have similar meanings under both standards -- OK and Continue, respectively -- so it's clear Berners-Lee didn't copy FTP into HTTP. For the record, there is no code 404 in FTP, so that infamous error message is original to the Hypertext Transfer Protocol by way of TimBL.
Rumor has it that, whether or not Berners-Lee suspected that code 404 would become famous by virtue of link rot and lazy sysadmins, he intended that particular numeric to include a sly inside joke. You see, the HTTP status code system bears a striking resemblance to the CERN laboratory building numbering system. CERN, the Swiss techno-mecca, is the birthplace of the World Wide Web, leading some to infer that code 404 is a subtle reference to room 404 at CERN.
The only problem with that theory -- or, rather, that urban legend -- is that there is no room 404 at CERN, and there never has been. The real meaning and origin of the 404 code is far more mundane, with each digit having a specific significance.
WHAT DO THE NUMBERS IN STATUS CODE 404 SIGNIFY UNDER THE FORMAL HTTP SPEC?
What do the numbers signify in the famous code 404 Page Not Found error, according to the HTTP status code specification from the World Wide Web Consortium?
In simplest terms, HTTP code 404 means Page Not Found (duh). But specifically, HTTP 404 is actually two phrases: Client Error and Not Found. The first 4 is the error class (Client Error), and 04 is the specific error (Not Found).
By design, the HTTP code spec is extensible for up to 100 errors per class. There are five recognized classes, quoted below from the current HTTP spec:
- 1xx: Informational: Request received, continuing process.
- 2xx: Success: The action was successfully received, understood, and accepted.
- 3xx: Redirection: Further action must be taken in order to complete the request.
- 4xx: Client Error: The request contains bad syntax or cannot be fulfilled.
- 5xx: Server Error: The server failed to fulfill an apparently valid request.
Within each class, secondary two-digit codes specify a distinct error. There are 53 specific codes recognized by the W3C, and most software recognizes these codes as written.
That said, no browser has to recognize the specific three-digit code, it's just a recommendation. All a W3C-compliant browser is required to do is recognize the error class -- the first digit.
So, I could institute a fictional HTTP 499 code Your browser is a scruffy looking nerf herder -- there is no 499 code in the W3C spec -- and a compliant browser would have no idea what that specific code meant but would recognize it as a general client error. It would be up to a human to interpret the lame Empire Strikes Back in-joke.
Incidentally, this does mean that on some level, the 404 error is famous simply because no one found a good reason to stray from the W3C code suggestions. That's also why the exact wording of a 404 return is up to the individual server and sysadmin, busting out anything from a Not Found to a Page Not Found to an elaborately worded Hitchhikers Guide to the Galaxy reference. That's not just techno-humor -- that's erroneously outrageous Geek Trivia.
Get ready for the Geekend
The Trivia Geek's blog has been reborn as the Geekend, an online archive of all things obscure, obtuse, and irrelevant -- unless you're a hardcore geek with a penchant for science fiction, technology, and snark. Get a daily dose of subcultural illumination by joining the seven-day Geekend.
The Quibble of the Week
If you uncover a questionable fact or debatable aspect of this week's Geek Trivia, just post it in the discussion area of the article. Each week, yours truly will choose the best post from the assembled masses and discuss it in a future edition of Geek Trivia.
This week's quibble comes from the March 21 edition of Geek Trivia, "The map in the moon." TechRepublic member joethejet felt I omitted a key detail about Leonardo da Vinci's lunar cartography.
"One thing you neglected to mention was that da Vinci was also wrong about the oceans reflecting light to the moon. While you didn't explicitly state that this was the case, you didn't tell us that he was wrong either. I mistakenly believed that the oceans also reflected the light until I read this: [article about the Da Vinci Glow]."
You're quite right, dear reader -- I had to omit that factoid because of space constraints. But I'm glad you could slip it in through the quibble.
It's true that da Vinci thought Earth's oceans bounced light to the moon, when the bulk of the reflected sunlight comes from Earth's cloud cover. On the whole, however, Leonardo's theories were sound. Thanks for the extra tidbit, and keep those quibbles coming.
Falling behind on your weekly Geek fix?
Check out the Geek Trivia Archive, and catch up on the most recent editions of Geek Trivia.
Test your command of useless knowledge by subscribing to TechRepublic's Geek Trivia newsletter. Automatically sign up today!
The Trivia Geek, also known as Jay Garmon, is a former advertising copywriter and Web developer who's duped TechRepublic into underwriting his affinity for movies, sci-fi, comic books, technology, and all things geekish or subcultural.