Staff Writer, CNET News.com
BURLINGAME, Calif.—The technical wizardry behind Google's successful search engine may come down to a blindingly obvious insight: PCs crash.
On Wednesday, Urs Hoelzle, a vice president of engineering and of operations at the search giant, shed some light on how Google's data centers operate. Many people consider the company's operations expertise more valuable than the actual search algorithms that launched the enterprise.
Hoelzle spoke at EclipseCon, a conference for application programmers that's going on till Thursday here.
The way Google has been able to build out its computing infrastructure for millions, rather than tens of millions, of dollars is by buying relatively cheap machines. Looking at hardware costs, company engineers saw that purchasing a few high-end servers, with eight or more powerful processors, costs significantly more than dozens of simpler "commodity" servers.
The trick is to make these racks of hardware operate in tandem and to ensure that the failure of one machine does not derail an operation, such as returning a search query or serving up an ad.
Consider a home PC, Hoelzle said. Optimistically, a consumer PC might crash once in three years from a software glitch or hardware problem.
"At Google scale...if you have thousands of PCs, you can expect one (failure) a day," he said. "So you better deal with that in an automated way, or you will have service outages."
Google, known for its rigorous hiring practices aimed at attracting the brightest minds in computer science, has created a number of software tools to handle its computing installation.
The company wrote its own file system, called Google File System, which is optimized for handling large, 64 megabyte blocks of data. Significantly, the file system was designed to assume that a failure, such as a failed disk or unplugged network cable, can happen at any time.
Data is replicated in three places, and there is a "master" machine that can locate copies of a piece of data, such as a keyword index, if the original is out of commission.
"You make the software tolerate failures. If you can expect failures, then this is what makes cheap commodity PCs viable for Internet services," Hoelzle said.
Google's PC servers, which number in the thousands, run a stripped-down version of Linux, which is based on the Red Hat distribution but is really just the operating system kernel modified for Google, he added.
VP of operations
and of engineering,
The company has also devised a system for handling massive amounts of data and returning rapid responses to queries. Google splits the Web into millions of pieces, or "shards" in Google tech speak, which are replicated in case of failure.
Not surprisingly, the company creates an index of words that appear on the Web, which it stores as an array of large files. But it also has document servers, which hold copies of Web pages that Google crawls and downloads.
Another important engineering feat done by Google is to make writing programs that run across thousands of servers very straightforward, according to Hoelzle. Normally, building applications to run in a "parallel" configuration of servers requires specialized tools and skills.
Google's programming tool, called MapReduce, which automates the task of recovering a program in case of a failure, is critical to keeping the company's costs down.
"Cost is really the sum of what the equipment you need to do the work costs and how much programming time you need to put into getting something useful," Hoelzle said, adding that Google has started using MapReduce more widely over the past year.
Finally, Google has created "batch" job scheduling software that acts as a sort of taskmaster for millions of operations. Called the Global Work Queue, it breaks up computing jobs into many smaller tasks and distributes them across machines.
For all its built-in redundancy in case of failure, the system doesn't address all problems, Hoelzle revealed. During the presentation, he showed a photo of six fire trucks responding to an emergency at a Google data center in an undisclosed location.He would not reveal any specific details on the mishap except to say that "it wasn't about one machine going down."
In a follow-up interview with CNET News.com, Hoelzle said the cost of power is another important factor in Google's data center designs.
"The physical cost of operations, excluding people, is directly proportional to power costs," he said. "(Power) becomes a factor in running cheaper operations in a data center. It's not just buying cheaper components but you also have to have an operating expense that makes sense."