COMMENTARY — Grid computing aficionados gathered at GlobusWorld in Boston this week to deliberate some of the driving forces behind their beloved technology. Having lived through the multiple generations of disruptive technologies, proponents of grids appear acutely aware that their technology will remain relegated to academic and scientific circles unless it embraces the two fundamental concepts on which 21st century-style mass adoption has proven to flourish — interoperable standards and open source.
Leading the way on the open source front is the Globus Alliance - an organization whose most notable contribution to the cause has been a grid toolkit (GT4) that tightly weaves as many enabling standards as it possibly can — many of which come from the Web services ecosystem — into an open-source based solution for building grids. GT4 is to grids what an open source solution like Apache is to Web servers or what GNU Linux is to operating systems (even to the extent that both include implementations of well known standards).
The role of the Globus Alliance, as the primary chaperone of GT4, mostly closely resembles the roles played by the Apache Software Foundation (to Apache) and the Free Software Foundation (to GNU Linux). In fact, so closely does the grid ecosystem resemble the Linux ecosystem, that it has already given birth to a company (Univa), that services and supports its own distribution of the toolkit (much the same way Red Hat services and supports its own distribution of GNU Linux).
On the interoperable standards front, GT4 already has many standards baked into it. As more standards come to bear, particularly in the area of Web services, it's likely they'll be absorbed into the standard grid playbook. As it turns out, Web services and grid computing are virtually synonymous. Both compute paradigms subscribe to the idea that computing tasks can be serviced in utility-like fashion by distributed compute nodes that are discoverable and accessible through standard APIs — the benefits of which are dynamically scalable and resilient systems that do away with over-provisioning as a method of dealing with peak loads.
Theoretically, the end result can be better performance (but only when you need it) at a significantly reduced cost — two phrases that are music to the ears of enterprise technologists and chief financial officers. But, even though an open source implementation exists — one that embraces certain standards the way enterprises should insist they be embraced — grid technology has a long row to hoe before enterprises can envision it as the cornerstone of their heterogeneous computing fabrics. I say "the cornerstone" as opposed to "a cornerstone" because a grid architecture is what I believe enterprises should be aspiring to. So far, I haven't heard a good explanation (or even a bad one) for why information technology shouldn't be built around the idea that we ought to find, call upon and pay for compute resources (be the call for something task specific, or just for more compute horsepower) on an as-needed basis and that, in doing so, we invoke standards-based approaches regardless of whether we run those resource pools ourselves, or we're looking for someone else who is.
Aided in part by vendors' attempts to garner a nascent market's attention for some very non-standard offerings, it's easy to get lost in the confusing morass of terminology — grid, Web services, utility, on-demand, virtualization, et al. — all of which are slightly different takes on the same aforementioned principle. In fact, at GlobusWorld, leading grid researcher Ian Foster cautioned that each of those approaches are solving the same problem and that they should be solving it in the same, standards-based way. Call it what you want. Ultimately, the flow chart looks pretty much the same whenever distributed systems are enlisted to help complete a task.
Unfortunately for enterprises, distilling the various marketing messages and grid technobabble into such a set of simple, easily understood fundamentals from which a strategic infrastructure plan can emerge is an arduous task. Worse, it's really not in the charters of the key celestial bodies in grid computing such as the Global Grid Forum, the Globus Alliance and the Globus Consortium, to make grid technologies more palatable to enterprises. So hard are companies like IBM and Sun pushing on the notions of on-demand and utility computing (respectively speaking) that it's probably easier simply to subscribe to one particular vendor's approach (fastest path to the advertised benefits) and run the risk of some lock-in than it is to become so well-versed in the fundamentals of distributed computing that your right to choice remains well-preserved for the foreseeable future.
But, in recognition of the enterprise mindset that prefers a methodical and calculated approach towards extracting value from any new computing paradigm, and in seeing how enterprise decision makers' information needs were being underserved by the existing grid regime, a group of key technology players banded together in 2004 to represent the interests of enterprises to the larger grid community. Of course, forming the Enterprise Grid Alliance was not a completely unselfish act on behalf of members such as Sun, Oracle, HP and Intel. Unlike other industry collaboratives that include IT customer participation (for example, the Liberty Alliance), the EGA is a vendor-driven outfit that has a vested interest in driving enterprise acceptance and adoption of grids. (End users of IT can join, though.) While each new paradigm shift in computing has ushered in new benchmarks in scalability and total cost of ownership, they've also helped vendors to rescue tired and crusty revenue streams from the ashes. It's a symbiotic relationship, but one that's not devoid of honesty. Said EGA marketing steering committee chairperson Peter ffoulkes, who is also the group manager of high performance and technical computing (HPTC) marketing at Sun, "We are not in the EGA for the sole benefit of humankind. We're in it to make money. But the same goes for users. They need to make money off their technology."
So, in the interests of that relationship (and its members), what has so far been the charter of the EGA? According to ffoulkes, enterprises whose systems run in the outrageously wasteful range of 5- to 20-percent utilization are peering over the fence at the academic and scientific communities and are watching them attain at least 50- if not 90-percent utilization rates. "For-profit business wants a piece of that action, but a lot of grid technology is designed by the science and academic communities for the science and academic communities" said ffoulkes. "That work is not aligned with commercial needs. We saw good work getting done, but no one was representing the enterprise community, a community that needed an advocacy group." Thus, the EGA, which was an outgrowth of discussions that began about a year ago between Sun, Oracle, and HP, was born.
Initially, the EGA is focused on five areas of paramount concern to enterprises: utility accounting, grid security, data provisioning, component provisioning, and common terminology. When it comes to discussing grid technologies, one gets the sense that non-interoperable terminology is almost as bad as non-interoperable technology after speaking with ffoulkes and his counterpart Tony DiCenzo who is also Oracle's director of standards strategy and architecture. Apparently, along the way to embracing grids, the academic and scientific communities have arrived at a nomenclature that's foreign to most enterprise IT types. Even among so-called grid vendors, the terminology used to refer to the same things apparently varies widely — and doesn't map well into the existing lexicon. The EGA has recognized the resulting language barrier as a big impediment to grid adoption and is working to establish some terminology standards for vendors to use when speaking with customers..
In the context of impediments to grid adoption, enterprises evaluating their grid or distributed computing options find it difficult if not impossible to compare offerings based on financial metrics such as cost or expected savings. Most vendors account for their offerings in different ways — a problem that's somewhat reflected in my colleague Dan Farber's recent attempt to compare utility offerings from Sun and IBM. Said ffoulkes, "Sun has set a price of $1 per CPU, but that's a specific type of CPU that may be different from someone else's." Glancing at DiCenzo, ffoulkes rhetorically asked "What is Oracle's unit of database access? Then there's HP's way of accounting. There's no way to compare." How the differences will get resolved is unclear. Perhaps it will be a standard unit of measure for different types of compute resource (i.e.: processor, storage, etc.) or maybe it will be some sort of published rate of exchange like euros to dollars.
Grid security is another area that makes businesses nervous and one that the EGA has established as a priority. As complicated as it is, walling-in dedicated transactional systems in order to keep sensitive data secure is child's play when compared to trying the same exercise in a transaction-oriented environment where work can be parceled out to a grid — particularly a public grid or one that's hosted by a grid service provider. If various vendors of grid technologies are to have any hope at penetrating the enterprise space, the various grid offerings will have to interoperate with each other as well as with existing enterprise security frameworks (a good example of the sort of concern that might not be shared with the academic or scientific communities).
Unfortunately, when it came time to discuss what data provisioning was, neither ffoulkes nor DiCenzo could answer. Based on what I know, data provisioning, has to do with making sure the data that some process needs to act on is in the right place at the right time, particularly when you take into account the fault tolerance that's indigenous to grids. According to the EGA's Web site, the EGA's Data Provisioning Working Group "is chartered with identifying the requirements of data provisioning in Enterprise Grids resulting in the development of usage scenarios and reference implementations. The initial focus of the working group will be on bulk operations and simple paradigms, such as deployment/redeployment. Subsequent focus will be on incremental and fine-grained data." OK, now I get it. Well, maybe not.
In contrast to data provisioning though, the fifth and last of the EGA's primary areas of focus is component provisioning. The standards for system provisioning in a heterogeneous hardware environment are virtually non-existent. It has taken companies like Altiris and Opsware, who've baked some secret sauce into their cross-platform provisioning solutions, to overcome the way a lack of provisioning standards results in a management nightmare if you're trying to centrally manage an environment that includes more than one vendor's hardware. Now imagine that that hardware is being used to string together a grid—suddenly, the challenges of provisioning take on an entirely new dimension. For example, if you issue a provisioning directive from your grid management console—"turn those 50 systems into a grid that's configured this way"-that grid simply won't work the way you want it to if those 50 systems aren't identical to each other. It's good to know that an industry-based group like the EGA - despite its commercial interests—is working on these sorts of interoperability issues. Though ffoulkes and DiCenzo like to focus on interoperability as the key message, they also agree that such interoperability greases the wheels of substitution, which ultimately gives buyers a lot of leverage over vendors—a benefit of standards that's always worth remembering.
On the downside, absent from the EGA's roster are two key players in enterprise computing - IBM and Microsoft. In fairness, I haven't had a chance to ping them as to why they're not involved with the group. That said, while IBM was present at GlobusWorld, Microsoft was nowhere to be seen.