The Wisdom of Crowds was a very popular book this year. In a nutshell it espouses the idea that there is such a thing “collective wisdom,” and that in a crowd of well informed people, the average decision making power of the group is actually greater than the smartest individual in the group. Basically, this is the idea of emergence, that sometimes a group can be greater than the sum of its parts.
Slowly but slowly, firms are starting to realize that “crowd wisdom” is not just for internal brainstorming sessions, but can be used to push their ideas into the outside world and have the best and brightest minds who don’t work their companies make them more effective. What on earth am I referring to? The sudden transformation that’s occurring in the technology sector around what was once proprietary technologies. 2006 has seen some remarkable about-faces moves from some serious players like Sun which Open Sourced Java a few weeks ago.
In my next few postings I am going to take a look at the kind of products/projects that were previously proprietary and became Open Source this past year, and maybe figure out why their host companys made the move, what it means for them, and what it means for the advancement of Open Source as a movement toward better software.
The latest company to see the wisdom potential in the IT Crowd is Terracotta which makes an amazing Java distributed computing platform. Terracotta’s technology implements what has long been a holy grail of sorts in the distributed computing space: a distributed shared-memory cluster. What this means, put simply, is a cluster of machines that can where programs can be written as though they were going to run on a single machine yet share memory (such as files, global variables, etc). Such beasts have existed but they usually require very specific hardware (for example, a shared memory system using Intel and AMD cluster nodes is really hard to pull off due to the differences in the underlying architectures of the two chip families); expensive high-speed switching systems to move data around as quickly as possible to lessen the impact on the processors, and a lot of very fancy, specialized programming and very fancy shared memory frameworks to pull off.
Cluster computing has been around for quite a while — having been pioneered by Donald Becker was a co-founder of the original Beowulf project. In traditional clusters, you take a bunch of machines — it’s best of they’re all of the same hardware type and speed — connect them together using high-speed networks, and using what is called “message passing” libraries such as The Message Passing Interface (MPI) or Parallel Virtual Machine. With these libraries you can break you application up into small cooperating pieces and run them on a cluster and get dramatically increased performance by sending bits of your data around to all the copies of you application which is running on all those cluster machines.. However, there is a (huge) catch: breaking up your application is a very hard task. And, once you accomplish this feat, the resulting application can only run on a cluster…
What Terracotta developed was a software layer for Java Virtual Machines (JVMs) that allows Java programs to run across multiple systems (and since Java is a virtual machine — it doesn’t matter how many different CPU types you have) and present a program with a what is called a “single memory model.” A single memory model means pretty much what it says — from the perspective of the Java applications running in a cluster of machines using the Terracotta system it’s all one big computer. The Java application has no idea that its running on a cluster of machines.
This is a Very Big Deal. For many years if you wanted to run your Java application on a cluster of machines, you had to use something Sun developed called the Remote Method Interface (RMI) — it’s sort of like those message passing interface for clusters I described above, but in this case you make calls to Java methods and Objects which, by definition, bring their data with them. The problem with RMI is that it’s expensive, it takes a lot of resources to make RMI work and you don’t want to put it in your code if your application will run on a single server; if you want to run your application on multiple machines it’s a win. But, like developing custom cluster applications it’s a Hobson’s Choice. The Terracotta system makes all of these problems go away for Java applications allowing developers (and businesses) have their cake and eat it too in terms of application development and deployment scalability.
So… what does this mean for the Open Source community? And, why would Terracotta open up what is clearly a very valuable system? For the Open Source community the implications are huge — there’s now a mechanism freely available to make application scalable without specialized programming. This means that basically distributed applications could now be the norm rather than one-off custom applications. (I see MMO’s taking advantage of this immediately).
For Terracotta they get to still sell their application (JBoss, Redhat and others have proven that there’s a pay-for support model that’s distinct — and profitable — apart from the hobbyist and derivative application markets) since most companies will want a fully supported version of the product, and at the same time, Terracotta gets the input ideas and energy of a huge Open Source Java community.
Of course, this is a very new world when it comes to proprietary companies opening up their software but I’m willing to bet we’ll see a whole new round of innovations come out of it.
Resources:
Terracotta source: open Terracotta