He's one of the few developers in Australia with experience working in Erlang, the functional programming language which is gaining fans for its handling of parallel processing and creating distributed systems. We sat down with Andre Pang to see what all the fuss was about.
If you're not familiar with Erlang, you can check out the open source Erlang Web site or Ericsson's (where the language was invented) Erlang Web site. You might also like to read Builder AU's own Quickstart to Erlang.
Why do you use Erlang?
I should start by saying that I'm not using Erlang at the moment — I'm working at Realmac software on RapidWeaver. I was at Rising Sun Research recently, however, and we were using Erlang in business there.
We had a program called cineSync, which was a collaborative movie player. You load it up on a whole bunch of computers, hit play on one and it would play on all of them. When you hit stop on one it stops at the same frame on all the others. We wanted a client-server model for this because it's targeted at the visual effects market. The first thing that comes to mind is probably that it suits peer to peer better, but the client-server model worked very well because the visual effects industry has very stringent security protocols, and it's easier for them to secure the one server than the many clients. We needed something that could coordinate with all these different clients.
Our very first server was written in C++, but we realised that we had a problem if the server crashes. There might be 20 VFX houses connected to this server, and when it goes down, they all go down with it. So we looked at using an Erlang server, by writing custom plug-ins to ejabberd, which is a popular XMPP (Extensible Messaging and Presence Protocol) platform — so we layered all of our code above that.
In the end we found that to be a good solution, in fact the only time I can remember it crashing is when we forgot to allocate swap space for the server and it ran out of memory. Due to the way you can set up ejabberd and Erlang in general is that it really suits a distributed system without a single point of failure. It makes it easy to set up different nodes which all replicate as each other, for instance, there were plenty of times when the primary server went down, but users were automatically connected to the secondary server instead. With a bit of smarts in the client it was possible to reconnect to a secondary server in the middle of the session and only lose a small amount of information in the process.
We had three or four servers around the world, and it was incredibly cheap to deploy, because with the Erlang runtime you don't need any special hardware or anything. It was very reliable and turned out to be a very good decision.
Reliability is a major advantage. What other advantages does Erlang have over other programming languages?
There's lots of different ways to view Erlang. I personally think that if you want to build a truly distributed system where reliability is a big concern, then you don't want to have failure. That's really the best domain for Erlang.
The Erlang attitude towards concurrency is part of how this works, by using message passing rather than shared state, but it's a very small component compared to the major advantage which is the Open Telecom Platform (OTP). Instead of writing an application from the ground up, OTP gives you a framework to write the application in, kind of like the model-view-controller paradigm that you get with Ruby on Rails. In Erlang when you write with the OTP framework you automatically get a lot of distribution and fault tolerance built in for you.
A lot of people make a big deal of Erlang's support for threads, and that's very true; Erlang's threading gives you a different way to program. A good example of this is Web servers, take Apache and the Erlang Web server, [called] yaws. With Erlang it's so trivial to make a new thread, it's very cheap and they're easy to manage, so you can have 80,000 threads running serving 80,000 different clients and with Erlang it's not a problem. With Apache, or any other Web server really, your server would fall over with anything near 80,000 threads.
Does the OTP help with rapid prototyping?
That's a bit of a hard question. Once you understand OTP, sure, in the same way that if you have never understood Ruby on Rails before, it's not going to be that easy to get it going. If you know Rails though, you can whip up a new Web site in a couple of hours.
OTP is a much more comprehensive framework though — it's a big thing with a lot of development time in it. I wouldn't say that OTP is particularly good for rapid prototyping, but Erlang as a language is fantastic for rapid prototyping if you ignore the OTP stuff. It's dynamic, you can do stuff like hot code loading. This lets you get a very basic server up and running and then swap in and out more complicated stuff as you go — then when you want something that's really solid and robust you rewrite it using OTP.
Erlang was designed for Telecoms, how well does it adapt to other areas such as Web or application programming?
I'd say pretty well, generally. Desktop applications, maybe not so much because there is a lack of GUI libraries available to it. There are some, but it would be a hard job. So I wouldn't recommend it for that.
For anything network server related however, it's fantastic. Because it was built and supported by Ericsson, they built it with support for all the common protocols. If you use a language like Haskell you've got to write libraries to support things like Web protocols, SMTP, all your basic protocols. Erlang comes with all of that, so it's the best choice for network server applications, especially since server applications tend to be concurrency oriented — you can just start a new thread for each connection, that's how ejabberd works.
Do you think a message passing approach has advantages over a traditional shared state approach?
In a nutshell yes, if only because the traditional approach you use is so bad and so error prone that the best programmers in the world will still have issues. I think we've created the problem — the idea of having shared resources that you have to lock and unlock each time you use them sounds so simple in theory, but when you actually try it you run in to so many subtle, complicated bugs which just go away when you use a message passing approach. For some reason it just seems to go better with the way our brains work. Joe Armstrong, the inventor of Erlang, is very adamant about message passing being easier for us to understand, even though there is overhead in copying objects around that you can avoid with shared state, you still get better performance with message passing just because you have a lot less locking overhead. It's just more simple for people.
I think using message passing will help alleviate a lot of the problems people have in writing concurrent code. Something is going to happen very soon, programmers are going to start realising in the next couple years that the way we've been doing things concurrently is just insane. It doesn't scale very well to multiple processors or systems, and it's just too hard to do it. Message passing is one way to do that, it's been proven as a model, Erlang has been shown to be good for extremely parallel applications and still scale well to multi-core processors. There are some other alternatives like software transactional memory, but I think what message passing has going for it is that it's a tried and tested model — there's no doubt that it's going to work. Whether it will work well in your language of choice is more difficult, because you often have to build a message passing layer on top of it, so it's a little deeper, but not too much. I think if you write the API's for it though, people will start using it a lot more.
What do you think the killer app will be if Erlang does take off?
Interesting question. I'd guess network servers, and distributed programs. When you're building a new system that's significantly different from what's available, so you can't use an existing product, and you really need it to be robust from the ground up, then Erlang is a great choice, especially with ejabberd out there. Ejabberd has just announced they're working on their XMPP library, which when you take away the jabber part is just a messaging layer. So instead of using it simply for messaging, you can use XMPP as a way to talk between different servers. Those kind of apps are ideal for erlang.
Twitter is running on Erlang, using ejabberd, and there is an MMORPG [Vendetta Online] which is using Erlang for their servers now.
Erlang has a database, called mnesia, but it's not really a relational database. How do you think it will effect the take-up of Erlang?
It's not your traditional relational database, that's for sure. I think there is a commercial RDMS ODBC layer available though. The big thing with relational databases is that usually you're programming in a language that has no inherent type for relationships, so you have this really annoying mismatch between how your database stores information and how ideally your programming language uses the information. With mnesia and Erlang, because mnesia is written in Erlang you can store native Erlang terms in your databases, and all that really goes away. You have a really powerful querying language because the query language is just list operations, which SQL is really a subset of. So you've got native support for querying, you've got native support for storing all your Erlang data, and because mnesia is self replicating it's distributed and fault tolerant — so if three or four nodes go down you can resync the data when they come back up, because it's all free with mnesia. It's really a pretty remarkable database. If you're coming from a relational database background it'll take a bit of work to get use to it, but it'll be well worth the effort.
One thing that Java still seems to have over up and coming languages is that it's still faster, do you see that as a problem?
That depends a lot actually, some people have been criticising Erlang because they see it as a standard mainstream language and it's not as fast as Java or C++, say. Well, yeah that's true. There is a native code compiler for it, but you'll never match the performance of C++. But the apps that Erlang is suited for really aren't CPU bound that often, if you look at ejabberd, it serves some absolutely crazy amount of concurrent connections, well over 100,000, and they're running it on, I'm not sure, but it's something like a Quad core XEON machine. CPU usage is not the problem, it's more network bound applications that Erlang is targeted at, so you don't really need the performance you get with C++. The other benefits you get with Erlang, such as hot code reloading, outweigh the speed advantages.
Also, it's designed to scale with extra machines, so the performance advantage is offset somewhat by being able to just add additional servers and load balance whenever you need to. So I dont think it's that vital a concern.
You shouldn't use Erlang for scientific number crunching by any means, use FORTRAN or C for that if you have to do it. What you can do is write all your number crunching code in C++ or Java or whatever, and then use Erlang to coordinate all the different nodes, to distribute it across multiple servers, that might be a really good approach, depending on the problem.