Bryan Cantrill is an engineer at Sun Microsystems responsible for the invention
of DTrace, a dynamic tracing facility in Solaris 10 that can identify bottlenecks and increase system performance.
How did the idea for DTrace emerge?
Quite simply it came about when we were trying to understand what our own systems were doing we didn’t have the tools to do it.
We started working on the DTrace problem in 2001, but we had the idea to do this before. In 1997 we were working on breaking some Solaris benchmarking and ran into problems for two weeks with
some of the best minds in the community. The answer was simple: one of the computers was thinking it was a router when it was supposed to be benchmarking.
However, after that experience I was a bit shaken. We had some of the best Solaris expertise in the world on this problem. What would happen if this was a customer? A lot of customers either live with
the problem or escalate it with Sun and see if we could answer it or not.
We realised we had to come up with a way to dynamically instrument production systems and answer questions about the systems. The problems we had before can now be solved by one person in one
afternoon, instead of two weeks with 16 or so people.
Many readers might have heard about DTrace for Solaris but how did DTrace for Java happen?
A developer at Enron was using DTrace and kept coming into some problems with his Java applications. So he plugged into the Java Virtual Machine (JVM) and DTrace and came up with a pretty
good stopgap solution.
It isn’t perfect as it means you have to restart your JVM. In Java DTrace does make your applications run slower. However, we are working on that for Mustang, the next version of Java. We are looking to integrate a probe so there is a zero effect. While not perfect, I show this to hardened Java developers and their eyes bug out because of some of the things you can do.
You seem to have opened up a floodgate outside Sun with DTrace for PHP, Perl, Ruby, and FreeBSD. What do you think of some of these implementations?
The PHP and Ruby implementations have been a great charge. It isn’t a whole lot of work and adds a lot of scope for those developers. In fact, using DTrace for PHP was mostly written at the Open Source
Developer Conference in the United States by a developer named Wes Furlong as I was giving a talk about DTrace for OpenSolaris.
At the end of the talk he said he was almost there and just needed my help on joining the dots. We finished it in the hotel room and all of a sudden you could see all this observability in PHP with DTrace. The FreeBSD port idea came out of that same conference. However, the FreeBSD port is going
to be a lot harder. They are going to have to pour a lot of foundation into the operating system to make it work. However, we’re happy to help the community to get it working and will be a great thing for FreeBSD.
Can you see DTrace in other Unix-type operating systems?
Well it’s open source, so yes. Anyone can port it if they want, even if it is not an
open-source operating system. It could be ported to AIX or other operating systems. If IBM were to do that we’d be excited for them.
Do you ever wish you didn’t take the DTrace idea, patent it, and start your own company?
No, I don’t wish that. The reason for that is that I like to solve big, hard problems. Big
companies solve big hard problems. Small companies don’t solve these problems, they solve small problems.
While there was only three of us on the team the project has been going on for a few
years now before it was put into Solaris.
It is not clear that it would have been that financially lucrative and my only exit strategy would be to sell it to Sun, and they’d know that. I needed the horsepower of a multibillion dollar
company to allow me to solve these type of problems.
When DTrace started hitting the market were there any interesting moments? For example, vendors pointing fingers at each other when problems arise and not owning up to bugs.
There have been lots of stories like that. Interestingly enough I’ve had software vendors ask and even demand we disable DTrace for their application which really caught me off guard at first.
Of course, the answer is “no” and we try and help the ISV the value it has for them. The fact that it has happened means that not everyone is comfortable with the customer being empowered. We need to be comfortable with it because customers have the right to know what is going on
in the infrastructure they have paid for.
Everybody needs to be comfortable with that. Sun need to be comfortable with that and our partners need to be comfortable with that. One of the reasons I love working for Sun is that they believe in the power of our customers. If people are finding things wrong with our code then that is a good thing. We win and the customer wins.
So what does the future hold now that DTrace is out there? What is the next big problem you’d like to solve?
The immediate future is that DTrace isn’t fully solved yet. For example, we have some work to do on DTrace for Java where it is still a stopgap solution.
We’ve also got some more early prototypes for DTrace but they have to be productised. So, we still have 2-3 more years of solving some problems using DTrace. Beyond that I don’t know, I can tell you that we’ll be looking to solve big, hard, and commercially relevant problems. I know that we will continue to innovate in the operating system.
What other technology outside Sun are you interested in at the moment?
Ruby on Rails is very interesting to me at the moment. I think that J2EE has enjoyed a period of being a de facto standard for deploying enterprise Web applications. I think it is going to be good for everybody to start rethinking some of that. I’m not a J2EE developer but holy God is it complicated.
I think if you look at Ruby on Rails its real value is its simplicity and will force Sun to respond with some simpler models.
Other technologies that are interesting to me [include] Parrot. It is originally a virtual machine for Perl 6, but somewhere on the way to doing that they worked out that they could make a virtual machine
for arbitrary interpreted environments, for example Ruby or Basic and others.
It’s not clear to me to what degree that is going to become real. Right now there are 80 percent of languages, but the 20 percent left is the hard part. For example, I’d like to see Java compile to Parrot.
I love software; not enough people say this. Software is so unique and things can happen to quickly. I was at MIT recently where I received a humbling award and one of the guys there who also won an
award was working on the pulse detination engine that will change the way jet engines work.
In short, instead of burning there is a series of explosions to propel the jet. It will take them ten years just to test those ideas. He is hoping that he will see this idea on a commercial airliner before he is dead.
In our industry you can blow people away by something you can cook up over the weekend, and that excites me. You simply can’t do that with jet engines.