Operating systems

How does Ubuntu's Upstart system initialization compare with runit?

Vincent Danen takes a look at Ubuntu's Upstart system, which is an event-based replacement for SysV init that handles the starting of tasks and services during boot. He looks at how it supervises services and compares it to runit.

When I was working on the Annvix project, I was fascinated with how the system booted and services were started. This led to an eventual rewrite of initialization scripts and the use of runit instead of the traditional SysV init. The results were fantastic: a fast booting system with services starting in parallel and a series of scripts to start and stop the system with a small footprint and amazing speed.

When I recently installed Fedora Core 10, I found that one of the new features was the use of Ubuntu's Upstart, a SysV init replacement system. I've taken the opportunity to poke at it a little bit and see how it compares to runit and SysV init, especially in light of the fact that a number of features it does have or plans to have mimic a lot of the things that made runit and supervised services so appealing.

According to Canonical, Upstart is an event-based replacement for SysV init, that handles the starting of tasks and services during boot. It stops them during shutdown, and supervises them while the system is running. From an end-user perspective, the system boots no differently.

The venerable /etc/inittab still exists, but all it does is indicate which runlevel is the default. Everything else is handled via configuration files in /etc/event.d/ (on Fedora) and /etc/init/jobs.d/ (on Ubuntu). These files replace certain lines in /etc/inittab; for instance the /etc/event.d/tty1 file contains:

start on stopped rc2
start on stopped rc3
start on stopped rc4
stop on runlevel 0
stop on runlevel 1
stop on runlevel 6
respawn
exec /sbin/mingetty tty1

Whereas on a SysV init system, an equivalent line in /etc/inittab would read:

1:2345:respawn:/sbin/mingetty tty1

Both of these indicate that if tty1 dies, it should be restarted. As is evident, this works with SysV init:

# ps ax | grep tty6 | grep -v grep
11656 tty6     Ss+    0:00 /sbin/mingetty tty6
# kill -9 11656
# ps ax | grep tty6 | grep -v grep
11676 tty6     Ss+    0:00 /sbin/mingetty tty6

You can tell, due to the differing PID, that init has restarted the service.

Unfortunately, Upstart in FC10 is not quite so successful:

# ps ax | grep tty2 | grep -v grep
 2502 tty2     Ss+    0:00 /sbin/mingetty tty2
# kill -9 2502
# ps ax | grep tty2 | grep -v grep
#

Killing the service on FC10 results in the service staying down, which is a serious regression. Granted, the version of Upstart in FC10 is 0.3.9, and the current release is 0.5.0. I don't know why FC10 has 0.3.9 when it was released in November and 0.5.0 has been out since August. Regardless, if you are running FC10 and expect respawned services to work as they have in the past, be prepared for a nasty surprise.

One of the nice features of using a supervising system is the restart of services in the event of crashes or unintended shutdowns. If you have ever remotely killed the parent sshd service on a remote system and then logged out, you will know what I mean. FC10 does not take advantage of the supervisory features of Upstart beyond mingetty, which is a shame (assuming, of course, that it worked properly). With runit on Annvix, every system service that was daemonized was supervised, meaning a service could be killed or crash and within seconds it would be up again. This is a valuable feature to anyone working with remote systems or servers.

Down the road, it looks like Upstart will also handle time-based events, removing the need for cron, at, and similar scheduling systems. This would put it on par with OS X's launchd system, which can handle services and events. It looks as though future versions of Upstart could be quite useful, but as it stands now, in Fedora Core 10 at least, I would rather rely on runit for supervisory services.

Get the PDF version of this tip here.

Delivered each Tuesday, TechRepublic's free Linux and Open Source newsletter provides tips, articles, and other resources to help you hone your Linux skills. Automatically sign up today!

About

Vincent Danen works on the Red Hat Security Response Team and lives in Canada. He has been writing about and developing on Linux for over 10 years and is a veteran Mac user.

9 comments
Photogenic Memory
Photogenic Memory

I've never used upstart. It seems interesting that processes can be killed and at some time later respawned. How much extra does it load on the system/cpu? Kinda scary huh? It's like a run-away process in the background waiting for a trigger and poof; it's chugging away. Overall, the software seems really cool. I hope this can be ported to CentOS.

aaronjsmith21
aaronjsmith21

Your facts about the operations of Upstart are incorrect. I USE ubuntu and Upstart and if I do kill a process on here that is managed by Upstart, it will resume that process almost immediately!!! Therefor, you give false representation of this! I belive this is due to Fedora's use of Upstart and there ability to properly manage running task within there OS, and then reporting this to the Upstart. I guess, so I do not have my facts wrong, I am going to install Fedora and get back to you on that, I will even install your version of it. So next time, I ask that you expose all the facts, not just the problems you faced, unless, you write an article that is meant to exploit this isssue, then you should have named it "Issues with Upstart and Fedora Core 10" and then specify at the begining of the article that you have not tested this on "Ubuntu" yet and therefor can not comment on this! So please, get the facts next time before you act!

bjswm
bjswm

Um - yes, I have inadvertently killed a remote sshd. It did cause me some problems, and something like this would definitely have helped.

Tony K
Tony K

how does it compare to runit? Does the machine boot much faster?

vdanen
vdanen

It depends on the process and how quickly upstart respawns them. With runit, a downed service that is supposed to be up will restart within 5s. So, yeah, if you have a bad config or something naughty is happening where the service is being killed (maybe because it's using too much memory), this could be a bad thing. On the other hand, one would hope you have a process in place to notify you if things are getting out of control so that this cycle wouldn't happen so often as to DoS your system. With runit at least, you can mark a service a down and it won't be restarted. I imagine you could do the same with upstart.

vdanen
vdanen

Facts are correct. Maybe you need to read it again? I specifically indicated this was upstart on Fedora. I also specifically indicated this may be due to an older version *on Fedora*. I have not tried Ubuntu (quite frankly, I couldn't be bothered). I'm glad to hear it works properly on Ubuntu, and likely with the newer version. I'm sure we'll see that then in Fedora 11.

seanferd
seanferd

He specifically said it was the Fedora's implementation and use of an old version. So - ugh, forget it.

vdanen
vdanen

Well, it depends. You can't just "plop" runit in like you can upstart. If you do things properly and utilize a fully rewritten init system using runit (i.e. all services are supervised, use the 1/2/3 script method, etc.) then runit smokes upstart (my Annvix servers would do a full boot in under 14s). Of course, that is a lot more work than just switching from SysV-init to upstart. I suspect runit would be faster anyways, but I couldn't tell you for sure as I haven't tried.

Photogenic Memory
Photogenic Memory

I definitely know having ssh set to respawn is good one. It won't help you if the OS is hung up though. Perhaps also applying it to apachectl for a web server is another good one? What else do you recommend to apply it to? I guess it can be anything you want. How about sometime of security check? But you can automate that with cron too. I like Linux. There are so many choices. Ultimately too many can drive you nuts, LOL!