Most computer users, like most motorists, never give a thought to what goes on under the hood. Very few programmers actually understand the underlying processes and methodologies behind an operating system. This sad fact is a shame because understanding these concepts leads to more efficient utilization of a system's resources. In this Daily Drill Down, I’ll explain some of the UNIX design goals and how the kernel process achieves them. I’ll describe the function of the kernel, the services it provides, and the methods that it uses to accomplish these tasks. Then, I’ll tell you how you can benefit from this knowledge.
What is the kernel, and what does it do?
Let’s assume that we have a computer that consists of a processor unit, a random access memory array, and such peripherals as hard drives, floppy drives, a video card, a sound card, serial ports, and parallel ports. If we were to design an operating system for this computer, we have to know who will be using it and what they will want it to do. If only one user is running one process at a time, there’s no reason why we would need multi-user operations. DOS was designed for this target audience, and DOS is still perfect for many real-time, embedded/dedicated applications that require the computer to perform only one dedicated function. DOS is great for games, too.
UNIX, however, was designed for multi-user, multi-tasking applications. UNIX takes the same resources and governs them, making sure that they’re shared among all of the various authorized users. Thus, security and reliability rank high among the goals that drive designers of UNIX and UNIX-like operating systems. The UNIX kernel achieves these goals by ruling over all of the physical aspects of the machine. The kernel is a thread of execution—just like any other process. However, the kernel runs in a privileged mode. It can see the physical memory of the machine, and it can see all of the physical devices and ports.
The virtual machine concept
All other processes run in an artificial world that the kernel constructs for them. Have you seen The Matrix? Call these processes user processes. They run under a virtual machine delusion, which leads them to believe that they have almost an infinite amount memory. The kernel prohibits user processes from seeing the physical memory, other processes, and hardware.
These are very important concepts; they’re fundamental to the UNIX design philosophy. For purposes of reliability and security, no user process should be able to tamper with any other user process (unless both processes specifically want to communicate with one another). You certainly wouldn't want another user's error to crash your process, and you wouldn't want other users to obtain your sensitive personal information or intellectual property without your approval.
The virtual machine concept helps you achieve these goals. It also promotes platform independence because the user process never sees the physical hardware. As long as all UNIX machines implement the same virtual machine illusion for their user processes, it should be possible to port the same process to almost any other UNIX platform.
The virtual machine also facilitates multitasking. Certainly, multitasking is possible without a virtual machine implementation. After all, DOS programs with Terminate and Stay Resident (TSR) tasks are technically multitasking implementations. However, when the two concepts are combined with swapping and a scheduler, you produce a pretty fancy operating system. System performance depends on the robustness of these algorithms. Since years of research have gone into these algorithms, many UNIX implementations are quite good at performing the functions. The UNIX community often criticizes less mature operating systems, such as Windows NT or Windows 95, for their immature scheduling and memory management algorithms.
Good scheduling and swapping algorithms behave very well under heavy loading. Almost all OSs perform well in lightly loaded situations. However, if you want to test the swapping algorithm, run something that uses lots of memory. A good swapping algorithm makes wise decisions about what to retain in core versus what to swap out. Basically, an OS will want to make sure that the most frequently needed pieces of data are in core when they’re needed. When a process is scheduled to run but has been swapped out, a page fault occurs and the kernel has to fetch the data before the task (process) can continue. Page fault occurrence is inversely proportional to performance because fetching data from a disk is phenomenally slower than accessing memory.
A good system administrator will monitor page faults to determine how efficient a system is operating. On BSD systems, the xperfmon++ application provides a time-varying graphical monitor for page faults and other operating characteristics. Programmers also use this tool to optimize their programs—especially programs that allocate huge amounts of memory. Remember, if memory becomes scarce, the OS will start making decisions about what to swap and what to keep in core. Thus, anything the programmer can do to help the OS should result in improved performance, including de-allocating memory that isn’t in use. Anything you can do to concentrate memory usage to a smaller area is beneficial, too.
In order to divide time among the user processes, the UNIX scheduler counts clock ticks. It keeps track of which tasks want to run and which ones can run. Frequently, tasks must wait to read data from—or write data to—a peripheral. For example, if a task is waiting on the user to hit a key, the kernel shouldn’t give that process CPU time until the user hits the key. Processes like this one are blocked, pending some event. Likewise, the print spooler can throw only so many characters at the printer before it becomes full. Then, it, too, causes a task to become blocked. Thus, only tasks that can do something get scheduled.
The UNIX scheduler uses a round-robin scheduling scheme. Each process gets to run a fixed amount of time. The priority scheme will bias the scheduler to prefer certain processes. This scheme becomes very important when huge background tasks are competing with the Xserver for CPU time. Since I’m an arrogant user, I believe that my immediate wishes are far more important than any background task that I might have launched to run overnight. For people like me, UNIX provides a nice command, which allows you to drop the priority of a task so that it lets other tasks run first.
Unfortunately, UNIX isn’t a real-time operating system. The scheduler doesn’t guarantee an immediate response to interrupts. For most multi-user, multitasking applications, however, nobody cares about real-time responsiveness. Still, many embedded/dedicated systems require real-time processing. For these applications, UNIX isn’t the best operating system.
You can see the scheduler in action by using the ps command. It will show you the various processes that are running on your system. It also displays status information, which is helpful in solving problems. Using the ps command, you can determine if a program is running, suspended, or waiting on an event. Each system displays this data slightly differently, so you’ll need to read the manual pages (by typing man ps) to determine the required command line options and formatting of output information.
The nice and renice commands allow the system administrator and users to adjust task priorities. Only the superuser can bump a priority higher, but anybody can lower the priority of a task that they own. The nice command is used for launching a command with a specified priority. The renice command changes the priority of a task that’s already running. Usually, renice is used after you notice that a task is causing problems and you have to drop its priority so that it will stop annoying other users.
When should a task run at a lower priority? Well, I always run huge, computationally intensive tasks and most background jobs at lower priorities. If I’m sitting at the machine, I want it to pay attention to me! Most users feel the same way. Oddly, most Microsoft operating systems care more about tasks that are already running, and they tend to ignore users—regardless of how much they click the unresponsive GUI. Prioritization will help you keep big, computationally intensive tasks from bullying users who are just trying to get their work done.
Tasks that take huge memory footprints take a long time to swap in and out, too. If you force one of these tasks to share time with smaller tasks, you’ll just waste most of the machine's time moving process information back and forth. You want the CPU to perform actual task work—not just move tasks in and out of memory! Sometimes, it’s better to drop the priority of the big task so that it will wait until all of the little tasks are done. The same rule applies whenever you have two tasks that are fighting over common resources. Allowing one task to run at higher priority might get both of them to finish faster.
The top command is very useful for watching the task scheduler information in real-time. The ps command provides a single snapshot view of the tasks that are running, but the top command displays the information continuously. This command is also great for changing priorities or killing tasks. While top is running, you can press [N] to renice a process. You can even kill a process by pressing [K].
In addition to ruling over system memory, the kernel rules over all of the peripherals. These resources are too precious for you to allow a user process to touch them directly. Thus, the kernel provides various services that grant user processes access to these devices. The file system is a perfect example of a resource that user processes access frequently. The kernel enforces security restrictions so that users can’t gain unauthorized access to another user's files. The kernel grants device access through system calls. For example, a user process might call the open(), read(), or write() system calls to manipulate a file within the file system. A process could call the fork() system call to create a new child process. Virtually every process that runs on the computer will use most of these functions. For example, the date command relies on the kernel to provide the current date and time. The cat command accesses both the file system and the terminal device. The kernel provides all of these services!
The kernel's structural organization
Think of the kernel as being divided into two separate functional blocks. The lower functional block would consist of the device drivers, the virtual memory manager, the scheduler, the swapper (or pager), and the initialization code. The upper functional block would consist of the system call processing functions. User processes view this part of the kernel as a library of service calls.
Service calls must communicate asynchronously with the lower level, but user processes don’t need to worry about how this communication occurs. A user process assumes that the system call is synchronous. For example, if a user process wants to write a large block of data to a file, the system call returns immediately, believing that the data have been written. The operating system may cache these transactions for several minutes before actually writing the data to disk. This caching allows the system to operate more efficiently as a whole. If it didn't work this way, the user process would have to wait for the write operation to complete or it would have to poll the operating system in order to make sure that the action actually happened.
Programmers and system administrators can apply their knowledge of the kernel to improve overall system performance, and they should look for ways to help the kernel out whenever possible. Using the nice and renice commands, they can adjust the scheduling priorities so that the system runs more efficiently. Strategic use of memory can make applications run faster and more efficiently. Tools like xperfmon++ can help determine if a system has adequate system resources.
Ed Gold grew up in Louisville, KY, and he received his master’s degree in electrical engineering at the University of Louisville. Ed owns a small engineering consulting firm in Orlando, FL, and he is working on the electro-optic subsystem for Lockheed Martin's Joint Strike Fighter proposal. Although his primary computing interests are in image processing and artificial intelligence, Ed is a dedicated FreeBSD/Linux enthusiast. He is currently working to improve the FreeBSD system install utility.The authors and editors have taken care in preparation of the content contained herein, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.