..."Shared memory is the fastest form of IPC..."
—W. Richard Stevens
In this Daily Drill Down, I would like to pay some attention to one of the most important parts of the software development process: programming. I’m going to discuss using shared memory, its forms and applications. Shared memory is a potent tool in the hands of the knowledgeable programmer. It can be accessed randomly, as needed, by any process active on the system. If you’re wondering whether you can use shared memory in your applications, you’ll want to read this Daily Drill Down. I’ll attempt to help you answer these questions: Do I need it or not? If I do need it, then how and where can I use it correctly? (Before starting, though, I should make you aware that this topic is very wide and can't be fully covered in just one article.)
Shared memory can be used by programmers in several ways: mmap-ing, POSIX (Portable Operating System Interface for UNIX) shared memory, and shared memory that is a part of the System V IPC package along with message-passing and semaphores. I'll give you a brief description of each and will spend some extra time on the most popular one, which is shared memory from the System V IPC package. These enhancements, added by AT&T, have been ported to every other UN*X variant, including FreeBSD and Linux.
What is shared memory?
Shared memory technology is a part of a powerful IPC (interprocess communication) toolbox in UNIX-derived systems, which allows arbitrary processes to exchange data and synchronize execution. There are many forms of IPC on a UNIX-derived system (Several forms of IPC in the base UNIX toolbox are serial communication mechanisms. These linear forms have many uses, but I’ll be focusing on shared memory in this Daily Drill Down.)
BSD, sockets, System V, and ioctl()
For both interprocess and intersystem communication requirements, Berkeley's Computer Science Research Group (CSRG) developed sockets. Some people believe sockets are a part of the System V IPC. That's wrong! Although the BSD system provides a mechanism known as sockets to provide common methods for interprocess communication and allow the use of sophisticated network protocols, sockets are now just another part of the networking API of any UNIX OS, along with IPC and XTI. All current flavors of UNIX implement all of these mechanisms.
One should not mix these definitions, though there are no solid borders between them. System V has its own definition of networking environment for the IPC and network communications, and this has posed a problem for UNIX integration. Traditional methods for implementing network communications rely heavily on the ioctl() system call to specify control information, but usage isn't uniform across network types. For this reason, programs designed for one network may not work for other networks. In the future, System V will use the streams mechanism to handle network configurations uniformly. Again, I'll address these mechanisms and their uses in another article.
Where should we use shared memory?
The first question every programmer will have is: Where should I use shared memory? Based on my own nine years of experience in programming on different platforms, I can say that programmers and developers will combine almost anything in their projects. I have seen combinations I wouldn't have imagined possible. That's why I don't want to and, truthfully speaking, could not draw a line between what and where something should be implemented. But I will say, again, in my experience, that the use of shared memory is very helpful and productive for Web applications, as well as for security-critical applications and authorization/authentication modules for Web servers. In one way or another, almost all commercial software uses it. It provides a very low-level tool for defining processes as separate cooperating entities, and it's enormously useful because of its speed.
Why is shared memory the fastest form of IPC? Once the memory is mapped into the address space of the processes that are sharing the memory region, processes do not execute any system calls into the kernel in passing data between processes, which would otherwise be required. Shared memory lets two or more processes share a region of memory. The processes must, of course, coordinate and synchronize their use of the shared memory between themselves to prevent data loss.
One way of sharing memory, mmap-ing, is so named because the function mmap() is used to work with memory. This function maps either a file or a POSIX shared memory object into the address space of a process. Why should we use it? I'll refer to the small code fragment listed in the man page for the mmap().
Consider the following pseudo-code:
fildes = open (...)
lseek (fildes, offset, whence)
read (fildes, buf, len)
/* use data in buf */
The following is a rewrite using mmap():
fildes = open (...)
address = mmap ((caddr_t) 0, len, (PROT_READ | PROT_WRITE),
MAP_PRIVATE, fildes, offset)
/* use data at address */
The special benefit of using a memory-mapped file is that all the Input/Output (I/O) operations are done under the covers by the kernel, and we just write code that fetches and stores values in the memory-mapped region. We never have to call read(), write(), or lseek() functions. This can simplify the code quite a bit; however, not all files can be memory mapped. If we try to map a descriptor that refers to a terminal or a socket, we'll get an error returned from mmap(). These types of descriptors must be accessed using read() and write(). Another use of mmap() is to provide shared memory between unrelated processes. In this case, the actual contents of the file become the initial contents of the shared memory. We change explicit file I/O into fetches and stores of memory that can often simplify the programs and increase the performance. mmap-ing is a very useful technique. mmap() is also used for POSIX shared memory.
POSIX shared memory
POSIX shared memory extends the concept of shared memory to include memory that is shared between unrelated processes. The POSIX.1 standard provides two ways to share memory between unrelated processes:
- Memory-mapped files. A file is opened by open(), and the resulting descriptor is mapped into the address space of the process by mmap().
- Shared memory objects. The function shm_open() opens a POSIX.1 IPC name (perhaps a pathname in the filesystem), returning a descriptor that is then mapped into the address space of the process by mmap().
Both techniques call mmap(). The only difference is in how the descriptor is being mapped. The simple process involved with POSIX shared memory requires a call to the function shm_open()to either create a new shared memory object or to open an existing one. It is then followed by a call of mmap() to map the shared memory into the address space of the calling process. The reason for this two-step process (instead of a single step that would take a name and return an address within the memory of the calling process) is that mmap() already existed when POSIX invented its form of shared memory allocation. POSIX shared memory is based upon the mmap() function. First, we call shm_open(), specifying a POSIX IPC name for the shared memory object, obtain a descriptor, and then map the descriptor into RAM with mmap(). The result is similar to memory-mapping a file, but the shared memory object need not have begun life as a file.
System V IPC
Well, now we've gotten to the System V IPC and the most popular implementation of shared memory. The UNIX System V IPC package consists of three mechanisms: messages, shared memory, and semaphores. Implemented as a unit, they share common properties:
- Each mechanism contains a table with entries that describe all instances of the mechanism.
- Each table entry contains a numeric key, which is its user-chosen name.
- Each mechanism contains a "get" system call to create a new entry or to retrieve an existing one. The parameters of the calls include a key and flags.
- The kernel searches the proper table for an entry named by the key. The "get" system call returns a kernel-chosen descriptor for use in the other system calls and analogous to the file system call, such as open().
- Each IPC entry has a permissions structure that includes the UID (user ID) and GID (group ID) of the process, which creates and owns the entry, a UID and GID set by the "control" system call, and a set of read-write-execute permissions for user, group, etc. (the same as with the file permissions modes).
- Each entry contains other status information, such as the PID (process ID) of the last process to update an entry and the time of last access or modification.
- Each mechanism contains a so-called "control" system call to query the status of an entry, to set status information, or to remove the entry from the system.
When a process queries the status of an entry, the kernel verifies that the process has read permissions and then copies data from the table entry to the user-space address block.
System V shared memory is similar to POSIX shared memory. Instead of calling shm_open() followed by mmap(), it uses the shmget() function followed by shmat(). Shmget() is used to obtain an identifier for a memory block, and shmat() attaches the shared memory segment to the address space of the process. To remove a shared memory object, one should use shmctl() with the IPC_RMID command flag set.
To understand the practical uses of the System V shared memory functions in our everyday programming, I want to review an example of a type of program where we need to use shared memory. This will help us to understand the real reasons for using this technique and will uncover most of its pluses and minuses.
A while ago, I was involved in the development of a WebChat module. There are a great number of such modules, and they're extremely popular on the Net. Online chat doesn't require any special software to be downloaded, and in most cases, it's very easy to use.
In my experience, most such chat programs are written in Perl or some other interpreted "scripting" language. From the perspective of the developer, this is smart, because (for example) Perl was created especially for managing text operations. When it comes to speed and efficiency, though, this isn't a very good choice at all. The WebChat module I developed was written in the C language, and shared memory appeared to be an extremely good solution for handling the messaging base. Let me explain why.
Usually, WebChat divides the browser's window into three or more parts (frames). The biggest frame is used for viewing messages sent by users. The next largest frame is used for sending messages. An extra frame can be used for informational messages, showing banner ads, etc.
The biggest frame should be refreshed as often as possible to create the illusion of real-time conversation. As a rule, a 10 second refresh rate is used. (Actually, in the module I've designed, this varies from 5 to 15 seconds because during the authorization phase the server selects an optimal refresh rate depending on channel bandwidth and other parameters.)
Now, just imagine 10, 12, or even more users trying to refresh the main frame every 10 seconds! Consider, too, that every refresh request should perform a total database lookup for new messages and send them back! File operations will kill the server storage device quickly, and the site hosts will have to flush their storage arrays almost every month or even more often. That's not good. That's why the only answer here is memory-based operations, and I think the shared memory technology should be used here. Let me show you an example by explaining various functions used for shared memory. (Be aware, this is just pseudo-code and should be used only as a core structure to show the ways of working with the System V shared memory!)
These are the functions for working with text:
int text_init (void);
int text_clear (void);
TTextAr *text_getpointer (void);
int text_insert (TTextAr *text, char* user, char* typo);
int text_output (TTextAr *text, char *user);
TtextAr is used for handling messages and some system information (date, time, user's info, etc.). SHM_SEGN is an ID for a shared memory segment. A programmer may choose it randomly or with a system that works for him. text_init() is an initialization function that is used for initialization of our shared memory block using shmget() and shmat() and which returns an error code in case of shared memory allocation error:
id = shmget (SHM_SEGN, sizeof (TTextAr), (SHM_R | SHM_W) | IPC_CREAT);
text = shmat (id, NULL, 0);
text_clear() is used for de-initialization and freeing up memory from our structure. And, as previously described, the text_init() function returns an error code in case of shared memory de-allocation errors:
id = shmget (SHM_SEGN, 0, (SHM_R | SHM_W));
shmctl (id, IPC_RMID, NULL);
text_getpointer() is used for setting a pointer to our TTextAr structure after it has been allocated in shared memory. It returns this pointer to the caller:
id = shmget (SHM_SEGN, 0, (SHM_R | SHM_W));
text = shmat (id, NULL, 0);
text_insert() is used for inserting some user's message (typo variable) into our structure. We're getting a pointer to our structure as a text variable. Here we can handle our structure without performing any memory allocation operations. For example:
strcpy (text->item[i].u_name, user);
strcpy (text->item[i].u_text, typo);
text_output() displays the last couple of messages from our structure. In this function, as in text_insert(), we can handle our structure as a usual variable without performing any memory allocation procedures.
After studying this example, I’m sure you’ll agree that shared memory technology is easy enough to handle.
Administering System V
Let’s look at the administration of System V shared memory functions for a moment. As with most everything is this world, System V shared memory has certain design limitations. These limits may vary from one operating system to another. For example, in the FreeBSD (FreeBSD 3.3-RELEASE i386) operating system, such limits can be easily configured during the kernel compilation process.
The maximum number of bytes for a shared memory segment is calculated by the formula SHMMAX=(SHMMAXPGS*PAGE_SIZE+1), where SHMMAXPGS is a predefined variable set to 1025, PAGE_SIZE is a variable set to 1<<PAGE_SHIFT, and PAGE_SHIFT is set to 12. Using this relatively simple formula, one can easily count the actual size of an array of bytes for a shared memory segment, which is 4,198,401.
PAGE_SIZE and PAGE_SHIFT variables also can be changed in the header file /sys/i386/include/param.h, which holds most of the system-wide definitions.
As far as the Solaris (SunOS 5.7 i386) operating system is concerned, the settings of these limits can be checked by the sysdef command:
white@onyx:~>sysdef | grep SHM
1048576 max shared memory segment size (SHMMAX)
1 min shared memory segment size (SHMMIN)
100 shared memory identifiers (SHMMNI)
6 max attached shm segments per process (SHMSEG)
The settings can be modified by editing the /etc/system file, which is read when the kernel bootstraps. There are four statements that are responsible for these limits:
set shmsys:shminfo_shmmin = 1
set shmsys:shminfo_shmseg = 6
set shmsys:shminfo_shmmax = 1048576
set shmsys:shminfo_shmmni = 100
It goes without saying that every operating system has certain limits and almost every system can be changed per user needs, at least on a system-wide basis. This is one of the blessings of UN*X and BSD-derived systems: Everything can be tuned for the application. As we move towards embedding complete systems in application fabrics, whether on the Internet or within factory machinery, this tuning capability will become much more important.
Much of the functionality of shared memory can be reproduced using other system features and facilities, but shared memory tools provide much better performance for closely cooperating application packages than anything else. The speed is worth spending the time to do the job right, as IBM found out when their Olympics web statistics system choked in front of 80 million people. Learn from their debacle, and add shared memory to your toolbox!
Alexander Prohorenko is a student of computer sciences. For the last three years, he worked as a leading system administrator and coder for one of the largest ISPs in Ukraine, and he installed and integrated much of the Internet infrastructure for that country. Now he’s engaged in quality systems programming and high performance web coding for Network Lynx, an American company that’s based in Rio Rancho, NM.
Donald Wilde owns Silver Lynx, a control electronics development specialty company. He’s also a technical partner in Network Lynx, a Web-based virtual company that uses the best of open-source tools for Internet business. Don has been interested in computing since 1980, when he learned to program his first computer, a 16-bit Intel SDK-86, in raw machine code.
Don especially likes to build code that "makes the machines dance." He would be willing to compare quality code to any other art form, and he believes that software design is a marriage of creative artistic flow and discipline.The authors and editors have taken care in preparation of the content contained herein, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.