Open Source

Introduction in developing Apache modules using C in the UNIX environment

Extending an Apache server's function with modules (using C, CGI, and PHP) allows you to create some very powerful modules for the fastest and most reliable server available. Alexy Prohorenko is here to show you how.

The Apache server supports an application programming interface (API), which allows programmers to extend a server with their own extension modules. Apache module API also gives programmers access to nearly all of the server's internal processing.

Modules simply look like additional parts of the original server—they can behave like Common Gateway Interface (CGI) scripts, creating pages on the fly, or they can make specific and fundamental changes in the operation of the server, like a single sign-on security system or a Web interface for relational databases.

In this Daily Drill Down, I will introduce you to the art of creating your very own Apache modules. With the help of CGI, PHP, and C, you can create very simple and very powerful modules for one of the fastest, most reliable servers available.

Introducing Apache
The Apache server is a freely distributed, full-featured Web server that runs on UNIX and Windows NT systems. The goal of this project (according to the Apache Software Foundation team) is to provide a secure, efficient, and extensible server that provides HTTP services in sync with the current HTTP standards.

Apache has been the most popular Web server on the Internet since April 1996. The December 2000 Netcraft Web Server Survey found that over 60 percent of the Web sites on the Internet are using Apache (over 62 percent if Apache derivatives are included), making it more widely used than all other Web servers combined.

Better than CGI?
If you are comparing the speed of executing Apache modules and CGI scripts, Apache modules are the fastest Web development solution, even faster than FastCGI protocol. Modules, however, suffer from low portability, as they are more system-dependent than CGI scripts.

If your goal is portability, you should choose pure CGI. With correctly designed and written CGI scripts, programmers can be sure that their applications will work without any problems and modifications on any server, any machine, and any OS. You need to remember that performance will be poor, however.

FastCGI provides a real performance improvement, but it requires some minor modifications to the source code of the CGI scripts before you will be able to use them.

PHP is an intelligent solution for everybody who needs easy programming. You do not create source code; you just create HTML pages (which include bits of PHP code). Unfortunately, performance is rather poor.

If you want real power and the best performance, your choice should be server API. Modules, which you will be able to create with API, work perfectly well with any type of browser, and they work very quickly. Keep in mind, however, that API modules only work with the server for which they were designed; transferring and installing your modules onto another server with another OS will be a nonstop headache. Some modules, in the worst situations, will need to be totally rewritten.

Module specifics
With modules, you can arrange for the server to take specific actions when the server starts or stops; you can extend standard Apache configuration files with your own directives; you can create your own much-more-powerful-than-standard authentication and authorization systems; and more.

Modules in Apache can be compiled in two ways: They can be linked directly into the server executable, or they can be performed as demand dynamic shared objects (DSO), which will be loaded on demand. If you want to link your modules into the server, you should compile your code into object files and then recompile your server, linking your objects. We will not examine this case—it makes it very hard to add new modules or debug them, and each time you'll have to recompile your Apache server.

We will, however, go into more detail with DSO modules. With them, you will need to have the standard mod_so module linked with your Apache server, which will allow you to add DSO modules. Each time you add a new DSO module, you will just add a few lines to your server configuration file, and that's all. Your module will exist as a separate file in the libexec directory. To make everything clear, let's take a look at an example (based on demo modules from the Apache distribution).
1:#include "httpd.h"
2:#include "http_config.h"
3:#include "http_core.h"
4:#include "http_log.h"
5:#include "http_protocol.h"
6:#define myname  "Dummy Example of Apache Module"

static int
7:dummyexample_handler (request_rec *r) 
8:r->content_type = "text/html"; 
9:ap_send_http_header (r);
10:ap_rprintf (r, "<HTML>\n");
11:ap_rprintf (r, "<HEAD><TITLE>%s</TITLE></HEAD>\n", myname);
12:ap_rprintf (r, "<BODY>\n");         
13:ap_rprintf (r, "Hello, world!<BR>\n");
14:ap_rprintf (r, "</BODY>\n");
15:ap_rprintf (r, "</HTML>\n");

16:return OK;

17:handler_rec dummyexample_handlers[] =
       {"dummyexample-handler", dummyexample_handler},

18:module MODULE_VAR_EXPORT dummyexample_module =

In the above example, the numbers and the colon (:) are used for clarity only.
Let's find out what each code line of our module means:
  • 1-5: Seems to be pretty clear—we are including necessary Apache module header files.
  • 6: We are defining a string variable, which will keep test info only.
  • 7: We declare our content handlers. Content handlers are invoked when the server encounters a document that our module must see. In our case, the content handler will just generate HTML code.
  • 8-9: We are preparing and sending headers before sending HTML content.
  • 10-15: We are just creating our page and sending ("printing") it to the server.
  • 16: This is the finish line. Possible variants are: OK (everything is okay), DECLINED (this is not something with which our module should get involved), or AUTH_REQUIRED (this is used when we need to authorize a user).
  • 17: This is our starting block, where we keep a list of content handlers available in this module.
  • 18: Here is where we will list the callback routines and data structures that provide hooks into our module from other parts of the server. In case a particular callback is not needed, we use the keyword NULL.
  • 19: Module initializer
  • 20: Per-directory config creator
  • 21: Directory config merger
  • 22: Server config creator
  • 23: Server config merger
  • 24: Command table
  • 25: List of handlers, this routine will be called 7th during request processing.
  • 26: Filename to URI translator, this routine will be called second during request processing.
  • 27: Checking and validating user ID, will be called fifth.
  • 28: Checking that user ID is valid, sixth during request processing.
  • 29: Checking access by host address, routing will be called fourth.
  • 30: MIME type checker and setter, will be called seventh.
  • 31: Fixups, called eighth during processing.
  • 32: Logger, 10th in processing.
  • 33: Header parsing, third in processing.
  • 34: Process initializer
  • 35: Process exit or cleanup
  • 36: Post read_request handling, first during request processing.

request_rec and functions
Let me add a few words about the request_rec structure, which we used in the above code. The request_rec request record is the main part of the Apache API. It contains almost everything that programmers could ever want to know about current requests. However, since the full definition of request_rec is extremely long, here are just a few of the fields of this structure:
  • conn_rec *connection is a pointer to the connection record for the current process, which also consists of information about the local and remote host addresses as well as the username used during authentication, etc. Detailed information on this record can be retrieved from Apache API documentation.
  • server_rec *server is a pointer to a server record structure, from which all info about the current server can be gathered.
  • table *headers_out contains outgoing HTTP headers.
  • table *err_headers_out contains outgoing HTTP headers, which will be used if an error occurs or if a subrequest is called.
  • const char *content_type contains MIME content type.
  • const char *content_encoding contains content encoding.
  • request_rec *next points to the most recent request (NULL, if none).
  • request_rec *prev points to the immediate ancestor of the request.
  • request_rec *main points to the top-level request.
  • int header_only is true if the remote client made only a header request. This value should not be changed by the programmer.
  • char *protocol contains the name and version number of the protocol (i.e., HTTP/1.0).
  • const char *hostname contains the name of the host requested by the client, although it's better to use ap_get_server_name() API function.
  • char *status_line contains status info returned from the Apache server to the browser (i.e., 200 OK).
  • int status is the numeric value of the transaction status code. Apache will set it; the programmer does not take care of it.
  • char *method contains the request method, such as POST or GET.
  • int method_number contains the integer value of the request method, i.e., M_GET; these constants are defined in the header file httpd.h.
  • char *args contains the query string for CGI GET requests (part of the URI string after the ? sign). Instead of creating your own query parsing code, I suggest you use the libapreq library created by The Apache Group, which provides routines for manipulating client request data. I am using a modified version of this library; however, it looks really great in its original form.
  • char *filename translates the name of the requested document.

Line 9 of the above code uses the ap_send_http_header() function. It sends a request_rec structure. In our case, we are using the variable named “r” for this. So we are loading the “r” variable with data, and then we are calling ap_send_http_header(); this function will send headers, info, etc. to Apache.

Lines 10-11 are using ap_rprintf() functions, which perform formatted output, like the standard printf() function works. The return value is a number of characters sent to the client.

The next step will be compiling this source code into a binary Apache module, which the server will be able to execute. The easiest way to do that is with apxs (the APache eXtenSion tool), which is designed for compiling and installing modules and adding them into Apache's configuration file. Nevertheless, using this tool does not show the details of the process, so we'll use a simple Makefile to compile and link our module.
5:INC=-I/usr/local/$(SERVERNAME)/include –DEAPI
10:all:         $(TITLE).so
12:             cp $(TITLE).so $(LIBEXEC)/$(TITLE).so
13:             chmod a+x $(LIBEXEC)/$(TITLE).so
15:             -$(APCTL) graceful
17:             -rm -f $(TITLE).o $(TITLE).so
18:$(TITLE).so:      $(TITLE).c
19:         $(CC) -funsigned-char -DUSE_EXPAT $(SRCINC) -fpic
20:         -DSHARED_MODULE \
21:         $(INC) $(LIB) -c $(TITLE).c
22:         $(LD) -Bshareable -o $(TITLE).so $(TITLE).o $(LIBS)

Our Makefile will be rather universal. Depending on your OS, the only thing you will need to change will be lines 1-9. Our target system is FreeBSD. For Linux, Solaris, BSDI, AIX, and IRIX systems, Makefile will have to be changed, according to Apache documentation. If you need to use this Makefile for other modules (within the same system), only line 1 should be changed (to reflect the new module name).

Installing the module
We are now ready to install our module, but first we have to install and configure the Apache server properly.

To make it able to support DSO modules, we'll run the following command:
./configure —enable-module=so

which will give us the following output:
Configuring for Apache, Version 1.3.12
 + using installation path layout: Apache (config.layout)
Creating Makefile
Creating Configuration.apaci in src
Creating Makefile in src
 + configured for FreeBSD 3.4 platform
 + setting C compiler to gcc
 + setting C pre-processor to gcc –E
 + checking for system header files
 + adding selected modules
 + checking sizeof various data types
 + doing sanity check on compiler and options
Creating Makefile in src/support
Creating Makefile in src/os/unix
Creating Makefile in src/ap
Creating Makefile in src/main
Creating Makefile in src/lib/expat-lite
Creating Makefile in src/modules/standard

Next we run the make and the make install commands:
make ; make install

Testing our module
To test our module, we will have to associate it with some URI. We can do it by adding the following lines to the Apache configuration file (httpd.conf):

LoadModule dummyexample_module    libexec/

<Location /test>
SetHandler dummyexample-handler

Okay, everything is ready. Let's compile the module using the make install command from within the same directory (where the source code and the Makefile are located). And, when we fetch (via browser) http://localhost/test, if everything was done correctly, our module will execute.
When using the above Makefile with modules that interact with mySQL, you'll have to link libmysqlclient.a along with your module. You can do it by changing line 9 from:LIBS=-lmto:LIBS=/usr/local/lib/mysql/libmysqlclient.a –lm(using the correct path to libmysqlclient.a). And to line 8, add the full path where mySQL headers files are kept (for example, -I/usr/local/include/mysql).The authors and editors have taken care in preparation of the content contained herein but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.

Editor's Picks