Apache can do a lot more than just serve up Web pages, the typical role that has earned it a place as the leading site-server software package. As a proxy server, Apache offers the potential for both faster Web browsing for internal clients and significant cost savings. In this article, I’ll show you how to set up Apache as a proxy server in two modes: a caching proxy server and a caching proxy server with authentication.
How you can get in on this
To enjoy the benefits of running a proxy server, you don’t need to spend thousands of dollars on specialized hardware and software. Using the freely available open source Apache Web server, you need only a little time and an inkling of how to set up your proxy server. On an older P350, I was able to set up Apache to work as a proxy, from start to finish, in roughly a half hour.
Apache proxy servers offer these fundamental value points:
- · Because of its ability to act as a proxy server, Apache can increase the efficiency of a wide area network connection while lowering the cost of ownership by making it possible to postpone an upgrade to the connection.
- · Using the proxy module’s authentication capabilities, an organization can ensure that network users are approved and not just someone with unauthorized physical access.
- · Apache is free and has proven itself. According to the latest Netcraft survey results, it is the leader in serving Web pages and is superior to Microsoft’s IIS server in reliability, ease of set up, and security.
The basics of my setup
First, a word about the Apache proxy module’s capabilities, which are dependent on the version of Apache you are running. Version 1.2 of Apache introduced a stable, nonexperimental version of the module; HTTP/1.1 proxy support was added in Apache version 1.3.23. The proxy module also supports FTP, SSL CONNECT, and HTTP/0.9.
For this discussion, I’ll use a Redhat Linux 7.2 server to run Apache as a proxy server with an IP address of 192.168.1.110 and a Windows XP client running Internet Explorer 6.
I plan to use Apache 1.3.24. I have already downloaded Apache, placed it into my home directory, and unpacked it into /home/slowe/apache_1.3.24. To set this up, I perform the following commands to install into /usr/local/apache:
./configure --prefix=/usr/local/apache --enable-module=proxy
The enable-module=proxy directive is required to install Apache’s proxy components. Once this install is complete, the Apache proxy server software is ready to be configured.
Configuring the proxy server
Starting Apache immediately after the install will launch a normal Web server instance, which is expected since I have not yet configured the proxy server, except to install the proxy module. The configuration for the Apache proxy module is located in the httpd.conf file located in /usr/local/apache/conf for my installation.
Preinstalled configuration file location
If you are working with a preinstalled Red Hat Linux distribution, the location of the httpd.conf directory is /etc/httpd/conf/. For older distributions, check /home/httpd/conf/.
For my first simple example, I’ll set up an Apache proxy server that is restricted to providing proxy services to just my local network, at IP address 192.168.1.0, and caches content to /usr/local/apache/proxy. In my httpd.conf file, I’ll uncomment configuration parameters related to the proxy services and add other directives to meet the above specifications. Here’s the final result:
What do the above directives do?
Line 1: Tells Apache to listen for proxy requests on port 4545.
Line 2: Allows ProxyVia, which enables the proxy server to include an HTTP/1.1 header that indicates the source of the request.
Line 3: Tells Apache to cache to /usr/local/apache/proxy for up to 500 KB.
Line 4: Dictates when garbage collection is to remove items. The garbage collection interval is set in hours using the CacheGcInterval directive.
Line 5: Allows documents to be cached for up to X hours via the CacheMaxExpire variable. The above example is set to 24.
Line 6: Provides an expiration time via CacheLastModified in the event that the originating page does not provide this variable.
Testing the proxy server
I want to keep DNS issues out of the equation for my examples, so I’ll browse to sites using their IP addresses. When I browse from my proxy-enabled workstation to 22.214.171.124 (apple.com), the page comes up just as it should. But how can I be sure that my client machine actually used the proxy server rather than simply getting the content directly from the Web site?
Configuring for a proxy server
If you need information on configuring a Web browser for use with a proxy server, see Dr. Thomas Shinder’s article “Configuring Internet Explorer on proxy networks.”
Before I actually browsed to Apple, I looked in the /usr/local/apache/proxy directory on my Apache proxy server and saw that it was empty. After I browsed to that location, the directory included a few new files and, upon inspection, one of them had this content.
As you can see from the X-URL statement in the first line, this is a page from Apple’s Web site that did not exist on my proxy server a few minutes ago.
What’s that you’re running—Apple?
Notice the contents of the Server line in my proxy directory. Apple’s site is running Apache 1.3.9. as its Web server platform.
In addition to providing caching and proxy services, Apache can authenticate users at the proxy server level, a capability that lets an organization control who is allowed to access resources and when.
Enforcing this service depends heavily on the placement of the proxy server on a network and on how its routing is configured. Consider this scenario: The administrator of a small network wants to implement a proxy server. She configures an authenticating proxy server on the network like any other workstation and then goes to each user’s workstation and adds the appropriate proxy settings to each one. (Configuring the client’s proxy settings is different for every browser.)
Everything works just as the admin expects, until she notices that certain users are no longer using the proxy server. They have removed the proxy settings from their machines.
To rectify this situation, the administrator could put the proxy server between the network and the outside world so that outbound network traffic has to pass through it, regardless of local browser settings. Another possible solution would be to configure network equipment to accept only certain types of traffic from certain hosts. For example, the gateway server (the only server pointing to the outside world) might accept HTTP traffic only from the proxy server and reject it from everything else.
Setting up authentication is just a matter of creating users and adding a few lines to the httpd.conf configuration file. For a small number of users, the simplest way to go about this is to manually create the users at the command line using the utility htpasswd, part of the htaccess package that comes bundled with Apache.
For my example, I have added my user id and password to a text file using htpasswd with this command.
The –c parameter tells htpasswd that this is a new file it needs to create. The name of the file is /usr/local/apache/conf/htaccess.conf, and slowe is the name of the user.
Once I execute this command and enter a password (I’m prompted for this password after I enter the command), the user slowe is added. Viewing the contents of the htaccess.conf file yields:
My password, depicted above as R5UsPKcGF6EqA, is encrypted in this output. I’d just repeat this process (minus the -c parameter) for all users I need to create.
Configuring for authentication
Once I have users set up, it’s time to modify httpd.conf to set up the authentication scheme. Consider this configuration snippet from my httpd.conf file.
The AuthName parameter defines the realm to be served by this authentication scheme. Setting AuthType to Basic provides a clear text mechanism to pass the password between the client and the proxy server. Finally, the require valid-user directive demands that the authenticating user exist in the AuthUserFile, which I also have defined.
The AuthUserFile entry
AuthUserFile entry sets the name of the file that I created using the htpasswd command earlier.
Once I set these parameters, Apache needs to stopped and restarted to reload httpd.conf. One way to stop and start Apache is to act as root and issue these commands:
When I start Internet Explorer on my test Windows XP workstation, I’m greeted with a Windows dialog asking for my username and password. For the browser to authenticate properly, the username and password must have an identical match in the /usr/local/apache/conf/htaccess.conf file (created by the htpasswd command) on my Apache proxy server.
I get a username and password prompt immediately upon starting Internet Explorer because my start page is http://www.msn.com, an external address that required me to route through the proxy server. Had the start page been set to about:blank, the server would not have requested a password until I tried to browse somewhere outside the network.
If I do not have an account on the proxy server, I get an error noting that I am not authorized to access the document I’ve requested.
In getting my proxy server up and running, I ran into a couple problems that affected its performance. My first hang-up involved a major performance issue when pages were requested via the proxy server from my Windows XP client. In tracking down the problem, I found that I had configured my Apache proxy service to listen to port 8080, which conflicted with another service. When I changed it to 4545, the proxy server performed extremely well.
My second problem was associated with the user authentication component. My /usr/local/apache/conf/htaccess.conf file did not have any entries when I first started and so could not authenticate anyone. Once I added an entry, all was well.
More than meets the eye
Apache has proven itself as a superstar of the open-source movement, and its benefits aren’t limited to just serving up Web pages to external clients. Apache’s low cost of ownership and ability to run on older equipment puts proxy server benefits within the reach of almost any IT organization.