Data Centers

How to set up web filtering solution on Squid Proxy

Guest contributor Rafael Akchurin of QuintoLabs presents this how-to article on setting up web content filtering on Squid Proxy.

This HOWTO describes how to protect your home or small enterprise network users from objectionable Internet content with help of HTTP proxy. Our goal is to set up a free, Linux-based server running Squid and deploy a web filtering application on it in order to save bandwidth, speed up web access, and block offensive and potentially illegal or malicious web files.

In this tutorial, I will assume that the network environment consists of a SOHO level router that distributes wireless Wi-Fi, several desktop and laptop computers, iPads, and some mobile smart phones as shown on the following network diagram.

Set up CentOS Linux on proxy server

Our proxy server will be built using free version of CentOS Linux 6.2. It is also possible to use RedHat Linux 6.2 with paid subscription if you need a guaranteed level of support for your servers.

In order to install CentOS Linux, go to http://mirror.centos.org/centos/6/isos/i386/ and download the CentOS-6.2-i386-minimal.iso image file. Burn it on a spare CD, insert into your server's CD drive and power it on.

Follow the installation steps accepting the defaults or customizing the required parts of the install according to your needs. Configure machine hostname as "proxy" and root password as "P@ssw0rd" (without quotation marks). Wait until the installation is complete and then reboot the system.

The installed version of CentOS usually does not have network connectivity enabled by default. In order to enable network access, we need to perform the following.

  1. Assign a static IP address of 192.168.1.2 with network mask 255.255.255.0 to our proxy server by modifying startup script /etc/sysconfig/network-scripts/ifcfg-eth0. Open it and add these lines:
    BOOTPROTO=static
     NETMASK=255.255.255.0
     IPADDR=192.168.1.2
     ONBOOT=yes
  2. Set default gateway settings in /etc/sysconfig/network configuration file by adding this line:
    GATEWAY=192.168.1.1
  3. Adjust DNS resolve settings in /etc/resolv.conf by adding IP address of the DNS server that runs on router:
    nameserver 192.168.1.1

Restart your network subsystem by typing /etc/init.d/network restart in the root terminal or by just restarting the server. After restart, confirm that the network functions correctly by typing in the terminal (there should not be any errors in the outputs on these commands):

$ping -c 3 192.168.1.1
$nslookup google.com

Before we do any further installation it is recommended to update the freshly installed system with the latest security patches that may have come out after the ISO has been released. So type yum update in the root terminal and reboot the server after update completes.

Set up Squid on proxy server

We will use Squid as caching and filtering proxy that runs on our Proxy Server. In order to install the version of Squid that comes with the 6.2 CentOS distribution type yum install squid in the root terminal. Squid and all related packages and dependencies are downloaded from the Internet and installed automatically.

Make Squid proxy service start on system boot automatically by typing chkconfig squid on. Reboot your server or just start Squid for the first time manually with service squid start.

The only thing to do is to let the external users from our home network access Squid. Open configuration file /etc/squid/squid.conf and add the line visible_hostname proxy. Also check that http_access allow localnet and acl localnet src 192.168.0.0/16 are present in the configuration file.

Restart Squid by typing service squid restart. Verify that Squid runs correctly by pointing your user browser to the IP address of the Proxy Server (192.168.1.2) and surfing to some of your favorite websites.

NOTE: You may need to adjust firewall settings in CentOS in order to let proxy users connect to port 3128 on the Proxy Server. Use system-config-firewall-tui or iptables commands to do that. A good idea would be to allow access also to port 80, as we will use this port for managing QuintoLabs Content Security through Web UI as described later.

Set up QuintoLabs Content Security

The next step is to install Content Security for Squid from QuintoLabs (I will refer to it as qlproxy further in text). For those who do not know, QuintoLabs Content Security is an ICAP daemon/URL rewriter that integrates with existing Squid proxy server and provides rich content filtering functionality to sanitize web traffic passing into internal home / enterprise networks. It may be used to block illegal or potentially malicious file downloads, remove annoying advertisements, prevent access to various categories of the websites and block resources with explicit content (i.e., prohibit explicit and adult content).

NOTE: There are other tools besides qlproxy that have almost the same functionality. Some of the well-known are SquidGuard (SG) and DansGuardian (DG). While these tools are okay from the theoretical perspective, you need to install them both to get the same functionality as qlproxy. SG runs as URL Rewriter and DG is even as a separate proxy itself. It also does not support SMP processing, relying on a resource ineffective process-per-connection server model, leading to exploded requirements on, for example, URL block database. It is also a problem to tie SG and DG together as they have different configuration directives and are largely independent of each other, forcing the admin to look into two different places when he needs to adjust only one filtering policy.

We will use version 2.0 of qlproxy, released this month. The most prominent feature of that release is a policy-based web filtering in which users of the proxy are organized into several groups with different levels of strictness.

By default, qlproxy comes with three policies pre-installed. Strict policy contains web filter settings put on maximum level and is aimed to protect minors and K-12 students from inappropriate content on the Internet. Relaxed policy blocks only excessive advertisements and is appropriate to be used by network administrators, teachers and all those who do not need filtered access to the web, but would like to evade most ads. The last group is Default and contains less-restrictive web filtering settings suitable for normal web browsing but without explicitly adult content shown.

The good thing about this is that you are free to design the policies yourself if you find the predefined policies not suitable for your network environment.

In order to install Content Security 2.0, we have to get the CentOS / RedHat RPM package manually from the QuintoLabs website and upload the package to the Proxy Server using scp. Another way is to type the following commands in the root terminal of the Proxy Server directly (as one line):

#curl http://quintolabs.com/qlproxy/binaries/2.0.0/qlproxy-2.0.0-bb01d.i386.rpm>qlproxy-2.0.0-bb01d.i386.rpm

After download completes (approx. 21MB) run the following command to install the downloaded package and all its dependencies (note that the package comes in i386 flavor but yum takes care of correct installation on x86_64 architectures):

#yum localinstall qlproxy-2.0.0-bb01d.i386.rpm

The yum installation manager will run for a while and the program will be installed into /opt/quintolabs/qlproxy (binaries), /var/opt/quintolabs/qlproxy (various logs and content filtering databases), and /etc/opt/quintolabs/qlproxy (configuration).

NOTE: This HowTo assumes you have SELinux disabled on your machine. For specific notes considering SELinux-based installation of qlproxy, see the website and sample SELinux policy installed in /opt/quintolabs/qlproxy/usr/share/selinux. In order to disable SELinux, set SELINUX=disabled in /etc/selinux/config and reboot.

Integrate Squid and Content Security

QuintoLabs Content Security may be integrated with Squid in two different ways - as ICAP server and as URL rewriter. It is recommended to use ICAP integration as it gives access to all HTTP traffic passing through Squid and allows qlproxy to perform full request and response filtering (ICAP is supported in Squid version 3 and up).

The README file in /etc/opt/quintolabs/qlproxy folder contains detailed instructions on how to perform integration with Squid on different platforms (Debian, Ubuntu, RedHat and even Windows). To integrate it with Squid running on CentOS, we need to add the following lines to /etc/squid/squid.conf configuration file:

icap_enable on
icap_preview_enable on
icap_preview_size 4096
icap_persistent_connections on
icap_send_client_ip on
icap_send_client_username on
icap_service qlproxy1 reqmod_precache bypass=0 icap://127.0.0.1:1344/reqmod
icap_service qlproxy2 respmod_precache bypass=0 icap://127.0.0.1:1344/respmod
adaptation_access qlproxy1 allow all
adaptation_access qlproxy2 allow all

Restart Squid by typing service squid restart and try surfing to your favorite websites and to see how many ads are blocked. Another useful test is to go to the eicar.com web site and try to download a sample artificial eicar.com virus to see that *.com files are blocked by the download filter.

Default installation of Content Security is quite usable out of the box, but in order to adjust it for our network requirements described earlier, we will have to perform some configuration changes as described below (all paths are relative to /etc/opt/quintolabs/qlproxy/policies):

  1. Put all normal users into Strict filtering policy by adding their IP addresses (or user names if your Squid performs authentication) to the strict/members.conf file.
  2. Put all power users into Relaxed filtering policy by adding their IP addresses or user names to the relaxed/members.conf file.
  3. Enable extended AdBlock subscriptions for blocking English, German and Russian ads in blocks_ads.conf configuration file for both policies. Also block common web tracking engines by uncommenting EasyPrivacy subscription in the same files.
  4. Increase the level of adult blocking heuristics to "high" in the strict/block_adult_sites.conf file. Although it may result in excessive false blocking there is always the possibility to add incorrectly blocked sites to an exception list.
  5. The UrlBlock module that uses a community-developed database of categorized domains incorrectly puts blogspot.com into an adult category... so we will add it to the exception list of a relaxed policy in relaxed/exceptions.conf to be able to read the blogs.
  6. Knowing that worms, trojans and other malware related software often connect to the world by numeric IP addresses instead of normal hostnames, we will put a magic regexp url = http://\d+\.\d+\.\d+\.\d+/.* into strict/block_sites_by_name.conf file to block access to web sites by IP.

Now issue a restart command to make qlproxyd daemon reload the configuration /etc/init.d/qlproxy restart.

Set up Web UI of Content Security with Apache

QuintoLabs Content Security contains a minimal Web UI that lets you see the current program configuration, view reports of usage activity, and program logs from a remote host using your favorite browser. Web UI is written using Django Python Framework and integrates with Apache using mod_wsgi deployed in virtualized Python environment (to minimize package dependences).

To install Apache, type the following in the root terminal yum install httpd. Make Apache service autostart on system boot by typing chkconfig httpd on. Reboot your machine or just start Apache for the first time manually by typing service httpd start. Then install additional Apache and Python modules by typing in the root terminal:

#yum install mod_wsgi python-setuptools
#easy_install virtualenv
#cd /var/opt/quintolabs/qlproxy/www
#virtualenv —no-site-packages qlproxy_django
#./qlproxy_django/bin/easy_install django
Integrate Web UI with Apache by adding the following lines to configuration file /etc/httpd/httpd.conf:
<VirtualHost *:80>
ServerName proxy.lan
ServerAdmin webmaster@proxy.lan
LogLevel info
ErrorLog /var/log/httpd/proxy.lan-error.log
CustomLog /var/log/httpd/proxy.lan-access.log combined
# aliases to static files (must come before the mod_wsgi settings)
Alias /static/ /var/opt/quintolabs/qlproxy/www/qlproxy/static/
Alias /redirect/ /var/opt/quintolabs/qlproxy/www/qlproxy/redirect/
# mod_wsgi settings
WSGIDaemonProcess proxy.lan display-name=%{GROUP}
WSGIProcessGroup proxy.lan
WSGIScriptAlias / /var/opt/quintolabs/qlproxy/www/qlproxy/qlproxy.wsgi
<Directory /var/opt/quintolabs/qlproxy/www/qlproxy>
Order deny,allow
Allow from all
</Directory>
</VirtualHost>

Add the following line to the /etc/httpd/conf.d/wsgi.conf to let the mod_wsgi run in daemon mode:

WSGISocketPrefix /var/run/wsgi
NOTE: If you get an "Access denied" error page trying to access http://localhost then check if SELinux permissions might be preventing access to /var/opt/quintolabs/qlproxy/www/qlproxy/ directory for httpd process.

After restart of Apache navigate to http://192.168.1.2/qlproxy to see program configuration, logs and generated reports.

Final step

The only thing left is to point network users to Proxy Server. There are several possibilities to do that automatically (think WPAD) but for testing purposes manual proxy configuration should be more than enough. So point the browser to proxy at 192.168.1.4 port 3128, surf to some favorite websites and see the difference: IP addresses in URLs are blocked, explicitly adult content sites are forbidden. RAM and CPU usage on the server is minimal, surfing experience is acceptable. System is automatically updated once a day for the latest URL block list and AdBlock subscriptions and requires minimal additional maintenance.

For more information see the following resources:

Author Rafael Akchurin is a co-founder, support engineer, and evangelist for QuintoLabs.

Editor's Picks