Open Source

Testing Web links with LinkChecker for Linux

Testing links on your Web site can be a time-consuming task if you don't have a tool to help shoulder the load. This Linux tip will show you how to get LinkChecker and how to use it.

Manually checking a Web site to see if all links resolve correctly can be a time-consuming task, particularly if the site in question is quite large or has a large number of links to external sites. This is especially true considering the lack of control of resources on external sites where pages may be moved or deleted.

An automated tool exists that will check a site to determine that all links properly resolve. LinkChecker is a Python tool available from http://linkchecker.sourceforge.net/. Download the LinkChecker package in either rpm or tar.gz format. When downloading the tar.gz package, compilation is required, but it's very easy. The only real requirement is that you must have Python 2.4 installed.

<code>
$ tar xvzf linkchecker-2.9.tar.gz
$ cd linkchecker-2.9
$ python setup.py build
$ python setup.py install —home /home/joe
</code>

This installs LinkChecker into joe's home directory; the LinkChecker executable will be in the ~/bin/ directory and the required libraries in ~/lib/python/. You can also install it system-wide by becoming root to run the install command and omit the —home argument.

To run LinkChecker, use:

<code>
$ cd ~/bin
$ PYTHONPATH=/home/joe/lib/python ./linkchecker http://somesite.com
</code>

This command will do link checks on every page on the somesite.com website. You can also check links on local HTML pages which is useful for pre-production testing.

To make things easier, you can add the following to your ~/.bash_profile, or equivalent file, so you can execute LinkChecker from any directory:

<code>
export PYTHONPATH=/home/joe/lib/python
export PATH=$PATH:/home/joe/bin
</code>

Executing LinkChecker with the -h option will give the program's help feature. A number of options are available to fine-tune how LinkChecker will behave, determine how many levels it will recurse, and more.

If you're so inclined, you can even use the included CGI script to setup a Web interface to LinkChecker that can be accessed via a Web browser.

Delivered each Tuesday, TechRepublic's free Linux NetNote provides tips, articles, and other resources to help you hone your Linux skills. Automatically sign up today!

About Vincent Danen

Vincent Danen works on the Red Hat Security Response Team and lives in Canada. He has been writing about and developing on Linux for over 10 years and is a veteran Mac user.

Editor's Picks