Manually checking a Web site to see if all links resolve
correctly can be a time-consuming task, particularly if the site in question is
quite large or has a large number of links to external sites. This is
especially true considering the lack of control of resources on external sites
where pages may be moved or deleted.
An automated tool exists that will check a site to determine
that all links properly resolve. LinkChecker is a Python tool available from http://linkchecker.sourceforge.net/.
Download the LinkChecker package in either rpm or tar.gz format. When
downloading the tar.gz package, compilation is required, but it’s very easy. The
only real requirement is that you must have Python 2.4 installed.
$ tar xvzf linkchecker-2.9.tar.gz
$ cd linkchecker-2.9
$ python setup.py build
$ python setup.py install --home /home/joe
This installs LinkChecker into joe’s home directory; the LinkChecker
executable will be in the ~/bin/ directory and the required libraries in
~/lib/python/. You can also install it system-wide by becoming root to run the
install command and omit the –home
To run LinkChecker, use:
$ cd ~/bin
$ PYTHONPATH=/home/joe/lib/python ./linkchecker http://somesite.com
This command will do link checks on every page on the
somesite.com website. You can also check links on local HTML pages which is
useful for pre-production testing.
To make things easier, you can add the following to your
~/.bash_profile, or equivalent file, so you can execute LinkChecker from any
Executing LinkChecker with the -h option will give the program’s help feature. A number of options
are available to fine-tune how LinkChecker will behave, determine how many
levels it will recurse, and more.
If you’re so inclined, you can even use the included CGI
script to setup a Web interface to LinkChecker that can be accessed via a Web
Delivered each Tuesday, TechRepublic’s free Linux NetNote provides tips, articles, and other resources to help you hone your Linux skills. Automatically sign up today!