Broken links on a
web page are one of the top reasons that users submit comments and
complaints about website functionality, so it makes sense for links to be reviewed
on a regular basis.
A systematic link checking policy ensures that external links are always working.

External links are harder to track than internal links because third-party websites may have changed their content management
system (CMS) or perhaps they’ve updated a URL naming convention; in either case, an automatic redirect for older or changed URLs may not have been implemented, so it’s up to you to make sure dead and broken external links are fixed.

I previously wrote about using Dreamweaver to find internal broken links so they can be repaired for intranet or local files. Now I’ll provide an overview of one tool that can be utilized to find external broken links on web page document files, and share links and short descriptions for additional link-checking applications.  

LinkChecker

LinkChecker is a free GPL-licensed website
validator maintained by Bastian Kleineidam, and the project can be found in the
wummel / linkchecker GitHub repository. The latest version updated
on December 24, 2013 is LinkChecker 8.5 and is available for download from the website as an
exe, deb, or tar.xz file.

LinkChecker’s features include recursive and multithreaded link checking
and site crawling; it supports a command line interface, a GUI client interface,
or a CGI web interface; it provides cookie and HTML5 support; and it can check HTML and CSS syntax. The exe file downloads as LinkChecker-8.5.exe and is
just over 11M, and the straightforward installation takes under one minute to
complete.

Find the program from the installed list and then open the
application. The GUI is displayed in Figure A. (Note: All
screenshots are from the application running on a Windows OS.)

Figure A

 

You can test a web page document by entering a fully qualified
URL into the GUI client or web interface (i.e., http://www.domainname.com) and then pressing the Start button on the top right. The
link check will validate recursively all pages starting with the parent URL; all external links pointing outside the parent URL will be checked
but will not use recursive checking for external third-party web
pages. For more information and
details on options, configurations, output types, proxy support, and other
topics, check out the online manual.

Figure B shows LinkChecker scanning a sample URL.

Figure B

 

The link check on the example page http://wummel.github.io/linkchecker/index.html resulted in 48 URLs found, 12 warnings, and 0 invalid URLs. URL properties
for the first and highlighted parent URL http://wummel.github.io/linkchecker/faq.html
shows a warning of a redirect that should be updated from line 74 in the
faq.html file to a link for http://seleniumhq.org,
which ultimately redirects to http://docs.seleniumhq.org/.
While the redirect works in this case, it’s probably a good idea for Bastian
to update the link on the page so as not to rely on Selenium’s due diligence in
keeping the redirect active.  

Additional link-checking applications

Xenu

Xenu is a free
download by Tilman Hausherr and is trademarked as Xenu, Xenu’s Link Sleuth, and Link
Sleuth for software products and services. The latest working download of the software is version 1.3.8
from September 4, 2010. For more information check out the official Description
page.

W3C Link Checker

With the W3C’s free link checker, you enter a URL into the form field and get options for summary only, hide redirects for all or
directories only, and check linked documents recursively within an assigned
depth. Also, you can save the link checking options as a cookie.

What do you use to check external links?

What link checking tool do you use
for your websites? Let us know in the discussion.