Targeting the cause of a VPN problem requires a systematic troubleshooting process. Brien Posey explains steps you can follow to zero in on the culprit.
VPNs can involve several systems working together to provide functionality, which makes pinpointing problems a little tricky. The best approach to troubleshooting VPN problems is to use the process of elimination. In this article, I will show you 10 things to look for when you're trying to determine the cause of VPN errors. This isn't intended to be a comprehensive guide to VPN troubleshooting, but it should help you get started with the process.
Note: This article is also available as a PDF download.
1: Find out who is affected
The first step in troubleshooting any VPN problem is to determine who is affected by it. That information can go a long way toward helping you figure out where to start looking for the problem. For example, if everyone in the company is having problems, you might look for a hardware failure on your VPN server, an incorrect firewall rule, or perhaps a configuration problem on your VPN server.
On the other hand, if the only person who is having a problem is that guy from Marketing who can never seem to remember his password or the woman from Accounting who insists on connecting from her home computer, that too can tell you a lot about what may be going on.
2: Check to see whether users can establish VPN connectivity
When you begin the actual troubleshooting process, I recommend you start by determining whether the affected users can establish VPN connectivity. Remember, not all VPN problems involve connection failures. Sometimes, users can connect, but they can't access network resources. Determining whether the user can establish VPN connectivity will help you narrow down the areas in which you should be looking for problems.
3: Look for policies preventing connectivity
If you find that certain users are having trouble establishing connectivity, have them try to log in from a known good machine. If that doesn't work, there may be a policy in place preventing them from logging in. For example, if you are operating in a Windows Server environment, you should check the Active Directory Users And Computers console to verify that the user has been given permission to log in remotely. Likewise, some VPNs are designed so that users are allowed to log in only during certain times of the day.
4: Don't rule out the client
If only a single user is affected by the problem and has no trouble logging in from another computer, the problem is most likely related to the computer that he or she was trying to connect from.
Several years ago, one of my users was having trouble connecting to a VPN from a home computer. When I tried talking him through the problem, he kept telling me that what he was seeing didn't match what I was asking him to do. It turned out that the user had installed a freeware VPN client because a friend had told him it was much better than what he'd been using. On another occasion, I had someone who was unable to establish VPN connectivity because a virus had destroyed the computer's TCP/IP stack.
If users are attempting to connect from their own computer, you can't assume anything about the system they're using.
5: Can users log in locally?
This probably sounds silly, but when users tell me that they are having trouble logging into the VPN, one of the first things I do is verify that they can log in locally.
I once had a user complain of VPN problems. I spent a lot of time trying to troubleshoot the issue. When nothing I tried seemed to make any difference, I decided to double-check the user's account to see whether there were any restrictions on it. When I did, I noticed that the account was locked out. I unlocked the account and tried again, but it wasn't long before the account was locked again.
I reset the user's password and was able to log in without any problems. When I told the user about it, he told me that he'd never been able to log in with that account. When I asked how he got his work done each day, he told me that he always logged in as one of his co-workers. (You can't make this stuff up.). Ever since that incident, I always like to verify that the user's account is working properly.
6: Are affected users behind NAT firewalls?
Another thing I like to check is whether affected users are connecting from computers that are behind a NAT firewall. Normally, NAT firewalls aren't a problem. However, some older firewalls don't work properly with VPN connections.
7: Check for Network Access Protection issues
Microsoft created the Network Access Protection feature as a way for administrators to protect network resources against remote users whose computers are not configured in a secure manner. Although Network Access Protection (NAP) works well, it has been known to cause problems for end users.
One problem I have seen a few times is that Network Access Protection is based on group policy settings. Therefore, if a user attempts to connect from a computer that is not a domain member, NAP will not work properly. Depending on how the VPN is configured, either the health of the user's computer will either be ignored or the user will be denied access to the network.
It is also common to configure NAP so that if a user's computer fails the various health checks, a VPN connection is established to an isolated network segment containing only the resources necessary to address the health problem (sometimes through automatic remediation). When this happens, some users may not understand what is going on and may assume that there is a problem with the VPN.
8: Try accessing various network resources
If users can log in to the VPN, but they can't do anything once they're connected, the next step is to systematically attempt to connect to various resources on the network. This is important because you may find that some network segments are accessible while others are not.
For example, when a user connects to a VPN server, the computer is typically assigned an IP address by a DHCP server. However, I once saw a situation in which the DHCP server had been configured incorrectly, and users who were assigned addresses from one specific scope couldn't access remote network segments.
9: Try accessing resources by IP address rather than server name
You can also try connecting to network resources by their IP address instead of by their name. If you can access previously inaccessible resources by using IP addresses, you can bet that a DNS problem is to blame. If that happens, you should check to see which DNS server VPN clients are configured to use.
10: Are users having performance problems?
Sometimes, users may find that although a VPN connection is functional, it is painfully slow. When this happens, you will have no choice but to do some performance monitoring on your infrastructure servers to ensure that they are not experiencing performance bottlenecks.
I have found that if the infrastructure servers are the source of performance problems, you will usually have multiple users complaining about poor performance. If only a single user is complaining, the problem is likely to be related to that user's Internet connection. I recently stayed at a hotel whose Internet service was so slow that I had difficulty even checking my email. If that happened to an end user, he or she might assume that the hotel's Internet service was running at a normal speed, but that the VPN server was having problems.