I recently got a call from the company I do some consulting work for (on the side). “The Exchange Server is acting up” was the official assessment for this client’s small network of about 30 computers. I learned that the server was running out of disk space at an alarming rate, users were unable to maintain a connection with the server, and the Internet seemed slow.
My first instincts told me that there was a virus loose on the server. Small companies often don't have the best security and antivirus practices in place, and this particular client was notorious for its lax security policies. They also didn't have a full-time network administrator—they had a few members of the local secretarial staff aiding with IT tasks.
Still on the phone with my contact (who was on site), I suggested he try a couple of quick things to see if we could get to the root of the problem. Norton reported nothing, the event logs reported errors that Exchange wasn’t working (no surprise there), and thoughts of a Denial of Service (DoS) attack roared through my head as I headed over, but there was no quick resolution.
Let's take a look at what we did to discover what the actual problem was, and how we went about resolving the issues.
Before I get too far into explaining the ordeal, let me tell you about the environment we were operating in. As I mentioned, this was a small company with less than 30 users. The company was using a single server running Microsoft Small Business Server 2000 SP3 with Exchange 2000 SP3.
After arriving on site, a quick look at the SMTP queues was all it took to discover the problem. My suspicions of a DoS were correct. The entire server, which hosted the e-mail server (Exchange) and the database (SQL Server), was effectively out of commission.
The network had encountered a DoS attack, but I soon discovered that the root cause behind this loss of functionality was an open relay. Whatis.com defines an open relay:
"An open relay (sometimes called an insecure relay or a third-party relay) is an SMTP e-mail server that allows third-party relay of e-mail messages. By processing mail that is neither for nor from a local user, an open relay makes it possible for an unscrupulous sender to route large volumes of spam. In effect, the owner of the server—who is typically unaware of the problem—donates network and computer resources to the sender's purpose. In addition to the financial costs incurred when a spammer hijacks a server, an organization may also suffer system crashes, equipment damage, and loss of business."
Clearly, what had caused our DoS was the open relay. The bigger question here was not whether we were open, but rather how and why? Exchange 2000 comes with relaying prohibited (Exchange 5.5 did not come this way). In fact, the only way you can relay is if you either authenticate or specifically allow a host to do so.
Again, this was a small company, and since it did not have a full-time network administrator and the personnel responsible for the day-to-day IT tasks were not fully qualified network administrators, the open relay must have come from a misconfiguration that essentially allowed relay for all hosts regardless of authentications.
This is an excellent example of the importance of having a qualified IT staff member to monitor a network. This company was easily big enough to need a full-time network administrator but balked at it for various reasons. A full-time staff member probably would have been able to catch this sort of thing in a short time and have it fixed quickly.
After determining the root cause of the Exchange Server’s problem, I began to determine a strategy to get this server back online. Because the server was responding so slowly (due to all the excessive mail in the queues) nearly any attempts to do anything were painfully slow. We were very close to losing the server completely, and we needed to act fast.
My first step was to turn off the relay, effectively stopping the server from receiving any more relay attempts. My second task was to try to clean out some of the queues, but it quickly became obvious, that this was no easy task. Because of the literally thousands of domain queues that were open, it would have taken me hours to clean out all of them. I would have had to go one-by-one since Exchange Server 2000 doesn’t allow you to clear more than one queue at a time. After messing with them for a while, I decided there had to be a better method.
A quick search on Google turned up Microsoft TechNet article 324958. Essentially this article perfectly described the symptoms we were experiencing, and effortlessly gave us the specific steps we needed to quickly clear up this issue. The general synopsis of this TechNet article is as follows:
- Determine whether your Exchange server is serving as an open relay. At this point we had already figured out what was wrong, but the article lists some steps to further confirm this. I have provided these in detail below.
- Turn off the open relay. This obviously makes sense. Once you have figured out you have an open relay mail server, you need to shut it off.
- Clean up your queues. This is the fun part. Depending on how long the server was open, and how many relay attempts may have come through, your SMTP queues will be enormous. In our case, we had literally thousands of different domain queues in our SMTP Virtual Server. To manually go through these and delete the messages would have taken hours upon hours. This TechNet article has a nice little trick to help you out. In short (and please consult the article for more details), you essentially “trick” your Exchange Server into moving all the messages from their individual queues into one single queue, which you can then whack all at once. This is a truly brilliant idea, but beware, because it will take a long time (hours) to get all the messages into the new queue.
- Clean up your file system. After you have deleted all your queues, you now have to address your file system. As your Exchange server tries to transfer all these messages, it can’t keep up, and so when the retry interval passes, it drops them into a folder on your hard drive. The path is %SystemRoot%:\Program Files\Exchsrvr\Mailroot\Vsi 1\BadMail. Here you will find all your messages. Don’t open this folder (it may cause your server to freeze up because there are so many messages). Instead, rename the folder, and create a new folder called BadMail. Then delete the old BadMail folder.
- Defrag your hard drives. With all the disk activity that your poor server experienced, a defrag is definitely in order.
- See if you've been blacklisted. If you were open for quite a while, there is a good chance that you were blacklisted (i.e., your e-mail is not being allowed by other e-mail servers, because you are listed as a known source of spam). Consequently, you may notice that messages will start to bounce back, even messages sent to known and good e-mail addresses. This link describes in more detail what to do if you are blacklisted.
How to tell if your Exchange Server is an open relay
Do you know if your mail server is an open relay? We learned the hard way, but you can protect yourself now. No matter if you have Small Business Server, or the full-blown Exchange 2000 Server, do yourself a favor and go check out your configuration. Follow these instructions to check:
- On a machine on your network, or on the actual server running Exchange itself, fire up a Telnet session by going to Start | Run and typing Telnet in the Run dialog box.
- From the Telnet prompt, type the command: set LOCAL_ECHO and press [Enter]. (This is so you can see your Telnet interaction occur on the screen.)
- Then, type open servername 25 and press [Enter].
- Just to make sure you actually connected, type hello yourdomainname.com and press [Enter] again.
- If you got a response, type the following command: mail from: email@example.com and press [Enter].
- Then, type rcpt to:firstname.lastname@example.org and press [Enter]. You should receive a message to the effect of Unable To Relay For Recipient@someotherdomain.com.
If you received the error message, you are okay, at least for now. If you didn’t receive the message, you definitely have a problem. Exchange 2000 comes with Relay turned off by default, but someone could have turned it on. Open your Exchange System Manager (i.e., the main Exchange administration utility) and follow these instructions:
- From within Exchange System Manager, expand the organization_name object, and then expand the Servers node. Expand the server_name object of the server on which you want to prevent mail relay, and then expand the Protocols node.
- Expand the SMTP node, right-click the virtual SMTP server on which you want to prevent mail relay, and then click Properties.
- From the Properties page, click the Access tab, and then click the Relay button.
- In the Relay Restriction dialog box, you will see several options. The Only The List Below option is turned on. By default, this option is empty and you need to type in the appropriate servers. The Allow All Computers option, which allows any system that can successfully authenticate to relay (regardless of the Only The List Below option), is also turned on. By default, this permits users and computers that can authenticate with the server to relay through the server. This option permits the Exchange 2000 server to relay mail from your internal network clients. Note that if you allow only anonymous access, the server will not authenticate users or computers.
From the relay restrictions page, you can find out if you have allowed any relaying. This was where we found our bogus address that caused the problem. Additionally, you should review any SMTP connectors your Exchange organization has configured to ensure that relay is not allowed.
Testing your configuration with Telnet, and then verifying to whom you are allowing relay is very important. If you are allowing any relay, be sure that you have configured it correctly, and that you know that your source is a good source.
Final thoughts and future planning
The company involved here learned a couple of valuable lessons: Taking appropriate measures to protect yourself is important and allowing seemingly simple changes to be made on a production server can have a drastic effect. The company only had a few hours of downtime, which really didn’t affect its overall business, but it could have been much worse. If the outage had been longer, it could have seriously impacted the company's e-mail performance. In a larger organization, the outage’s impact could have been a lot greater.
After this event, the company's viewpoint on network security has changed a bit. Moving forward, I don’t think that they are going to allow any types of relay at all, regardless of who says they need it, and they are exploring a more focused approach to network administration.