Software

Tracking your Exchange server's performance

Is your Exchange server running at peak performance? Do you even know what's going inside of it? In this Daily Drill Down, Brien Posey shows you how to track your Exchange server's performance.


In previous Daily Drill Downs, I’ve explained many different ways to enhance Exchange Server’s performance. So far, however, you haven’t had a way to actually measure any performance gains achieved through the use of these techniques. In this Daily Drill Down, I’ll explain how you can obtain quantifiable measurements that will help you to see just how well Exchange is performing both before and after the optimization process.

Monitoring Exchange Server
You’re probably already familiar with Performance Monitor, Windows NT’s tool for measuring the performance of various system components. Performance Monitor is also the tool of choice for keeping an eye on Exchange Server.

Remember that Exchange Server piggybacks on top of Windows NT Server. Therefore, Exchange Server requires most of the same resources as Windows NT does. For example, if Windows NT is low on memory, then Exchange Server will also be low on memory. In addition to the usual Performance Monitor counters, the Exchange Server installation program installs several counters that are Exchange Server-specific. By watching these Exchange-specific counters, you can get a feel for what’s really going on in your system.

Before I get into all of the specific details about which counters to monitor, I’d like to take a minute to discuss the general technique that you’ll use to optimize Exchange Server. The procedure begins by establishing a baseline. A baseline is nothing more than a measurement of how your server is performing under normal circumstances. After all, measuring performance gains is pointless if you have nothing to which you can compare those measurements.

Once you’ve established a baseline, it’s time to begin the optimization process. The optimization process consists of repeatedly searching the baseline for bottlenecks, removing the bottlenecks, and remeasuring. A bottleneck is basically the slowest part of the system. For example, suppose that your system contains a really fast processor, but the processor is often idle because it’s waiting for information to be loaded from the hard disk. In such a situation, the hard disk is the bottleneck because it’s the slowest component in the system and other components have to spend precious time waiting for it to do its job.

In a situation in which the baseline revealed that the hard disk was the bottleneck, you’d probably want to upgrade to a faster hard disk or to a RAID array. Once you’d done so, you could remeasure your system’s performance and compare the new measurements against the original baseline to judge the performance gain. If the new measurements reveal that the hard disk is no longer the bottleneck, you’d search for the new bottleneck and fix it.

Before you go crazy with the process of tracking down and removing bottlenecks, I should point out that there will always be a bottleneck in any system. Even the fastest, most well-tuned computer in the world has a bottleneck because there’s one component that’s just a little bit (or a lot) slower than the other components in the system. My point is that it’s easy to take the process of removing bottlenecks to an extreme level. Just use common sense when tweaking your servers. With that said, let’s get started with the optimization process.

Observing the server’s workload
When it comes to optimizing servers, there’s one extremely important thing to remember: The only constant is change. Basically, what that means is that over time, users’ habits and server workloads change.

I was once asked to look at someone’s Exchange server to see how its performance could be improved. After looking at the baseline, I made a couple of recommendations to the company. But, by the time the company decided to use my recommendations, eight months had gone by. When the new hardware was installed, the performance gains were minimal and the people who had asked me to look at the server were upset. A closer analysis revealed that during the eight months, the user count had doubled—the resources that I had recommended were based on a much lower user count. The lesson here is that if you collect baselines over time, you need to also take into account increases in the number of users on the server and the increased workload of existing users.

Fortunately, Exchange Server comes with several Performance Monitor counters that, when used over time, can help you to spot trends in increased server usage. Below, I’ve listed some of these counters and what they mean. Remember that Performance Monitor is divided into objects and counters. The objects are basically categories, while counters measure a specific attribute of the object. In the list below, I’ve listed the object and counter in the format Object | Counter.
  • MSExchangeIS | User Count: This counter represents the number of clients connected to the Exchange server. It you watch this counter over time, you can see how quickly the user count is increasing.
  • MSExchangeIS | Active User Count: Sometimes the number of connected users can be deceptive when measuring the server’s workload. After all, how many users do you know who log in every morning and don’t actually do anything for most of the day? Any managers come to mind? If you want to look at the real load on the server, check out this counter. It counts only the users who’ve actually done something Exchange-related in the past 10 minutes.
  • MSExchangeISPrivate | Messages Submitted / min: This counter measures the number of messages that are sent to the private information store each minute.
  • MSExchangeISPrivate | Message Recipients Delivered / min: This counter tracks the number of messages that are delivered by the private information store each minute. It may seem like common sense that the number of messages submitted should be the same as the number of messages delivered, but this isn’t the case. The number of messages delivered will usually be a lot higher because often a single message is sent to many different users.
  • MSExchangeISPublic | Messages Submitted / min: This counter measures the number of messages being sent to the public information store each minute.
  • MSExchangeISPublic | Message Recipients Delivered / min: This counter measures the number of messages delivered by the public information store each minute. Like its private information store counterpart, this counter will be higher than the number of messages submitted because many messages are sent to multiple users.
  • MSExchangeMTA | Messages / sec: This counter tallies the number of messages that flow through the MTA each second. This is a great counter for measuring the server’s overall workload.
  • MSExchangeMTA | Messages Bytes / sec: If you notice that Exchange is bogging down over time, but the same number of users are sending about the same number of messages over that time period, then the slowdown could be caused by the messages getting bigger. You can use this counter to measure the number of bytes flowing through the MTA each second. If you want to know the average message size, simply divide this byte count by the number of messages flowing through the MTA each second.

Observing service times
So far I’ve shown you a few Performance Monitor counters to track, and I’ve mentioned a number of optimization techniques in previous articles that you can use, but what is the optimization process really all about? Is it simply a matter of making a few numbers change on a Performance Monitor chart? No, it’s about making sure that the server is responsive to end-user requests.

Unfortunately, there’s no way to directly measure how responsive the Exchange server is to user requests. If you really want to find out how responsive the server is, simply ask the users who use it. But, if you really want some measurable statistics, there are some Performance Monitor counters that you can track over time to get a feel for how the server is performing. I’ve listed these counters below in the same format as I previously used.
  • MSExchangeISPrivate | Send Queue Size: This counter measures the number of messages that are waiting to be sent. This number can be greater than zero during periods of heavy activity, but it should return to zero quickly after the period of heavy use has passed.
  • MSExchangeISPrivate | Average Time for Delivery: This counter indicates the average amount of time that it takes the private information store to deliver messages.
  • MSExchangeISPublic | Send Queue Size: This counter measures how many messages are waiting to be sent to the public information store. Just like its private information store counterpart, this counter can be above zero during periods of heavy use, but it should return to zero quickly thereafter.
  • MSExchangeISPublic | Average Time for Delivery: This counter provides an indication of how long it usually takes the public information store to deliver messages.
  • MSExchangeMTA | Work Queue Length: You’ve seen that both the private and public information stores can measure the number of messages waiting for delivery. If you want to measure the number of messages that are waiting for delivery collectively, then you can use this counter. It looks at the number of messages in the MTA queue rather than basing its results on the private or the public information store queue.

Observing the physical hardware
In one of my previous Daily Drill Downs entitled “Optimizing Exchange Server, part 2,“ I discussed how various hardware components could cause Exchange to run slowly. In this section, I’ll explain how to measure the performance of some of those components and tell you what to do if you do find a bottleneck.

Processor
You can judge just how well the processor is performing by looking at the System | % Total Processor Time and the Process | % Processor Time counters. These counters measure what percentage of your machine’s total processing power is being used at any given time. Although it’s normal for these values to spike to 100 percent, they should remain at an average of 80 percent or less.

If you find that the processor is inadequate, first look for unnecessary services that might be running on the server. Disable these services and retest the processor. If the processor is still running too slowly, you might install a faster processor, a larger L2 cache, or, when appropriate, a second processor.

Hard disk
The disk counters are divided into physical disk counters and logical disk counters. I recommend using the logical disk counters for gauging hard disk performance. Before you do though, there are a couple of things that you need to know. First, the disk counters are disabled by default because they negatively impact the server’s performance. To enable the disk counters, enter the command DISKPERF –Y at the command prompt and reboot the server. The other thing that you need to know is that the disk counters are inaccurate. It’s impossible to measure the hard disk’s performance accurately because the counters rely on the hard disk to store collected data. You can, however, use the counters to get a general idea of what the hard disk is doing.

When measuring hard disk performance, I recommend looking at the following counters and comparing the results with the maximum load specifications set by your hard disk’s manufacturer. By doing so, you’ll be able to see how close your hard drive is to reaching its speed limit. Here are the counters that I recommend testing:
  • LogicalDisk | Disk Bytes Written / sec
  • LogicalDisk | Disk Bytes Read / sec
  • LogicalDisk | Disk Reads / sec
  • LogicalDisk | Disk Writes / sec

You can also look at the LogicalDisk | Avg. Disk Queue Length counter. If this counter is consistently above 2, it may indicate a hard disk bottleneck. I recommend ignoring the LogicalDisk | % Disk Time counter. Things like smart disk controllers and elevator algorithms make this counter totally unreliable.

Memory
When it comes to memory, there’s no such thing as too much. The most important memory-related counters are Memory | Pages / sec and Memory | Available Bytes. The number of pages per second relates to how often your server has to rely on memory pages that are stored in virtual memory on the hard disk.

If this value is consistently above 20, then it’s time to add more memory. Likewise, the number of available bytes refers to the amount of free memory in the server. This value should never drop below 4 MB (higher for larger servers). If it does, it’s time to add more memory.

Conclusion
In this Daily Drill Down, I explained how you can use Performance Monitor to see just how well Exchange Server is performing. I also discussed how to use Performance Monitor data to track down system bottlenecks.
The authors and editors have taken care in preparation of the content contained herein but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for any damages. Always have a verified backup before making any changes.

Editor's Picks