Mark Pimperton explains how an MTU mismatch between the ISP's router and the main firewall left customer emails languishing in cyberspace.
This tale relates to cable broadband provided in the UK (by Virgin Media Business) but it's more about the ISP router so maybe happens with other providers. After ordering a broadband upgrade (from a 10Mb/1Mb service to 50Mb/5Mb), my skeptical colleague searched relevant discussion forums looking for potential problems. He found reports of delays to incoming email and warned me this could affect us.
By the time we checked this out we'd already had reports of expected emails not being received. To see what was going on we enabled SMTP Receive Logging in Exchange 2007 as follows:
- Navigate to Server Configuration / Hub Transport.
- In the Receive Connectors pane right-click the connector name and select Properties.
- On the General tab, set Protocol logging level from None to Verbose and click Apply or OK.
(You can also enable this from the Exchange Management Shell; see this blog for details.)
The resulting logs were written to C:\Program Files\Microsoft\Exchange Server\TransportRoles\Logs\ProtocolLog\SmtpReceive. They're not particularly easy to read but the key phrase is "Timeout waiting for client input". We were getting this every few minutes in the log and we could tell that some emails were delayed by several hours before finally being delivered.
The MTU, or Maximum Transmission Unit, is a networking property which is generally out of sight, out of mind. On this occasion, though, it was the cause of the repeated email delivery failures. Following the fix suggested by our ISP, we changed the MTU on our firewall/router WAN interface from 1500 to 1460 to match the ISP router's standard setting of 1460 (which can't be changed).
The next day, the timeouts in the SMTP log fell from every few minutes to 2 or 3 per day - which is normal. (It can happen when our email filtering provider starts to deliver a message but then marks it as spam and stops the transmission, for example.) All was well again.
I don't claim to understand how this difference of 40 bytes in the MTU between the ISP router and our main router/firewall actually caused all the fragmentation and retries. And, to be honest, all I cared about was the fix. In any case, we're told that a future firmware upgrade to the ISP router will remove this requirement for all other components to match the MTU setting.
Incidentally, when you no longer need the SMTP logging be sure to turn it off again - or at least set yourself a reminder to purge the log files regularly. Ours were between 1 and 4Mb per day and we're a relatively light email user.
We have two ISPs and this email problem happened when we upgraded the first one back in March. We recently upgraded the second (switching from standard ADSL to FTTC). No email hassles this time (or none that I've been told of), but we did have a repeat of the dropped ARP packets issue I wrote about previously. Because the setup was different with the upgraded connection, the dropped packets were coming from the internal IP address of the modem (in the 192.168.x subnet) rather than an IP address somewhere in the ISP's network like before.
The symptoms were also slightly different than before. The first time, we lost all connectivity and DNS. This time, we just saw a glitch in connectivity (with a bunch of alert emails from the firewall telling us it was down and then up again). This was happening at the same 15-minute interval, however, so we were pretty sure it was the same thing.
We identified the problem using the SonicWALL's Packet Monitor as before and applied the same fix - adding a static route to make packets from that address acceptable. No more 15-minute glitches.
Our email delivery delays were fixed by making our firewall MTU the same as the ISP's router. After two different strange problems with two broadband upgrades we're looking forward to just enjoying our faster connections for a while!