Who among us has not encountered MTU issues, generally in conjunction with a VPN or GRE tunnel? Okay, all two of you can put your hands down now. My point here is, most of us are used to dealing with MTU, and for TCP traffic, the one big tool is of course the Cisco interface command “ip tcp adjust-mss size“. This blog talks about some variant situations where the solution is not so clear.
A prior article at https://netcraftsmen.com/packetization-layer-path-mtu-discovery/ goes into the issues with Path MTU Discovery, P-MTU-D. If you don’t like reading my favorite author, see also Wikipedia, at http://en.wikipedia.org/wiki/Path_MTU_Discovery.
The problem: some security / firewall staff are blocking all ICMP as a DDOS defense. The IP protocol requires certain ICMP messages to operate correctly. So it is unwise to block all ICMP — and more so, with IPv6, which has a heavier reliance on ICMPv6 for certain situations. Advice on this topic seems mixed, e.g. after Google search I found some articles discussing this, that seemed to be saying blocking all ICMP might be necessary to avoid DDOS. I’m not very comfortable with that recommendation. One can narrow the window of exposure and block all but ICMP large MTU messages. What do you think of blocking all ICMP?
I recently read somewhere that Microsoft modified P-MTU-D in newer OS releases, so that if TCP acknowledgements aren’t received and ICMP MTU exceeded messages are also not received, the protocol stack assumes ICMP interception and decreases the MSS. That seems like a good idea, especially in light of the fact that some percentage (5-10% perhaps) of sites do seem to block all inbound ICMP.
It is well-known that the adjust-mss command does not (cannot) work with UDP packets.
MTU and IP VideoConferencing
In talking about a IP video-conferencing situation, there was mention of setting the MTU down to say 1420 on video endpoints, since some people were doing IPVC from home over VPN. And if you think about it, RTP/RTCP being UDP, yes, adjust-mss cannot help.
Some quick Google searching showed that IPVC endpoints are supposed to do P-MTU-D. But that doesn’t help when there are random e.g. business partners who block all ICMP.
If you consider that Cisco Jabber and other IPVC softphone clients tend to spread like wildfire in sites doing IPVC, do you really want to have to set the MTU on each one? Are they smart enough to do UDP-centric P-MTU-D? Or can the MTU they should be used be set on a central control point? If they don’t get RTCP replies or some form of feedback, how can they work around a mis-configured firewall that blocks all ICMP? (Hmm, perhaps they could use TCP to determine MTU, then use that MTU for UDP? Does anybody actually do that?)
MTU and Tunneled VPN Traffic
Another situation involving MTU that came up recently is tunneled IPsec traffic.
The DDOS service from Prolexic (www.prolexic.com) uses BGP to steer your inbound Internet traffic to their site, then routes the acceptable traffic via GRE tunnels to your site’s Internet routers. See the following diagram.
Inbound traffic (red) goes to Prolexic, gets DDOS-filtering applied, then sent via the blue GRE tunnel to the customer site. Customer’s outbound traffic (green) goes directly to the Internet. This is asymmetric, so might run afoul of uRPF checking, I imagine. So perhaps sites sometimes route their outbound Internet traffic via Prolexic?
They recommend setting adjust-mss on your LAN and GRE tunnel interfaces of your Internet router(s). That works for TCP traffic.
Unfortunately, if you have a VPN termination point on say a firewall behind the Internet router(s), that could be a problem. Inbound VPN client traffic has to pass through the GRE tunnel. IPsec is either ESP or AHP, different IP sub-protocols than TCP.
In particular, IPsec does not generally do P-MTU-D or anything like it. Some Cisco devices do support it for VPNs, others do not. Even if the implementation does attempt P-MTU-D, MTU black holing stymies that.
The end result is that for MTU black hole sites, there is no way for the Internet router to communciate a smaller MTU to the head end VPN endpoint. Having some way to set the desired MTU directly seems like one possible answer.
The site in question terminated VPNs at the head end on a CheckPoint firewall. It appears (correct me via a comment if I’m wrong) that CheckPoint takes a server-like approach to MTU, so interface MTU can only be set globally on the outside interface of the firewall. Rather than for VPN clients only, which would be my preferred approach. (I did find web articles talking about setting client-side MTU, which is a business I’d rather not be in.)
One could try setting the outside MTU on the firewall(s), and hope for the desired effect on traffic coming from inside servers. In other words, put your effort into shrinking the TCP payload for IPsec, where you control things. This would require some lab testing time, to see if that suffices. If the firewall sends ICMP MTU exceeded messages and your servers do TCP only with P-MTU-D, this might work.
Since most of the encrypted traffic is partners running web-based applications, one interesting idea is to put a router on the inside of the firewall, and use “adjust-mss” on its LAN side interface. If the firewall(s) in question only service VPN traffic, one can plumb things so all unencrypted VPN traffic passes through this extra inside router. The unencrypted VPN traffic would be lower in aggregate speed than the Internet connection for sure, so scaling / cost are unlikely to be big problems. This was named the “MTU-crusher” router in discussion. (And thanks to the person that coined that term.)
Internet Denial of Reality
This whole topic has struck me for years as having some denial associated with it. The “official” answer to MTU is P-MTU-D. But there are lots of sites who “didn’t get the memo” and still block all inbound ICMP. So as far as I’m concerned, we can’t just ignore those sites, which means P-MTU-D just doesn’t work adequately in the real world. Is anybody at the standards level recognizing that and dealing with the real world here? The answer is yes, and old history.
RFC2923 discusses the problem, getting it on the radar. And RFC4821 provides one way to be efficient about P-MTU-D and not rely on ICMP feedback. Its last status appears to be draft standard undergoing field trial (2007). Google search also reveals that Cisco TelePresence implements RFC4821 behavior. Present status appears to be optional, for use by clueful vendors. Unfortunately, there is high latency between draft or actual standards and their implementation and deployment in the real world.
One perspective is that the sites creating the problem by blocking all ICMP are depriving themselves of connectivity, e.g. for video or file downloads. To the extent that you’re in business and they’re a partner or customer, you may not be able to leave it as their problem, nor do you have the time to educate the offending party.
I have read about modified P-MTU-D where you detect such sites (“MTU black hole detection”) by sending large frames with DF bit set and then with DF not set. That might help. How many implementations currently do that? Or do vendors implement the Microsoft approach, lowering MTU when no response is detected with large frames with DF set? I imagine vendors are all over the place, with no consistency on this.
Is any of this learned experience being ported to IPv6? I’m tracking IPv6, but not that closely. IPv6 requires fragmentation by the endpoint, routers are not allowed to fragment. So sites that block all ICMPv6 will learn painfully from their MTU-related connectivity problems. That does seem like it puts the problem where it belongs, on the offending firewall admins.
I’m glad to see Cisco has recognized some of the security implications of the increased reliance on ICMP in IPv6 by putting features into the switch code, e.g. NDP security for router discovery and SLAAC. (NDP uses ICMP for messaging.)