I’ve run into some interesting situations in working with customer VPNs. I am hoping a couple of them might be of general interest, and so will discuss them in this blog / article.
Prior MPLS VPN blogs
I’ve written a fair number of words on the topic of things a customer of MPLS VPN WAN should know. If you feel the need for more words from me, you can certainly find them at the links below.
Configuring the Customer Side of an MPLS VPN WAN, Part 2
Configuring the Customer Side of an MPLS VPN WAN, Part 1
MPLS VPN Cutovers
If you’re converting some WAN service (Frame Relay, ATM, T1, etc.) to MPLS VPN, that’s not necessarily complex. You have the new and the old WAN clouds, and you turn up a site on the new, turn off the old, test, and done. The new talks to the old via one or more sites that are connected to both. If more than one site is connected to both, then the routing considerations in the previous MPLS VPN blogs apply.
Where this gets a little bit more interesting is if you plan to change one of the connections, say from multi-link PPP access over several T1 links to a rate-capped T3 link. This involves a change of physical media. So you get a new router ready to be hooked up, and schedule a cutover time with the carrier. You turn off the old router, turn up the new (duplicating the prior router’s WAN addressing), the carrier does similarly on their end of things, and life is good. Or not. If you have problems, you work with the carrier to back out the changes.
If you have problems, or want to do things gradually, you might coordinate with the carrier to bring up the new link with your new / second router attached, at a different IP address, and no LAN connection. You can then stress-test the new router link (after hours), inject some non-overlapping routing prefixes (to verify the carrier propagates routes cleanly), etc. And, as I’m presently in the midst of doing, try to determine the cause of intermittent lost packets — is it GETVPN, the site Packeteer, something else? (The link counters and route longevity timers look very clean.)
What you cannot do is hook the second router up to the LAN, and gradually cut sites over to it. Great idea, however the problem is that your routers aren’t directly connected, and the carrier’s MBGP is going to go to one or the other of the two routers (unless configured for equal cost routes). BGP doesn’t care about link metrics. If you’re doing EIGRP to the carrier, you can’t do AS prepending or such to “steer routing” — and doing so would affect all traffic in from all sites anyway. So having most remote sites go to the first router, and swinging one site to the new router, just isn’t really feasible without some contortions. (Did someone just say “GRE tunnels”?)
MPLS and Load Balancing
A prior consultant set a customer up with dual T1 connections from each remote site to the MPLS VPN provider. They go to different provider POPs (Points of Presence), for some degree of local access diversity. (I always worry about the last mile and sharing a bundle / manhole going into the building, though.) They’re currently configured as primary and backup serial links.
If the T1’s went to the same POP, we (and the carrier) could set up multilink PPP and use 2 x T1 of bandwidth. Not a possibility with different POPs.
If the carrier enables its MBGP to support two equal cost routes, then load balancing might be a possibility. The carrier in question in fact said they had done so and support dual routes for customers with EBGP handoff. Unfortunately, my customer has EIGRP, and the carrier won’t support load balancing for that. (The SE said to set the EIGRP metrics differently, set a variance, and somehow load balancing would occur — I don’t believe that at all.)
The reason the carrier has to support this? MBGP, like normal BGP, is focussed on selecting one best route. Selecting and storing two or more routes roughly doubles the RAM and probably other resource consumption in the router.
If the carrier doesn’t support multiple routes, well, if both your links are active, then outbound you may be able to load share (equal cost next hops at the MPLS POPs from EIGRP’s or OSPF’s perspective). See the top arrows in the following diagram.
Within the carrier, the MPLS VPN MBGP will effectively route traffic to one or the other of the MPLS links. (Strictly speaking, MBGP will pick one exit PE for each destination, and use the MPLS label path built by routing to reach that egress PE.) The bottom arrows in the diagram show this.
Are there options? Sure, you can do point-to-point GRE tunnels, mGRE tunnels, or forms of IPsec between the CE routers connecting to the MPLS provider. They add overhead of various kinds but allow your IGP (OSPF or EIGRP) to operate more intuitively. You would only be using the MPLS provider’s routing to interconnect your CE (WAN) routers in that case.
MPLS Goodies
There are some related topics I should bring up, yet do not feel I have much to add to what’s already available on the topics. So I’ll comment briefly and provide you with some links.
Backdoor links. These are non-MPLS links connecting two MPLS VPN sites, or two routers at one logical MPLS site. Some of your MPLS “sites” may be extended, e.g. two buildings with a fiber or MAN link, each with a MPLS VPN router connecting to the same carrier and MPLS VPN. These situations are also known as “backdoor” links from the MPLS perspective. Various quirky routing situations can arise when they are present. The following figure shows a backdoor link between two MPLS sites — or imagine the rectangles are buildings with a LAN connection, for the one site version of a backdoor link.
See, for example, the Pepelnjak book for more words and another picture. The following link may work for you:
Site Of Origin (SOO). This is an MBGP attribute that the carrier can provide to indicate which site / CE connection a prefix came from. That way, when there is some backdoor link, the carrier can filter to make sure that prefixes received from a site do not get re-advertised to the second router at the site, create a routing loop, etc.
http://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/s_mvesoo.html
AS Override. This is a feature the carrier can implement to allow all your sites to use one AS number. This command over-writes the neighbor’s AS number with it’s own AS number, as many times as needed. This preserves the AS PATH link but overrides the standard EBGP behavior of rejecting inbound prefixes where the AS PATH contains the local AS number. The above SOO feature can be used with AS Override to prevent loops.
http://www.cisco.com/en/US/docs/ios/12_0t/12_0t7/feature/guide/VPN_EN.html
http://www.juniper.net/techpubs/software/junos/junos73/swconfig73-routing/html/bgp-summary6.html
Allow AS In. Used to build a hub-and-spoke MPLS VPN. The provider configures this when its own AS number might be received from routes passing through two routers (or one router and two logical links) at a customer site. It caps the number of times the AS number can appear, to prevent routing information loops.
http://www.ciscosystems.com.ro/en/US/docs/ios/12_3t/mpls/command/reference/mp_n5gt.html#wp1007547
OSPF sham links. These are a (painful) way the carrier can work around backdoor link problems for a customer. Personally, I think carriers aren’t going to want to do such things at any scale. In which case, avoiding backdoor links becomes your problem. See also the prior articles this blog started out with.
http://www.cisco.com/en/US/docs/ios/12_2t/12_2t8/feature/guide/ospfshmk.html
Final comment … if you have a meshy WAN or MAN topology and want to add L3 MPLS VPN, then be aware you may end up with a lot of backdoor links. You will need to discuss that (and any associated costs) with your potential MPLS VPN providers. They might constitute a case for L2 MPLS VPN intead, or the tunneling alternatives mentioned above.