Practice Safe BGP

This blog is a “fun BGP story” — let’s call it a Case Study #1, a recent design situation involving local-as (Case Study #2), plus some lessons learned.

One theme is that local-as can be fairly handy, so handy there might be a danger of over-using it!

Case Study #1: BGP Story

In working a while ago with a large company I’m going to refer to as “Company X”, I found some interesting paths in their network. This is my recollection of what I saw (probably altered by time).

Company X’s Call Centers were generally connected via two routers and two MPLS providers, the well-known A and B. A large number of the business sites used point-to-point IPsec VPN. The datacenter hubs for VPN used separate dedicated Internet links. All sites were assigned private site ASNs.

There were also Internet connections via carriers A and B, but that doesn’t really impact this tale.

Here is what the topology looked like:

I noticed there were some Call Center router BGP table entries with long AS paths. When I looked closer, there were multiple occurrences of carrier B’s ASN. When I asked about it, it turned out that Carrier B insisted on using its public ASN for both of its MPLS instances. But the call center sites had to reach the other sites (check availability, customer support, etc.). To address this, the carrier had reportedly configured allowas-in.

The primary BGP paths were as I expected. Some paths like the one shown in green in the diagram above from Site B to the bottom site did have the carrier ASN in it twice.

What was striking was some of the alternative paths in the BGP table. The ones shown in red and blue for instance.

The allowas-in was allowing paths that hit the MPLS B twice!

The red path indicates a related design problem. Site A had one connection down, and the best path to it (I don’t recall the details why) was transiting another site. That’s probably not what was intended.

One of my personal BGP rules is to be very controlling about prefixes, to reduce the number of surprises you can get. In particular, I prefer to filter BGP on the principle that a site should only advertise its own prefixes, with good address design, and preferably just one summary prefix. When you have site with dual BGP-connections, that keeps them from being transit sites for traffic going elsewhere.

Datacenter or hub sites are a special case. I prefer to have them advertise a corporate summary, where possible. That way, they can be transit between two sites, one of which has lost connectivity on the A side, the other on the B side.

If you have the local AS situation above, you could use a regex to filter any prefix with AS PATH having 3+ occurrences of the ISP B ASN.

A more recent technique: you could use local-AS and replace-AS to substitute a private ASN for carrier B’s. That would allow you to filter on the local carrier instance.

And by the way, don’t use “allowas-in” unless you absolutely have to. You’re going against the very useful capability of EBGP to prevent recirculating routes. If you do use “allowas-in”, you should probably be thinking about doing your own recirculation filtering.

Case Study #2: Local AS

This second case study is from a recent WAN redesign. The following diagram suggests what a site looks like, stripped of non-essentials.

Each WAN site is assigned a private BGP ASN. For example, 65001 in the diagram below. The company also has an assigned ASN, which we’ll pretend is 65002.

The Internet-facing VPN router is configured with the assigned ASN and uses it for EBGP to the Internet peer. It uses local-as and no-prepend replace-as to the site WAN module router on the left, using the private DMVPN ASN 65003. That makes that BGP peering EBGP.

It also uses local-as with the DMVPN ASN to the DMVPN hub, which does likewise. That is, all DMVPN routers appear to be ASN 65003. That allows the hub to operate as an IBGP peer and route reflector for DMVPN sites (Note: This is a fairly recent feature, so check for code supporting it).

The customer and I did discuss the fact that local-as was originally intended for migration scenarios, but it turns out to be very handy for other things.

Is using it that way a good idea? I do suspect my “beer principle” might apply (one beer = good, many beers = headache). Over-using network features can lead to headaches and problems. Using features in ways the vendor likely didn’t anticipate or test can also painfully find bugs (that probably also applies to mis-use of beer, but I don’t think we want to go there as an analogy). In this case, we’re stretching the original use case for local AS.

Lessons to be Learned Here

Sites should generally advertise their own prefixes only in BGP, unless you want them to be transit.

You can also think of this as a form of “manual split horizon” when there are two WAN providers. Sites should learn prefixes at other sites directly from the provider connection, not via some other round-about path.

Failure to do this, particularly with a provider doing allowas-in, clutters up the BGP table with ridiculous paths. That doesn’t help memory and CPU on the router, nor is it likely to help convergence time and troubleshooting time.

Comments

Comments are welcome, both in agreement or constructive disagreement about the above. I enjoy hearing from readers and carrying on deeper discussion via comments. Thanks in advance!

—————-

Hashtags: #CiscoChampion #TechFieldDay #TheNetCraftsmenWay #BGP

Twitter: @pjwelcher

Disclosure Statement

NetCraftsmen Services

Did you know that NetCraftsmen does network /datacenter / security / collaboration design / design review? Or that we have deep UC&C experts on staff, including @ucguerilla? For more information, contact us at info@ncm2020.ainsleystaging.com.

Case Study #1: BGP Story

Case Study #2: Local AS

Lessons to be Learned Here

Comments

NetCraftsmen Services

Leave a Reply

Related Topics