That’s “Ethernet over MPLS” to you! My blog from a couple of days ago mentioned that the Catalyst 6500 with Sup720 has great forwarding performance with EoMPLS, without adding any specialized hardware such as the SIP or ES cards. See https://netcraftsmen.com/blogs/entry/extended-vlan-mitigation.html for the context.
In this blog, I intend to write (briefly, I think I can manage briefly…) about the basics, then show the lab tested configuration snippets.
What is EoMPLS?
I think of EoMPLS as a virtual patch cord. Frames go in a port on a switch, get transported via MPLS labels over an IP routed network, and get spit out a port on another switch. The term “pseudowire” (“PWE”) is also used for EoMPLS or similar “virtual patch cord” functionality.
EoMPLS comes in two flavors: port-based (everything goes on a trunk port, preserving 802.1q VLAN info) or VLAN-based (traffic in an access VLAN gets transported). VLAN-based can transport different VLANs to different endpoints, or mix and match L2 and L3 activities on a port (via subinterfaces). It is configured using dot1q subinterfaces. I may include configuration samples in a second blog.
Why EoMPLS? It is downright handy when you have a VLAN that needs to be HERE and THERE, with a routed network in the middle. For example, cluster heartbeat between servers in two data centers. You do need MPLS capable switches that can also do EoMPLS at wire speed. Unlike MPLS VPN, there is no requirement for MBGP.
Note that the technology will in principle do that for you. I’m not necessarily recommending you do that, because the cluster may behave oddly if the network between HERE and THERE goes flaky, drops packets for a while, etc. But EoMPLS is a great thing to have in your bag of tricks, for when you need it.
The other warning about EoMPLS is it might be a bit dangerous to your network’s health. It is so easy to set up, that you can easily dig yourself a nice big hole, full or L2 over L3 “spaghetti”. Which complicates troubleshooting, defeats structured design, ignores the carefully crafted routed core in your network, and causes warts. Well, 3 out of 4 anyway.
EoMPLS versus VPLS
Like QinQ tunneling (802.1q tunneling of switched traffic), EoMPLS just takes the frame that comes in a port (somehow), transports it across the middle, and spits it out the paired port on the other end. There is no examination of MAC address, no learning of source MAC address, in short, no switching logic applied.
VPLS is Virtual Private LAN Service. It is EoMPLS on steroids. In VPLS, the switch can have several EoMPLS tunnels and make a switching decision, as to which one to use. VPLS in the 6500 (7600) requires extra hardware assistance. It does not work in a “vanilla” 6500 (assuming the 6500 can somehow be considered vanilla is kind of a big assumption, I know).
Poor Man’s VPLS
If you want switching logic, you can cheat a bit. If you cable the port doing EoMPLS to another switch, that other switch does normal switching. You can even cable the EoMPLS to a non-EoMPLS port on the same switch. This is called a “loopback cable”. (I do wish folks used a term that couldn’t be confused with a loopback interface. Something like “humdinger” or “frobozz” perhaps?) This is mildly wasteful of ports, but works just fine.
So if you want to transport traffic from any of a bunch of ports on a switch to another switch using EoMPLS, add one more port to the VLAN, cable it physically to the EoMPLS port, and it’ll work.
Lab Layout
Three switches, swA, swB, and swC. Switches swA and swB are connected to swC (“C as in Core”).
I enabled MPLS globally and on the uplink / downlink interfaces between the swA, swB, and swC switches. In practice, one would enable them on all infrastructure uplinks, downlinks, and crosslinks in the distribution and core layers, and to selected closets if you need EoMPLS anywhere, anytime. The technique does not work if there is a routed path along which MPLS is not enabled.
Note: to support the MPLS labels, you’ll want jumbo support consistently configured throughout as well. This requires per-interface configuration on uplinks, downlinks, and crosslinks. (The same interfaces that will be doing MPLS labels.)
Sample configuration for doing this:
mpls ip
!
int Ten1/1
mtu 9216 mpls ip
!
int Ten1/2
mtu 9216
mpls ip
(The interfaces or port channels required vary with the switch connections, of course.)
Do NOT configure MPLS on the port that will be connected at L2 via EoMPLS. Just the paths in between the two endpoints (along all reasonable routes).
I created two xconnect pseudowires, one from swA to swC, the other from swB to swC. As noted, the pseudo-wires can either be port-based (all VLANs) or VLAN-based (just one VLAN). I tried it both ways. Due to late hours, I did not test VLAN-based extensively.
Port-Based EoMPLS
The first xconnect went from Gig 4/1 to Gig 2/2, the second from Gig 4/1 to Gig 2/3 (swA and swB to swC). The addresses shown are the loopback address for the switch on the other end. The number is the circuit ID, which allows the two switches to recognize the two ends of one connection.
Here’s some captured text from switch B:
interface GigabitEthernet4/1
mtu 9216
no ip address
xconnect 10.159.0.2 201 encapsulation mpls
spanning-tree portfast trunk
swB#show mpls l2 vc
Local intf Local circuit Dest address VC ID Status
————- ——————– ————— ———- ———-
Gi4/1 Ethernet 10.159.0.2 201 UP
swB#show mpls l2 bind
Destination Address: 10.159.0.2, VC ID: 201
Local Label: 72
Cbit: 1, VC Type: Ethernet, GroupID: 0
MTU: 9216, Interface Desc: w-ecc-01 G06-314 U31 ONBD1
VCCV: CC Type: RA [2]
CV Type: LSPV [2]
Remote Label: 76
Cbit: 1, VC Type: Ethernet, GroupID: 0
MTU: 9216, Interface Desc: n/a
VCCV: CC Type: RA [2]
CV Type: LSPV [2]
And for switch A:
…
interface GigabitEthernet4/1
mtu 9216
no ip address
xconnect 10.159.0.2 200 encapsulation mpls
spanning-tree portfast trunk
end
swA#show mpls l2 vc
Local intf Local circuit Dest address VC ID Status
————- ——————– ————— ———- ———-
Gi4/1 Ethernet 10.159.0.2 200 UP
And for switch C:
… interface GigabitEthernet2/2
mtu 9216
no ip address
xconnect 10.112.0.128 200 encapsulation mpls
spanning-tree portfast trunk
…
interface GigabitEthernet2/3
mtu 9216
no ip address
xconnect 10.112.0.130 201 encapsulation mpls
spanning-tree portfast trunk
swC#show mpls l2 vc
Local intf Local circuit Dest address VC ID Status
————- ——————– ————— ———- ———-
Gi2/2 Ethernet 10.112.0.128 200 UP
Gi2/3 Ethernet 10.112.0.130 201 UP
Note the VCID is 200 for one xconnect, 201 for the other. These have to be different for each pseudo-wire, and are used by the endpoints to match up xconnect commands. That is, the two ends of an xconnect must agree on the VCID number.
Troubleshooting EoMPLS
Along the way, I ran into two problems. The first took some time to resolve. The xconnect was not coming up, and debug showed an authorization problem. It turned out the MTU on the physical port (for port-based) has to be set, and to at least 1504, to accommodate VLAN 802.1q tagging. Subsequent experimentation showed that the xconnect verifies that the two end physical ports have the same MTU. (This would be a lovely nasty time-killer for a CCIE lab test, I suspect.)
Caution: not all 6500 blades support jumbos (8000 to 9216 byte MTU). See http://www.cisco.com/warp/public/473/103.pdf. The 6748 line cards do, the 6148A series does, the 6148 and 6548 do not.
The second (minor) problem was that the xconnect does not come up unless the physical port is up (something connected to it, link status). If you’re trying to configure it without two devices plugged into the two ends, it’s going to be difficult to see that word “up”. (I was somewhat expecting it to behave more like a GRE tunnel: configure it and if it is happy, it shows as “up”.)
I mention these as possible gotchas when doing xconnect. They could consume time trouble-shooting if you don’t expect this behavior.
Testing EoMPLS
This was testing by plugging in my two test PCs, addressed with 1.1.1.10 and 1.1.1.11. Those were certainly not in the global routing table in the lab.
When I did so, I could ping between the PCs, despite their being connected to two different switches with only routing of 10.0.0.0 networks in between. I varied the PC connection points to test all combinations (pairs of ports). They worked. (Output not captured.)
I also verified that when one PC was on swA and the other on swB, they could not ping each other (nor even ARP each other). There is no local switching of traffic coming out one pseudowire back into another. (For that, SIP or ES hardware is required with VPLS functionality).
However, I patched port 2/2 on swC to 2/5, and 2/3 to 2/6, and did “no shutdown” on the latter two ports. They defaulted to both being in VLAN 1. I was then able to ping between the edge-connected PCs. Neat!
The following capture shows that ports 2/5 and 2/6 were doing normal MAC-based LAN switching, and there was no MAC learning on ports 2/2 and 2/3:
Displaying entries from Line card 2:
Legend: * – primary entry
age – seconds since last seen
n/a – not available
vlan mac address type learn age ports
——+—————-+——–+—–+———-+————————–
Module 2:
* 1 000b.db99.6dec dynamic Yes 220 Gi2/5
swC#show mac-address-table dyn int gig 2/6
Displaying entries from Line card 2:
Legend: * – primary entry
age – seconds since last seen
n/a – not available
vlan mac address type learn age ports
——+—————-+——–+—–+———-+————————–
Module 2:
* 1 0015.c5b5.f3e4 dynamic Yes 225 Gi2/6
swC#show mac-address-table dyn int gig 2/6 2
Legend: * – primary entry
age – seconds since last seen
n/a – not available
vlan mac address type learn age ports
——+—————-+——–+—–+———-+————————–
No entries present.
swC#show mac-address-table dyn int gig 2/2 3
Legend: * – primary entry
age – seconds since last seen
n/a – not available
vlan mac address type learn age ports
——+—————-+——–+—–+———-+————————–
No entries present.
Once this was worked, I was pleased with the extreme simplicity of adding xconnects.
Note also that enabling jumbos and MPLS on the infrastructure only needs to be done once, no matter how many xconnects are to be built for various purposes. For new deployments and upgrades, we now enable jumbos on the infrastructure since consistently doing it supports all sorts of later needs. And doing it in ad hoc fashion is an invitation to all sorts of fun, especially with OSPF. (Adjacencies stay up with MTU mismatches, but won’t come up after the link bounces — even months after you changed the configuration.)
Note: redundant xconnects for High Availability require special handling of Spanning Tree. Note however that re-routing and MPLS will keep an xconnect up if there is any routed path fully supporting MPLS between the endpoints, so xconnects should be rather robust.
References
EoMPLS: http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Center/HA_Clusters/HA_AppA.pdf
Configuration Guide: http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SXF/native/configuration/guide/pfc3mpls.html
If you have "switchport" applied on the EoMPLS interface, "xconnect" is not an option. It must be applied to "no switchport" interfaces. That is, in interface configuration mode, configure "no switchport" and then you will be able to complete the configuration shown above for port-based EoMPLS xconnect.
..slight, exaggeration but thank you Pete for explaining in a few lines what several hundred pages of documentation does not!
I am still a little confused on MTU settings though. I am using your ‘loopback’ solution to connect ports on 2 6500 switches. So I have a pair of core links, a pair of ‘xconnect’ ports and a pair ‘loopback’ ports. This works brilliantly up to a certain packet size. What is the correct MTU setting for each if I only want to transport normal IP packets of 1500 bytes? (It seems to be the proposed solution of enabling jumbo frames will create an issue if you actually want to transport jumbo datagrams).
The port configured with xconnect needs to have MTU 1504 or greater, to support VLAN tags, if you’re doing port-based EoMPLS. The two end ports have to match each other.
The MPLS side of things needs MTU of at least say 1508 for label stacks, I’d go with at least 1520 to allow for more labels and/or 802.1q tags in the future.
Well something wasn’t clicking Pete, so I did some lab testing and found the source of my confusion.
I am running 12.2(33)SXI on 6500s and it’s clear that the ‘mpls mtu’ command DOES NOT DO ANYTHING. It is still accepted but makes absolutely no difference to what packets are forwarded.
By using just ‘mtu’ however I was able to work out the minimum values – it turns out 1518 is sufficient for EoMPLS. The caveat here is that I am using a loopback configuration but only on an access port – I haven’t tried to forward a trunk port, so there may be additional overhead for tags.
Without ‘mtu 1518’ but with say ‘mpls mtu override 1580’ (the maximum) on ingress and egress interfaces on both ends the largest transportable packet was 18 bytes less than without EoMPLS.
This makes sense in retrospect.. my ‘transported’ packets are 1514 bytes on the wire – a standard ethernet packet + an additional 4-byte label, so the MTU (which is L3 payload in this context) is 1514+4=1518.
Hope this helps somebody as your article helped me – I couldn’t find this in Cisco’s documentation, in fact it seems like a bug.
Ah, I see. Yes, you need to set the MTU, the [b]interface[/b] MTU.
Last time I looked, MPLS MTU sets the MPLS payload MTU downwards from the default, which isn’t exactly what you usually want to do. Ok, just checked, [url]http://www.cisco.com/en/US/docs/ios/12_2sb/feature/guide/newmtu.html[/url] says you can set the MPLS MTU higher than the interface MTU on certain interfaces but all sorts of bad things might happen. It also describes labeled packets as getting fragmented, which strikes me as unlikely — I wouldn’t expect label forwarding to be capable of that). I would more expect IP packets that were about to have a label stack added to be subject to such fragmentation — and that to have to be handled by the CPU.