Need for Speed
Networking professionals need to become aware of networking within the cloud(s), and from cloud to users and customers.
Face it, we’re going to be involved in designing networking support for cloud-based applications. That may be optimistic — “should be involved in designing” may be more like it — but we all know who is going to get called when there’s a performance problem.
So, we’d best have some idea about how various forms of cloud go about networking, if only to be able to discuss, understand, and troubleshoot such applications.
We now have lots of choices concerning compute resources and overall approaches for delivering applications.
That’s a lot of technology. Clearly, I am not going to get very detailed about any of the above in this blog.
The initial driver for cloud was perhaps agility, scale-out, automated scale-up, and high availability. The more recent drivers perhaps add things like matching micro-services coding styles, leveraging platforms or pre-packaged services for faster Time-to-Market, and lowering costs.
I think of this as waves of change, coming faster and faster, solving new problems, or decreasing costs.
Which wave should a technology surfer try to catch? Trying to catch the latest may not make sense — there will always be yet another wave, and it can be easier to pick up new technology if you follow its historical path, since complexity gets added over time.
It’s becoming clearer that there are factors limiting adoption of these cloud variations.
Start-ups are one thing: no history, can move quickly. Existing companies, not so much.
Technology adoption rate / learning is one factor limiting change for existing companies and code. Another is installed code base and budget. Existing companies can only fund and staff a limited amount of change at one time and are consequently forced to prioritize. That does put them at risk of disruption if they don’t prioritize and manage that well.
My sense is that right now, many / most existing code bases are moving to basic cloud. A VPC is not too much of a paradigm shift, whereas leveraging containers well may require more code rewriting, if done well. Serverless requires an even bigger shift, which might be more likely in a new micro-services-based application (or mobile application) than a more classic application.
This blog started with designing or troubleshooting a cloud app. Let’s elaborate on that a bit.
Developers operate at different skill levels. Some (many?) can be somewhat vague about the impact of physical networks, e.g. packet loss, link failure, latency. Cloud brings increased risk regarding those very factors. That can cost a development project a lot of money if it charges off in an ill-conceived direction.
The second reason is that there is a lot of early cloud work currently, “lift and shift”. Most of it doesn’t worry about scaling up. So duplicate addresses, NAT, static routes, and other “quick” networking elements may creep in. Do you know what the alternatives are? Quirks of the various cloud providers?
Yet another reason: the IT world is evolving different ways of doing things, methods not on traditional networking staff’s “radar”.
For example, app developers can do a lot of what I’d call “server-based networking”. As in routing via Vyatta router code or other server-based code, or security via iptables or other access lists. We need to be able to discuss pros and cons.
Another example: I’ve been reading up and dabbling with containers on my Mac. Reading about service mesh frameworks introduced me to container sidecars. Wow!
All that has an architectural level impact, possibly changing how networking + dev teams build things.
We’ve heard that “networking people should learn to code”.
Coding to automate the provisioning and operational aspects of cloud compute and networking is one good reason to know how to code. But more than that, being able to communicate with coders and application architects / developers and have some idea what they’re talking about is likely a fairly crucial skillset as well. It’s not just network tool coders we’ll need to talk to! We need to broaden that to app dev teams!
Specifically, which of the cloud technologies should a networking person become more familiar with? What should one’s learning / hands-on / experience priority be?
I can’t really answer that, other than with the classic consultant response: “it depends”. Instead of analysis paralysis, do something.
Starting with classic VPCs may be the simplest, providing a good foundation for learning about the others.
The obvious answer is: what your employer or customer needs, or what they are doing.
I couldn’t decide where to put Edge Computing in the above list, so I put it at the end. Above that, the trend seems to be smaller, faster, cheaper. Edge is more about low latency and proximity to IOT devices doing real-time calculations. That’s qualitatively different.
I currently view Edge as “some of the same stuff, different place”. And evolving. Probably different tools and needs.
I’ve heard the comment that networking for the cloud technologies is pretty much the same. That’s partially true but an oversimplification. The network has to provide connectivity to the cloud. And within the cloud, there is virtual networking, that acts somewhat like what we’re used to.
Cloud compute and containers do have routing in some form, access list capabilities, load balancing, and usually some NAT, both for outbound Internet access but perhaps other uses. And public IPs for inbound Internet access to applications.
However, there can be hidden surprises. One of them is the AWS limit on VPC (Virtual Private Cloud) to VPC routing — you can write rules to route from VPC A to B, and B to C, but your traffic will not be able to transit from A to C via B. reportedly now addresses this issue, albeit with some constraints.
Microsoft has per-interface routing tables. Put differently, their virtual router behaves a bit differently than we might expect.
Another difference is that broadcast or multicast may well not be emulated, meaning EIGRP or OSPF hellos don’t work with virtual routers. Since you won’t get link down conditions, you may need BFD to quickly detect loss of a neighbor.
Update: While this blog was sitting in our publication queue, Ivan Pepelnjak and Daniel Dib have clearly been digging into what happens “under the hood” in various clouds, based on some tweets and blogs. Ivan has webinars posted on AWS and Azure networking. Daniel has started a Slack channel on the topic. I’m glad they were able to explore in depth and share their findings, especially regarding things we are used to doing that may not work as expected in the cloud.
A different thought is that we networking people have to know enough about how things work to ask the right questions. To me, application flows are key. Who talks to who? Where are the resources in question located? Does the service or micro-service auto-scale up and down? How is it monitored? Etc.
Here’s another thing you might find somewhat unexpected: in VXLAN we might run BGP to the Top of Rack (TOR) switches. Cumulus and Dinesh Dutt’s book discuss that, and some interesting enhancement to BGP, also large data centers running BGP to the servers themselves. In a recent Arista webinar, I became aware of Tigera Calico. Each compute node acts as a router for the prefixes / endpoints on that server. Similar idea, different context.
Challenge: Pros, cons, what’s your reaction upon reading this? Quick, your manager just called you in and asked for your opinion!
Another item that might be novel to a networking person is the idea of ephemeral services, e.g. with containers that spin up and down. IP addresses are automatically assigned, and instances come and go. That is usually front-ended by a services manager, tracking which containers are running and what services they can provide, so that one service can find another. Upon spinning up, a service registers with the manager. Consumers of the service do service discovery to the service manager, to be informed how to contact a currently active provider of the service.
I’ve been thinking of the services manager as sort of an automated service load balancer, based on service names than virtual IPs. They track containers that can provide a given service and answer DNS queries to steer consumers of services to the services they need. There may well be NAT involved, depending on the features in use.
Networking people also need to be prepared to use virtual network devices. We might want fancier routing, VPN, or firewall functions than the cloud provider supplies, for instance.
Change seems to be happening fast in terms of connecting to cloud. Some sites have been putting in circuits to connect into various cloud networks. That takes time. It may have cost / bandwidth advantages, especially at higher speeds. Leveraging services like Equinix Cloud Exchange can be more agile.
Note: As of April / May 2019, Equinix has some very interesting and much more agile virtual networking offerings in progress. See my recent blog about Network Edge.
The CSPs (Cloud Service Providers) provide basic VPN access (and may well generate sample configurations for your Cisco or Juniper router). Putting a virtual router or SD-WAN device into the cloud lets you do more sophisticated routing and VPN access, supporting both agility and hybrid cloud. For what it’s worth, Cisco Live had several sessions with slide decks including sample CSRv configurations.
Fallacies of Distributed Computing — I’ve seen this before, but thanks to Ivan Pepelnjak for the reminder. Since I saw it in his blog, it seems to have popped up all over the container reading I’ve since been doing.
Comments are welcome, both in agreement or constructive disagreement about the above. I enjoy hearing from readers and carrying on deeper discussion via comments. Thanks in advance!
Hashtags: #CiscoChampion #TechFieldDay #TheNetCraftsmenWay #Cloud #SaaS #VPC
Did you know that NetCraftsmen does network /datacenter / security / collaboration design / design review? Or that we have deep UC&C experts on staff, including @ucguerilla? For more information, contact us at firstname.lastname@example.org.
Need for Speed
Container-Based WAN Monitoring
What is NVMe and How Does It Impact My Network?
Nick has over 20 years of experience in Security Operations and Security Sales. He is an avid student of cybersecurity and regularly engages with the Infosec community at events like BSides, RVASec, Derbycon and more. The son of an FBI forensics director, Nick holds a B.S. in Criminal Justice and is one of Cisco’s Fire Jumper Elite members. When he’s not working, he writes cyberpunk and punches aliens on his Playstation.
Virgilio “Bong” has sixteen years of professional experience in IT industry from academe, technical and customer support, pre-sales, post sales, project management, training and enablement. He has worked in Cisco Technical Assistance Center (TAC) as a member of the WAN and LAN Switching team. Bong now works for Tech Data as the Field Solutions Architect with a focus on Cisco Security and holds a few Cisco certifications including Fire Jumper Elite.
John is our CTO and the practice lead for a talented team of consultants focused on designing and delivering scalable and secure infrastructure solutions to customers across multiple industry verticals and technologies. Previously he has held several positions including Executive Director/Chief Architect for Global Network Services at JPMorgan Chase. In that capacity, he led a team managing network architecture and services. Prior to his role at JPMorgan Chase, John was a Distinguished Engineer at Cisco working across a number of verticals including Higher Education, Finance, Retail, Government, and Health Care.
He is an expert in working with groups to identify business needs, and align technology strategies to enable business strategies, building in agility and scalability to allow for future changes. John is experienced in the architecture and design of highly available, secure, network infrastructure and data centers, and has worked on projects worldwide. He has worked in both the business and regulatory environments for the design and deployment of complex IT infrastructures.