If you think about it, one problem in networking is that there are too many applications, and too little time. I greatly enjoyed a recent blog by @RealLisaC, seasonally titled The Ghost of Application-Aware Networking. Excellent reading! It can be found at http://theborgqueen.wordpress.com/2013/10/29/the-ghost-of-application-aware-networking/. The following blog represents various thoughts that it and my other reading triggered — which qualifies it as thought-provoking!
The overall theme is Cisco’s experiences with application focus in the network, and what that might mean in light of the big Cisco announcements November 6 in New York City. (Which they’re being unusually secretive about, by the way.)
For what it’s worth, I’ll be attending The Big Insieme / Cisco Event as part of a small Tech Field Day group. I expect we’ll all be feverishly tweeting and blogging our impressions as time permits. We’ll also be stating our minds in a roundtable discussion that will be made available.
My perspective is that Cisco’s prior application efforts were always a bit too little too late. And painful to most people to use.
The above blog confirms that to some extent from an insider perspective.
My personal take, for example, is that ACE suffered from confusing MQC-like CLI (good try but fail!). The underlying problem is / was complexity, however. That’s bad enough when doing SLB functions, which might be more web-centric with HTML rewrite, with perhaps some TCP in the mix. When you consider WAAS, accelerating generic applications is a bit different but also a pretty big challenge. What is it that makes WAAS more viable? Perhaps the fact that a large group of application optimizations was developed over time, starting with key business applications. And that the complexity is hidden from the user.
Cisco’s latest iteration of this is perhaps NBAR2. Where we finally are getting closer to recognizing and using human friendly names for many apps. (Which used to be one thing people liked about PacketShapers, for example — you could see flows with real application names, including older Microsoft apps with dynamic port choices. Not that I ever liked PacketShapers all that much, by the way. I’ll spare you why.)
This comes back to what do we want to virtualize. OpenFlow is focused on flows, which I see as a means to an end. Service-chaining being one such end. Flows are more like virtualizing or controlling the plumbing, the links or use of links. For what its worth, controlling the actual links via optics and optical switching is part of what Plexxi does. Whether that is better than spine-and-leaf, I can’t tell. Different use cases, and I’m not sure I understand one that really fits what Plexxi does. But that’s a topic for another blog, still stuck in the nether parts of my brain.
Virtualization is supposed to take the complexity out of networking. What better to take the complexity out of than applications? Virtual applications. The talk about NFV and service chaining is focused on easier deployment of applications. That’s an important piece of the puzzle. But is there more?
From doing QoS and security ACLs over the years, I’ve seen that any documentation from software / application vendors as to what ports they use and what they use them for is thin and unreliable, verging on non-existing. It’s apparently considered part of the documentation, which is always an afterthought, rather than part of the development process. (And some vendors I’ve seen failed to be lucid about which ports were source or destination ports.)
What if vendors that claim to be “enterprise vendors” actually had to document the ports their app listens on or transmits on? With some sort of validation testing? Maybe the “security fingerprint” could even be provided in some electronic form that we or a controller could use for security rules? With some sanity checking, so that a sloppy coder doesn’t just specify “all TCP and UDP ports above 1024” or something like that.
Maybe that could even be taken a step further. Some ports get used talking to “the outside”, others get used talking internally. Some are inbound (traffic going to a web app on port 80), some are outbound (e.g. application legitimately going to vendor site for an upgrade). Why should we have to RTFM and then type that info into ACL rules? Perhaps it could be part of some “deployment package” that we could use along with our NFV or SDN software to do the necessary VM, SLB, Firewall, and ruleset instantiation? If software vendors can’t be trusted to do this sort of thing, maybe a big company like Cisco could figure out how to make money off of providing such value?
It seems there ought to be room here somehow, for value add by making managing applications easier for the customer. The intelligence behind this is what I think any reasonable controller may eventually need, in whatever SDN flavor.
The bottom line here is ease of use. And maybe similar in a way to the theme of my prior blog, SDN for Bandwidth Control, at https://netcraftsmen.com/blogs/entry/sdn-for-bandwidth-control.html. I loved the blog by Martin Casado and the usual VMware/NSX suspects <grin> Of Mice and Elephants, at http://networkheresy.com/2013/11/01/of-mice-and-elephants/.
The similarity? Elephants are big long-lived flows. Apparently my blogs agreed in advance with the brain trust at NSX. Their positive take on elephant flows is that they’re what we might be able to do something useful about (as long as there isn’t, so to speak, a tightly packed herd of elephants).
When we talk applications, think about doing a WireShark capture or NetFlow capture in your datacenter. There are probably a gazillion flows. Life’s too short to look at all of them. BUT some are pretty important ones: VoIP, various types of video, key business applications that are “fragile” or latency sensitive. If we have a central easy way to tweak how the network handles such applications, maybe one that tells us more about what’s going on in an aggregate sense, wouldn’t that help?
Again referring to some recent QoS experiences, the world is changing. Call Admission Control and tight control used to be important (to some, anyway). It still is, but with applications like Microsoft Lync (nthing I’d call enterprise-grade CAC that I know of), surveillance video, streaming Video on Demand, it’s starting to look like CAC is doomed. There are just too many potential big bandwidth consumers that aren’t under CAC-like control.
What’s the next best thing? What do you do if you can’t have total control? Declare MS Lync traffic to be Best Effort? Some do. If you get backpressure from the business, then what? Perhaps what might help is knowing when you’re about to hit the wall. So if you have Verizon Gold CAR MPLS service, knowing when your 30 or 40 percent (or 90 percent) reserved VoIP bandwidth isn’t enough. Knowing how much video (in each form) is out there, so you can do better capacity planning. Yes, NetFlow sort of gets you there. How easy is it to interpret? How many are actually doing it well?
It’s crystal ball time. What is Insieme going to announce?
I agree with @RealLisaC (Lisa Caywood), and SDN Central (http://www.sdncentral.com/news/insieme-know-think-know/2013/10/) that from what little data is available it appears that it might be fast / big switch or other network hardware. Could it be an optical super-Plexxi type box? I do expect some form of controller or orchestration tool would be part of the announcement, especially since with UCS Director (former Cloupia) and other products Cisco’s got some solid experience in that space. And, per the above discourse (fancy word for “rant”?), maybe Insieme might even do some reporting on the application mix, including which packets are dropping where, latencies, good stuff like that? The number of developers working for Insieme is fairly large, according to one report I saw. Ties to OpenDaylight (perhaps) or OnePK (more likely) maybe?
Bob McCouch had a very interesting recent blog, at http://herdingpackets.net/2013/10/24/big-switch-networks-and-the-possible-future-of-networking-hardware/. I like his casting SDN as lately somewhat in two camps, overlays vs. flow-based approaches. His context was Big Switch Networks, but as I read it (with the “no overlays” attitude Insieme already exhibited), I found myself wondering if Insieme will be an OpenFlow or some form of flow-based approach. That would leverage the suspected Cisco hence Insieme bias towards ASICs. It would stick to fast native hardware. It could likely handle sparse numbers of elephant flows. It doesn’t clearly mesh well with VMware and the virtual world, but maybe there will be an answer to that, e.g. a VXLAN gateway function. (But: VM orchestration?).
I just watched the Arista announcement webinar, primarily announcing some new “SPLINE” switches. They talked about both their OpenSwitch integration and their VXLAN and VMware / NSX integration. Playing it both ways? I think I’m getting jaded, as far as 40 / 100 G port counts. I’m sold on 10 G ports, and lots of them, as blade servers drive bandwidth. I’m not seeing that much demand or rationale for 40 and 100 G yet, not in traditional enterprises anyway.
Back to Cisco / Insieme: I’m kind of hoping its not all about ASICs. For high-end performance yes. But if its about applications, I see that as less about high-end performance, more about control and large-scale awareness of conditions in the network.
I also hope any GUI is fairly well done (i.e. relatively fast and bug free). First impressions matter with GUI management software.
Your comments are welcome! It’s crystal ball time. What do YOU think Cisco and Insieme are going to announce?
Twitter: @pjwelcher