This blog represents some speculative thoughts that popped into my head recently. I hope you, the reader, will find it interesting food for thought, and who knows, maybe even useful!
Apologies if this blog rambles a little. I thought it useful to treat some of the material as “let’s think through this together.” I’ve tried to tighten it up, but it has been resisting my editorial efforts!
Security at the endpoint keys off having a centrally-controlled agent on the user or server endpoint. Plus, perhaps hardware in some places as well. Products that do this are Cisco Secure Workload (formerly Tetration), Illumio. Cisco StealthWatch fits in as being on the observe and report (but not enforce) side of things.
Well, the new idea or theme for this blog is: what else can be shifted from the network to the endpoint? Does network security change dramatically at some future point?
I claim this change is already underway at some sites and in some new products.
From one perspective, firewalls and VPN termination points are becoming costly bottlenecks, constraining factors in network design. Their costs are not coming down the way they are for L3 switches. Distributing the workload might solve that.
Does shifting workload to endpoints solve that? What problems does it NOT solve?
Design shifts due to SD-WAN and DIA / SaaS connectivity are what got me started thinking about this. Well, that plus a few other recent inputs. So let’s look at the factors driving change.
Factor: SD-WAN and DIA
One of the appealing and sales factors for SD-WAN is that using the DIA (Direct Internet Access) and SAAS access capabilities provides several benefits. It provides faster access to SaaS sites for users, compared to backhauling traffic to a regional or central data center to run it through the “security stack” before putting the traffic out on the Internet.
Another benefit is that doing this shifts traffic away from the central Internet link(s), which can reduce costs. That assumes you’re already paying for Internet at the remote SD-WAN sites and at any regional hubs. Or that you shifted from WAN to Internet access circuits.
Not usually explicitly mentioned: by reducing the central Internet traffic flows, you may also need a smaller firewall at the central (or regional) Internet sites. Or you might be able to extend the capacity lifetime of the firewalls there. That could amount to substantial cost savings.
In short, SD-WAN can provide a form of disaggregation of traffic, distributing the workload.
That does assume you don’t require an SD-WAN “router” plus a firewall. Scattering little firewalls at every remote site probably costs more than a big central firewall pair.
Factor: Network As A Service
The second thing to note is Alkira / Cisco / CloudFlare / Equinix / Packet Fabric – essentially WAN variants of Network as a Service (NaaS), with some degree of immediacy, perhaps. Ok, that’s basically SD-WAN and cloud access as a managed service, plus likely some automation and some VNF (Virtual Network Functions). From one perspective, that’s just shuffling existing components around a bit, with some virtualization thrown into the mix. Except putting it that way understates the value.
When you look into it, NaaS seems to be mostly about CoLo and cloud links and virtual devices. So far. (Or leasing gear in advance of need, which is more about financials.)
Note that your sites still need some sort of circuit connection(s), likely Internet. Local mile might still be awkward, but the rest might be agile via NaaS. (More about NaaS in an upcoming blog.)
Factor: Endpoint-Based SD-WAN / VPN
The third thing to note is Ananda.net, hardware-less SD-WAN, and similar products (if there are any). I understand their “basic product” (my term) as being VPN software for end systems (users or servers). That software builds a VPN between “consenting” endpoints (my term). The admin controls groups of endpoints and servers and virtual VPN endpoints that can mutually communicate.
The key here is that the focus is endpoints in groups that are allowed mutual access within each group. My perception is that early consumers of this might be driven by DevOps or other teams or centrally administered. Less infrastructure for a startup, etc.
And since it is software/drivers, you can put Ananda’s software on an appropriately sized server as a virtual VPN termination point (aka VNF “router”). For example, you might do this in an appropriately sized HQ or data center, or cloud network. I say “appropriately sized” since I’d expect there to be scaling/performance trade-offs.
Part of Ananda’s marketing emphasizes post-COVID WFH (Work From Home), of course. Another key sales factor might be simple VPN management, e.g., suitable for smaller companies/startups. However, the product is not limited to that. Early days yet.
If you step back and consider this, another way of looking at what Ananda is selling is that it is endpoint-first, with VNF router code as a way to provide secure VPN access to resources at a data center or cloud or for devices that you can’t put an agent on—and supporting migration to Ananda perhaps. An interesting shift in emphasis as to how we do VPN?
A more recent perspective is to regard each VPN grouping in Ananda as a Zero Trust Architecture (ZTA) “enclave” in effect, one that might have a gateway but does not need to have one. Having context and other user information factors into access is not clearly available yet, however. Who knows what other ZTA (Zero Trust Architecture) attributes it may develop as their product gets features added to it.
Factor: Illumio, Tetration
The fourth item (mentioned at the start) is Illumio and Cisco Tetration (Secure Workload) endpoint-based security policy, segmentation, and traffic monitoring. I’ve blogged about such products previously.
My provocative question: “So who needs firewalls?”
Realistically, the answer is similar to what I wrote about Ananda. Everything that doesn’t participate in the endpoint-based approach. At least. Assuming you trust the endpoint product.
If you buy into this, maybe this changes your network design a bit. You perhaps segment off or VRF off the stuff that doesn’t support endpoint-based security and put it behind a firewall?
Or maybe you do a “belt + suspenders” design (double protection), i.e., use firewalls to screen out a lot of traffic, then do specifics / more detailed ACLs at the endpoint. Maybe use the firewalls for monitoring traffic as well.
Two points I recently read that I wish I’d thought of myself (and sooner):
- The endpoint solutions also allow the access list rules on each server to potentially just be those relevant to that server. Smart NIC cards, ditto. That assumes the central management software is smart enough to figure out which rules are applicable. So you might need a lot less TCAM / fancy memory than a firewall does. Across multiple servers, ok, maybe it adds up to more memory. I’m not sure how that plays out cost-wise. It distributes it, per-server, rather than coming in big, costly hardware lumps?
- If you’re doing service chaining “hairpinning” to force traffic through a physical or virtual firewall, endpoint / NIC-based security potentially removes that need. (Potentially, since I suspect the ACL rules might get rather ugly. For data center segmentation, probably fine.)
Editorial comment: I really dislike service chaining from a design perspective since I feel it could be very hard to audit. Also, it might entail potential hairpinning, although if you’re filtering inter-VRF traffic, there may be no alternative. Having firewalls in blocking positions along the traffic path is a lot more straightforward.
Factor: Cloud-Based Security
See my recent Internet Edge blog on Alternative Security Solutions.
Adding It All Up
So… is SD-WAN an interim step to be replaced (mostly) with endpoint-based VPN + security controls?
The marketing might be something like “WAN-less networking.”
(Two of my daughters do marketing. They clearly did not get that interest/skill from their dad!)
Thinking about Why: SD-WAN
Let’s come at this from a different angle. So why do we do SD-WAN and site-to-site VPN in various forms? To secure traffic, control DIA / SaaS access, and add services (Cisco Umbrella, AMP, etc.), with smart per-app re-routing and failover, QoS, etc.
That’s really several separate functions:
- DIA for HTTP / HTTPS traffic to selected sites (and direct SAAS access as well))
- Site to site VPN for internal traffic and Internet-bound non-DIA traffic
- Smart application routing with failover
- Firewall traffic controls (+ IPS + malware detection + etc., such as Cisco AMP for endpoints.)
Of these, site-to-site VPN is the only one that doesn’t immediately seem like something software on the endpoint with central control might be capable of matching.
- If you’re doing SD-WAN with a leased circuit of some kind plus an Internet link, then you believe that leased connections can provide a better quality of service than the Internet.
- If your SD-WAN is totally Internet-based, then perhaps endpoint agent software might be an alternative to site-to-site VPN? Let’s examine that a bit more closely…
Why do we do Site to Site VPN or SD-WAN? As opposed to just having a remote office (or WFH) with remote access VPN?
Traffic rarely goes directly between users, as far as I can see. Except for malware?
So Site to Site VPN does what? It likely covers:
- User to server traffic that is not HTTPS, e.g., older-style legacy app front-end to back-end traffic.
- User to Internet host traffic where for some reason, we need to run it via a central site (probably security stack). DIA shifts that function to the SD-WAN router. I don’t see a reason an endpoint agent couldn’t force traffic through a corporate firewall stack with DIA / SAAS exceptions. Or just do enforcement locally?
- Central access to manage servers onsite at the remote office. Is that still something people do? My impression is remote site servers are rare these days.
- Printers, phones, etc.
The first of these, yeah, definitely need VPN to do that securely. How much is there for that? Would VPN directly from the user to the app server(s) suffice? How much additional server capacity would that require? Administratively, how would that scale? (I’m thinking of adding a given user to a mess of per-server / per-app groups, but maybe the groups are coarser so as to have fewer of them?)
The second of the above items, I’m not so sure about that. If we’re going to do DNS-based blacklisting, etc., to what extent do those need to be done in an onsite box? Consider the Internet-based Cisco Umbrella Secure Internet Gateway, zScaler, etc. – a prior blog?
So what does SD-WAN buy you that endpoint-based might not? Is there an endpoint equivalent? Like your cell phone doing data over 4G/5G when your WLAN and LAN are out?
I can imagine some issues around who pays for data usage on personal cell phones, perhaps, but it seems like we already have alternatives. Maybe not quite as admin and user-friendly as SD-WAN / routing. Yet. It seems that eliminating the need to put a device of some sort for WFH or a small office could be a financial win. Licensing/purchasing and support costs versus endpoint-based? Plus smaller impact of failure!
For endpoint-based VPN solutions, there’s a gap concerning networking for other (non-laptop) devices. Could the endpoint agent also supply site-to-site VPN services to other devices on the same LAN as the user endpoint? Do any endpoint products currently do that (act like an S2S — site to site — VPN router)?
Other Factors to Consider
Pensando – super-smart NICs under central control. Startup led by former Cisco CEO John Chambers. They seem more likely to be used in the servers than in user endpoints, i.e., where there is more to offload and more to gain performance-wise.
Other VMware smart NICs basically shifting server-side routing and security functionality onto the NIC hardware to significantly improve CPU performance (for a price).
I see these as mostly server-side unless the smart NIC is sold in a form that can leverage USB or whatever to connect to a laptop. I see them as logically equivalent to a software-based endpoint agent but running in hardware to significantly boost performance.
Segmentation: You could conceivably use centrally programmed endpoint NICs to enforce segmentation. Perhaps.
Immediate question: how do you scale manageable ACLs enforcing segmentation at endpoints? Do you do that with something like Cisco’s pxGrid mapping endpoint IPs to a group, to enable group-based ACLs? Or is it based on MS AD groups? And do that at the scale of all endpoints rather than selected network devices/firewalls? Maybe lookup and cache info for each new endpoint? Alternatively, just prevent all user-to-user traffic, except, say, softphone traffic??? Whatever the answer is, this seems solvable.
For what it’s worth, I should note that zScaler/zScaler Client Connector, Cisco Umbrella / Secure Internet Gateway, and Cisco Secure Endpoint (aka AMP for Endpoints) are some other alternatives, which I’ve mentioned before.
The key point about them for this blog is that many of the security-related tasks don’t have to be located on the endpoint. With anycast or other distributed services in points of presence near most users, cloud-based services can provide a very viable alternative without severe latency penalties. Other advantages: smaller endpoint footprint, easier / faster updates by the provider, low support costs. And the ability for a small to medium-sized organization to offload security to a major service provider, rather than struggling to adequately staff it in-house.
When are endpoints (desk workstations and laptops) and their operating system likely to support containers?
I’ll let you do the speculation about why that might be useful, what that might open up in terms of VPN access, security tools, or other VNFs. Cleaner management than desktop agents, perhaps?
We’ve ended up with speculation about doing security enforcement/firewall lite and VPN functions on endpoints. And how might that spare us from needing network hardware, especially for WFH / mobile user settings. Also discussed: does it at some point “flip” to where user networking is mostly endpoint-based and where some hardware-based or VNF-based networking might remain. (Office buildings/sites, data centers, IOT, and immutable devices?)
I don’t have answers at this point, just questions. I suspect the correct answer to most of my questions will be “it depends.” And perhaps “all of the above”. That is, the network world never does anything 100%, so there may be a future mix of network-provided functionality and endpoint-centric functionality.
I’m envisioning this as if there is a “control slider” controlling the mix of network device-based versus endpoint-based functionality, and it is starting to move more towards endpoint-based.
It will depend on the choice of vendors and products mix, as well as administrative and security preferences.
For those who mostly WFH, a combined VPN and endpoint security driver under centralized control potentially makes a lot of sense. For offices above a certain size, it may make sense to do SD-WAN to have fewer entities to manage. Yet if staff are 40% or 60% WFH, then that would be duplicative. Should WFO (Work From Office) be different than WFH as far as networking? Or would it be simpler to treat it as higher-speed Internet / WAN but otherwise like WFH?
That’s something my thoughts keep coming back to: WFH and endpoint software can be game-changers! Culture changers?
Another wild card here is “other devices.” When I WFH, I personally am pretty much OK when doing remote access VPN. I don’t need other devices connected back to the office, in part due to secure Internet access to things like email, Webex, Jabber, etc. If you do have such a requirement (work phone handset at home?), then you might need an onsite SD-WAN or other site-to-site VPN devices. The main pain point I see with remote access right now is blocked split tunneling, preventing me from printing. Which may be more of an odd gap in current product features. But who does (much) printing and paper these days?
Unless, as mentioned previously, maybe your laptop could provide such services? I.e., extend security enforcement and VPN support to local devices.
What do you think? Which approach do you prefer?