The blogging part of my brain seems to be stuck on security lately. Apparently because somewhat similar topics keep coming up in discussions with customers or my NetCraftsmen peers.
This blog shares some security thoughts.
Some links to set the context or about related issues:
- NetCraftsmen blog by Paul Mauritz: What Is Stopping Your Organization from Reaching Zero Trust?
- LinkedIn blog by me: It Is Opportunity Time
- LinkedIn blog by me: Securing AWS Cloud Deployments
- Other relevant NetCraftsmen blogs by me:
TL;DR This blog looks at tools for controlling user to application security and Zero Trust, and the bigger picture of what controls we might want. It probably also applies to various forms of ZTNA (Zero Trust Network Access, which I understand as VPN/encrypted traffic plus id-based application access controls).
In other words, I’m trying to raise the discussion from the nitty-gritty of flows and ACLs, how we get them right and who does them, to how we can USE that information at a high level for enforcement purposes. Where there might be gaps and issues. And where might the tools fit in relation to the end-goal of Zero Trust.
Types of Enforcement Tools
There are (at least) two major types of products contending for how to control user to application security going forward. Obviously, I can only talk about the ones I’m aware of.
Here they are:
- Network-based approaches.
- Endpoint/server-based approaches. Two sub-variants:
- Traffic is sent normally across the network
- Traffic is tunneled (and probably encrypted) directly between endpoints
- ZTNA appears to be a mix of the two, network based with per-user filters as to what applications (IP addresses? URLs?) they can access.
This blog will examine how they stack up from the access control and Zero Trust perspectives.
Concerning network-based approaches, I’m lumping all the various forms of access list (“ACL”) enforcement in there. So stateless (e.g., DNAC enforcement or basic ACI), stateful (firewalls, etc.), etc. If there is traffic on the wire, network-based approaches can control it. Well, unless it is encrypted.
- Advantages:
- ACLs can intercept and control traffic across the network, if deployed on devices in a position to examine and intercept said traffic. This topology dependency is both a strength and a weakness. Strength because a chokepoint in the network means no traffic can bypass the controls. Weakness because the network topology dependency can get awkward.
- Corollary: To intercept user to user or device traffic, ultimately either the local switch must be able to do enforcement, or the traffic must be tunneled or otherwise forced to go through some more central Policy Enforcement Point (PEP). If encrypted, it has to be de-crypted and probably re-encrypted. Which can get painful.
- Limitations:
- ACLs don’t work when traffic is tunneled or tunneled in encrypted form.
- ACLs do not control user trust levels – something else is needed for that. (E.g. Cisco ISE, etc.) ISE etc. can indirectly leverage ACLs however by forcing users/endpoints to re-DHCP into a different address block.
Agent-based approaches can also control traffic in terms of ACL-like policies.
- Advantages:
- ACLs may be simpler since for outbound traffic from a given user you do not need to specify the source(s). (Which is potentially also an advantage of Cisco TrustSec/SGT based ACLs.)
- In the central controller though, you may still have source IPs in policies (ACLs). I’d hope not. Logging, yes.
- Enforcement is probably in the agent itself, i.e., local to one or the other endpoint.
- Limitations:
- Doesn’t work if you have sources or destinations that you cannot put an agent on. (Printers, OT/IOT devices, mainframes, app servers where the support contract forbids modifications, etc.)
- The workaround for that may be to run such traffic through some sort of middle box, dare I call it a user access firewall?
I’ll note in passing that in principle, any suspect or malicious behavior detection software that is integrated into the control system for either approach should be able to trigger limited remediation-only access for a user or device. In practice, that will probably be driven by the agent sending flow info to the controller or other software, and the controller adjusting the policy applied.
Encrypting traffic on the wire makes traffic and behavior monitoring harder but means you may not have to trust the network, at least not as much.
Networking: it’s always trade-offs!
For both network and agent based, malicious behavior detection flows could be somewhat complicated, i.e. flow data to a central device, from it to cloud-based behavior/malware software, and alert back to security policy controller to deploy the “limited access” policy.
As far as Zero Trust, it appears there are several increasing levels of user-centric control possible.
My short list, some tiers of control:
- ACLs, typically based on device IP – no user awareness
- User-aware
- Network-based: 802.1x/NAC plus dynamic VLAN assignment or dynamic ACL assignment based on user (realistically, user group). Or tunneling to an enforcement point, for a couple of the non-Cisco vendors.
- Agent-based: I’m assuming the agent can glean the user ID, so potentially there might be user-based policy enforcement. I have no idea which, if any, products do anything like that, perhaps tied to MS AD groups.
- In particular, either approach can in principle control which apps a user can get to. To avoid the nightmare of per-user per-app configuration settings, there will likely be use of user and app groups.
- User and application aware
- This seems to require user groups (managed where?) that tie into application privileges. Which seems likely to take quite a while to mature and achieve any resemblance of standardization. I’ll be keeping my eyes open for anything that addresses this.
- There are products that control access to data, with different privilege levels applied there. But is that all that we need?
Other Factors
So: who is going to be your “enforcer”?
All this can lead to tension as to which group “owns” the solution. Tension as to wanting to own application security or wanting to NOT own it. It can also lead to double-coverage (both own it) – which isn’t necessarily a bad thing. “Belt and suspenders.” Or no owner, which is worse.
Generally, server admins don’t want to deal with security, ACLs, etc. And can be downright unhelpful when someone else is trying to step up and create tight security policy. Yet they’re the ones I’d hope would know the needs of their application/application layers. Maybe that’s overly optimistic of me.
In the real world, if they didn’t write the code, they probably don’t know the function or API calls used nor the ports. So, for the many purchased apps that a company uses internally, they may have had a consultant or contractor deploy them, or followed installation instructions, and there’s likely little local knowledge of those apps.
Lately, security people have a lot of compliance and audit type tasks to deal with, so (as I’ve noted in other blogs) network staff can end up being the owners of ACLs. Unless they’ve developed major skills in dodging such assignments.
I end up with maybe the user administration group plus the security group owning this, with security’s role being defining different classes of users based on what they’re allowed to access. See also Microsoft Active Directory, below.
Drilling Down: TrustSec/NAC
I’m going to use the terms TrustSec/NAC loosely, in order to include non-Cisco vendor solutions.
For our present purposes then, NAC or 802.1x provides user and/or device authentication and authorization. Authorization to get onto the network.
To me, TrustSec or a generic form of it means something along the lines of assignment of VLAN or other segmentation to the user or device. I’m trying here to accommodate the fact that some vendors may be using tunnels back to a policy enforcement device to segment traffic. Which might or might not be performance-limiting – but that’s outside the present focus.
TrustSec/NAC network tools can typically apply various access lists or security policy to the user or machine’s traffic, on the access switch or on some other policy enforcement device. So, they can (to some degree) control which servers, ports, and applications the user or device can send traffic to.
Actually, for the foreseeable future, I suspect that control over the use of the application currently (and likely in the future) is probably controlled by the application, in many cases perhaps using Microsoft Active Directory groups to control user activities with the application.
Having groupings that are unique to each application and administered separately for each application seems like a very complex (if not nightmare) scenario. As in unsustainable. I have little data on what organizations do with that, so I’ll change the subject now!
If a NAC-centric dynamic VLAN assignment is being used, or tunnels, policy enforcement may be on the switch port or wireless AP, or may be being done at some upstream enforcement point = firewall or other device.
The challenge for this approach is of course devices that cannot do the 802.1x/NAC authentication, etc. Namely, devices such as printers and IOT sensors, and other networked devices (coffee makers, refrigerators, whatever). This group of devices seems likely to also be the ones you cannot put a security or a Zero Trust agent on.
The answer I’m aware of for this is the one most people know about from 802.1x/NAC tools: put such devices into one or more VLANs (etc.) based on device type. Obtained via the vendor MAC address OUI, etc. (some form of “profiling”).
That’s where having a tool that is good at recognizing OT/IOT devices is important. Cisco’s ISE large, canned suite (or add-on packages, e.g. the medical one) of known device profiles can be helpful for that. I *like* the idea of the switch talking to ISE and ISE in effect saying “that’s a whatchamacallit, put it into the office-devices group and apply the relevant VLAN and ACL to the port”.
I have the impression some of the other NAC solutions can do at least some of that. But I lack detailed knowledge about them. I’ve looked for a couple of non-Cisco vendors’ documentation on the subject, and had trouble finding anything, no luck with anything but very minimal documentation. The problem, of course, being software vendor different than hardware vendor.
Drilling Down: Zero Trust
On the other hand, we have Zero Trust, which might well have an endpoint-based solution, i.e., an agent on each user’s device, and/or servers. Possible doing periodic re-authorization as to what the user is allowed to do.
One potential challenge with Zero Trust agents is actually deploying the agents. Most sites do that as part of a laptop/desktop build or refresh. Something similar is common for corporate cell phones, possibly via the MDM. And this can be a challenge with 802.1x/NAC, especially for obtaining deeper context information. I note in passing Cisco helped a bit by integrating various security functions into their AnyConnect agent.
I’m not expecting much tie-in to within-application authorization. I’d think the situation would be much as with 802.1x: any privilege controls in the application would depend on internal mechanisms tied to internal or MS AD or some grouping mechanism.
For devices with agents, device profiling may be more straight-forward, assuming the agent has access to key device attributes.
In the case of BYOD, cell phones, etc. an agent might be available for the user to install and required as a condition for access. That leaves devices that cannot be modified by adding an agent.
In all such cases, the key will be the ease of identifying the device type and then tying device type or profile to security policies.
Zero Trust Implementation
There are two obvious ways a ZT solution might work. One is to impose a policy at the end-user agent. Another would be server-side, perhaps based on the current IP of the user device. However, server-side could well have a gap around any server lacking an agent.
Another would be to use a per-user encrypted or other tunnel between user and server. Overhead and performance might be a concern with this latter approach, especially at the server end. (Encryption on servers consumes valuable CPU cycles.) In either case, central control would be needed to deploy policy. Having the central control point in the actual packet flows would not scale well.
The Gaps
The fun part for agent-based solutions is dealing with the OT/IOT device exceptions that do not support an agent.
If the network is not participating in some way, then the server/application-side agent would have to deal with the exceptions. Except it might have very little information to do so with. At that point, any solution might become very specific to the device and the application.
There’s another possible gap: servers (e.g., mainframes) and devices that you cannot install an agent on. E.g., applications where modifying the VM or build is forbidden (breaches support contract, etc.).
So, for such “problem” devices, either user or server side, it seems like the network-based solutions may come out a bit ahead in our “scoring”!
While on the topic of gaps, how do we know that either approach does not miss some endpoint or endpoint pair?
In the network-based approach, every switch port would be under 802.1x/NAC control. So detecting “leaks” might be more of a matter of vetting ACL rules, perhaps logging permitted traffic. Or flow monitoring and detecting unexpected flow to sensitive servers.
With network “service-chaining,” auditing the ACL rules and what hits them looks to be more complex. That’s where I like physical cabling and knowing in a simple way that the only way traffic gets from A to B is via the firewall. This applies in the cloud, only more so. (Per-virtual function or device routing means in effect more bypass plumbing?)
If a site uses a pure agent-based approach, the network security policy doesn’t provide fallback coverage. So in such a case, care might need to be taken to detect any “agentless” flows, especially when neither endpoint can do enforcement (agentless at both ends, or where the agent enforces only at the other endpoint, i.e. source-only or destination-only).
If the agent-based approach uses VPNs or HTTPS, then that might help you prevent any “agentless” flows. For better or for worse.
Snooping/Flows and Behavioral Analysis
Both approaches seem to provide the potential ability to capture traffic flow data, report it centrally, and do behavioral analysis, including cutting off user/device access – or limiting it to Internet and remediation resources. This is where having agent software that also provides flow data could be helpful.
From the flow perspective, getting device/user flow information depends on something like NetFlow at large scale, on the network side. Massive flow data on the agent side of things is the counterpart.
Either way, you’d need to set up NetFlow (IPFIX, etc.) for the network approach, or get suitable agents on devices on the agent approach. Or both.
Wrapping Up
Well, that was a lot of discussion with some “it depends” scattered throughout.
One conclusion is that you probably want to have monitoring, to detect “leaks.”
Another is that assigning user/device and server groups driving segmentation (and addressing, if needed) and passing traffic through a firewall with group-aware rules gives you hard security as a safety measure.
Whether stateless enforcement suffices for device-to-device traffic is another decision point. Putting risky devices into different segments on the network is one way to force traffic from them to go through a firewall or hard PEP. Doing that with agent-based feels weaker to me, but then if your 802.1x/NAC fails to segment, you’d have comparable exposure.
This is hard stuff, whether a vendor is coming at it from the network / network device side or the application side.
Links
For the networking side of things, the vendor list should be fairly clear: Cisco (and ISE in particular), Juniper, Arista, HP/Aruba, plus the usual firewall vendors.
Here are links to some of the companies I’m aware of in the agent-centric or similar security spaces.