Need for Speed
I’ve written in this blog about various reasons for using network automation but it is time to put them together. Counting down…
7. Performance and SLAs
The first thing that network management does is performance monitoring. It’s conceptually easy, but surprisingly challenging, primarily due to differences in vendors, changes in standards (e.g., 32-bit vs 64-bit counters, different SNMP versions, and bugs in vendor implementations). Once those hurdles are past, the thousands of interfaces need to be sorted by a variety of criteria (e.g., percent utilization, error rates, broadcasts, etc). Alerting thresholds on performance data need to be defined and now you have a system that alerts you when utilization is high or errors suddenly appear on a link. Doing this task without automation is impossible in any network consisting of more than about 50 routers and switches.
SLAs are another area where automation is required. How else would you monitor the delay, jitter, and packet loss across a network (to pick three common SLA factors). An automated system is required for performing SLA tests, processing the results, and presenting the reports.
6. Scaling of processes
There are many processes in managing networks that should be performed regularly to have a smoothly running network with minimum downtime. But because these processes take a lot of time to implement manually, they are seldom performed. With network automation, these processes can be performed regularly, reducing risk of an unexpected network failure. Of course, the results of the processes should be sent to a network administrator, particularly regarding any alerts or exceptions. These processes include:
Reduce operating costs by tracking the inventory of your network devices and paying maintenance only on those devices that are in your network. Know which devices you want to upgrade next in a network refresh by tracking the age of all your devices and the OS loaded on them.
When troubleshooting, an accurate network topology drawing is valuable. Keeping network drawings up to date is a tedious and often neglected task and when a problem occurs, I typically see people sketching the network topology so they can proceed with the problem diagnosis. The NMS collects connectivity information, which can be displayed within the tool or exported to drawing tools (Microsoft has published the Visio XML format).
Topology information is also very valuable for network planning and preventing outages. It allows you to answer questions about uplink oversubscription ratios, verify redundant connections (or the lack thereof), and identify strange topologies that tend to appear in most networks (and that can cause strange behavior or failure modes).
3. Network Analysis
Network analysis is the process of taking all the collected data about a network and performing analysis on that data to identify current and potential problems. The simplest analysis is identifying interfaces running at high utilization. More complex analysis incorporates data from multiple devices, such as determining that a VRRP group only contains one router (the operational data from all routers shows that there is no peer router). The most complex analysis uses multiple sources of data, such as from both configuration files and operational data, exemplified by a duplex mismatch where an interface configuration shows a setting of ‘auto’, the interface’s state is ‘half’, and operational data shows late collisions.
Other network analysis incorporates data sources like events (syslog or SNMP traps). Most network management systems collect the data but then rely on the network engineer to perform the analysis. Because the network engineer is already busy, this limits what he or she can do, so it often defaults to looking at alerts generated by the interface utilization thresholds. Automating the analysis tasks allows easy identification of lots of problems that network engineers know that they know should be done but never have the time to perform.
2. Correlation of the above items
The next step in automation is to correlate several of the above items. A good example is to use the topology information to perform higher frequency interface performance polling on any interface where the neighboring device is another infrastructure device. Edge ports can be polled at a much lower frequency.
Another example is using the topology information to determine whether a subnet has been allocated multiple times. Similarly, it would be good to use topology to tell if two subnets that overlap, but have different masks are on the same segment due to a typo in the configuration or are they two different subnets in different parts of the network.
1. Human error
The biggest and most important reason for network automation is human error. It accounts for at least 40% of network failures (some estimates are as high as 80%). It has been proven that automation helps reduce those errors. Updating the configurations of hundreds of routers and switches is not something that should be done manually. Automated mechanisms to verify a proposed change and to implement a change control process where it is validated by other network engineers is important for reducing or eliminating silly mistakes.
That’s the list. Networks are big. Networks are complex and are increasing in complexity. Automation is the only hope we have of managing the size and complexity while providing high availability.
NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html
Need for Speed
Container-Based WAN Monitoring
What is NVMe and How Does It Impact My Network?
Nick has over 20 years of experience in Security Operations and Security Sales. He is an avid student of cybersecurity and regularly engages with the Infosec community at events like BSides, RVASec, Derbycon and more. The son of an FBI forensics director, Nick holds a B.S. in Criminal Justice and is one of Cisco’s Fire Jumper Elite members. When he’s not working, he writes cyberpunk and punches aliens on his Playstation.
Virgilio “Bong” has sixteen years of professional experience in IT industry from academe, technical and customer support, pre-sales, post sales, project management, training and enablement. He has worked in Cisco Technical Assistance Center (TAC) as a member of the WAN and LAN Switching team. Bong now works for Tech Data as the Field Solutions Architect with a focus on Cisco Security and holds a few Cisco certifications including Fire Jumper Elite.
John is our CTO and the practice lead for a talented team of consultants focused on designing and delivering scalable and secure infrastructure solutions to customers across multiple industry verticals and technologies. Previously he has held several positions including Executive Director/Chief Architect for Global Network Services at JPMorgan Chase. In that capacity, he led a team managing network architecture and services. Prior to his role at JPMorgan Chase, John was a Distinguished Engineer at Cisco working across a number of verticals including Higher Education, Finance, Retail, Government, and Health Care.
He is an expert in working with groups to identify business needs, and align technology strategies to enable business strategies, building in agility and scalability to allow for future changes. John is experienced in the architecture and design of highly available, secure, network infrastructure and data centers, and has worked on projects worldwide. He has worked in both the business and regulatory environments for the design and deployment of complex IT infrastructures.