Click here to request a Network Assessment!

Peter Welcher

Exposing Hidden Costs in Network Operations

Many IT shops are facing budgetary challenges: Support more technology with the same (or smaller) budget. What happens? This is where you enter the Twilight Zone (cue eerie theme music) of hidden costs in network operations. Let’s take a look at some common things we at NetCraftsmen see in many networks.

Problem: When funding is tight, your staff can get spread thinner and thinner. One consequence is that the workload leads to stress, and people quit.

Hidden costs: Replacing lost employees costs more. Salaries are going up, and the number of skilled network engineers with experience is not growing as quickly. You may well end up with someone less skilled, after burning time you don’t have on many interviews. And then you have to bring the new employee up to speed on your network …

Solution: Don’t burn out your staff. And do say “thank you” and provide other forms of positive feedback — “well done” certificates of appreciation, cash cards, desk toys, or whatever works.

Problem: If you are under-staffed, it can result in rushed planning, sloppy rollback plans, lack of time for testing and deploying untested changes.

Hidden costs: Mistakes, rework, and worse: outages.

Solution: Don’t ask for too much, and do create an environment in which staff can safely report that they didn’t have time for adequate prep. Forcing staff to hit deadlines, without providing the necessary preparation time and resources, will result in cut corners and outages.

Problem: Person A is working on a project but has to wait for Person B to do something before he can continue. Meanwhile, Person B is waiting for Person C to do something, but C is waiting for A, perhaps on some other project. If everyone is too busy, then every task gets queued for processing by every person involved. This is a mutual lock condition.

Hidden cost: Projects take forever to be completed.

Solution: Establish clear priorities, with everyone on the same page as to the highest priority projects.

Problem: A slight variant of this occurs when multi-person projects with rushed planning get bottlenecked on missed steps. Similarly, rushed equipment orders may need follow-up orders, as components or optics inadvertently got left out of the bill of materials. If your procurement group is slow, this just made their queuing delay greater – and it sure held up your project.

Hidden cost: Rework, delay.

Bonus hidden cost: You just doubled the project delay due to the procurement cycle. If your procurement department is slow (are there any that are not?), that’s particularly painful.

Solution: Make sure that effective planning is a clear priority (without overdoing it). It is better to take a little time up front to make sure everything is accounted for than to suffer big delays later. Your staff needs to understand that.

Problem: An overly busy staff is the equivalent of a computer that is thrashing. It gets so busy switching tasks, and swapping virtual memory off slow disk drives, that little useful work gets done. About all you can do in response is to reduce the CPU and memory workload. The same goes for network or server staff.

Managerial hidden cost: You get ulcers trying to make it all work.

Solution: Prioritize. And don’t take on more than can realistically get accomplished.

Problem: Have you seen documentation actually getting updated before or as changes are made? Good documentation, diagrams, and a set of core tables of information represent cached information that took time to develop. If staff keeps putting the documentation/diagram updates off due to being “too busy,” the consequences can be serious.

Hidden cost: New staff and consultants will require verbal documentation, which ties up existing and new staff as they must continually explain the basics of how various aspects of the network work.

Bonus hidden cost: Someone quits or gets hit by a truck, and nobody else knows how that part of the network works, or how to configure some devices.

Solution: Documentation and diagrams that are up to date can at least provide solid clues as to how something is supposed to work. This is particularly important when the “something” is either complicated or not obvious. Even better: document changes before they happen: it can help staff spot hidden complexities or problems in proposed changes.

Problem: Diagrams that have become obsolete (see also above).

Hidden cost: When networking professionals troubleshoot, they connect to devices, use CDP, and develop a pencil sketch to diagram the part of the network they’re troubleshooting. The next time something happens, they do it all over again. This can become a vicious cycle.

Very hidden cost: Your Mean Time to Repair (MTTR) is far longer due to the rework. More downtime!

Solution: Selected documentation and diagrams are pure gold, and must be maintained. Help staff understand priorities regarding what must be kept up to date — as opposed to lower priority, “nice to have” items. If those pencil sketches are still happening, have staff take an extra 30-60 minutes to re-sketch cleanly, and take a picture of the sketch. It might just save time, next time around – or help someone later develop a solid Visio diagram.

Problem: Network management platforms don’t get maintained as carefully.

Hidden cost: Murphy’s Law says the router that just died is the one that the NM tool was unable to capture a configuration for, because nobody noticed and fixed it. Or the interface you need capacity or other data on, or that is acting up, was never “managed,” so you have no historical baseline.

Solution 1: Implement scheduled verification that network management tool maintenance is getting done, and that new devices and interfaces are managed by the tools, etc. (Be sure to avoid “tool-itis”— having too many tools, which consequently are poorly maintained, so nobody uses them.) It is useful to analyze the set of network management tools. We recommend tools based on critical functions and number of staff. Stick to the base set of tools, and make sure everyone is trained in using and maintaining them. Spreading the maintenance load around helps, particularly when the designated “owner” is too busy.

Solution 2: Outsource maintenance of network management tools.

Problem: Staff doesn’t have time to keep up with new Cisco/vendor features.

Hidden cost: Staff implements something, and then discovers a two-year-old feature that would have made it easier and faster. Ensuing choice: Rework the project, or stick with what’s already deployed.

Solution: Manage skills (new product awareness, new technology awareness, and deeper skill sets). Some time must be put into developing skills, but figuring out just how much time requires a balancing act. (A related management task is ensuring you have double coverage of each technical skill, so that there is no “single staff point of failure”). If/when Murphy’s Law kicks in, you don’t want to find out your <whatever> specialist is abroad.

Problem: Changing gears slightly, there can be problems at play other than understaffing. Staff being insufficiently skilled or experienced can cause similar symptoms, as can inefficient/under-productive staff. Identifying either condition might not be easy.

A related problem: A poor or overly complex design can really exacerbate time demands. The complexity just slows everything down, indirectly making staff less productive. Most training focuses on how to configure devices, not on good design. The Cisco CCDA and CCDP certifications (and related course and books) focus on design skills. Those skills, plus experience, plus some common sense, are pre-requisites for good design.

Vendors come up with new technology. Their marketing emphasizes the benefits and value. But just because the technology is there, cool, etc., does not mean you have to deploy it. First consider the pros and cons, especially any increment in the complexity level.

Solution: A NetCraftsmen network assessment can identify problem areas and suggest design improvements, or configuration/stability improvements. We can also talk to staff and suggest training and skills improvements. Staff efficiency is harder to pin down.

Solution 2: Use NetCraftsmen for strategic planning and periodic design/migration planning review.

Well, now that we’ve commiserated about your woes, how about a positive angle. What else can be done about the problems we’ve identified here?

Possible solutions:

  • Hire more staff.
  • If your team is thrashing, shed/defer some tasks and re-prioritize.
  • Give team members clear priorities to ensure the most important things get done. Some progress goes a long way toward making everyone happier and less stressed.
  • Use NetCraftsmen to help you hire more staff. Staff with stronger skills may cost more but may get the job done more efficiently.
  • Leverage our design or managed services offerings to supplement staff skills (especially in network design) or to offload tasks.
  • Outsource some tasks, services, etc.

The book, The Phoenix Project, is interesting reading concerning IT operations. We like the term “accrued technical debt.” Generally speaking, every time a project is rushed and corners are cut, it is like going into debt. Hidden cost: You end up paying “interest” on that debt, in the form of unplanned downtime and maintenance. The book also has some interesting observations about queuing and resource bottlenecks, and how that affects time to completion. If you cut enough corners, staff will not have any time available for projects!

Are there costs lurking within your own network operations that you may not be aware of? Contact us to start a conversation about how to find out.


Comments are welcome, both in agreement or informative disagreement with the above. Thanks in advance!

Hashtags: #HiddenCosts, #NetworkOperations, #NetworkInfrastructure, #NetOps

Twitter: @pjwelcher

Disclosure Statement

Cisco Champion 2014 Cisco Certified 15 Years

Peter Welcher

Peter Welcher

Architect, Operations Technical Advisor

A principal consultant with broad knowledge and experience in high-end routing and network design, as well as data centers, Pete has provided design advice and done assessments of a wide variety of networks. CCIE #1773, CCDP, CCSI (#94014)

View more Posts


Nick Kelly

Cybersecurity Engineer, Cisco

Nick has over 20 years of experience in Security Operations and Security Sales. He is an avid student of cybersecurity and regularly engages with the Infosec community at events like BSides, RVASec, Derbycon and more. The son of an FBI forensics director, Nick holds a B.S. in Criminal Justice and is one of Cisco’s Fire Jumper Elite members. When he’s not working, he writes cyberpunk and punches aliens on his Playstation.


Virgilio “BONG” dela Cruz Jr.

CCDP, CCNA V, CCNP, Cisco IPS Express Security for AM/EE
Field Solutions Architect, Tech Data

Virgilio “Bong” has sixteen years of professional experience in IT industry from academe, technical and customer support, pre-sales, post sales, project management, training and enablement. He has worked in Cisco Technical Assistance Center (TAC) as a member of the WAN and LAN Switching team. Bong now works for Tech Data as the Field Solutions Architect with a focus on Cisco Security and holds a few Cisco certifications including Fire Jumper Elite.


John Cavanaugh

CCIE #1066, CCDE #20070002, CCAr
Chief Technology Officer, Practice Lead Security Services, NetCraftsmen

John is our CTO and the practice lead for a talented team of consultants focused on designing and delivering scalable and secure infrastructure solutions to customers across multiple industry verticals and technologies. Previously he has held several positions including Executive Director/Chief Architect for Global Network Services at JPMorgan Chase. In that capacity, he led a team managing network architecture and services.  Prior to his role at JPMorgan Chase, John was a Distinguished Engineer at Cisco working across a number of verticals including Higher Education, Finance, Retail, Government, and Health Care.

He is an expert in working with groups to identify business needs, and align technology strategies to enable business strategies, building in agility and scalability to allow for future changes. John is experienced in the architecture and design of highly available, secure, network infrastructure and data centers, and has worked on projects worldwide. He has worked in both the business and regulatory environments for the design and deployment of complex IT infrastructures.