I’m sure that most of you have some sort of network monitoring system in place and that you’ve looked at utilization of your important links. I was looking at a set of links recently that had some common characteristics as well as one unexpected attribute. The utilization charts below are for Gigabit interfaces in the core of a network. In the first image, you’ll note the common work-day traffic signature that looks like a double peak of 24-32Mbps, with the valley between the peaks occurring around lunch time. No such signature occurs on Saturday and Sunday.
The second thing you should note is the evening automated backups and other back-office workload that occurrs every day, starting in the evening and running into the early morning hours of the next day. Depending on the extent of the data being copied, the evening traffic load may be even higher than the daily workload. In fact, at startup, the evening network traffic peaks regularly at around 30Mbps before dropping back to 16Mbps. I’ll bet that this traffic is primarily one or more backups and that the both the sending and receiving systems have free memory available to buffer the data, resulting in higher network demands. But after the process has run a short time, the physical I/O constraints of the disk systems lower the overall throughput to 16Mbps (about 1.6MB/s).
The next utilization chart is more interesting because it doesn’t have the same clear signatures during the daytime. You have to look more closely to see the night-time network traffic, which occurs after midnight each day. You also have to be careful in examining this chart because the magnitude of the traffic is higher. What is interesting to me about this chart is the extent of the peaks and the plateau of overall traffic at around 100Mbps on each Sunday. While this link is 1Gbps, the sites which are using it are connected by 100Mbps circuits.
Without flow data to correlate with the traffic utilization graphs, it is difficult to tell what traffic is traversing the link and whether higher peaks should be expected. If there is primarily one site performing backups or some other activity, then I would be satisfied in accepting that the peak traffic would be around 100Mbps. But if more than one site were attempting to use the link, and I expected the overall traffic load to be greater than 100Mbps, then I’d want to start further investigations. Is something else keeping the load at 100Mbps? Why am I not seeing the characteristic initial traffic burst that preceeds the longer term flow, like we observed for the evening transfers in the first graph?
My concern is that something in the network is limiting the throughput to 100Mbps. If I had flow data available, I’d be able to determine who were the top sources and destinations and conduct further investiations.
Analysis like this is useful when validating a Service Level Agreement (SLA). If a Gigabit path was contracted, but for some other reason is restricted to 100Mbps, then I’d want to know about it. Perhaps the carrier has an internal limitation? Or, more likely, the application just can’t run any faster than 100Mbps throughput and the link has plenty of reserve capacity. You can’t tell from these charts which case applies without correlating other data.
-Terry
_____________________________________________________________________________________________
Re-posted with Permission
NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html