Diagnosing the “ipOutNoRoute” Counter

Author
Terry Slattery
Principal Architect

Any devices that implement the IP MIB have an interesting SNMP counter called ipOutNoRoutes that has the following definition:

ipOutNoRoutes OBJECT-TYPE
  SYNTAX Counter
  ACCESS read-only
  STATUS mandatory
  DESCRIPTION
    "The number of IP datagrams discarded because no route could
    be found to transmit them to their destination.  Note that
    this counter includes any packets counted in ipForwDatagrams
    which meet this `no-route' criterion.  Note that this
    includes any datagrams which a host cannot route because all
    of its default routers are down."

NetMRI tracks this counter and reports when a router or L3 switch has a high number of them in 24 hours.

I’ve investigated the actions that drive this counter because it sounds like an interesting object to track. The description implies that the counter increments when the router, or the routing function in a L3 switch, is not able to forward a packet because a forwarding lookup failed. Fred Baker at Cisco tells me that it is also incremented when an ARP lookup fails. So the counter increments when a L3 lookup fails and when an L2 resolution fails. That confuses the purpose of the counter, at least to my mind. I would expect L3 counters to not include L2 event counts and vice versa.

I’ve previously described how to improve SNMP MIBs to make it easy for an NMS to report to the network staff which endpoints are causing problems that a counter like this is reporting.

Practically speaking, getting the MIB updated or creating a new MIB and getting it supported in network gear is not going to happen anytime soon. It typically takes several years to get agreement on a new MIB. Add to that the years that it takes vendors to get the new functionality built in their products and rolled out to their entire product line. How do we determine the source of the problem when we find an error counter that’s exceeding normal thresholds like the ipOutNoRoutes, icmpOutDestUnreachs, or icmpOutTimeExcds?

NetCraftsmen co-worker Marty Adkins had a good diagnostic suggestion for Cisco devices: ‘debug ip icmp’. His rationale is that ICMP messages are typically very low volume and that enabling debug for ICMP packets would not cause CPU overload on the router. Of course, it helps to use ‘no logging console’ and ‘logging buffered’ to keep the CPU load to a minimum, as described in Cisco’s Important Information on Debug Commands.

I used Marty’s procedure at a customer site and it works. A router (it could have been a L3 switch) had a high ipOutNoRoutes counter for each day. I enabled ‘debug ip icmp’ and found that several hosts on an attached segment were attempting to send data to the 10.10.10.255 address, which is the local broadcast address for a 10.10.10.0/24 subnet. But the router interface was configured with 10.10.10.227/28, which is an address in subnet 10.10.10.224/28. The router was not configured with proxy arp, which is rarely needed or used these days, so its only option was to drop the packets, increment the ipOutNoRoute counter, and return an ICMP Host Unreachable message. I spotted it because 10.10.10.255 is the 10.10.10.0/24 subnet broadcast address. Checking the router interface configuration quickly showed me that there were no interfaces in that subnet and that the interface reporting the bad packets was the 10.10.10.224/28 subnet. Instant problem identification!

The misconfigured hosts were not attempting to communicate with hosts in any of the other subnets contained within 10.10.10.0/24. If they were they would have created an interesting troubleshooting scenario. Connectivity to the router and to the other members of the 10.10.10.224/28 subnet would have worked. But communication with any other host on any other subnet within the 10.10.10.0/24 range would have failed because the ARP would fail, creating a routing black hole for those destinations from the incorrectly configured hosts. Ping tests would confirm which destinations would respond and which would not. The correct diagnosis would require that someone spot the incorrect mask configuration on the hosts. There would not have been any good tips about the source of the problem, such as the failure of packets to 10.10.10.255.

This type of problem would be difficult for most network engineers to solve without more data. I’ve occasionally heard of people describing “bad” network segments and that they have never been able to get working correctly. These are most often due to some easily explained problem which has never been important enough to spend the time to determine the true cause. Perhaps you know of similar problems.

-Terry

_____________________________________________________________________________________________

Re-posted with Permission 

NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html

infoblox-logo

Leave a Reply

 

Nick Kelly

Cybersecurity Engineer, Cisco

Nick has over 20 years of experience in Security Operations and Security Sales. He is an avid student of cybersecurity and regularly engages with the Infosec community at events like BSides, RVASec, Derbycon and more. The son of an FBI forensics director, Nick holds a B.S. in Criminal Justice and is one of Cisco’s Fire Jumper Elite members. When he’s not working, he writes cyberpunk and punches aliens on his Playstation.

 

Virgilio “BONG” dela Cruz Jr.

CCDP, CCNA V, CCNP, Cisco IPS Express Security for AM/EE
Field Solutions Architect, Tech Data

Virgilio “Bong” has sixteen years of professional experience in IT industry from academe, technical and customer support, pre-sales, post sales, project management, training and enablement. He has worked in Cisco Technical Assistance Center (TAC) as a member of the WAN and LAN Switching team. Bong now works for Tech Data as the Field Solutions Architect with a focus on Cisco Security and holds a few Cisco certifications including Fire Jumper Elite.

 

John Cavanaugh

CCIE #1066, CCDE #20070002, CCAr
Chief Technology Officer, Practice Lead Security Services, NetCraftsmen

John is our CTO and the practice lead for a talented team of consultants focused on designing and delivering scalable and secure infrastructure solutions to customers across multiple industry verticals and technologies. Previously he has held several positions including Executive Director/Chief Architect for Global Network Services at JPMorgan Chase. In that capacity, he led a team managing network architecture and services.  Prior to his role at JPMorgan Chase, John was a Distinguished Engineer at Cisco working across a number of verticals including Higher Education, Finance, Retail, Government, and Health Care.

He is an expert in working with groups to identify business needs, and align technology strategies to enable business strategies, building in agility and scalability to allow for future changes. John is experienced in the architecture and design of highly available, secure, network infrastructure and data centers, and has worked on projects worldwide. He has worked in both the business and regulatory environments for the design and deployment of complex IT infrastructures.