NNSquad - Network Neutrality Squad
NNSquad Home Page
NNSquad Mailing List Information
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[ NNSquad ] Re: How far can we trust SNMP metrics?
- To: Lauren Weinstein <lauren@vortex.com>
- Subject: [ NNSquad ] Re: How far can we trust SNMP metrics?
- From: Warren Kumari <warren@kumari.net>
- Date: Mon, 10 Dec 2007 22:44:05 -0500
- Cc: nnsquad@nnsquad.org
On Dec 10, 2007, at 5:21 PM, Lauren Weinstein wrote:
In the context of using SNMP for data collection, do we really have
any idea about the reliability of the stats derived from popular
routers?
Well, I know for a fact that the SNMP statistics on "carrier class"  
routers is fairly accurate, I don't know if that is definitely true  
for all of the consumer broadband devices though.
Devices whose SNMP accuracy I have tested:
Juniper (M and T series only) -- perfect.
Cisco routers -- basically correct, some models more than others.  
Cisco is somewhat notorious for not counting properly...
Netscreen -- newer code seems to be perfect.
Force10 -- perfect.
Foundry -- basically perfect.
Black Diamond -- perfect.
Many of the distributed architecture boxes suffer from issues where  
the data place sends stats to the control plane on some sort of  
periodic basis -- if you poll before the forwarding engine has pushed  
the stats up to the CP, you get funny numbers, but averaged out over  
a while you get more sensible answers.
I have not really tested the consumer broadband routers, but I have  
access to an Ixia and can do so sometime if needed.
For example, my servers are under very heavy load right now due to
interest in the Rogers story.  My own SNMP stats are steadily
showing both "maximum" and "last" upstream values that significantly
exceed the provisioned ceilings on the circuit.  This suggests that
I'm seeing calculation/buffering artifacts, and these could be
matters of significant concern affecting the use of SNMP metrics for
the project.
A few questions:
Is this snmpd running locally on the box? Or are you polling a  
device? If it is local snmpd, what distribution, version, etc? If you  
are polling a device, what is it?
What OIDs are you monitoring? (ifHCInOctets?)
SNMP V1 or V2c?
How often are you polling? If you have a 100Mbps Ethernet that is  
pegged, you will wrap a 32bit counter in a little under 6 minutes --  
unless you poll more often that that you will get very odd numbers (a  
1Gbps link will wrap in around 34 seconds!)
What software are you using for the polling / traffic calculations?  
Are you sure the software is aware of the polling interval (if you  
have different pollers and stats processes).
And last question -- are you certain that the provisioned limit is  
correct? I have seen people who have purchased a rate-limited service  
(eg: 20Mbps on a DS3) but then discovered that the provider  
incorrectly set the rate-limiter / forgot to rate-limit...
I'll see about hooking some consumer devices up to an Ixia or  
SmartBits and validating the counters sometime.
W
Comments?  Thanks.
--Lauren--
NNSquad Moderator