Page 1 of 1

CPU temperature probe problem

Posted: Fri Apr 01, 2011 1:04 pm
by Cue
I have made a CPU temperature probe that seems to work but is always down (Status: down).
if(oid("1.3.6.1.4.1.232.6.2.6.8.1.4.0.2")<70, "Temp is under 70°", "TEMP is over 70°")
When I change the heat value the correct message is displayed so I assume the information is getting read correctly, this OID also works in a label.

Type: Function
Agent: default
Avaliable: (oid("1.3.6.1.4.1.232.6.2.6.8.1.4.0.2")0,1,0)
Error: if(oid("1.3.6.1.4.1.232.6.2.6.8.1.4.0.2")<70, "Temp is under 70°", "TEMP is over 70°")
Value: 1.3.6.1.4.1.232.6.2.6.8.1.4.0.2
Unit: C

Im monitoring the CPU heat value in a HP Proliant ML350 G4

Thank you.

Re: CPU temperature probe problem

Posted: Fri Apr 01, 2011 2:08 pm
by gsandul
I have made a CPU temperature probe that seems to work but is always down (Status: down).
The probe is in UP state only when "Error:" string is empty.
so in your case the Error should be
if(oid("1.3.6.1.4.1.232.6.2.6.8.1.4.0.2")<70, "", "TEMP is over 70°")

Re: CPU temperature probe problem

Posted: Fri Apr 01, 2011 2:54 pm
by Cue
Its still down, but I get "not available" in the problem row.
Except if I lower the heat value I get the correct message in the problem row "TEMP is over 70°", but the probe is still down...

Odd.

Re: CPU temperature probe problem

Posted: Fri Apr 01, 2011 3:34 pm
by gsandul
Its still down, but I get "not available" in the problem row.
"not available" - "not available condition" is detected.
So you have to change condition for probe to be available.
in your case "Avaliable:" condition should be
if (array_size(oid_column("1.3.6.1.4.1.232.6.2.6.8.1.4.0"))>0,1,0)

Re: CPU temperature probe problem

Posted: Fri Apr 01, 2011 4:36 pm
by Cue
You are a genius, thank you very much. :D

How you learn all these commands in The Dude is utterly beyond me.

Re: CPU temperature probe problem

Posted: Wed Mar 07, 2012 3:12 pm
by CypherBit
I know I'm reviving an old post, but it directly applies to what I'm trying to do.

I'm very, very new to SNMP/The Dude, but have a couple of HP servers where I want to monitor the temperature, but am failing to do so.

Whenever I use snmpwalk, be it when the temperature is low (around 50° Celsius) or high (about 70°), I get the same results:
Image
image host

So the probe suggested here does not work, the temp is always showing up as normal. The MIB is: cpqhlth.mib.

Any suggestions?

Re: CPU temperature probe problem

Posted: Wed Mar 07, 2012 10:12 pm
by lebowski
Not sure I understand the trouble... Anyhow you say SNMP walk shows you some values... sure 1.3.6.1.4.1.232.6.2.6.8.1.4.x has 7 different values in your case. If you want to test the first value you need to specify it exactly 1.3.6.1.4.1.232.6.2.6.8.1.4.0.1. If that is the exact OID you are interested in. (the picture is a little blurry)

Then to get better at the dude you should put the value you want to track on a device label (before you make a probe out of it). This way you know you are reading the SNMP value from the server correctly. PUT the below on the Appearance,label of your HP.
Temp1:[oid("1.3.6.1.4.1.232.6.2.6.8.1.4.0.1")]
then add
Temp2:[oid("1.3.6.1.4.1.232.6.2.6.8.1.4.0.2")]

Do both temperatures show up correctly? How would you use that in a probe?

Re: CPU temperature probe problem

Posted: Wed Mar 07, 2012 11:02 pm
by CypherBit
lebowski, thank you for your quick reply.

Everything I've read so far (even for other software) seems to indicate that one has to use 1.3.6.1.4.1.232.6.2.6.8.1.4.x to monitor temperature on HP servers. The thing is the values that I get through SNMP are not correct. They don't indicate the real temperatures and never change even if I stress test the CPU and the temperature goes way up.

I have the newest MIB from HP. What could be causing this and what steps can I take to fix this.


PS: thank you for the device label value trick (before making a probe).

Re: CPU temperature probe problem

Posted: Thu Mar 08, 2012 12:51 am
by lebowski
OH! Ok, the SNMP engine running on your server must be able to monitor the actual hardware... For example windows doesn't automatically monitor an ASUS motherboard temperature, although there has been a lot of progress in standardization you might need to install an ASUS specific SNMP engine in windows to correctly populate the values.

So for your case see if there is an SNMP add on provided by HP for your operating system and your hardware there should be one.

Let us know how it goes...
Lebowski

Re: CPU temperature probe problem

Posted: Fri Mar 09, 2012 10:39 am
by CypherBit
Thank you for your assistance. I mananged to solve it, two things I was doing wrong:

- not waiting long enough for the temperatures to really change,
- using the wrong OID. Different generations of DL360 (G5 in G6 in my case) use a different OID for Ambient temperature.

All is good now :)

Re: CPU temperature probe problem

Posted: Fri Mar 09, 2012 4:35 pm
by lebowski
Good to hear, always curious about things! In hindsight it was obvious that you didn't need any add on programs since the values were being read...