SNMP on LINUX server 實作

這幾天把 site 的幾台機器開始要加到 SNMP server 裡,所以得先把 幾台機器上的 SNMP 打開設定好,至少要能從 SNMP server 去 polling,然後能送 SNMP Traps 到 SNMP server 去...

先檢查一下 snmpd 的狀態,應該是未啟動的:
[root@KHXPROVS1 ~]# service snmpd status
snmpd is stopped
再檢查一下是否開機會啟動 snmpd 的服務:
[root@KHXPROVS1 ~]# chkconfig --list|grep snmpd
snmpd 0:off 1:off 2:off 3:off 4:off 5:off 6:off
所以我們先用 chkconfig 的指令讓 snmpd 一開機便會自行啟動:
[root@KHXPROVS1 ~]# chkconfig --level 345 snmpd on
接著打開 snmpd 的服務:
[root@KHXPROVS1 ~]# service snmpd start
Starting snmpd: [ OK ]
再來我們先嘗試著用 snmpwalk 的指令來查詢一下主機:
[root@KHXPROVS1 ~]# snmpwalk -v 1 localhost -c public system
Timeout: No Response from localhost
[root@KHXPROVS1 ~]# snmpwalk -v 2c -c public localhost system
Timeout: No Response from localhost
關於 snmpwalk 的用法,可以直接看 manual:
[root@KHXPROVS1 ~]# snmpwalk
No hostname specified.

Version: 5.1.2

-h, --help display this help message
-H display configuration file directives understood
-v 1|2c|3 specifies SNMP version to use
-V, --version display package version number
SNMP Version 1 or 2c specific
-c COMMUNITY set the community string
SNMP Version 3 specific
-a PROTOCOL set authentication protocol (MD5|SHA)
-A PASSPHRASE set authentication protocol pass phrase
-e ENGINE-ID set security engine ID (e.g. 800000020109840301)
-E ENGINE-ID set context engine ID (e.g. 800000020109840301)
-l LEVEL set security level (noAuthNoPriv|authNoPriv|authPriv)
-n CONTEXT set context name (e.g. bridge1)
-u USER-NAME set security name (e.g. bert)
-x PROTOCOL set privacy protocol (DES|AES)
-X PASSPHRASE set privacy protocol pass phrase
-Z BOOTS,TIME set destination engine boots/time
General communication options
-r RETRIES set the number of retries
-t TIMEOUT set the request timeout (in seconds)
-d dump input/output packets in hexadecimal
-D TOKEN[,...] turn on debugging output for the specified TOKENs
(ALL gives extremely verbose debugging output)
General options
-m MIB[:...] load given list of MIBs (ALL loads everything)
-M DIR[:...] look in given list of directories for MIBs
-P MIBOPTS Toggle various defaults controlling MIB parsing:
u: allow the use of underlines in MIB symbols
c: disallow the use of "--" to terminate comments
d: save the DESCRIPTIONs of the MIB objects
e: disable errors when MIB symbols conflict
w: enable warnings when MIB symbols conflict
W: enable detailed warnings when MIB symbols conflict
R: replace MIB symbols from latest module
-O OUTOPTS Toggle various defaults controlling output display:
a: print all strings in ascii format
b: do not break OID indexes down
e: print enums numerically
E: escape quotes in string indices
f: print full OIDs on output
n: print OIDs numerically
q: quick print for easier parsing
Q: quick print with equal-signs
s: print only last symbolic element of OID
S: print MIB module-id plus last element
t: print timeticks unparsed as numeric integers
T: print human-readable text along with hex strings
u: print OIDs using UCD-style prefix suppression
U: don't print units
v: print values only (not OID = value)
x: print all strings in hex format
X: extended index format
-I INOPTS Toggle various defaults controlling input parsing:
b: do best/regex matching to find a MIB node
h: don't apply DISPLAY-HINTs
r: do not check values for range/type legality
R: do random access to OID labels
u: top-level OIDs must have '.' prefix (UCD-style)
s SUFFIX: Append all textual OIDs with SUFFIX before parsing
S PREFIX: Prepend all textual OIDs with PREFIX before parsing
-L LOGOPTS Toggle various defaults controlling logging:
e: log to standard error
o: log to standard output
f file: log to the specified file
s facility: log to syslog (via the specified facility)

[EO] pri: log to standard error/output for level 'pri' and above
[EO] p1-p2: log to standard error/output for levels 'p1' to 'p2'
[FS] pri token: log to file/syslog for level 'pri' and above
[FS] p1-p2 token: log to file/syslog for levels 'p1' to 'p2'
-C APPOPTS Set various application specific behaviours:
p: print the number of variables found
i: include given OID in the search range
c: do not check returned OIDs are increasing
t: Display wall-clock time to complete the request
剛剛,我們分別用 v1 跟 v2c 去查詢都沒得到回應,因為 snmpd 的設定檔還沒編輯過,所以我們先編輯一下 /etc/snmp/snmpd.conf 檔,這是 snmpd 的設定檔,其中有幾個地方要改的:

1.首先找一下 com2sec notConfigUser default public 這一段,然後改成如下所示:(其中 是 SNMP server 的 IP address)
#com2sec notConfigUser default public
com2sec local localhost public
com2sec mynetwork public
2. 再來是找 group notConfigGroup v1 notConfigUser 這一段,然後改成如下所示:
#group notConfigGroup v1 notConfigUser
#group notConfigGroup v2c notConfigUser
group MyRWGroup v1 local
group MyRWGroup v2c local
group MyROGroup v1 mynetwork
group MyROGroup v2c mynetwork
3. 接著是找 view all included .1 80 這一段,然後改成如下所示:(把前面的 # 去掉)
## incl/excl subtree mask
view all included .1 80
4. 找到 #access MyROGroup "" any noauth 0 all none none 這一段,改成如下所示:
#access MyROGroup "" any noauth 0 all none none
#access MyRWGroup "" any noauth 0 all all all
access MyROGroup "" any noauth prefix all none none
access MyRWGroup "" any noauth prefix all all all
5. 再找一下 syslocation Unknown (configure /etc/snmp/snmp.conf)這一段,改成如下所示:(其中 syslocation 用來告知你機器所在的位置)
syslocation GangShan
syscontact Root (configure /etc/snmp/snmp.local.conf)
6. 接下來是設定 process monitor 的部分,找到 #proc mountd 這一行的位置,這裡假設我們機器上面有 ftp 的服務,而且我們想監視服務的狀態那就可以改成如下所示:
#proc mountd
proc snmpd
proc vsftpd

procfix vsftpd /sbin/service vsftpd restart
(其中 procfix 那一行是在 process 沒有 running 時執行的)。

7. 再來是關於硬碟容量的監視,找到 #disk / 10000 這一段,這裡假設我們要監視 / ,上限是 85% 的使用限制,我們可以設定如下:
#disk / 10000
disk / 15%
好囉,先改這一部分,存檔然後將 snmpd 的服務重啟吧:
[root@KHXPROVS1 ~]# service snmpd restart
Stopping snmpd: [ OK ]
Starting snmpd: [ OK ]
8. 好了,做個簡單的測試吧,先查詢一下硬碟的監視狀況吧:
[root@KHXPROVS1 ~]# snmpwalk -v 2c -c public localhost .
UCD-SNMP-MIB::dskIndex.1 = INTEGER: 1
UCD-SNMP-MIB::dskPath.1 = STRING: /
UCD-SNMP-MIB::dskDevice.1 = STRING: /dev/sda2
UCD-SNMP-MIB::dskMinimum.1 = INTEGER: -1
UCD-SNMP-MIB::dskMinPercent.1 = INTEGER: 15
UCD-SNMP-MIB::dskTotal.1 = INTEGER: 41286828
UCD-SNMP-MIB::dskAvail.1 = INTEGER: 31760584
UCD-SNMP-MIB::dskUsed.1 = INTEGER: 7428960
UCD-SNMP-MIB::dskPercent.1 = INTEGER: 19
UCD-SNMP-MIB::dskPercentNode.1 = INTEGER: 6
UCD-SNMP-MIB::dskErrorFlag.1 = INTEGER: 0
UCD-SNMP-MIB::dskErrorMsg.1 = STRING:
9. 再來,查詢一下服務的監視狀況吧:
[root@KHXPROVS1 ~]# snmpwalk -v 2c -c public localhost .
UCD-SNMP-MIB::prIndex.1 = INTEGER: 1
UCD-SNMP-MIB::prIndex.2 = INTEGER: 2
UCD-SNMP-MIB::prNames.1 = STRING: snmpd
UCD-SNMP-MIB::prNames.2 = STRING: vsftpd
UCD-SNMP-MIB::prCount.1 = INTEGER: 1
UCD-SNMP-MIB::prCount.2 = INTEGER: 1
UCD-SNMP-MIB::prErrorFlag.1 = INTEGER: 0
UCD-SNMP-MIB::prErrorFlag.2 = INTEGER: 0
UCD-SNMP-MIB::prErrMessage.1 = STRING:
UCD-SNMP-MIB::prErrMessage.2 = STRING:
UCD-SNMP-MIB::prErrFixCmd.2 = STRING: /sbin/service vsftpd restart
假設我現在將 vsftpd 服務給停下來,那麼再做一次 snmp query 就會發現 (UCD-SNMP-MIB::prErrMessage.2)的內容變成了:
UCD-SNMP-MIB::prErrMessage.2 = STRING: No vsftpd process running.
10. 接著我們在做個測試,直接用 snmptrap 的指令將 UCD-SNMP-MIB::prErrMessage.2 的值當成 traps 的內容送出到 SNMP server 去:
[root@KHXPROVS1 ~]# /usr/bin/snmptrap -v 2c -c public "" . . s "HOST:KHXP
PROVS1|EVENT=No vsftpd process running."
另外,我們打開 tcpdump 看一下是否有攔到這個 traps 送出:
[root@KHXPROVS1 ~]# tcpdump -vvvXX host -i eth0 and port 162 -s 0
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

18:23:55.648888 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 17, length: 165) > [bad udp cksum a09d!] { SNMPv2c { V2Trap(121) R=2145043168 system.sysUpTime.0=37325417 S: E:2021.2.1.101="HOST:KHXPROVS1|EVENT=No vsftpd process running." } }
0x0000: 001d 0926 b7cd 001e c9ad 5464 0800 4500 ...&......Td..E.
0x0010: 00a5 0000 4000 4011 f3ef 0a10 1920 0a10 ....@.@.........
0x0020: 1919 801c 00a2 0091 46fb 3081 8602 0101 ........F.0.....
0x0030: 0406 7075 626c 6963 a779 0204 7fda c2e0 ..public.y......
0x0040: 0201 0002 0100 306b 3010 0608 2b06 0102 ......0k0...+...
0x0050: 0101 0300 4304 0239 8a69 3018 060a 2b06 ....C..9.i0...+.
0x0060: 0106 0301 0104 0100 060a 2b06 0104 018f ..........+.....
0x0070: 6502 0165 303d 060a 2b06 0104 018f 6502 e..e0=..+.....e.
0x0080: 0165 042f 484f 5354 3a4b 4858 5052 4f56 .e./HOST:KHXPROV
0x0090: 5331 7c45 5645 4e54 3d4e 6f20 7673 6674 S1|EVENT=No.vsft
0x00a0: 7064 2070 726f 6365 7373 2072 756e 6e69 pd.process.runni
0x00b0: 6e67 2e ng.

1 packets captured
1 packets received by filter
0 packets dropped by kernel
[root@KHXPROVS1 ~]#
測試完畢,如果這時在 SNMP server 上有 MIB Browser 之類的軟體,便可收到剛剛送出的 ALARM Traps 囉。

下次有機會再把檢查以及送 traps 出去的簡單 scripts 列上來給大家參考吧...


設定基本的 snmpd 參數:
snmpconf -i -g basic_setup (輔助建立、修改配置文件)
snmpget -v 1 -c public ssCpuRawSystem.0
UCD-SNMP-MIB::ssCpuRawSystem.0 = Counter32: 2377652724

轉換物件名稱的數值形式與可閱讀形式、查詢 MIB 資訊:
snmptranslate -On -Td -IR ssCpuRawSystem
ssCpuRawSystem OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION “system CPU time.”
::= { iso(1) org(3) dod(6) internet(1) private(4) enterprises(1) ucdavis(2021) systemStats(11) 52 }
snmptranslate -Td .
ssCpuRawSystem OBJECT-TYPE
SYNTAX Counter32
MAX-ACCESS read-only
STATUS current
DESCRIPTION “system CPU time.”
::= { iso(1) org(3) dod(6) internet(1) private(4) enterprises(1) ucdavis(2021) systemStats(11) 52 }
以 SNMP 查詢網路上主機的硬碟用量:
[root@KHXPROVS1 ~]# snmpdf -v 1 -c public -Cu khxprovs1 (<<>
Description size (kB) Used Available Used%
/ 41286828 7433684 33853144 18%
[root@KHXPROVS1 ~]# snmpstatus -v 1 -c public khxprovs1 (<<>
[]=>[Linux KHXPROVS1 2.6.9-55.ELsmp #1 SMP Fri Apr 20 17:03:35 EDT 2007 i686] Up: 16:15:02.30
Interfaces: 0, Recv/Trans packets: 1529/1529 | IP: 38894376/58055763

Net-SNMP 首頁:
查詢 MIB file 或 OID 的好站:ipMonitor Support Portal :: Mibs
2 Responses
  1. 阿宅 Says:

    一直解決不了no reponse 的問題


    很棒的文章,最近也在弄這個,不過不知道是否能夠寫關於snmp trap該如何自動送出(比如說拔硬碟)的一些相關設定,弄好久都沒有成果Q_Q