Linux

DevMon - Universal SNMP Orchestration Tool

SNMP (Simple Network Management Protocol) is something everyone knows, so I won't dwell on it. Python and other languages have plenty of excellent third-party l

# DevMon - Devices Monitor with SNMP

standard-readme compliant

SNMP (Simple Network Management Protocol) is something everyone knows, so I won't dwell on it. Python and other languages have plenty of excellent third-party libraries, such as and so on (I only know pysnmp, do your own Googling if you're interested). Given that off-the-shelf cars already exist, why build this wheel? To put it in a way that fits the times: you only know whether the meat tastes good once you take a bite yourself. Maybe it's just having too much time on my hands. But since it's already written, let me give a rough walkthrough in the same vein.

Speaking of which, anyone who's trained a dog knows that to teach the furry kid, the first thing you need to standardize is your own point of entry. Take SNMP, for instance. Is the hard part how to grab the data, or how to trap? No. The hard part is standardizing SNMP's entry point (maybe, perhaps, possible — take it as such). Only by orchestrating a general-purpose template from that entry point can you legitimately slack off in the future (question: why use 地 here? Do you still remember how to use 地, 的, and 得?).

This repository contains the following:

  1. SNMP client and event definitions
  2. SNMP event reading and storage (MongoDB)
  3. Event orchestration and archiving to the log server

To Do

  1. Format the output of SSH remote commands (line and column style)
  2. Support receiving agent messages via SNMP TRAP
  3. Standardize the code, and strive to make it a Python module for easy installation

Table of Contents

Background

The hardware products on the market today — storage, hosts, switches, firewalls, you name it — all support SNMP for retrieving (trap counts as delivery, right? Hopefully that phrasing doesn't offend any perfectionists) operational status. And as a chassis administrator who only does inspections, I naturally want to optimize my workspace as much as possible, which means I have to put in some serious effort (genuinely exhausting, and frankly futile) to free up time and declutter my mind.

The common ailment of a beginner coder is that code is rarely commented, and once uploaded, even they themselves can't read it.

Installation

1. Tested Python Versions

Python-3.11
If you need to install Python 3.11 on RHEL 7, search Google yourself, or refer to the next article (how to upgrade Python 3.11 on RHEL 7).

2. Required Python Modules

Try running python3 devmon.py; install whichever module it complains is missing. Typically, the following modules need to be installed:

python3 -m pip install PyYAML PyMySQL pymongo

3. Linux components to install (using RHEL7 as an example)

rpm -ivh mongodb-org-server-x.y.z-el7.x86_64.rpm

yum install -y net-snmp

Usage Instructions

1. Main Configuration

File: conf/devmon.conf

2. Define Hosts and OID List

Directory: devlist/a-side, devlist/b-side  # valid device lists  
Directory: examples  # verified SNMP definition templates for a category of devices  
Directory: maintaining  # disable event reads for devices under maintenance (just move the device file into this directory)

3. Pushing events for defined SNMP hosts and OIDs

python3 devmon.py run  # Read SNMP events from the device list, and store to the DB, archive to rsyslog
python3 devmon.py service  # Continuously read at the defined interval

4. Polling the defined SNMP hosts and OIDs

python3 devmon.py pm  # Read valid device OID values and classify them by label

Command output:

$ python3 devmon.py pm
=================================================================================================
----------------------------------------  172.16.10.250  ----------------------------------------
Memory Size............................................................................... PASSED
System Name............................................................................... PASSED
Network Interface......................................................................... FAILED
MemoryFreePercent......................................................................... PASSED
SWAPAvailablePercent...................................................................... PASSED
Disk I/O Load 15(min) Avg Lvl1............................................................ PASSED
Disk I/O Load 15(min) Avg Lvl3............................................................ PASSED
CPU Usage................................................................................. PASSED
Label Network Interface                   Current down(2)    Threshold up(1)     Device br0
Label Network Interface                   Current down(2)    Threshold up(1)     Device virbr0
=================================================================================================
----------------------------------------    localhost    ----------------------------------------
Storage Used Percent...................................................................... FAILED
Network Interface......................................................................... FAILED
Memory Size............................................................................... PASSED
SystemName................................................................................ PASSED
Label Network Interface                   Current down(2)    Threshold up(1)     Device VHC128
Label Network Interface                   Current down(2)    Threshold up(1)     Device XHC0
Label Network Interface                   Current down(2)    Threshold up(1)     Device XHC1
Label Network Interface                   Current down(2)    Threshold up(1)     Device XHC20
Label Network Interface                   Current down(2)    Threshold up(1)     Device ap1
Label Network Interface                   Current down(2)    Threshold up(1)     Device gif0
Label Network Interface                   Current down(2)    Threshold up(1)     Device stf0
Label Storage Used Percent                Current 84.75      Limit 80-100    Device /
Label Storage Used Percent                Current 84.75      Limit 80-100    Device /System/Volumes/Data
Label Storage Used Percent                Current 84.75      Limit 80-100    Device /System/Volumes/Preboot
Label Storage Used Percent                Current 84.75      Limit 80-100    Device /System/Volumes/Update
Label Storage Used Percent                Current 84.75      Limit 80-100    Device /System/Volumes/VM
Label Storage Used Percent                Current 91.60      Limit 80-100    Device /Library/Developer/CoreSimulator/Volumes/watchOS_20T253
Label Storage Used Percent                Current 98.00      Limit 80-100    Device /dev

SNMP Entry Definition Spec Sample

---
# Device IP, SNMP daemon interface address, required
address: SomeAddress
# Device physical region, e.g. DCA, DataCenterA..., required
region: SomeRegion
# Device business area, e.g. Dev, Prod..., required
area: SomeArea
# Device IP recorded in CMDB, used to correlate the resource ID in CMDB, required
addr_in_cmdb: SomeAddr
# Manually specify resource ID; if omitted, try to look it up in MongoDB (must be synced first) in the corresponding table, optional. Recommend manually finding and specifying for small quantities.
rid: 'THis is resource ID'
# SNMP client configuration
snmp:
  # SNMP version, 2c or 3, required
  version: '2c'
  # SNMP community name; not needed for v3, required for v2c
  community: 'public'
  # SNMP v3 protocol username, required for v3
  username: 'user1'
  # SNMP MIB library, required for some devices
  mib: 'ANY-MIB'
  # Brocade fibre-channel switch context (virtual FC switch) ID, required when virtual FC switches are configured
  context: 128
  # snmpwalk OID read timeout, optional
  timeout: 2
  # Retry count after snmpwalk OID read failures, optional
  retries: 1
  # OID association configuration
  OIDs:
  # Define by a single OID; one of id, id_range, table is required
  - id: 'SNMPv2-MIB::sysName.0'
    # OID label, used to categorize inspection display (recommend a short string), required
    label: System Name
    # OID explanation, used to compose alert content, required
    explanation: 'Hostname'
    # OID reference value; equal to or contains the OID read value, considered normal, otherwise triggers an alert; one of this and watermark is required
    reference: 'monitor'
  # Define by an OID range
  - id_range:
      # OID start value; if not ending with .1 (for example), the default starting index is 1; required when defined by id_range
      start: 'ifOperStatus.1'
      # Total count within the OID range, or use the 'end' keyword as the terminating ID; one of this and 'end' is required
      count: 31  # the number of the OID range
      # end: 'ifOperStatus.3'
    label: 'NetworkInterface'
    explanation: 'NIC Status'
    # If the OID name or description needs to be read from another OID, define 'related_symbol'; the index automatically references the current OID
    # Note: do NOT end with an index number!!!
    # Configure as needed, optional.
    related_symbol: 'IF-MIB::ifDescr'
    reference: 'up(1)'  # a reference value which been considered as normal stat
  # Define OID as a list (table entry); one of the three is required
  - table: 'hrStorageUsed'
    # OID list entry from which to read the current index number, optional
    table_index: 'hrStorageIndex'
    label: 'Storage Used Percent'
    explanation: 'Storage device usage rate'
    # Used to compose alert content, optional, has a default value.
    alert: 'Critical anomaly, administrator must pay attention!'
    # Index numbers to exclude from reading, optional.
    exclude_index: '35, 36, 44, 73'
    # Same as above, reads the actual name or description of the OID, optional.
    related_symbol: 'hrStorageDescr'
    # Value requires computation, e.g. storage percentage, optional.
    arithmetic: '%'
    # The other operand OID list entry for the arithmetic; 'arithmetic', 'arith_symbol', 'arith_pos' must all be present or all be absent
    arith_symbol: 'hrStorageSize'
    # Position (1 or 2) of the OID for the extra value within the arithmetic; 1 is first, 2 is second
    arith_pos: 2
    # The OID result needs to be compared against a threshold (or limit) range; one of this and reference is required.
    watermark:
      low: -1
      high: 80
      # Limit type switch;
      # True means this watermark defines a restricted range: value between low and high triggers an alert, otherwise normal;
      # False means this watermark defines a threshold range: value between low and high is normal, otherwise triggers an alert;
      # Optional value
      # restricted: False

Maintainer

@n0rvyn

How to Contribute

Your contributions are very welcome! File an issue or submit a pull request. (Even if you know how to pull, the author just can't figure out how to merge 😂)

License

MI Has No T No particular license—I haven't figured out the differences between licenses. Copy away, I don't mind.

Packaging Example

Tried to package it with pyinstaller, the result was poor; if any expert is willing, feel free to give some pointers.
pyinstaller devmon.py -F \
    -p src/core/log.py \
    -p src/core/mongo.py \
    -p src/core/pushmsg.py \
    -p src/core/readfile.py \
    -p src/core/rid.py \
    -p src/core/snmp.py \
    -p src/core/ssh.py \
    -p src/type/case.py \
    -p src/type/oid.py \
    -p src/type/snmpagent.py \
    -p src/type/sshagent.py \
    -p src -p src/core/ -p src/type/
N
norvyn

独立 iOS 开发者,写字的人。在一座有海的城市,慢慢地做一些小而确定的东西。An independent iOS developer and writer — slowly making small, certain things in a city by the sea.

评论Comments

加载中…Loading…

留下评论Leave a comment