Date Added: Apr 2010
OpenNMS Group created a consulting engagement with new network and system management platform called OpenNMS. This consulting engagement was programmed for three days with its main focus on building and executing a link of operations plan in the event of a system failure or localized disaster impacting OpenNMS. Discussed in this paper is the plan that includes automated and manual instructions that are required for generating, sustaining, and implementing two OpenNMS instances in primary, standby, and failover modes. The OpenNMS software and software dependencies are installed on two geographically remote Debian Linux 4.0 servers that are nms-01 and nms-02. The nms-01 server talked about in this paper is the primary system that is dependable for monitoring the corporate network and the nms-02 server that is a backup system, which when in standby mode is used for monitoring nms-01. It is divided into various sections with one section talking about the script that is responsible for maintaining stability of the live nms-01 OpenNMS instance with the stand-by OpenNMS' failover configuration resident on nms-02. For the nms-02 server, the OpenNMS instance's normal state is to operate in standby mode. It is configured to monitor nms-01 server through two of its services including the OpenNMS web application (OpenNMSWebApp) and the ICMP service of the nms-01 Ethernet interface. The scripts mentioned here assume a Debian system based file system hierarchy.