Nagios VMware plugin XML::LibXML error

I recently ran into a strange issue with the Nagios VMware plugin on Cent OS 6.5. I had followed the document:  http://assets.nagios.com/downloads/nagiosxi/docs/Monitoring-VMware-With-Nagios-XI.pdf as always but it the plugin failed to run with the following error:

[root@nagios ~]# /usr/local/nagiosxi/html/includes/configwizards/vmware/plugins/check_esx3.pl -H 192.168.1.13 -u “username” -p “password” -l cpu
Use of inherited AUTOLOAD for non-method XML::LibXML::Document::documentElement() is deprecated at /usr/share/perl5/VMware/VICommon.pm line 554.
ESX3 CRITICAL – Undefined subroutine &XML::LibXML::Document::documentElement called at /usr/share/perl5/VMware/VICommon.pm line 554

Spent a lot of time installing several perl modules but nothing helped. In the end the following command fixed the issue: yum reinstall libxml

[root@nagios ~]# yum reinstall perl-XML-LibXML perl-libxml-perl
Loaded plugins: fastestmirror
Setting up Reinstall Process
Loading mirror speeds from cached hostfile
* base: centos.fastbull.org
* epel: mirror.nus.edu.sg
* extras: centos.fastbull.org
* rpmforge: apt.sw.be
* updates: centos.fastbull.org
Resolving Dependencies
–> Running transaction check
—> Package perl-XML-LibXML.i686 1:1.70-5.el6 will be reinstalled
—> Package perl-libxml-perl.noarch 0:0.08-10.el6 will be reinstalled
–> Finished Dependency Resolution

Dependencies Resolved

==============================================================================================================================================================================
Package Arch Version Repository Size
==============================================================================================================================================================================
Reinstalling:
perl-XML-LibXML i686 1:1.70-5.el6 base 366 k
perl-libxml-perl noarch 0.08-10.el6 base 89 k

Transaction Summary
==============================================================================================================================================================================
Reinstall 2 Package(s)

Total download size: 455 k
Installed size: 1.1 M
Is this ok [y/N]: y
Downloading Packages:
(1/2): perl-XML-LibXML-1.70-5.el6.i686.rpm | 366 kB 00:01
(2/2): perl-libxml-perl-0.08-10.el6.noarch.rpm | 89 kB 00:00
——————————————————————————————————————————————————————————
Total 224 kB/s | 455 kB 00:02
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : perl-libxml-perl-0.08-10.el6.noarch 1/2
Installing : 1:perl-XML-LibXML-1.70-5.el6.i686 2/2
Verifying : 1:perl-XML-LibXML-1.70-5.el6.i686 1/2
Verifying : perl-libxml-perl-0.08-10.el6.noarch 2/2

Installed:
perl-XML-LibXML.i686 1:1.70-5.el6 perl-libxml-perl.noarch 0:0.08-10.el6

Complete!

Fix Heartbleed vulnerability on RedHat and Ubuntu

Before you start, check if your server is vulnerable:  http://www.nagios.com/heartbleed-tester.

If your server is relatively up-to-date, it should be using a vulnerable version of openssl. I had a retired-from-production server still up and running Red Hat Enterprise Linux ES release 4 (Nahant Update 6) and it wasn’t vulnerable as it was using openssl-openssl-0.9.7.

But all other servers I manage were vulnerable so had to update openssl on them. Here is how I did it on Red Hat Enterprise Linux Server release 6.2 (Santiago).

[root@admm2 ~]# yum list installed | grep openssl
openssl.x86_64 1.0.1e-16.el6_5.4 @rhel-x86_64-server-6

[root@admm2 ~]# yum install openssl
Loaded plugins: replace, rhnplugin
Setting up Install Process
Resolving Dependencies
–> Running transaction check
—> Package openssl.x86_64 0:1.0.1e-16.el6_5.4 will be updated
—> Package openssl.x86_64 0:1.0.1e-16.el6_5.7 will be an update
–> Finished Dependency Resolution

Dependencies Resolved

====================================================================================================================
Package Arch Version Repository Size
====================================================================================================================
Updating:
openssl x86_64 1.0.1e-16.el6_5.7 rhel-x86_64-server-6 1.5 M

Transaction Summary
====================================================================================================================
Upgrade 1 Package(s)

Total download size: 1.5 M
Is this ok [y/N]: y
Downloading Packages:
openssl-1.0.1e-16.el6_5.7.x86_64.rpm | 1.5 MB 00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Updating : openssl-1.0.1e-16.el6_5.7.x86_64 1/2
Cleanup : openssl-1.0.1e-16.el6_5.4.x86_64 2/2
Verifying : openssl-1.0.1e-16.el6_5.7.x86_64 1/2
Verifying : openssl-1.0.1e-16.el6_5.4.x86_64 2/2

Updated:
openssl.x86_64 0:1.0.1e-16.el6_5.7

Complete!

Now restart any services(Apache, Postfix, Dovecot etc..) using openssl and you are all set. To check what services are using openssl run the command: lsof -n | grep ssl

On Ubuntu 12.04.3 LTS 

It seems to be a bit more longer process on Ubuntu servers:

apt-get update and apt-get install openssl did NOT fix the issue and at this point of time “apt-get dist-upgrade” is the only option. This will take longer depending on the packages you have installed on the server and will most probably require a reboot.

Command line tool for checking Internet bandwidth on Linux

I manage all my Linux servers remotely and 99% of them doesn’t have a GUI installed. It surely helps to find out what is the Internet bandwidth available for the remote server.

speedtest-cli (https://github.com/sivel/speedtest-cli) solves this problem:

root@hostname:/tmp# wget -O speedtest-cli https://raw.github.com/sivel/speedtest-cli/master/speedtest_cli.py

–2014-04-04 21:05:54– https://raw.github.com/sivel/speedtest-cli/master/speedtest_cli.py
Resolving raw.github.com (raw.github.com)… 103.245.222.133
Connecting to raw.github.com (raw.github.com)|103.245.222.133|:443… connected.
HTTP request sent, awaiting response… 200 OK
Length: 20043 (20K) [text/plain]
Saving to: `speedtest-cli’

100%[======================================>] 20,043 68.3K/s in 0.3s

2014-04-04 21:05:56 (68.3 KB/s) – `speedtest-cli’ saved [20043/20043]

root@hostname:/tmp# chmod +x speedtest-cli

root@hostname:/tmp# ./speedtest-cli -h
usage: speedtest-cli [-h] [–share] [–simple] [–list] [–server SERVER]
[–mini MINI] [–source SOURCE] [–version]

Command line interface for testing internet bandwidth using speedtest.net.
————————————————————————–
https://github.com/sivel/speedtest-cli

optional arguments:
-h, –help show this help message and exit
–share Generate and provide a URL to the speedtest.net share
results image
–simple Suppress verbose output, only show basic information
–list Display a list of speedtest.net servers sorted by distance
–server SERVER Specify a server ID to test against
–mini MINI URL of the Speedtest Mini server
–source SOURCE Source IP address to bind to
–version Show the version number and exit

root@hostname:/tmp# ./speedtest-cli
Retrieving speedtest.net configuration…
Retrieving speedtest.net server list…
Testing from Emirates Telecommunications Corporation (92.98.101.188)…
Selecting best server based on ping…
Hosted by Etisalat (Dubai) [14.56 km]: 25.59 ms
Testing download speed………………………………….
Download: 10.41 Mbit/s
Testing upload speed…………………………………………..
Upload: 2.14 Mbit/s

I see I am getting 10 Mbps download speed as promised by Etisalat, but that’s why testing download from Etisalat server. What happens when I try to download from a server in Texas for example:

root@hostname:/tmp# ./speedtest-cli –server 1859
Retrieving speedtest.net configuration…
Retrieving speedtest.net server list…
Testing from Emirates Telecommunications Corporation (92.98.101.188)…
Hosted by T-Mobile (Dallas, TX) [12932.27 km]: 26.623 ms
Testing download speed………………………………….
Download: 7.52 Mbit/s
Testing upload speed…………………………………………..
Upload: 1.58 Mbit/s

Thought so!

ORA-21561: OID generation failed error while configuring Nagios XI to monitor Oracle database server

I ran into the error “ORA-21561: OID generation failed” while configuring Nagios to monitor an Oracle server using Oracle Instant Client 12.1. After spending a lot of time checking all the obvious things(such as /etc/hosts file) etc, I figured out this was happening as the Nagios server had the hostname changed recently and it wasn’t updated properly.

If you run into a similar issue, on the Nagios server please check if the output of the command ‘hostname’ matches the hostname in /etc/hosts file.

Etisalat eLife Internet connection issues with Shorewall

I have an eLife 8 Mbps connection at home and always used a Linksys Wifi router running dd-wrt as the gateway. Yesterday I decided to use my existing Ubuntu server for the job to get more ‘control’ of my network.

On the Ubuntu server I used pppoeconf to setup the connection to Etisalat and then configured Shorewall as firewall, Squid as transparent proxy and dnsmasq as DHCP/DNS server.

Once everything was setup, my wife started complaining that she isn’t able to access some of the sites that were working fine before I made the switch. I was able to access the sites from the firewall itself, so she was right for once – I had screwed up. I spent a lot of time trying to narrow down the issue, checking DNS resolution and fiddling with firewall rules. I couldn’t fix it and it didn’t make much sense so I decided to RTFM. Sure enough, the fix was right there:

http://www.shorewall.net/FAQ.html#faq33

(FAQ 33) From clients behind the firewall, connections to some sites fail. Connections to the same sites from the firewall itself work fine. What's wrong?

Answer: Most likely, you need to set CLAMPMSS=Yes in /etc/shorewall/shorewall.conf.

From http://shorewall.net/manpages/shorewall.conf.html:

CLAMPMSS=[Yes|No|value]
This parameter enables the TCP Clamp MSS to PMTU feature of Netfilter and is usually required when your internet connection is through PPPoE or PPTP. If set to Yes or yes, the feature is enabled. If left blank or set to No or no, the feature is not enabled.

Important: This option requires CONFIG_IP_NF_TARGET_TCPMSS in your kernel.

You may also set CLAMPMSS to a numeric value (e.g., CLAMPMSS=1400). This will set the MSS field in TCP SYN packets going through the firewall to the value that you specify.

So long story short – CLAMPMSS=Yes is required in /etc/shorewall/shorewall.conf for Etisalat eLife connections.

Monitor expiration date of SSL certificate using Nagios XI

If you manage more than a handful web servers, it is not easy to keep track of SSL certificate expiry dates.

Nagios XI makes your life easier my monitoring all your https sites and alerting you at ‘x’ days before the certificate expires. The default value of ‘x’ is 30 days, but you can change this easily using the web interface.

To set this up, log into Nagios XI web interface, navigate to Configure – Run the Monitoring Wizard – Website. Enter https site address and click next. Then select the option SSL Certificate as seen below:

nagios_ssl 

Recreating a Zimbra Self-Signed SSL Certificate

One of the Zimbra mail servers I manage(version 7.1.4) stopped accepting user logins over IMAP/Webmail.

Going through the logs I saw some errors in: /opt/zimbra/log/mailbox.log

Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path validation failed: java.security.cert.CertPathValidatorException: timestamp check failed
at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1731)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:925)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1170)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1197)
at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1181)

I did a restart of Zimbra services to see what happens:

zimbra@mail:~$ zmcontrol restart
Host mail.kannayath.com
Stopping zmconfigd...Done.
Stopping stats...Done.
Stopping mta...Done.
Stopping spell...Done.
Stopping snmp...Done.
Stopping cbpolicyd...Done.
Stopping archiving...Done.
Stopping antivirus...Done.
Stopping antispam...Done.
Stopping imapproxy...Done.
Stopping memcached...Done.
Stopping mailbox...Done.
Stopping logger...Done.
Stopping ldap...Done.
Host mail.kannayath.com
Starting ldap...Done.
Unable to determine enabled services from ldap.
Enabled services read from cache. Service list may be inaccurate.
Starting zmconfigd...Done.
Starting logger...Failed.
Starting logswatch...ERROR: service.FAILURE (system failure: ZimbraLdapContext) (cause: javax.net.ssl.SSLHandshakeException sun.security.validator.ValidatorException: PKIX path validation failed: <strong>java.security.cert.CertPathValidatorException: timestamp check failed</strong>)
zimbra logger service is not enabled! failed.
Starting mailbox...Done.
Starting memcached...Done.
Starting imapproxy...Done.
Starting antispam...Done.
Starting antivirus...Done.
Starting snmp...Done.
Starting spell...Done.
Starting mta...Done.
Starting stats...Done.

A quick lookup online pointed to an expired certificate and that’s what it was – The certificate installed when Zimbra was setup was only valid for 365 days. This server was setup to be a pilot system but was then converted to production use but the certificate was never regenerated.

I had to log in as root and create new certificate:

/opt/zimbra/bin/zmcertmgr createcrt -new -days 3650 
/opt/zimbra/bin/zmcertmgr deploycrt self

The certificate is now valid for 10 years and won’t have to worry about for a while!

A Zimbra server restart worked fine:

zimbra@mail:~$ zmcontrol restart 
Host mail.kannayath.com
	Stopping zmconfigd...Done.
	Stopping stats...Done.
	Stopping mta...Done.
	Stopping spell...Done.
	Stopping snmp...Done.
	Stopping cbpolicyd...Done.
	Stopping archiving...Done.
	Stopping antivirus...Done.
	Stopping antispam...Done.
	Stopping imapproxy...Done.
	Stopping memcached...Done.
	Stopping mailbox...Done.
	Stopping logger...Done.
	Stopping ldap...Done.
Host mail.kannayath.com
	Starting ldap...Done.
	Starting zmconfigd...Done.
	Starting logger...Done.
	Starting mailbox...Done.
	Starting memcached...Done.
	Starting imapproxy...Done.
	Starting antispam...Done.
	Starting antivirus...Done.
	Starting snmp...Done.
	Starting spell...Done.
	Starting mta...Done.
	Starting stats...Done.

Everything looks alright now:

<span style="font-size: 0.857142857rem; line-height: 1.714285714;">zimbra@mail:~$ zmcontrol status</span>
Host mail.kannayath.com
	antispam                Running
	antivirus               Running
	imapproxy               Running
	ldap                    Running
	logger                  Running
	mailbox                 Running
	memcached               Running
	mta                     Running
	snmp                    Running
	spell                   Running
	stats                   Running
	zmconfigd               Running

On one of my other posts, I have explained how you can setup Nagios XI to monitor expiration date of SSL certificate and alert you before it expires.