Thursday, January 17, 2013

Automated vSphere Daily Health Checks Using Python


The on-call staff member in our team is tasked with performing morning health checks of key infrastructure. Basic stuff like checking that we have enough free space in the data stores and that we don’t have any forgotten snapshots running in the background.

Doing these checks manually is a time consuming process so any kind of automation is a good thing™. Rather than logging onto each VMWare Server and manually inspecting the health of the VMWare host I’d rather have something automated. In this case I’ve opted for a script that does these checks for me. As a result it sends me an email in the morning using a traffic light system. Items that are “bad” or “need attention” are highlighted red. Items that are “good” are highlighted in green.

The first version of this script performs basic health checks of the VMWare host. Specifically it checks which VMWare guests are running and which aren’t. It lists all of the snapshots on the system and which virtual machine they belong to. It also checks the data stores and highlights any data stores with less than 20% free space.

In future versions I want the script to go through the logs and highlight any problems found on the VMWare host.

The script is written in Python. Personally I run it from our Linux based monitoring server. I haven’t tested it on Windows however it should work just fine (in theory!). It uses a module called PySphere which needs to be installed on your system. This can be done by running the following command:

sudo easy_install PySphere

Next upload the script to a location where you place your scripts. The script isn’t path dependant so the location does not matter.

Before running the script you will need to update it to match your environment. There’s a parameter section at the top of the script. You will need to update it such that it points to your vSphere server using the correct login credentials. You will also need to edit the SMTP information to use your email server with the relevant to and from email addresses.

 # Connect to ESX Server and perform checks  
 __author__ = "William Brown"  
 __copyright__ = "Copyright 2013"  
 __credits__ = ["William Brown"]  
 __version__ = "1.0.0"  
 __maintainer__ = "William Brown"  
 __email__ = "william.brownw@gmail.com"  
 __status__ = "Production"  
 from pysphere import *  
 import smtplib  
 from email.mime.text import MIMEText  
 from email.mime.multipart import MIMEMultipart  
 import datetime  
 # Change these parameters to match your environment  
 hostname = "hostname"  
 hostip  = "ip address"  
 user   = "username"  
 password = "password"  
 emailfrom = "fromaddress"  
 emailto = "emailtoaddress"  
 smtpsvr = "serveraddress"  
 vmserver = VIServer()  
 vmserver.connect(hostip, user, password)  
 def print_header():  
      html = "<html><head></head><body>"  
      html += "<h1>VMWare ESX Status Report - " + hostname + "</h1>"  
      html += "<table border=1><tr><td><b>Server Type:</b></td><td>" + vmserver.get_server_type() + "</td></tr>"   
      html += "<tr><td><b>VMWare Version:</b></td><td>" + vmserver.get_api_version() + "</td></tr>"  
      html += "<tr><td><b>IP Address:</b></td><td>" + hostip + "</td></tr>"  
      html += "<tr><td>Runtime</td><td>" + str(datetime.datetime.now()) + "</td></tr>"  
      html += "</table>"   
      return html  
 def print_vm_status():  
      html = "<h1>VMWare Guest Status Report</h1>"  
      html += "<table border=1><tr><th>VM Guest Name</th><th>Guest Status</th></tr>"  
      vmlist = vmserver.get_registered_vms()  
      for vm in vmlist:  
           vm_svr = vmserver.get_vm_by_path(vm)  
           if vm_svr.get_status() == "POWERED OFF":  
                html += "<tr bgcolor=#F78181><td>" + vm + "</td><td>" + vm_svr.get_status() + "</td></tr>"  
           else:  
                html += "<tr bgcolor=#81F781><td>" + vm + "</td><td>" + vm_svr.get_status() + "</td></tr>"  
      html += "</table>"  
      return html  
 def print_snapshot_status():  
      html = "<h1>Snapshot Status</h1>"  
     html += "<p>All snapshots are listed below. If there's no entries in this table, there's no snapshots!</p>"  
      html += "<table border=1><tr><th>Virtual Machine</th><th>Snapshot Name</th><th>Date Created</th><th>State</th></tr>"  
      vmlist = vmserver.get_registered_vms()  
     for vm in vmlist:  
         vm_svr = vmserver.get_vm_by_path(vm)  
         vm_snaps = vm_svr.get_snapshots()  
         for vm_snap in vm_snaps:  
                html += "<tr bgcolor=#F78181><td>" + vm + "</td>"  
             html += "<td>" + vm_snap.get_name() + "</td>"  
             html += "<td>" + vm_snap.get_create_time() + "</td>"  
             html += "<td>" + vm_snap.get_state() + "</td>"  
      html += "</table>"  
      return html  
 def print_datastore_status():  
      html = "<h1>Datastore Status</h1>"  
      html += "<p>Note: Datastores with less than 20% free space are highlighted in red</p>"  
      html += "<table border=1><tr><th>Datastore</th><th>Type</th><th>Capacity (GB)</th><th>Freespace (GB)</th><th>Uncommitted Space (GB)</th></tr>"  
      for ds_mor, name in vmserver.get_datastores().items():  
           props = VIProperty(vmserver, ds_mor)  
           percentfree = float(props.summary.freeSpace)/float(props.summary.capacity)  
           if percentfree < 0.2:  
                bgcolor="#F78181"  
           else:  
                bgcolor="#81F781"  
           if hasattr(props.summary, "uncommitted"):  
                 html += "<tr bgcolor=" + bgcolor + "><td>"+str(name)+"</td><td>"+str(props.summary.type)+"</td><td>"+str(props.summary.capacity/1024/1024/1024)+"</td><td>"+str(props.summary.freeSpace/1024/1024/1024)+"</td><td>"+str(props.summary.uncommitted/1024/1024/1024)+"</td></tr>"  
          else:  
                 html+="<tr bgcolor=" + bgcolor + "><td>"+str(name)+"</td><td>"+str(props.summary.type)+"</td><td>"+str(props.summary.capacity/1024/1024/1024)+"</td><td>"+str(props.summary.freeSpace/1024/1024/1024)+"</td><td>N/A</td></tr>"  
      html += "</table>"  
      return html  
 #print_header()  
 #print print_vm_status()  
 #print_datastore_status()  
 body = print_header()  
 body += print_vm_status()  
 body += print_snapshot_status()  
 body += print_datastore_status()  
 body += "</body></HTML>"  
 msg = MIMEMultipart('alternative')  
 #part1 = MIMEText('text', 'plain')  
 part2 = MIMEText(body,'html')  
 #msg.attach(part1)  
 msg.attach(part2)  
 msg['Subject'] = "VMWare Health Check - " + hostname + " - " + str(datetime.datetime.now())  
 msg['From'] = emailfrom  
 msg['To'] = emailto  
 s = smtplib.SMTP(smtpsvr)  
 s.sendmail( emailfrom, emailto, msg.as_string())  
 s.quit  

Finally you will need to setup a cron job (or windows scheduled task) to run this script as required. Personally I run this job first thing in the morning so that it’s in my inbox when I’m ready to do my morning checks.





2 comments:

  1. What version of Python is required for the script to run? I am using Python 3.3 and get a lot of syntax errors.

    ReplyDelete
    Replies
    1. The script was written against Python 2.7.3. Python 3.3 has a lot of changes in syntax and definitely won't work.

      Delete