Python auto update mechanism
In the latest sd-agent release (the monitoring agent for our server monitoring application, Server Density), we included a new update mechanism to allow you to keep the agent up to date by issuing one command. This post will look at how that has been implemented.
If you are installing sd-agent onto multiple servers then keeping it up to date is important. We wanted to allow this to be done with a single command:
python agent.py update
We implemented a very basic mechanism to do this in a previous version but that simply downloaded the latest .tar.gz from our servers, then extracted itself. We wanted the rewrite to have the following features:
- Check if there is in fact a new version available.
- Be able to download new files if they are added to the distribution.
- Calculate a checksum for each file downloaded against the expected value, provided by our server.
- Only overwrite the existing files if they all downloaded correctly.
We specifically implemented it so that it requires the user to initiate the update. There are possible security implications for automated updates that we would rather not deal with – if our servers were ever compromised, we did not want updates being triggered across all our customer servers.
Check for updates
The first step is to actually check for any updates. We provide a page on our website that gives the latest version number in JSON. It also outputs a list of files and their MD5 checksums:
{"version":"1.0.0b4","files":[{"name":"agent.py","md5":"59ff4d213be978c0a2b0e9f330de78b4"}
{"name":"checks.py","md5":"fce251e8cc3baf8ae7474daf691a945c"},
{"name":"daemon.py","md5":"13dfc732ec625de81cf8f836ee8a75f7"},
{"name":"LICENSE","md5":"1b40d466e9eec1075d38acc38f3943fa"},
{"name":"LICENSE-minjson","md5":"7fbc338309ac38fefcd64b04bb903e34"},
{"name":"minjson.py","md5":"6d6cf7429e759023074787a5f42ea3bc"}]}
A simple HTTP request is made to get this:
try:
request = urllib2.urlopen('http://www.serverdensity.com/agentupdate/')
response = request.read()
except urllib2.HTTPError, e:
print 'Unable to get latest version info - HTTPError = ' + str(e.reason)
sys.exit(2)
except urllib2.URLError, e:
print 'Unable to get latest version info - URLError = ' + str(e.reason)
sys.exit(2)
except httplib.HTTPException, e:
print 'Unable to get latest version info - HTTPException'
sys.exit(2)
except Exception, e:
import traceback
print 'Unable to get latest version info - Exception = ' + traceback.format_exc()
sys.exit(2)
This is followed by parsing the JSON. Unfortunately, the json module is only in Python as of v2.6 and since we support, 2.4 and 2.5, we have to use a 3rd party Python JSON class to do the work.
if int(pythonVersion[1]) >= 6: updateInfo = json.loads(response) else: updateInfo = minjson.safeRead(response)
Downloading files
Assuming that the version is newer on the server, we can proceed to download the files. This is simply a case of looping through the files provided by the server call. I wrote a function that can be called recursively so that if the download checksum doesn’t match the first time, it can be called a second time. This only happens twice so that an infinite loop doesn’t occur. Plus, if the file is corrupted twice, then there is probably some underlying network issue anyway.
for agentFile in updateInfo['files']: agentFile['tempFile'] = downloadFile(agentFile)
The downloadFile function is too large to post here so you can view it in the source browser instead.
Using urllib.urlretrieve() is used because it will download each file to a temporary location then give you the path so you can deal with the files afterwards. This allows us to not overwrite the existing files before checking they downloaded correctly.
downloadedFile = urllib.urlretrieve('http://www.serverdensity.com/downloads/sd-agent/' + agentFile['name'])
MD5 checksum of a file
Calculating the checksum of a file is not as easy as in PHP where you can just call md5_file(). You must first read the entire file and pass it to the MD5 functions. The problem with this is that if it is a big file, you will use up a lot of memory. The files in sd-agent are currently small so this isn’t an issue but I wanted to ensure that any future additions were also covered.
checksum = md5.new() f = file(downloadedFile[0], 'rb') # Although the files are small, we can't guarantee the available memory nor that there # won't be large files in the future, so read the file in small parts (1kb at time) while True: part = f.read(1024) if not part: break # end of file checksum.update(part) f.close()
The checksum is then compared and if it doesn’t match, the function called again, but only once. If it fails an error is printed.
Updating files
You will notice in the file loop above that the return path is stored so that if all downloaded file, we can loop through the files again and this time overwrite the existing one. If there is an existing file, it is removed before the new one put in its place.
for agentFile in updateInfo['files']: print 'Updating ' + agentFile['name'] if os.path.exists(agentFile['name']): os.remove(agentFile['name']) os.rename(agentFile['tempFile'], agentFile['name'])
And that’s it! The user is told to restart the agent and they are fully up to date. All code presented above is released under the Simplified BSD license, which sd-agent itself is also licensed under.
