EC2 EBS Backup Python script
April 25, 2012 2 Comments
This is simple EC2 backup script that snapshots listed EBS volumes daily. Script keeps maximum number of daily, weekly and monthly snapshots per volume and checks if daily backup has already been done or in progress, so it does not make duplicates for single day.
Prerequisities
1. Ec2 command line tools.
Check that you can run them from command line
$ ec2-describe-snapshots SNAPSHOT snap-070cba6c vol-123123 completed 2012-04-19T02:06:54+0000 100% 457025778133 my.com root SNAPSHOT snap-170cba7c vol-455445 completed 2012-04-19T02:07:09+0000 100% 457025778133 10 my.net root ...
2. Fabric administration and deployment scripting tool
Install with easy_install or pip
<pre>$ sudo easy_install fabric</pre>
See http://docs.fabfile.org/en/1.4.1/installation.html for more details
3. The script.
Copy following to ec2-backup.py and replace the BACKUP_VOLS array with your own volumes and their descriptions. Script is also available in GitHub.
import os, sys, time import dateutil.parser from datetime import date, timedelta, datetime from fabric.api import (local, settings, abort, run, lcd, cd, put) from fabric.contrib.console import confirm from fabric.api import env # for each volume, define the how many daily, weekly and monthly backups # you want to keep. For weekly monday's backup is kept and for the each month # the one from 1st day. BACKUP_VOLS = { 'vol-abc1234': {'comment': 'my.com root', 'days': 7, 'weeks': 4, 'months': 4}, 'vol-1234565': {'comment': 'my.com database', 'days': 7, 'weeks': 4, 'months': 4}, } today = date.today() snapshots = {} hastoday = {} savedays = {} # retained snapshot days for each volume for (volume, conf) in BACKUP_VOLS.items(): daylist = savedays[volume] = [] # last n days for c in range(conf['days'] - 1, -1, -1): daylist.append(today - timedelta(days=c)) # last n weeks (get mondays) monday = today - timedelta(days=today.isoweekday() - 1) daylist.append(monday) for c in range(conf['weeks'] - 1, 0, -1): daylist.append(monday - timedelta(days=c * 7)) # last n months (first day of month) for c in range(conf['months'] - 1, -1, -1): daylist.append(datetime(today.year, today.month - c, 1).date()) SNAPSHOTS = local('ec2-describe-snapshots', capture=True).split('\n') SNAPSHOTS = [tuple(l.split('\t')) for l in SNAPSHOTS if l.startswith('SNAPSHOT')] for (_, snapshot, volume, status, datestr, progress, _, _, _) in SNAPSHOTS: snapshotdate = dateutil.parser.parse(datestr).date() if volume in BACKUP_VOLS: if snapshotdate == today: hastoday[volume] = {'status': status, 'snapshot': snapshot, 'progress': progress.replace('%', '')} if volume not in snapshots: snapshots[volume] = [] snapshots[volume].append((snapshot, status, snapshotdate)) for snapshotlist in snapshots.values(): snapshotlist.sort(key=lambda x: x[2], reverse=True) for volume in BACKUP_VOLS.keys(): if volume not in snapshots: snapshots[volume] = [] print "VOLUME\tSNAPSHOT\tSTATUS\tDATE\tDESC" for (volume, snapshotlist) in snapshots.items(): for (snapshot, status, date) in snapshotlist: datestr = date.strftime('%Y-%m-%d') print "%s\t%s\t%s\t%s\t%s" % (volume, snapshot, status, datestr, BACKUP_VOLS[volume]['comment']) def status(): pass def backup(dryrun=False): print "\nCREATING SNAPSHOTS" for (volume, snapshotlist) in snapshots.items(): if volume in hastoday: print '%s has %s%% %s snapshot %s for today "%s"' % (volume, hastoday[volume]['progress'], hastoday[volume]['status'], hastoday[volume]['snapshot'], BACKUP_VOLS[volume]['comment']) else: print 'creating snapshot for %s "%s"' % (volume, BACKUP_VOLS[volume]['comment']) snapshotlist.insert(0, ('new', 'incomplete', today)) if not dryrun: local('ec2-create-snapshot %s -d "%s"' % (volume, BACKUP_VOLS[volume]['comment'])) print "\nDELETING OLD SNAPSHOTS" for (volume, snapshotlist) in snapshots.items(): for (snapshot, _, date) in snapshotlist: if not date in savedays[volume]: datestr = date.strftime('%Y-%m-%d') print "deleting %s %s for %s (%s)" % (snapshot, datestr, volume, BACKUP_VOLS[volume]['comment']) if not dryrun: with settings(warn_only=True): local('ec2-delete-snapshot %s' % snapshot) def dryrun(): print """ *** DRY RUN *** """ backup(dryrun=True)
You can dry run the script first to see what it would do
$ fab -f ec2-backup.py dryrun
To make actual backup
$ fab -f ec2-backup.py backup
Example output
$ fab -f ec2-backup.py backup [localhost] local: ec2-describe-snapshots VOLUME SNAPSHOT STATUS DATE DESC vol-abc1234 snap-48fe4023 completed 2012-04-24 my.com database vol-abc1234 snap-23863a48 completed 2012-04-23 my.com database vol-abc1234 snap-838131e8 completed 2012-04-20 my.com database vol-abc1234 snap-1b0cba70 completed 2012-04-19 my.com database vol-abc1234 snap-0d4ffb66 completed 2012-04-17 my.com database vol-1234565 snap-42fe4029 completed 2012-04-24 my.com root vol-1234565 snap-25863a4e completed 2012-04-23 my.com root vol-1234565 snap-858131ee completed 2012-04-20 my.com root vol-1234565 snap-1f0cba74 completed 2012-04-19 my.com root vol-1234565 snap-034ffb68 completed 2012-04-17 my.com root CREATING SNAPSHOTS creating snapshot for vol-abc1234 "my.com database" [localhost] local: ec2-create-snapshot vol-abc1234 -d "my.com database" SNAPSHOT snap-8ccd74e7 vol-abc1234 pending 2012-04-25T02:18:58+0000 457025778133 50 my.com database creating snapshot for vol-1234565 "my.com root" [localhost] local: ec2-create-snapshot vol-1234565 -d "my.com root" SNAPSHOT snap-86cd74ed vol-1234565 pending 2012-04-25T02:19:03+0000 457025778133 8 my.com root DELETING OLD SNAPSHOTS deleting snap-0d4ffb66 2012-04-17 for vol-abc1234 (my.com database) [localhost] local: ec2-delete-snapshot snap-0d4ffb66 SNAPSHOT snap-0d4ffb66 deleting snap-034ffb68 2012-04-17 for vol-1234565 (my.com root) [localhost] local: ec2-delete-snapshot snap-034ffb68 SNAPSHOT snap-034ffb68 Done.
If you try to run it again, it will notify about already running backups
... CREATING SNAPSHOTS vol-abc1234 has 55% pending snapshot snap-8ccd74e7 for today "my.com database" vol-1234565 has 100% completed snapshot snap-86cd74ed for today "my.com root" ...
This is very useful. Thanks!
Thanks for the script, in case of databases, I found this very useful piece of information:
http://aws.amazon.com/articles/1663?_encoding=UTF8&jiveRedirect=1
Cheers
Fabio