Thingy Ma Jig is the blog of Nicholas Thompson and contains any useful tips, sites and general blog-stuff which are considered interesting or handy!
Posted on 26 November 2008 in
tip
linux
How to
geek
Drupal
code
bash
I was just dumping a database using mysqldump and I noticed that tables such as cache (and its cousins cache_page, cache_menu, etc), sessions and watchdog can be pretty big and are also not often essential for backing up. I mean, when you restore your web site do you really care about restoring people's logged in sessions from when the backup took place? I can understand maybe keeping watchdog; but then again should you lose your site you would probably lose it several hours after the backup so would miss out on any relevant watchdog notices.
Anywho… I did a mysqdump for a large database for a website I maintain and the dump came out at 400Mb. I then spent a few minutes cobling together a small script which would do a mysqldump but had some pre-programmed Regular Expressions to match specific groups of tables which it would only dump the structure for (ie, no data). After running this script, the SQL dump was only 220Mb. Much better! It also runs considerable quicker too and will cause less table locking.
So - the script?
#!/bin/bash
# Define DB Login
USER="username"
PASS="password"
# Get the database from the commandline
DB=$1
# Define the 'structure only' tables
STRUCTURE_ONLY="/^(prefix1_|prefix2_)?(watchdog|sessions|cache(_.+)?)$/"
# Get the tables from the database
TABLES=`mysql -u$USER -p$PASS -B -N -e 'show tables;' $DB`
# Create the SQL file
DBFILE="${DB}.sql"
> $DBFILE
# Status message
echo "Starting dump of ${DB}"
# Loop over the tables
for t in $TABLES; do
# Test if the table matches the 'structur only' regex
RESULT=`echo "$t" | gawk "$STRUCTURE_ONLY"`
# if a match...
if [ $RESULT ]
then
# ... dump structure only onto the end of the SQL file
mysqldump --opt --no-data --user=$USER --password=$PASS $DB $t >> $DBFILE
else
# dump full table onto the end of the SQL file
mysqldump --opt --user=$USER --password=$PASS $DB $t >> $DBFILE
fi
done
# Finish Message
echo "Done"
There are a few lines in here you might want to configure… For example, the USER & PASS variables will need to be the ones for your database. This script also assumes you're running on localhost. The variable STRUCTURE_ONLY hold the Regular Expression to match the 'no data' tables. You might want to change this to match your database structure more accurately - for example you might want to change or remove the prefix option. In my case I have 1 database hosting 3 sites, 2 of which have a prefix but share several tables with no prefix (user, session, sequences, etc). This is how it would look for a database with no prefixes.
STRUCTURE_ONLY="/^(watchdog|sessions|cache(_.+)?)$/"
To run the script, copy the code into a file (eg, 'drupaldump') and save it. Make it executable by running something like this:
$ chmod +x drupaldump
The finally run the script as follows…
$ ./drupaldump [database]
… where [database] is the name of the database you'd like to dump.
Please leave a comment if you can think of any improvements! There are a few I can think of but don't need (ie, its not my itch!) such as option to pass username and password in from the command line, changing the host for the database… There is plenty of space for improvement.