deploymentclouddatabase-managementdev-to-productionscalr

Code & data tracking / deployment


For a long time now, we've held our data within the project's repository. We just held everything under data/sql, and each table had its own create_tablename.sql and data_tablename.sql files.

We have now just deployed our 2nd project onto Scalr and we've realised it's a bit messy.

The way we deploy:

We have a "packageup" collection of scripts which tear apart the project into 3 archives (data, code, static files) which we then store in 3 separate buckets on S3.

Whenever a role starts up, it downloads one of the files (depending on the role: data, nfs or web) and then a "unpackage" script sets up everything for each role, loads the data into mysql, sets up the nfs, etc.

We do it like this because we don't want to save server images, we always start from vanilla instances onto which we install everything from scratch using various in-house built scripts. Startup time isn't an issue (we have a ready to use farm in 9 minutes).

The issue is that it's a pain trying to find the right version of the database whenever we try to setup a new development build (at any point in time, we've got about 4 dev builds for a project). Also, git is starting to choke once we go into production, as the sql files end up totalling around 500mb.

The question is:

How is everyone else managing databases? I've been looking for something that makes it easy to take data out of production into dev, and also migrating data from dev into production, but haven't stumbled upon anything.


Solution

  • You should seriously take a look at dbdeploy (dbdeploy.com). It is ported to many languages, the major ones being Java and PHP. It is integrated in build-tools like Ant and Phing, and allows easy sharing of so called delta files.

    A delta file always consists of a deploy section, but can also contain an undo section. When you commit your delta file and another developer checks it out, he can just run dbdeploy and all new changes are automatically applied to his database.

    I'm using dbdeploy for my open source blog, so you can take a look on how delta files are organized: http://site.svn.dasprids.de/trunk/sql/deltas/