Github is critical for our deployment and for our general workflow as a team. We keep all our projects on Github in private repos, and open source what applications we can abstract away. In relation to how we deploy, we essentially use Github to manage, track, and store deployment artifacts.
Each one of our deployed projects must have three distinct branches and are persistent, Develop, Alpha, and Production. Develop is our first “terminal” branch. Changes that make their way into the Develop branch will make their way into production eventually. For this reason, most of our work starts out in feature, bug, and other derivative branches of Develop.
When someone wants to deploy an application we merge changes from Alpha into PRoduction and then create a new deployable git tag. That tag is created and pushed up to Github.
At deploy time we use fabric to remote into each production node and do the following:
Currently when we make a deployment and we need to revert, as long as their are no migrations is fairly simple. You can roll back by deploying the previous successful build to each production node. In the case that the app has migrations, we run a rollback on the bad code, and then redeploy the old (good) code.
Migrations should be run separately from code deployment Beefed up disaster recovery plan for migrations and code failures. They are “okay” now, but could benefit from better automation.
We’ve built and tested a disaster recovery plan that uses the Github API, an in-house project management tool API, and fabric. In the event of a full disaster and system shut down, we can bring up new production nodes and re deploy our full suite of applications in under 1 hour. To accomplish this we have an in-house project management application that tracks what apps are currently in production and their github repos. During the recovery we make a call to Github’s API to get the max (latest) version of the app, and deploys it to each server node.
I’m not a sys admin, so the details here are a little beyond my own feature set. What I do know is we run our entire stack on Amazon cloud services. Nothing on our machines is persistent, we store any persistent data on Amazon’s S3 storage service. Because of this we make heavy use of the boto library for Python. We also have our database managed by Amazon cloud services.
Chef is used to allow us to bring up a repeatable environment quickly.