Moving from Subversion to Git
We recently moved our project from subversion to git, and so far the move has gone very smoothly. The following post will detail what we did to make the move.
For this project there 2 pairs working and we have 6 machines and one hosted service:
- Two Mac OSX workstations with IDEA
- One continuous integration server running Cruise Control
- A staging server running a 2-year old version of Ubuntu
- A subversion server
- A production server hosted on Engine Yard
- A Github account with the ability to create private repositories
The goal was to have one pair continue to work while the migration from svn to git was happening.
Getting the right software
Every machine must have git installed for this to work, and at least one workstation must have git svn installed. On Mac OSX we were able to install git reliably via macports with
sudo port install git-core.
For the workstation that needed git svn it was a little more difficult. On a fresh install of OSX with macports, you should just be able to run
sudo port install git-core +svn, but this requires perl bindings for svn and there are a number of dependencies that might be wrong if you’ve installed subversion from source or have conflicting macports.
On one machine, we were able to get past it by ininstalling and reinstalling subversion before running
sudo port install git-core +svn
On a modern Ubuntu machine you should be able to run
sudo apt-get install git-core, but since our Ubuntu machine is a few versions behind we had to install from source from http://git.or.cz/.
We planned on converting our svn externals to piston once we moved to git, so we also installed piston. We followed the instructions from http://technicalpickles.com/posts/piston-and-git-for-the-win to install it locally, since it’s not available as a standard gem.
Cloning the subversion repository
To clone the repository we used git svn. We only had a trunk on our project, so it was a bit easier for us – to start, all we had to do was:
git svn clone svn://our/repository/trunk
If you have branches and tags you may also want to add the
--tags flags as well – see the docs for more info.
For us the cloning took almost half an hour. When it was done we had a branch named git-svn (visible only with
git branch -a), which we converted to master with:
git checkout -b master git-svn
We noticed when we looked at
git log that we were missing several months from the commit history, even though the git svn command exited without error. To grab the rest of the revisions we ran
git svn rebase and it finished the import. I’m not sure if this is a documented behavior or a bug, but it was surprising to us even though it was easily fixed.
Attributing checkins to pairs
We used to add the pair’s initials to the beginning of our svn commit comments, but with git there is a more elegant way to do it by changing the name of the pair.
In it’s simplest form, you can just change the
git config user.name every time a new pair sits down. Brian Takita suggested listing all the pairs in the .git/config file, and just uncommenting the right one in the morning, which saves some time.
There are much more complex solutions, like [this][http://www.brynary.com/2008/9/1/setting-the-git-commit-author-to-pair-programmers-names] if you like as well (well worth the read even if you don’t end up using it)
Any files that were ignored in svn are no longer ignored in git. We added several standard files to our ignore list, like the
- IDEA project files
Ignoring the log entries seemed to remove the log directory entirely, which caused us to have to manually create the log file on our different servers, but seemed like the right thing to do.
Our git svn clone did not pull in any of our externals. We added these back using piston, although it only worked for a few of the externals. The latest version of piston has support for adding svn repositories to git projects. The syntax is the same as it normally is:
cd your-project piston import svn://some/svn/repo vendor/plugins/some_plugin
For us this worked about half the time, and the other half it hung endlessly and never finished the svn checkout. We still don’t know why that happened. For the plugins that we couldn’t piston we just svn exported those to our project, and once we figure out how to fix piston we’ll piston those as well.
Brian Takita mentioned that when there is a big pending changelist, piston becomes slower on git. In our case, we tried when there were no pending changes, and it still hung.
Adding git support to IDEA
Since we use IDEA for development, we decided to install a git plugin for IDEA to make it easier to view. We decided on git4idea because the project has been more active on github recently and seemed to have a decent feature set.
git clone git://github.com/markscott/git4idea.git cd git4idea cp Git4Idea.jar /Library/Applications/<your IDEA directory>/plugins
Then restart idea to make sure it takes effect. We added the .git directory to the list of ignored modules so that it wouldn’t appear in search history as well. A few things to note with this plugin:
- when you commit, it does not push – the git workflow typically involves committing locally and then pushing in two separate steps
- when you push it will ask you for a target – the default is default/origin or something like that – in our case typing “origin” works
- the project explorer seems to get out of date easily. We run Version Control > Refresh File status often whenever it looks funky (or type “Alt + C, E” if you’ve got the alt keys working)
- when you commit, there are annoying comments in the commit message textarea – it’s focused by default, so you can easily delete it, but it’s still annoying
We haven’t used the plugin extensively, but so far it looks decent.
When switching IDEA projects, you might find it helpful to copy your old project files into the new project to save some time re-setting your settings.
Pushing to github
Once you are setup, you can push to github. In our case we did the following:
- added our local machine’s public ssh key to the account’s ssh keys (on githubs main account page)
- created a new repository and marked it as private
- followed the github instructions to add the remote repo and push to origin master
We decided to only push the master branch to github, since we wouldn’t be using git svn for more than a few hours as we migrated all of the machines. We didn’t test the continued use of git svn after cloning the repo, so if you need to continue with svn you may need to push the original branch as well, or change your .git/config file to make sure that dcommits still work etc…
Once we had it on github, we renamed our original project directory and did a fresh
git clone of the github project, then:
- updated our .git/config again with our pair names (alternately, you can set the config variables globally, like
git config --global user.name 'foo & bar')
- added the log directory (
To update the CI box, we:
- Added a deploy key to github
- Stopped all running cruise processes
- Renamed our current project to -SVN
- Updated to the latest version of CCRB
- Added our new project with the standard cruise command
- Updated our “scratch pad” checkout from the svn repo to the git repo
- Rebooted CI to make sure everything worked (you probably don’t need to reboot)
Github provides the ability to create read-only accounts for specific repositories. These users are identified by public keys and they must be unique across github. For us, we had to:
- Log into our CI box
- Create a public/private key pair with ssh-keygen
- Copy the public key to github’s deploy key area in the Project Name > Admin section
Then we updated to the latest version of Cruise Control by going to CCRB directory and running
git pull. If you’ve added Cruise Control by some other method, you’ll have to update to the latest source from http://github.com/thoughtworks/cruisecontrol.rb/tree/master.
In our setup, our projects are in ~/.cruise/projects. Cruise Control loops through every directory in /projects and loads it’s cruise_config.rb file, so you can have multiple builds running at the same time. After renaming our original project, we added the new project by going to ~/.cruise and typing:
~/.cruise./cruise add your-projectname --repository firstname.lastname@example.org:projectname.git --source-control git
This checked out the repository and added the necessary config file. We then went in and manually created the log directory.
On our CI box we also have a scratch-pad checkout of our code, useful for debugging, located at ~/workspace/project-name. We blew that away and git cloned our repo, then added the log directory.
Once the CI build is green, you can delete the old project that was based on svn.
Git on capistrano
To deploy our app to our staging server, we:
- Created a deploy key for the staging server and added it to github
- Deleted all files from the remote cache directory
- Updated deploy.rb to use git
- Deployed twice (the second time to ensure that it worked from the remote cache)
We followed the excellent guide here to get our deploy settings correct.
The only stumbling block we found was that we use
deploy_via :remote_cache, which stores a checkout of the repository on the server to make deployments faster. Since the remote cache had a subversion checkout, it was necessary to delete all files in the cache before deploying.
After the cache was cleared and deploy.rb was updated we deployed twice, both times without incident.
Deploying git on Engine Yard
We haven’t had the opportunity to deploy on git to EY, but given how easy it was to deploy on our staging server we anticipate that EY will be easy to deploy.
Josh Susser pointed out that because Github itself if hosted on Engine Yard, git deploys from git repos on Github to slices on Engine Yard servers are blazingly fast.
Managing the transition
While one pair was updating these servers, the other pair was checking into the existing subversion repository. Once all the work stations were set up, CI was up and deployments were working, we ran a final svn rebase and pushed to github. It looked something like this:
cd svn-project-dir svn commit -m "made some well tested changes"` cd git-project-dir git pull # => gets all of the changes that the other pair made, i.e. pistoned directories etc... git svn rebase # => gets latest changes from svn git push # => sends changes to github
So by the end of the day, there was almost no interruption for one pair.
All in all it took about a day for a single pair, but if we could do it again knowing what we know now we could probably do it in about half that time.
About the Author