Subversion tutorial

This tutorial is for members of Alan Rogers's lab at the University of Utah.

Subversion is a program for keeping track of software, text, and so on. Don't use it for large data files, such as vcf or bcf files. Do use it for README files, scripts, software, and small files of all types. The subversion server should contain everything you need to recreate a project, should the large data files get destroyed.

(We back up large files on the archive storage at the Center for High-Performance Computing (CHPC) of the University of Utah. See the tutorial on rclone for details.)

Like other version control systems, you can use subversion to reach back into the past and recover a file as it existed long ago. For example,

svn cat -r {2008-03-15} foo.txt

would print the version of foo.txt from 13 March 2008.

When you commit files using subversion, they are stored in a repository, or "repo" for short. We use a subversion server maintained by the College of Social and Behavioral Sciences of the University of Utah.

Setup

Install subversion on every machine from which you wish to use subversion. On a mac using homebrew, the command is "brew install subversion". On ubuntu linux, it is "sudo apt-get install subversion". (You need "sudo" access to do this under linux.)

On the chpc server, you don't need to install subversion, but you do need to load the module each time you log on. The easy way to do this is to put the following line into your .bash_profile:

module load svn/1.9.5

For convenience, it is also useful to define a macro that represents the URL of your directory on the subversion server. So I added this to my .bash_profile on all of the machines I use, including the CHPC server:

export ROGLAB=https://svn.csbs.utah.edu/roglab
export MYRL=https://svn.csbs.utah.edu/roglab/rogers

In the second line, use your own name instead of rogers. This will define the macro ROGLAB and MYRL each time you log in. (To define them without logging out and in, type ". ~/.bash_profile".)

Now if you type "echo $ROGLAB", you should see

https://svn.csbs.utah.edu/roglab

Your user name on the subversion server is your last name, all lower case. Contact Alan Rogers for your password.

Make directories on the subversion server

svn mkdir --username rogers $ROGLAB/rogers -m ""

Here, you would substitute your own last name for "rogers". The last bit (-m "") specifies an empty log message. When you want to add a message to the subversion log, put something between the quotes.

The subversion documentation will tell you to create 3 subdirectories called "branches", "tags", and "trunk". These are useful for software development, but I don't think we need them in roglab. So I'm not going to say anything further about branches, tags, and trunk.

But you will want to create subdirectories to contain the projects that you want subversion to keep track of. For example,

svn mkdir --username rogers $ROGLAB/rogers/vindija -m ""

Getting an existing directory tree under subversion control

Move into the directory just above the one you want to work on:

cd ~/group/rogers/data

If the directory tree already exists, move it out of the way:

mv altai altai.bak

Make a directory of the same name on the subversion server:

svn mkdir --username rogers $MYRL/altai -m ""

Check it out

svn co --username rogers $MYRL/altai altai

You should now have two directores, called "altai.bak" and "altai. Copy everything from the old directory into the new one.

rsync -av altai.bak/ altai

Now altai has all the files that were originally in altai.bak. Check that the two directory trees are really the same:

diff -rq altai.bak altai

If there are no differences, diff won't print anything. Check and then double check to make sure the new directory has everything you need. Then delete the old tree:

rm -rf altai.bak

Add the files that you want to keep under subversion control. Don't include any large files (vcf, bcf, bed, etc).

cd altai
svn add README.md
svn add *.legofit *.lgo

You can also add subdirectories, but be careful: the default method automatically adds the files within the subdirectory. Some of those may be large files that you don't want to add. Here's the way to add a subdirectory called orig without adding its contents:

svn add --depth=empty orig

Then cd into the subdirectory and continue adding files.

Work flow

I keep many of my projects on the subversion server and check them out on all the machines I use. When I begin work for the day, my first step is to update the copy on my current machine, so that it contains any changes I might have made using other machines:

svn up

To add or delete a file to the repo:

svn add foo.lgo
svn rm bar.lgo

These commands schedule the file for addition or removal from the repo, but nothing happens to the remote repo until you commit:

svn commit -m "add file foo; remove file bar"

If you omit -m and the comment, subversion will prompt for a comment. The commit command will also push any changes to existing files onto the repo. You can shorten "commit" to "ci" to save typing.

To summarize, begin a work session with svn up. During the session, you can modify files, add them with svn add, or delete them with svn rm. When you're done, or whenever you want to save your work, use svn commit.

Status

To check on the status of your repo, type

svn stat

In the output, "A" indicates that a file has been added, "M" that it has been modified, "D" that it has been deleted, and "?" that it is not under subversion control. Once you commit your changes (svn ci), the "A", "M", and "D" lines will disappear, but the "?" lines will remain.

Ignoring files

If you are ignoring a lot of the files in your directory, svn stat will print many lines that begin with "?", and this can be a nuisance. It is a good idea to tell subversion to ignore these files.

To do so, create a directory called .svnignore in the top-level directory of your project. For example, the file might look like this:

*.vcf.gz
*.vcf.bgz
*.tbi
Sample_*

Here, *.vcf.gz would match all files that end with .vcf.gz, and Sample_* would match all files that begin with Sample_. Then cd into the directory that contains your .svnignore file and type:

svn propset svn:global-ignores -F .svnignore .

From then on, svn stat will ignore the files that match the patterns in your .svnignore file. If you add lines to .svnignore, you have to run svn propset again.

Learning more

There is lots of online information about subversion. My favorite book is Practical Subversion.