Thursday, May 01, 2008

Subversion is the most important tool that a CS student can possess...

... or at least it's firmly in the top three.
Subversion is a content-versioning system, meaning that it records all important changes you make to your files. As you work, you tell Subversion to record milestones, such as the completion of a function, the fixing of a bug or the end of a day's work. With this information you can see how a file has changed over time, who made those changes and if necessary, even revert them away, restoring the file to an earlier version.
Subversion is certainly not the only versioning system around, but it's modern and popular. CVS is the granddaddy of versioning software, and git is the shiny new kid on the block with an impressive pedigree.
Version-control software is a vital tool in the effective programmer's toolbox, but for the computer science student it is arguably the most important tool to use and master. Regardless of which programming experience, language, compiler, IDE, course or program you use, subversion or one of its cousins will one day save your bacon. Here's how and why.
(n.b. These comments of course apply to all content-version systems, not just Subversion. Subversion happens to be the one that I've used and grown to love. If you prefer another, just replace all instances of Subversion in this text with your choice.)

  1. Subversion will save your time
    I regularly use three different development machines: my laptop, my desktop and the school computers. Trying to maintain the files from all of my classes on all of these machines would be a nightmare. I've seen people with thumb drives, people who email themselves the files from one computer to another and people who are too panicked to try to work on a different computer. None of this is necessary or even reasonable for a proficient coder. Instead, just import your work onto each machine, commit your changes after each session and when you need to move to another computer update your local repository. No muss, no fuss and no lost time trying to keep your files with you.
  2. Subversion will save your friendship
    Despite the average programmer's antisocial tendencies, group projects are a fact of life in a CS degree. And like any group endeavour, they can lead to quite a bit of tension as each member works on different sections of the codebase. Taken to an extreme, this can be very bad: a group member may disappear with crucial code, it may be difficult to get everyone together to code at once or, worst of all, multiple versions of the code may develop as people work independently. Eventually, the final code may be turned in with bugs that no one will own up to creating. This is a pretty good approach to straining or ending friendships.
    Subversion prevents all of this from happening by allowing each member to work independently, have access to the latest (committed) code, and to commit their code when it's tested and ready. And because Subversion lets you track every change made to every line of code, ownership and accountability is built into the system. It can even provide contribution statistics to provide a very approximate estimate of who pulled their weight.
    An aside: Subversion is a godsend when working with people running different operating systems or development environments.
  3. Subversion will save your ass
    Part a: Coding when you should have gone to sleep

    It took me several wasted all-nighters to finally learn a valuable lesson: eventually you're doing more harm than good when coding while sleepy. I would finally wander off to bed, crash, come back to my code the next day and wonder what the hell I had been trying to accomplish. Without Subversion, this would lead to hours of re-constructing and repairing the code. With subversion, all that is necessary is to revert back to a version where I was still making sense. Regular, reasonable code check-ins are the key to long-term success. If I know I've got a long night of hacking in front of me, I will usually commit every hour or so, to give myself many possible restore points even if I don't realize I've passed the point of gibberish.
    Another aside: Subversion also allows you to easily explore other ways of doing things and play with your code, knowing that you can always get back to where you were before with no risk.
    Part b: Hard drive crash

    Related to the issue of working on multiple computers is what happens when the single computer you are working on crashes or you accidentally rm your files into oblivion. If you're relying on zips and manual backups, at a minimum repairing the damage is a hassle and at maximum it's impossible. With your own Subversion repository hosted by a reputable, reliable provider, it's a breeze to import your work into a new machine and continue on as if nothing had happened.
  4. Subversion will make you look good
    Virtually every professional programmer, software development company and open-source project uses some type of version control. Be very, very wary if they don't. In fact, be terrified. Along with project planning, requirements gathering, modelling and other necessities of software engineering, version control is a required skill that's not usually included in the CS curriculum. If you have this skill and experience as a student, it's a strong plus to add to your resume when it comes time for internship and job applications. It will set you apart from other students and demonstrate that you're committed to turning out quality code.
Hopefully this has convinced you to invest the time in learning version control If so, go grab a copy for your OS (Subversion is completely multiplatform), learn the basic commands, install an IDE plugin (Eclipse and Visual Studio both have great plugins: subclipse and AnkhSVN, respectively) create an account with a subversion host and start checking your work in.

5 snarky replies:

Jakub Narebski said...

I'd rather the title of this blog entry was "Version control is the most important tool..." (or "revision control", etc.), not "Subversion...", as it is not about Subversion but most about version control system.

There are two important issues that version control systems help with that you haven't mention (well, one was mentioned), perhaps because of lack of good support for it in Subversion:

* branching which allow to test different ideas without affecting main development, and allow to do maintenance of stable version while developing new features. Subversion needs third-party support (svnmerge, SVK) to make merging easy, and easy branching without easy merging is not conductive.

* bisect (or diff debugging) which allow to find where bug is by finding which commit introduced the bug. As far as I know Subversion doesn't have direct support for that, although it is possible that one of main contributors to Subversion invented term diff debugging

Anonymous said...

CVS is hardly the grandaddy. RCS is older, and actually still used in a number of places (think federal government).

Jakub Narebski said...


CVS is hardly the grandaddy. RCS is older, and actually still used in a number of places (think federal government).


RCS track single files, and not state of project as a whole, and it uses lock-edit cycle instead of edit-merge.

Jakob Homan said...

Jakub- You're right about the title. I went back and forth many times on it, and ended up using Subversion as it sounded catching. Good points about other uses of versioning.

Jakub Narebski said...

By the way, I'd like to add one more comment wrt. branching.

To use branches (so called topic branches) to develop new features, test ideas etc. you rather need private branches. Minimally, you would need to be able to freely create and delete branches, without worrying about unique names. Best, those private branches with work in progress should be kept private, and not visible / published (this for example allow to rewrite history e.g. using rebase / transplant / cherry-picking to arrive at perfect patch series, clean history[1]).

And this in my opinion requires distributed version control system such as git, Mercurial or Bazaar.

[1] Like in those science assignments, where not only final result matters, but also a way you arrived at the result.