Jun 22, 2012

Shouldn't You be Using Version Control?

By "you" I don't mean just everyone involved with software development, I'm thinking that EVERYONE would see benefits. If you use Word and Excel in a company of one person  you would see benefits to using version control software. If you use DropBox you're already using version control software but you may not realize it. Did you know that DropBox keeps track of past revisions of modified documents?

  • Have you ever opened a document, made changes and hit Save instead of Save As?
  • Have you ever accidentally modified or deleted some text and not noticed until the next time you opened the document?
  • Have you ever changed a document and then decided you had it right the first time?

If so, version control will be your best friend and you can feel free to send me free espresso for life. The ascent of distributed version control software (DVCS) is changing the game for everyone. You'll notice I'm avoiding using terms like Source Code Management here because I think DVCS can be so much more.

The Cost of Entry is Gone

It used to be that using version control software had a high up-front cost in terms of labor and knowledge. You had to set up a server (it could be on the same machine as the client), then you had to set up the client. The server tools were usually command line driven, and you had to learn a set of commands just for set up and maintenance of the server.  Many people found it easier to just make manual backups.

Distributed version control uses a peer-to-peer approach. You can get many benefits from version control just by setting it up on your one machine.  Oh yeah the cost in $ is gone too, all the tools I'm talking about are free.

Getting Started with Mercurial

I'm going to focus on one DVCS tool called Mercurial (aka Hg) because I found it the easiest to work with. I'll also focus on Windows but Hg is available for Linux and Mac.   Let's get to the "hard" part, how do you set it up. It's a two step process.

  1.  Download TortoiseHg and install it.
  2. Right click on a folder you want to put under version control and select TortoiseHg->Create Repository Here. In the dialog that pops up, feel free to read the options but typically all the defaults are exactly what you want so just click Create.


Geek Tip: A repository is really nothing but a sophisticated folder called .hg that gets created at the root level of the folder you're working with. You don't ever need to look in here and you shouldn't modify anything in there either, unless you read the manuals and really know what your doing. Just consider it magic. Also if you want to sound hip at your next dinner party don't call it a repository, all the cool kids say "repo". Of course I use words like hip and groovy so my cool credentials are pretty suspect.

Adding Files To Track

That's it your done (well with setting it up anyway)! You are now running version control software. It can keep track of every file in that folder and in every subfolder under that folder. You still have to tell Hg which files you want track. Actually it's easier to tell it which files you don't want to track  and then let it automatically track all the others. 

Tracking Everything

If you want to track everything, just right click on the folder, select TortoiseHg->Add Files…   (see the image below and follow the green arrows to Add Files… ignore the green arrow pointing to Edit Ignore Filter for now ).

RED HOT TIP: You can also get to this menu from anywhere inside the tracked folder, you can even just right-click on some white space inside a folder view and the menu will work. The commands apply to the entire repository, that includes all the subfolders.


You'll get a dialog with a list of all the files in all the subfolders on the left and they will all be checked. Click the Add button and they will all be added. This is a good way to go if this folder (and all subfolders only contain files like the Word and Excel files you want to protect). If for some reason the files aren't all selected you can click that single checkbox at the top to select them all, clearing that checkbox at the top clears all the ones below.

Hg can work with binary files too

Hg can keep track of binary files as well as straight text files. Depending on the type of binary file this might require much less storage than you think. I'll talk a little more about that later but in case you deal mostly with binary files I wanted to keep you interested.

Being Selective about what you track

If you have some files you don't want to track, let's say you always create pdf's of all your word documents and since you can easily re-create those you don't want to track them. Or perhaps you have some gargantuan file that you feel will take too much space to track, you can tell Hg to ignore files. The dialog box is very powerful and a little geeky but I'll keep it simple. You can type in the top simple things like *.pdf and all pdf files in all folders will be ignored. You can type the name of a subfolder and all files in the subfolder will be ignored. You  can also click on a file in the right hand Untracked files area and this will copy the path to the top and you can click Add. All ignored entries will be shown on the left. As soon as you click Add any untracked files on the right that match what you type will disappear. The Untracked files is showing you only the files Hg doesn't know what to do with, meaning you haven't added or ignored them yet.  Once you have the ignore list set up, close it and then do the Add procedure as described above. 


Why not just track everything?

Tracking files does require disk space. Hg is very, very efficient about how it does it but there is a cost. Generally tracking compressed binary files (like those ending in .zip) can be expensive. That's not because it's a binary file but because compression tends to change every byte in a file so Hg has to make a full copy. That said Hg is smart about tracking already compressed files (like .png, or .jpeg) files. It doesn't try to compress them again when storing copies to track.  If a binary file changes in a logical manner, Hg will track just enough so that it can recreate any version of the binary. If you have gobs of disk space you can track everything with impunity.

The Daily Routine – Checking in

So now you have a repository set up and it knows which files you want to track. Now you tell it to actually go ahead and track them, software weenies like me call this committing. This is why most software engineers are married, we're not afraid of commitment.  Right click anywhere in the folder hierarchy being tracked select TortoiseHg->Hg Commit…


You'll get a dialog much like the one shown above. Again the techie roots of Hg are showing but you can ignore most of what is here.  The one thing you can't omit is section 3. You must enter something (anything will do) in this area. I would encourage you to enter something meaningful. These messages entered here show up in tools you can use later to see a list of commits and they can be very helpful in finding a previous version. Once you have a message you can hit Commit. The green arrow is showing a popup that will let you quickly select previous commit messages. This can be helpful to stick in boiler plate code like. "Acme Proposal changes as requested by ". Remember even though this example only shows one file being checked in a commit can check in every file that's been modified since the previous commit.

Hot Pink Tip: The pink entry shown above is a file Hg doesn't know what to do with. It hasn't been added for tracking and it hasn't been ignored. I did this on purpose because this is a special directory Word creates to automatically back up your changes. This directory (and the file) goes away once you close Word. You can safely ignore these files. If it bothers you, you can add "~$" to your filter list and they will no longer show up.

If you created a new Word document, it too will show up in the list in hot pink. Put a checkmark next to it and Hg will both add it and then commit it for you in one step. It will show a dialog asking you to confirm that you do want to track the file. Just Click Add and your new document is now tracked and committed to the repo.  The dialog will stay open (I think it's lonely), so just click Close.

Hot Confession: Remember when I told you, you had to add files first and then commit them? Well I lied. I wanted you to know all the steps. You can really skip the Add Files… step and just jump right to commitment. The commitment dialog does the adding for you.

If you're committing a lot of files at once and feeling overwhelmed you can use section 1 to type a filter to only show you matching files. Then you can type a message for just those files and commit. Now you know why the dialog hangs around. Remove the filter or type a new one and keep committing files in logical groups with logical commit messages.

What's the Diff

One thing software weenies really like about version control is that we can run tools that show us all the changes from version to version. This is incredibly useful, it lets us find quickly where we inserted the bug into the code you bought. It would be useful for you too, but unfortunately, Microsoft .doc and .docx formats are binary. That means they look something like this:


If you save your files in .rtf, that is a text format and you can do compares between various versions to see what's changed. Unfortunately changing a few characters in an .rtf file shows up in the difference viewer looking like this:

-\par This is a .doc not a .docx}{\rtlch\fcs1 \af1\afs48 \ltrch\fcs0 \f1\fs48\kerning32\insrsid4134052\charrsid2456785 \line }{\rtlch\fcs1 \af1\afs48 \ltrch\fcs0 \f1\fs48\kerning32\insrsid6251120\charrsid2456785
+\par This is a .}{\rtlch\fcs1 \af1\afs48 \ltrch\fcs0 \f1\fs48\kerning32\insrsid8134959 rtf}{\rtlch\fcs1 \af1\afs48 \ltrch\fcs0 \f1\fs48\kerning32\insrsid14756322  not a .docx}{\rtlch\fcs1 \af1\afs48 \ltrch\fcs0 \f1\fs48\kerning32\insrsid8134959  and not a .doc
+}{\rtlch\fcs1 \af1\afs48 \ltrch\fcs0 \f1\fs48\kerning32\insrsid4134052\charrsid2456785 \line }{\rtlch\fcs1 \af1\afs48 \ltrch\fcs0 \f1\fs48\kerning32\insrsid6251120\charrsid2456785

Hot Meandering Aside you can skip: Sort of makes you long for the simplicity of WordPerfect doesn't it? Believe it or not if you read that gibberish carefully you can tell what I deleted (that line starts with –) and what I added (the lines that starts with +). Actually I just changed a few words on a single line. This by the way is a small subset of hundreds of lines that were different between the two .rtf docs.

So while you do get significant advantages from using version control, the Hg difference tool is not one of them. Word however can compare two versions of a document for you in a meaningful way so you can still have the advantage. It does require "updating" to an old version, making a copy outside your repo, then updating back to most recent version but that and topics like cloning your repo are best left for another day.

The Workbench

The penultimate topic I want to mention is the Hg Workbench. It's right there in the right-click menu. It can do a lot but I won't go into details. I just want you to know it's there and that it's one of easiest ways to see a history of your commits. Here's a look at the one I created while writing this post.



Backups in the cloud made free and easy

The ultimate topic I want to mention is that you can still use a centralized Hg server. In fact BitBucket (the place you downloaded TortoiseHg from) offers free accounts for teams of 5 or fewer for unlimited repos. This means you can push your local repo up to their servers whenever you want. When your computer explodes, you'll be able to get all your files back by doing a simple clone of your repo from their server. Not bad for free. Again I'm not going into the details on this but BitBucket has a nice tutorial for those interested.

About Me

My photo
Tod Gentille (@todgentille) is now a Curriculum Director for Pluralsight. He's been programming professionally since well before you were born and was a software consultant for most of his career. He's also a father, husband, drummer, and windsurfer. He wants to be a guitar player but he just hasn't got the chops for it.