Friday, February 22, 2013

Version Control Systems

Hello,

as I will work with Version Control Systems a lot, I want to post a short introduction:

------------------------------------------------------------------

Version Control Systems (VCS)  are tools helping developers to manage changes of software. They are used for documentation, sharing and merging of code. This is an essential part in software development and for the success of projects. Software is usually developed in teams which members work parallel on the same code. So it is very important to have a tool for sharing and merging changes [1].

Because this task is so important, there are already a lot of tools available. In my project I want to take use of these systems for handling real-time geo data. The process of merging geo data streams to each other is nearly the same like merging code of several developers to each other. Also documentation is needed for handling sensor data.

In general there can be made a classification of two types of version control sytems: distributed and centralized systems.

Centralized version control systems

Centralized version control systems consist of one single server that contains all the versioned files and a number of clients that are allowed to down- and upload files to that server. This architecture has been a standard for many years [2].

  • Advantages:
    Easier to learn
    Everybody in a project knows what others are doing at the moment
    Administrators have control over what everyone is doing
    For administrators a central version control system is easier to deal with
    Used in many companies
  • Disadvantages:
    The centralized server is a single point of failure. When for example the server goes down for one hour, nobody can work with the version control system any more.
    If the hard disk, where the central server is stored, becomes corrupted, all the data is lost. So the making of proper backups is a central aspect when using central version control systems.
  • Examples:
    Subversion, CVS, Perforce
Distributed version control systems

In a distributed version control systems each client fully mirrors the whole repository [2].

  • Advantages:
    Every checkout of a client is a full backup of the repository. If the repository goes down, any of the client repositories can be copied back on the server and fully restores the system.
    A hierarchical model for workflows is possible. It is possible to work with different people in different groups within the same project.
    Speed increase
  • Disadvantages:
    More complex to learn [3].
  • Examples:
    Git (can also be used as centralized version control system), Mercurial, Bazaar, Darcs
Bibilopgrahy

[1] Gilad Bracha Matthias Kleine, Robert Hirschfeld. An Abstraction for Version Control Systems. Universitätsverlag Potsdam, 2012.
[2] Scott Chacon. Pro Git. Apress, 2009
[3] Ian Oxley Chris Kemper. Foundation Version Control for Web Developers. Apress,
2012.

------------------------------------------------------------------

My plans are to make a evaluation of different systems and chose then two options for a practical implementation. My two current favorites are Subversion and Git. I think they are the two most common systems in companies. I also want to have one example of a centralized and one of a decentralized version control system in my thesis. But we will see ;)

Cheers, Tanja

5 comments:

  1. Great blogging Tanja. I just want to pick up on one point at the start of the blog, where I think perhaps you are quoting from another source. You state that: Version Control Systems (VCS) are tools helping developers to manage changes of software. They are used for documentation, sharing and merging of code. This is an essential part in software development and for the success of projects, because errors can easily occur. Software is usually developed in teams which members work parallel on the same code. So it is very important to have a tool for sharing and merging changes [1].
    These all seemed reasonable statements, but "This is an essential part in software development and for the success of projects, because errors can easily occur." seemed strangely out of place. Clearly all sorts of errors can occur, but I don't know that errors occurring is directly an argument for version control. I mean one needs version control because changes need to be made to code and that may introduce errors that need to be backed out of. Perhaps you mean, that errors can easily occur when code is merged?

    Also, I was reflecting that while git can support a decentralized distributed workflow, it also supports a centralized workflow. It is arguably workflow agnostic. I would also argue that git is easier to learn and use than SVN. Trying to create and re-merge code branches in SVN was incredibly complex - the same thing in git is very easy ...

    ReplyDelete
  2. Hi Sam!

    Thank you for your comment!

    Yes, I meant that errors can easily occur, when code would be merged without the use of a versioning system.
    I deleted the part of the sentence "because errors can easily occur". You are right that it seems a bit strange in that context.

    I read in a book, that Git is more complex than Subversion. I added the book in the bibliography:
    [3] Ian Oxley Chris Kemper. Foundation Version Control for Web Developers. Apress, 2012.
    On the top of page 37 it says that Git is harder to set up (e.g. you need SSH keys) and has a longer learning curve. I personally think that this is kind of true. It took me longer to understand the concepts of Git, because there are a lot more things you can do with Git than with SVN. With Subversion you only need to remember checkout and commit :-)

    You are right, that re-merging code and branching is easier with Git. I am preparing a more detailed post about Git, Subversion and Mercurial where I will certainly mention that.

    I changed the example-part of the decentralized version control systems:
    Git (can also be used as centralized version control system)

    I know that Git is in some parts much better than subversion. But there are quite a lot of companies who are still using subversion in their everyday business life. So I think it will be useful to look at this system too? For example, I work with Subversion in my company since 4 years. The speed of our SVN is quite good and I never had huge problems with merging, because we are just 4 developers and every one of us is usually working on different parts of the code.

    ReplyDelete
  3. Hi Tanja, thanks for the update. I would concur that git is more complex than subversion overall, but I don't know that it is necessarily more complex to learn - it is if you try to learn all the features, but you can use git just like subversion, i.e. only use the simple features, and it's not hard at all. Subversion also has a lot more features than just checkout and commit. I'd be careful believing everything you read in books :-)

    It's definitely worth looking at subversion too - it's still widely in use. It may be just fine for your needs. Merging code is not hard in subversion or git, it is "branching" and re-merging branches that is very hard in subversion and much easier in git.

    Here's the bit of the Berkeley course that covers that. This is just one of the reasons why I think that Berkeley course is so essential for what you are doing - although of course you don't particularly need to learn the details of ruby on rails, but the other software engineering stuff in there is golden!

    ReplyDelete
  4. Hi Sam,

    Yes its true, not to believe everything I read in books :)
    Thanks for the link to the Berkeley lecture about version control. I watched it today. I created a new post where I will write about the lectures I already watched, so that you will have a better overview how far I am: http://tanjamalitz.blogspot.co.at/2013/02/software-as-service.html

    Thanks for the Git-video. I currently installed "Tortoise Git" (http://code.google.com/p/tortoisegit/) and worked with a Git GUI. Do you think it is better to work with the console?

    ReplyDelete