Thursday, June 6, 2013

version control and git

Originally version control was developed for large software projects with many developers.  However, it can be useful for single users also.  A single user could use version control to keep track of history and changes to files including word files which will allow them to revert back to previous versions.  Its also useful way to keep many different versions of code well organized and keep work in sync on multiple computers. 

So how does version control work?

The system keeps track of differences from one version to the next by making a changeset which is a collection of differences from one commit.  Then the System can go back and reconstruct any version. Git is a distributed version control system (DVCS) that I will talk about here.

So what is a distributed version control system?  Its a version control system where every person has the complete history of the entire project.  DVCS requires no connection to a central repository, and commits are very fast which allows commits to be much more frequent. Also you can commit changes, revert to earlier versions, and examine history all without being connected to the server and without affecting anyone else's work. Also it isn't a problem if the server dies since every clone has a full history.  Once a developer has made one or more local commits, they can push those changes to another repository or other developer can pull them.  Most projects have a central repository but this is not necessary.  Use this link for branching diagrams.

Changeset all stored in one .git subdirectory in a top directory.  This will be a hidden file since it starts with a  dot.  Its best not to mess with this file however you can see this file in Unix with the ls -a command.

To make a clone use:

git clone (web address of  bitbucket) (name of what you would like to call the file)

git clone will make a complete copy of the repository at the particular directory you have navigated to in terminal.

To make a new git repository:
steps make a directory then Initialized git repository
    mkdir name of directory
    git init

Here is a list of useful git commands:

git add (file name) - adds file to git repository
git add -u - adds anything it the current directory which is already under version control (then do a git commit)
git clone - makes a complete copy of the repository at a particular directory you navigated to in terminal
git checkout (name of commit) - converts everything in repository back to that commit
git checkout (name of commit) -- (name of file to revert back) - converts just that file back to the commit
git checkout HEAD - gits you the most recent commit
git commit - commits to your clone's .git directory -m "string describing commit"
git commit -m "first version of script"
git fetch - pulls changesets from another clone by default: the one you cloned from
git init - Initialize git repository
git ls-files . - tells which files in this directory are under version control currently
git log - Shows most recent commits you did
git merge - applies changesets to your working copy
git pull - updating repository (navigate to the repository on your machine and perform git pull)
git push - sends your recent changesets to another clone by default: the one you clone from
git status - outputs all untracked files
git status | more - shows page by page untracked files
git status . - shows files to be committed in the directory you are in
Protocol to remove untracked files that you never put in the repository such as .pyc files
   create a special file called .gitignore
    example: $ vi directoryname/.gitignore
   then add the types of files you would like to ignore
    example: *.pyc (ignore any .pyc file)

github vs bitbucket

bitbucket is private and github is public
git was developed for the open source software for the developing the linux kernal
many repository for scientific computing projects use github for example IPython, NumPy, Scipy, and matplotlib