Version Control
Following SWC Git Novice 1-6
Version control makes it possible to have ‘unlimited undo’ and histories of your documents (and some smaller datasets)
Having everything on a website like GitHub (or Bitbucket or GitLab) makes it easy to share with other people and collaborate.
If you keep all of your important programs and files somewhere in the cloud, like on box.com, dropbox, github, it makes it easier to use heterogeneous environments.
Getting Started
Introduce yourself (once you’ve set this on your machine, you don’t have to set it again unless you want to change something):
git config --global user.name "Firstname Lastname"
git config --global user.email you@yourdomain.example.com
To make a directory into a “git directory” you need to initialize it. In the following command we create a directory and initialize it as a git directory:
mkdir myproject
git init
The above command creates a hidden directory called .git
that keeps tracks of changes in the directory. We now call it a “repository”" or “repo” for short. One maintains a log of changes with Git, but you have to tell it what to track. The command git status
tells you, among other things, the status of changes to the directory:
git status
Since we created the empty repo, there shouldn’t be any changes to track yet, so let’s make a file:
echo "my project" > README.md
git status
Commiting changes
Changes are tracked via “commits”, but you need to tell git which files to line up for the commit. The following code moves the file README.md
to the “staging” are waiting to be committed:
git add README.md
git status
You then commit together with a short but descriptive message:
git commit -m "added README as first commit"
git status
Git log
If at any time you want to know all that has been committed:
git log
Branching
Branches are a powerful aspect of git, they are like an “alternative history” in which you can make changes. You can later decide if you want to merge these changes into the “main timeline”.
The “main timeline” is itself a branch called “master”.
A branch called “first-script” can be created with:
git branch first-script
You don’t automatically change to this branch. For that:
git checkout first-script
You can return to the master branch by:
git checkout master
The git status
command tells you what branch you are on.
A shorter way to create a branch and simultaenously switch to it is:
git checkout -b first-script
This creates a new branch called your ‘first script’. Now lets add something to it.
echo "!#/bin/bash \n echo 'hello world'" > myscript.sh
Quick-check question: * What is \n
?
Now commit the changes:
git add myscript.sh
git commit -m 'added hello world script'
Merging
Once you’re satisfied with changes you can merge the branch with the master branch. You do this from the branch you want to merge into, in this case from master. It’s good practice to use git status
at every step to check what branch you are on and any changes you may have not committed.
git status
git log
git checkout master
git status
git merge first-script
git status
Note this does not delete the branch.
Review
You may want to know all commits and the branches in which they were made in. As mentioned before, git log
gives you a history. Adding the --graph
flag give you a history of commits together with an asterisk *
telling you to what branch the commit belongs to.
git log --graph
Collaborative Coding SWC Git Novice 7-14
One of the strenghts of git is its power to facilitate collaboration.
GitHub
GitHub is one of the many hosting sites for collaborative coding via Git. You can create a repo in their servers via their website, and then clone it onto your computer via:
git clone url-of-repo-goes-here
One works on it locally as you would a local git repo. But after committing changes you need to send them to the repo on the server using:
git push
Multiple people can do this, so you should periodically “pull” changes:
git pull
There may be conflicts. For the mechanics of resolving such things we refer to the software carpentry notes.
Pull Requests & Code Reviews
It’s best practice to work with branches, and then to submit a “pull request” on the Github site of your repo. This opens up a forum to discuss the code.
The branch can be pulled via a git pull
and then revealed by:
git branch -a
The branch can be checked out by a git checkout
and viewed.
Your collaborators then review the code. Those with push access may commit and push changes. The discussion happens on the website. Even individual lines of code can be commented.
When everybody is satisfied with the new code, it can get accepted and merged to the master branch right from the website (or the command line too).
Messaging
- Slack
- IRC
Git in Practice
There are various workflows based around git. The following is a very common approach to using Git. It is described in a blog post by Vincent Dressen http://nvie.com/posts/a-successful-git-branching-model/.
We use this in the PEcAn Project. Here is a list of branches. Our documentation provides a recommended git workflow that includes a ‘basic’ and ‘advanced’ section. These are worth reading, and it is not necessary to include the content here.
Google is where you will often turn for help, and Google will often lead you to Stackoverflow.com. If you read through the most popular stackoverflow questions tagged with ‘git’ you will find solutions to most common problems.
Git in Rstudio
At www.pi4-uiuc open the Rstudio environment.
To add the git repo: In top corner Project –> New Project –> Version Control –> https://www.github.com/username/repository
To create a new R script: > File –> New File –> R Script
Commits and pushes can be made directly from RStudio.
Neat trick: Lets show the git status (https://git-scm.com/book/it/v2/Appendix-A%3A-Git-in-altri-contesti-Git-in-Bash)
export PS1="\\w\$(__git_ps1 '(%s)') \$ "