5 min read

Git and GitHub

Version Control

Following SWC Git Novice 1-6

Version control makes it possible to have ‘unlimited undo’ and histories of your documents (and some smaller datasets)

Having everything on a website like GitHub (or Bitbucket or GitLab) makes it easy to share with other people and collaborate.

If you keep all of your important programs and files somewhere in the cloud, like on box.com, dropbox, github, it makes it easier to use heterogeneous environments.

Getting Started

Introduce yourself (once you’ve set this on your machine, you don’t have to set it again unless you want to change something):

git config --global user.name "Firstname Lastname"
git config --global user.email you@yourdomain.example.com

To make a directory into a “git directory” you need to initialize it. In the following command we create a directory and initialize it as a git directory:

mkdir myproject
git init

The above command creates a hidden directory called .git that keeps tracks of changes in the directory. We now call it a “repository”" or “repo” for short. One maintains a log of changes with Git, but you have to tell it what to track. The command git status tells you, among other things, the status of changes to the directory:

git status

Since we created the empty repo, there shouldn’t be any changes to track yet, so let’s make a file:

echo "my project" > README.md
git status

Commiting changes

Changes are tracked via “commits”, but you need to tell git which files to line up for the commit. The following code moves the file README.md to the “staging” are waiting to be committed:

git add README.md
git status

You then commit together with a short but descriptive message:

git commit -m "added README as first commit"
git status

Git log

If at any time you want to know all that has been committed:

git log

Branching

Branches are a powerful aspect of git, they are like an “alternative history” in which you can make changes. You can later decide if you want to merge these changes into the “main timeline”.

The “main timeline” is itself a branch called “master”.

A branch called “first-script” can be created with:

git branch first-script

You don’t automatically change to this branch. For that:

git checkout first-script

You can return to the master branch by:

git checkout master

The git status command tells you what branch you are on.

A shorter way to create a branch and simultaenously switch to it is:

git checkout -b first-script

This creates a new branch called your ‘first script’. Now lets add something to it.

echo "!#/bin/bash \n echo 'hello world'" > myscript.sh

Quick-check question: * What is \n?

Now commit the changes:

git add myscript.sh
git commit -m 'added hello world script'

Merging

Once you’re satisfied with changes you can merge the branch with the master branch. You do this from the branch you want to merge into, in this case from master. It’s good practice to use git status at every step to check what branch you are on and any changes you may have not committed.

git status
git log
git checkout master
git status
git merge first-script
git status

Note this does not delete the branch.

Review

You may want to know all commits and the branches in which they were made in. As mentioned before, git log gives you a history. Adding the --graph flag give you a history of commits together with an asterisk * telling you to what branch the commit belongs to.

git log --graph

Collaborative Coding SWC Git Novice 7-14

One of the strenghts of git is its power to facilitate collaboration.

GitHub

GitHub is one of the many hosting sites for collaborative coding via Git. You can create a repo in their servers via their website, and then clone it onto your computer via:

git clone url-of-repo-goes-here

One works on it locally as you would a local git repo. But after committing changes you need to send them to the repo on the server using:

git push

Multiple people can do this, so you should periodically “pull” changes:

git pull

There may be conflicts. For the mechanics of resolving such things we refer to the software carpentry notes.

Pull Requests & Code Reviews

It’s best practice to work with branches, and then to submit a “pull request” on the Github site of your repo. This opens up a forum to discuss the code.

The branch can be pulled via a git pull and then revealed by:

git branch -a

The branch can be checked out by a git checkout and viewed.

Your collaborators then review the code. Those with push access may commit and push changes. The discussion happens on the website. Even individual lines of code can be commented.

When everybody is satisfied with the new code, it can get accepted and merged to the master branch right from the website (or the command line too).

Messaging

  • Slack
  • IRC

Git in Practice

There are various workflows based around git. The following is a very common approach to using Git. It is described in a blog post by Vincent Dressen http://nvie.com/posts/a-successful-git-branching-model/.

image of git workflow

image of git workflow

We use this in the PEcAn Project. Here is a list of branches. Our documentation provides a recommended git workflow that includes a ‘basic’ and ‘advanced’ section. These are worth reading, and it is not necessary to include the content here.

Google is where you will often turn for help, and Google will often lead you to Stackoverflow.com. If you read through the most popular stackoverflow questions tagged with ‘git’ you will find solutions to most common problems.

Git in Rstudio

At www.pi4-uiuc open the Rstudio environment.

To add the git repo: In top corner Project –> New Project –> Version Control –> https://www.github.com/username/repository

To create a new R script: > File –> New File –> R Script

Commits and pushes can be made directly from RStudio.


Neat trick: Lets show the git status (https://git-scm.com/book/it/v2/Appendix-A%3A-Git-in-altri-contesti-Git-in-Bash)

export PS1="\\w\$(__git_ps1 '(%s)') \$ "