2 min read

3. Introduction to R

This lesson is based on Getting Started with R and Rstudio (SWC 1-3).

Getting started with R

Why R? * by design “a programming language and environment for statistical computing” * Most common language used by academic statisticians. Thus, many methods are developed as R packages and not available elsewhere. * built around stats, with lots of handy functions * data analysis by design * formulas for statistical models * ggplot ‘grammar of graphics’ among best graphing programs (only get better w/ graphical design programs like photoshop/Inkscape) * dplyr and tidyr for data munging * Integrates with lower level languages (esp C, C++, FORTRAN; Rcpp package for integration w/ C++ excellent) * can do anything * has a reputation for being slow. * It can’t be as fast as a compiled language (like C / FORTRAN) * Recent changes and new packages make it faster and better at handling big data (e.g. by leveraging back-end databases).

Why not R?

  • different than many programming languages
  • if this is your first language, it is

Intro to Rstudio

Rstudio is the best IDE for programming in R. Developed by the Rstudio company, which leads the R community as a whole. Many leading R developers including Hadley Wickham (lead author of tidyverse, dplyr, ggplot2) and Yihui Xie (Rmarkdown, Shiny) work there. Now there are lots of ‘plugins’ that facilitate developing in R.

If you love emacs, try the emacs-ess package. Especially if you are mostly writing scripts. It is really quite awesome. However, it doesn’t compare to Rstudio once you start developing packages and writing reports or websites.

There are four panes (CCW from top left): * text editor * interactive environment * files/plots/packages/help/viewer * environment/history/build/git

Loading and Evaluating Data

Control Flow (if, else, for)

Following SWC R intro 7

Visualization in R using ggplot

Following SWC 8