3 min read

Projects Overview

Class Projects

Two groups:

Each group will have a github repository that contains code developed to answer a specific question with data from the TERRA REF project.

Each group will develop an automated analysis of a real-time data stream that will enable visualization and qa/qc. Here are the project topics / questions and links to their repositories.

Topics

  • Topic 1: Met data
    • Create dynamic live visualizations of weather data streams from Maricopa and UIUC Energy Farm
    • Use reanalysis data to validate data streams
    • Extra Credit: fill in missing data
  • Topic 2: Trait and Geospatial Data
    • Map trait data
    • Identify outliers
    • Generate plot layout and plant traits overlain on full field imagery

Objectives

Students should experience a collaborative development cycle similar to what they might encounter as a research programmer / software engineer / data scientist etc. They will have the opportunity to contribute to open source projects (e.g. TERRA REF and PEcAn Projects). Their GitHub repositories, reports, and code will become part of their portfolio.

Projects will provide the opportunity for students to apply what we have learned in class to new datasets and analyses.

Requirements

Project Components

  • Combine data from more than one source (the TERRA REF dataset counts as a single source)
  • Summarize data in tables and graphs
  • Develop quality control metrics to flag data gaps and outliers
  • Produce a report in Rmarkdown that can be updated daily or as new data arrive
  • Extra credit - create an Rshiny web application

Expectations / Requirements

  • Work collaboratively,
    • Create clear tasks in GitHub - at any given time each person should be able to identify the one open issue that they are working on
    • Team members should have clearly defined roles
  • Use data from > 1 source
  • Exploratory Data Analysis
    • multiple figures and tables that illustrate data
  • At least one report that
    • takes raw data and generates useful output (figures, new derived data sets, results of analyses)
    • runs with sample data in data/ folder
    • can be re-run on new data as it becomes available
    • outputs to results/ folder
  • Documentation (can be wiki or README)
    • Background
    • Data description and illustration
    • Approach (Methods)
    • Analysis (results)
    • Discussion
      • findings
      • next steps

Introduction to the TERRA REF dataset

Daily Tasks

Day 1

Form a team of 4-5 people

  • create a team github repository and slack channel
  • create a README with:
  • team members
  • ideas that you want to work on
  • what types of data you will need
  • what types of analyses / algorithms you might use
  • Spend the next three days thinking about your project

Day 2

Flush out ideas from Friday (Day 1)

  • narrow the scope
  • define data sets that are needed
  • clarify questions
  • write hypotheses / predictions
  • draft of methods that you will use to test hypotheses
  • what figures and tables will you create?
  • divide tasks within the group

Set up project

Download data

  • Download and subset at least a sample of the data that you will use
  • Describe its contents in the documentation
  • Create issues and assign to David if you need data that aren’t available

Day 3

  • Refine questions / methods
  • Create visualizations
  • Import met data

Day 4

By end of day you should have

  • GitHub issues for
    • any data you need
    • any issues that are blocking your work
  • A project document containing
    • steps to access data
    • exploratory data analysis figures
    • time series
    • histograms, boxplots
    • faceted & or color by (trait, site, genotype etc)
    • outlines of
    • background
    • approach