VectorByte Training Materials 2024

Pre-work and set-up

Hardware and Software

We will be using R for all data manipulation and analyses/model fitting. Any operating system (Windows, Mac, Linux) will do, as long as you have R (version 3.6 or higher) installed.

You may use any IDE/ GUI for R (VScode, RStudio, Emacs, etc). For most people, RStudio is a good option. Whichever one you decide to use, please make sure it is installed and tested before the workshop.

We will also be using Slack for additional support during the training. Please have these installed in advance. We will have a channel on Slack dedicated to software/hardware issues and troubleshooting.

Pre-requisites

We are assuming familiarity with R basics as well as at least introductory statistics, including up through simple linear regression. If you would like materials to review, we recommend that you do the following:

Review of R

  1. Go to The Multilingual Quantitative Biologist, and read+work through the Biological Computing in R Chapter.

  2. In addition / alternatively to pre-work element (1), here are some resources for brushing up on R at the end of the Intro R Chapter you can try. There are many more resources online (e.g., this and this ) – pick something that suits your learning style.

Review of statistics

  1. Review background on introductory probability and statistics (solutions to exercises). You can also use the resources on The Multilingual Quantitative Biologist - Basic Data Analyses and Statistics through into linear models.

  2. Review the conceptual basics of Simple Linear Regression. We’ll refresh fitting linear models in R as well as transformations and diagnostic plots during the workshop.



Live Workshop Materials

Introduction to the Workshop



Regression in R Refresher (Transformations and Diagnostics)



Time dependent regression analysis



Basics of Forecasting Time Series using R



Introduction to the VecDyn database



Introduction to climate and weather data for time dependent analyses



Gaussian Process models (GPs) for Time Dependent Data



Dengue Forecasting Challenge