VectorByte Training Materials 2024
Pre-work and set-up
Hardware and Software
We will be using R
for all data manipulation and analyses/model fitting. Any operating system (Windows, Mac, Linux) will do, as long as you have R
(version 3.6 or higher) installed.
You may use any IDE/ GUI for R
(VScode, RStudio, Emacs, etc). For most people, RStudio
is a good option. Whichever one you decide to use, please make sure it is installed and tested before the workshop.
We will also be using Slack for additional support during the training. Please have these installed in advance. We will have a channel on Slack dedicated to software/hardware issues and troubleshooting.
Pre-requisites
We are assuming familiarity with R basics as well as at least introductory statistics, including up through simple linear regression. If you would like materials to review, we recommend that you do the following:
Review of R
Go to The Multilingual Quantitative Biologist, and read+work through the Biological Computing in R Chapter.
In addition / alternatively to pre-work element (1), here are some resources for brushing up on R at the end of the Intro R Chapter you can try. There are many more resources online (e.g., this and this ) – pick something that suits your learning style.
Review of statistics
Review background on introductory probability and statistics (solutions to exercises). You can also use the resources on The Multilingual Quantitative Biologist - Basic Data Analyses and Statistics through into linear models.
Review the conceptual basics of Simple Linear Regression. We’ll refresh fitting linear models in R as well as transformations and diagnostic plots during the workshop.
Live Workshop Materials
Introduction to the Workshop
Regression in R Refresher (Transformations and Diagnostics)
- Lecture Slides
- Practical
- Dataset: transforms.csv
Time dependent regression analysis
Basics of Forecasting Time Series using R
Introduction to the VecDyn database
- The VecDyn website.
- About the VecDyn API
- Other Materials Coming Soon!
Introduction to climate and weather data for time dependent analyses
Gaussian Process models (GPs) for Time Dependent Data
Dengue Forecasting Challenge
- Description
- dataset: combined_sanjuan.csv