Chapter 3 Getting Started with R

3.1 Installing R and Rstudio

R is not a computer program like Microsoft Word. Instead, it is a computer language, just like Python, Matlab, Java and many others. You can tell your computer to execute commands, by writing the commands in such a computer language. RStudio is the interface, an application, that functions as a tool to help you write, understand and organize your R code. In order to get R up and running on your system we therefore need to install both R and Rstudio. Follow the steps on the Rstudio-website specifically for your system (Linux, Mac, Windows).

After installation, you can open the application RStudio and run R code in it. You’ll notice your screen in Rstudio to be divided into 4 sections:

  • script editor (top left); this is where we will do most work. We will write and edit code in this place
  • console (bottom left); this is where we execute code. while you can run R commands here directly, we will execute code here written in the script editor.
  • environment pane (top right); shows all the current variables, datasets and anything else in the working memory
  • file pane (bottom right); allows you to browse over files

The R language is very large and diverse. Any user of R can extend the language with new functions and new tools, which come together in the shape of a package. There are about 20 thousand (!) R packages available, each of which has their own specialty. When some clever person creates a new epidemiological model and wants it to become available for the community, they can upload it on R and make it available for us. While the volume of packages make it sound like you can easily lose track of everything that’s out there, there’s only a handful of packages that really everyone tends to use. For example, the Tidyverse is a collection of packages that many people working with R have extensive experience with. We will also be using it a lot in this course, specifically the packages ggplot2 (for visualization) dplyr (for data manipulation), readr (for reading datafiles) and tidyr (for reshaping dataframes).

We will go through installing and loading packages, and writing functions later in this course. For now we can get started on some very simple things in R, to get you familiar with doing some coding!

3.2 Starting to code

The print() function is a very basic and useful one, which allows you to often check some things quickly. On the first line in the script editor (left top pane), go and write the following line:

print('hello world!')
## [1] "hello world!"

with your cursor on the right side of that line, press the “run”-button. You will see the output in the console where it now shows in yellow what has been executed, and in white the output is shown.

R is great for doing some easy mathematics. Let’s do some calculations and print the output. Here we have multiple lines of code, that we need to execute in that order for it to work. We can’t first run the print, and then do the calculations. We could do that line by line, so press the run button each time. Alternatively, you can select all these lines, and then press the run button. Another option is to use the shortcut, which for Windows users is: CTRL + Enter. This is the equivalent of the Run button. For the execution of an entire code chunk, you can also run CTRL + SHIFT + ENTER

x = 280
y = 10
z = 0.4
product = x/y*z
print(product)
## [1] 11.2

We will discuss more R syntax, and more things to code, at a later point. For now let’s leave it at this and talk a bit about how we organize our code.

3.3 R Projects and Organization

We can save our code in an R script, which is the most basic R file. To do that, we go CTRL + S and we can navigate to the location to save our file. Besides this R file (extension: .R) we have a couple of other options, including an R Markdown file. I am a huge fan of these. If you’re familiar with Python; R Markdown - files are the R equivalent of a Jupyter Notebook file. They allow you to write code in the same file as nicely organized text. You can put in graphs that look really nice, add citations and a bibliography, add chapters, and so much more. This whole textbook is actually running in a handful of R Markdown files. Eventually, I want to write a small project with you in an R Markdown as well, though for now we can get started with R scripts.

To organize our R code properly, we shouldn’t work with individual R scripts. It is much better to have a specific environment for your R code, in which you store your analysis (scripts), your data and your results and figures.

Something I struggle a lot with myself is with directories. A directory is a certain folder’s location, a way of pinpointing where R is working (the working directory). This is important to know, because when we tell R where to find a certain dataset on our computer, the computer needs to know where to look, and it starts looking relative to the working directory.

To have one nicely organized environment, and have set working directories, we have something called an *R project**. This is exactly an environment to bundle different R files in a separate folder, where the working directory is automatically set to the root of the project; the location of the project file. We create an R project through the following steps:

  • File > New Project
  • A window will pop up asking to Save the current workspace: - Don’t save
  • New Directory > New Project
  • In Directory name: write “epiproject_00”
  • In Create project as a subdirectory of: - press browse, and navigate to a place on hyour computer you would like to store your R codes. I suggest this NOT TO BE your Downloads or Desktop folder.
  • Create Project

Now that we’re in a project, open a new R script. Save it in the project under the name “00_syntax” and let’s set started in the next chapter.