Chapter 2 Project set up

If you have already set up your R Project, please proceed to next chapter. This package works best when using a Project within the RStudio IDE. Using projects has many benefits as multiple related files are all stored in one directory, and this becomes the ‘working directory’ which allows easy use of relative pathways.

See https://support.rstudio.com/hc/en-us/articles/200526207-Using-RStudio-Projects.

2.1 Use R Notebooks

It is also encouraged that R Notebooks are used (file extension .Rmd) and these should be saved in the top level of the project folder. When you save a Notebook, a .html file is generated in the same folder as the .Rmd file which shows the output of scripts below each ‘chunk’ of code. This will help you keep track of your work and the .html reports make it quick and easy to view the results outside of RStudio. In fact, this book is written in a very similar way to R Notebooks, and you will see why this is so handy in the next chapters.

The output will appear differently in this book, particularly related to the lack of colour in the messages that the functions produce. When using these functions within RStudio the output will be clearer. It is also a good idea to keep the output messages from a function that takes a long time and create a new code chunk below it to do further tasks. If you run an additional line of code in the existing code chunk it will wipe the output on the screen.

If you are a beginner, try opening a new R Notebook by clicking File -> New File -> R Notebook and it will bring up some more introductory details if you are unfamiliar with these.

R Notebooks are a special type of R Markdown file, and shouldn’t be confused.

2.2 Setup packages

The first ‘code chunk’ of your R Notebook should be labelled setup, by using {r setup} at the start of code chunk. Everything between the two sets of three backticks (```) is executed as R code, and everything outside of these is rendered as text. By labelling the first code chunk as setup, RStudio will execute this code chunk before running any other code if it has not already done so in the current session. This is particularly useful for loading your R packages, which are required once per session.

At a minimum it is suggested that you load library(tidyverse), as well as the DE4Rumi package:

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.7
## v tidyr   1.1.4     v stringr 1.4.0
## v readr   2.1.0     v forcats 0.5.1
## Warning: package 'tibble' was built under R version 4.1.2
## Warning: package 'tidyr' was built under R version 4.1.2
## Warning: package 'readr' was built under R version 4.1.2
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(DE4Rumi)
## Loading required package: S4Vectors
## Loading required package: stats4
## Loading required package: BiocGenerics
## Loading required package: parallel
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
## 
##     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
##     clusterExport, clusterMap, parApply, parCapply, parLapply,
##     parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:dplyr':
## 
##     combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
##     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
##     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
##     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
##     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
##     union, unique, unsplit, which.max, which.min
## 
## Attaching package: 'S4Vectors'
## The following objects are masked from 'package:dplyr':
## 
##     first, rename
## The following object is masked from 'package:tidyr':
## 
##     expand
## The following objects are masked from 'package:base':
## 
##     expand.grid, I, unname

2.3 A note on file paths

For example, DE4Rumi has the ability to export plots and tables, and it will default to the pathway file = "./outputs/". Although the "./" is not necessary, it is used to remind the user that the output folder is located in the top level. Keep in mind that it is possible to use "../" to go up 2 levels, if needed. When writing a file path to import or export a file, you can press the Tab button on your keyboard after you type "./", and it will display a list of files and folders. Tab complete is useful for typing long variable or function names too.

2.4 Figures displayed below code chunks

Figures will also be displayed below the code chunks for some functions in DE4Rumi. If multiple figures are generated by a function, all of them will be displayed below the code chunk and will be visible one after the other in the .html output. Various controls are available to changing the size of the figures and how they look. For example, figure width can be set to 8 or 10 for larger plots by adding {r fig.width = 10} at the start of code chunk.

To view plots in the ‘Plots’ window inside RStudio, copy and paste the code that generates the plot into the ‘Console’ directly (keyboard shortcut Ctrl + 2) and run it. This will let you Zoom and Export directly if needed. Other options for exporting plots exist in some functions in DE4Rumi.