Skip to content

Introduction

Introduction To RStudio

RStudio is what is known as an Integrated Development Environment or IDE. Here you can write scripts, run R code, use R packages, view plots, and manage projects. This pane is broken up into three panels:

  • The Interactive R console/Terminal (left)
  • Environment/History/Connections (upper right)
  • Files/Plots/Packages/Help/Viewer (lower right)

RStudio Layout

Project Management

Before we dive into R it is worth taking a moment to talk about project management. Often times data analysis is incremental and files build up over time resulting in messy directories:

Example of a Messy Directory

Sifting through a non-organized file system can make it difficult to find files, share data/scripts, and identify different versions of scripts. To remedy this, It is reccomended to work within an R Project. Before we make this project, we should make sure you are in your home directory. To do this click on the three dots in the files tab:

Navigating Folders

Then enter in a ~ symbol to go home!

Getting Home

R Project

For the following intro to R tutorial we will be using Alzheimer's Disease gene expression data from Srinivasan et al. 2020. When working within R it is useful to set up an R project. R projects will set your working directory relative to the project directory. This can help ensure you are only working with files within this project space. To create a new project:

  1. Go to File > New Project
  2. New Directory
  3. New Project
  4. Create a name for your project (e.g. r_data_viz)
  5. Create Project

When analyzing data it is useful to create a folder to house your raw data, scripts and results. We can do this by clicking the New Folder icon to create these folders:

  1. Click New Folder > Enter data > Click OK
  2. Click New Folder > Enter scripts > Click OK
  3. Click New Folder > Enter results > Click OK

Now that we have our project set up we will need to download our data. In the data folder we will download our data and decompress it:

download.file("https://raw.githubusercontent.com/BioNomad/omicsTrain/main/docs/programming_languages_tools/r_data_viz/data/ad_overview.png",destfile = "./data/ad_overview.png")
download.file("https://raw.githubusercontent.com/BioNomad/omicsTrain/main/docs/programming_languages_tools/r_data_viz/data/expression_data.tsv",destfile = "./data/expression_data.tsv")
download.file("https://raw.githubusercontent.com/BioNomad/omicsTrain/main/docs/programming_languages_tools/r_data_viz/data/meta_data.tsv",destfile = "./data/meta_data.tsv")

Data Principles

  • Treat data as read-only
  • Store raw data separately from cleaned data if you do need to manipulate it
  • Ensure scripts to clean data are kept in a separate scripts folder
  • Treat reproducible results as disposable

Tip

Result files are good candidate files to cut if you are getting low on storage.

New R script

Now we will create an R script. R commands can be entered into the console, but saving these commands in a script will allow us to rerun these commands at a later date. To create an R script we will need to either:

  • Go to File > New File > R script
  • Click the New File icon and select R script

Creating a New R Script

Running R Code

When running R code you have a few options:

Running One Line/Chunk:

  • Put your cursor at the beginning of the line of code and hit Ctrl + Enter on Windows or ⌘ + Enter on MacOSX.

  • Highlight the line/chunk of code and hit Ctrl + Enter or ⌘ + Enter.

Running The Entire Script:

  • Clicking Source at the top of the script window.

References

  1. R for Reproducible Scientific Analysis
  2. Base R Cheat Sheet
  3. Alzheimer’s Patient Microglia Exhibit Enhanced Aging and Unique Transcriptional Activation