Introduction
Introduction To RStudio
RStudio is what is known as an Integrated Development Environment or IDE. Here you can write scripts, run R code, use R packages, view plots, and manage projects. This pane is broken up into three panels:
- The Interactive R console/Terminal (left)
- Environment/History/Connections (upper right)
- Files/Plots/Packages/Help/Viewer (lower right)
RStudio Layout
Project Management
Before we dive into R it is worth taking a moment to talk about project management. Often times data analysis is incremental and files build up over time resulting in messy directories:
Example of a Messy Directory
Sifting through a non-organized file system can make it difficult to find files, share data/scripts, and identify different versions of scripts. To remedy this, It is reccomended to work within an R Project. Before we make this project, we should make sure you are in your home directory. To do this click on the three dots in the files tab:
Navigating Folders
Then enter in a ~ symbol to go home!
Getting Home
R Project
For the following intro to R tutorial we will be using Alzheimer's Disease gene expression data from Srinivasan et al. 2020. When working within R it is useful to set up an R project. R projects will set your working directory relative to the project directory. This can help ensure you are only working with files within this project space. To create a new project:
- Go to
File
>New Project
New Directory
New Project
- Create a name for your project (e.g.
r_data_viz
) Create Project
When analyzing data it is useful to create a folder to house your raw data, scripts and results. We can do this by clicking the New Folder
icon to create these folders:
- Click
New Folder
> Enterdata
> Click OK - Click
New Folder
> Enterscripts
> Click OK - Click
New Folder
> Enterresults
> Click OK
Now that we have our project set up we will need to download our data. In the data
folder we will download our data and decompress it:
download.file("https://raw.githubusercontent.com/BioNomad/omicsTrain/main/docs/programming_languages_tools/r_data_viz/data/ad_overview.png",destfile = "./data/ad_overview.png")
download.file("https://raw.githubusercontent.com/BioNomad/omicsTrain/main/docs/programming_languages_tools/r_data_viz/data/expression_data.tsv",destfile = "./data/expression_data.tsv")
download.file("https://raw.githubusercontent.com/BioNomad/omicsTrain/main/docs/programming_languages_tools/r_data_viz/data/meta_data.tsv",destfile = "./data/meta_data.tsv")
Data Principles
- Treat data as read-only
- Store raw data separately from cleaned data if you do need to manipulate it
- Ensure scripts to clean data are kept in a separate
scripts
folder - Treat reproducible results as disposable
Tip
Result files are good candidate files to cut if you are getting low on storage.
New R script
Now we will create an R script. R commands can be entered into the console, but saving these commands in a script will allow us to rerun these commands at a later date. To create an R script we will need to either:
- Go to
File > New File > R script
- Click the
New File
icon and select R script
Creating a New R Script
Running R Code
When running R code you have a few options:
Running One Line/Chunk:
-
Put your cursor at the beginning of the line of code and hit
Ctrl + Enter
on Windows or ⌘ +Enter
on MacOSX. -
Highlight the line/chunk of code and hit
Ctrl + Enter
or ⌘ +Enter
.
Running The Entire Script:
- Clicking
Source
at the top of the script window.