Introduction to Biostatistics
Biostatistics attempts to use statiscal methods to solve biological problems. This involves data so for the purpose of our biostatistics tutorials we will need to do some setup.
Setup
For the following machine learning tutorials we will be using glioblastoma data from cBioPortal. When working within R it is useful to set up an R project. R projects will set your working directory relative to the project directory. This can help ensure you are only working with files within this project space. To create a new project:
- Go to
File
>New Project
New Directory
New Project
- Create a name for your project (e.g.
biostatistics
) Create Project
When analyzing data it is useful to create a folder to house your raw data, scripts and results. We can do this by clicking the New Folder
icon to create these folders:
- Click
New Folder
> Enterdata
> Click OK - Click
New Folder
> Enterscripts
> Click OK - Click
New Folder
> Enterresults
> Click OK
Now that we have our project set up we will need to download our data. In the data
folder we will download our data and decompress it:
download.file(url = "https://cbioportal-datahub.s3.amazonaws.com/gbm_cptac_2021.tar.gz",destfile = "./data/gbm_cptac_2021.tar.gz" )
untar(tarfile = "./data/gbm_cptac_2021.tar.gz",exdir = "./data/")