Optional: AlphaFold2 Batch Script
Log into the HPC cluster’s On Demand Interface
- Open a Chrome browser and go to On Demand
- Log in with your Tufts Credentials
- On the top menu bar choose
Clusters->Tufts HPC Shell Access
- You'll see a welcome message and a bash prompt, for example for user
tutln01
:
[tutln01@login001 ~]$
- This indicates you are logged in to the login node of the cluster. DO NOT run anything from this node as it is a shared login node. To complete jobs we either need to start an interactive session to get on to a compute node (more details can be found here about this option) or we can write a batch script to submit our job to a compute node.
AlphaFold2 Batch Script
A batch script can be broken into two parts - the header section with information on how to run the job and a command section where we use UNIX commnads to do a job. Here is our script:
#!/bin/bash
#SBATCH -p preempt # partition we submit to
#SBATCH -n 8 # The number of cpu cores we would like
#SBATCH --mem=64g # The amount of RAM we would like
#SBATCH --time=2-0:00:00 # The time we think our job will take
#SBATCH -o output.%j # The name of the output file
#SBATCH -e error.%j # The name of the error file
#SBATCH -N 1 # The number of nodes we would like
#SBATCH --gres=gpu:1 # The number of GPUs
#SBATCH --exclude=c1cmp[025-026] # The nodes to exclude when using AlphaFold2
# Load the AlphaFold2 and NVIDIA modules
module load alphafold/2.1.1
nvidia-smi
# Make the results direcory
mkdir /path/to/your/home/directory/af2
# Specify where your output directory and raw data are
outputpath=/path/to/your/home/directory/af2
fastapath=/path/to/your/home/directory/data/1AXC.fasta
# Date to specify if you want to avoid using template
maxtemplatedate=2020-06-10
source activate alphafold2.1.1
# Running alphafold 2.1.1
runaf2 -o $outputpath -f $fastapath -t $maxtemplatedate -m multimer
What do these commands mean?
module load alphafold/2.1.1
: load the AlphaFold2 modulenvidia-smi
: load the NVIDIA-SMI modulemkdir /path/to/your/home/directory/af2
: make a directory for our outputsoutputpath=/path/to/your/home/directory/af2
: specify where our outputs should gofastapath=/path/to/your/home/directory/data/1AXC.fasta
: specify our input FASTA filemaxtemplatedate=2020-06-10
: specify our maximum template datesource activate alphafold2.1.1
: activate our AlphaFold2 conda environmentrunaf2 -o $outputpath -f $fastapath -t $maxtemplatedate -m multimer
: run our AlphaFold2 program-o
: output path-f
: our input FASTA file path-t
: our maximum template date-m
: whether or not this is a multimeric protein
What's up with the maxtemplatedate?
The maxtemplatedate
option is a bit more complicated. If we ask AlphaFold to predict the structure of a protein with a structure already in the Protein Data Bank (PDB) - then we have the option of using that structure in the prediction. If we do not want AlphaFold 2 to use this structure in the prediction we need to specify a date before the release date of that structure.
What do the SBATCH commands mean?
command |
description |
---|---|
#!/bin/bash |
specify our script is a bash script |
#SBATCH -p ccgpu |
The partition we are requesting, and if you don't have ccgpu access, use "preempt" |
#SBATCH -n 8 |
The number of cpu cores we would like - here it is 8 |
#SBATCH --mem=64g |
The amount of RAM we would like - here it is 64 Gigabytes |
#SBATCH --time=2-0:00:00 |
The time we think our job will take - here we say 2 days (days-hours:minutes:seconds) |
#SBATCH -o output.%j |
The name of the output file - here it is "output.jobID" |
#SBATCH -N 1 |
The number of nodes we would like - here it is 1 |
#SBATCH --gres=gpu:1 |
The number of GPUs - here we ask for 1 |
#SBATCH --exclude=c1cmp[025-026] |
These are nodes to exclude when using AlphaFold2 |
Run the AlphaFold2 Batch Script
- First save your batch script, here we will save it as
runaf2.sh
.- The bash script must end in
.sh
- The bash script must end in
- Now submit your script with the command:
sbatch runaf2.sh
- To check on the status of this script use the following command:
squeue -u your_utln
- Be sure to swap out
your_utln
with your Tufts utln!