A directory containing the files describing the various genetic tool hits that were used to construct the input MSA.
unrelaxed_model_*.pdb
A PDB file w/ predicted structure, exactly as outputted by the model
relaxed_model_*.pdb
A PDB file w/ predicted structure, after performing an Amber relaxation procedure on the unrelaxed structure prediction
ranked_*.pdb
A PDB file w/ relaxed predicted structures, after reordering by model confidence (using predicted LDDT (pLDDT) scores). ranked_1.pdb = highest confidence ranked_5.pdb = lowest confidence
ranking_debug.json
A JSON file w/pLDDT values used to perform the model ranking, and a mapping back to the original model names.
timings.json
A JSON file w/times taken to run each section of the AlphaFold pipeline.
result_model_*.pkl
A pickle file w/ a nested dictionary of the various NumPy arrays directly produced by the model: StructureModule Output, Distograms, Per-residue pLDDT scores, predicted TMscore, predicted pairwise aligned errors
Assessing AlphaFold2 Accuracy
We can assess the accuracy of the AlphaFold prediction using:
Predicted Local Distance Difference Test (pLDDT)
Predicted Alignment Errorb
Predicted Local Distance Difference Test (pLDDT)
per-residue confidence metric ranging from 0-100 (100 being the highest confidence)
Regions below 50 could indicate disordered regions
Predicted Alignment Error (PAE)
The color at (x, y) corresponds to the expected distance error in residue x’s position, when the prediction and true structure are aligned on residue y.
So, in the example below:
The darker color indicates a lower error
When we are aligning on residue 300, we are more confident in the position of residue 200 and less confident in the position of residue 600
The example in the above came from a multimer prediction
Here we see that the error is higher when assessing the position between the two chains:
Plotting Structure Prediction Information
We can leverage the pkl files to gain insight into our structure predictions. To do so we use a python script provided by the VIB Bioinformatics Core which we call vizaf2.py. First we will need to move back up one directory and load the AlphaFold module so that we have the packages needed to run our script.
cd../../
ls
You should then see the following output:
data lig1 lig1af2.sh pcna pcnaaf2.sh plotaf2.ipynb vizaf2.py
moduleloadalphafold/2.1.1
Now we will need to feed our script three arguments:
--input_dir input directory with model files mentioned above
--output_dir output directory to put our plots of model information