Thursday, 11 June 2015

BLAST (Basic Local Alignment Search Tool)

INTRODUCTION

     BLAST, Basic Local Alignment Search Tool is an algorithm for comparing primary biological sequence information from, such as the nucleotides of DNA sequences of the same or different organisms. A BLAST search enables user to compare a query sequence with the information that keep in database of sequences. The software will emphasizes regions of local alignment to detect relationships among sequences which share only isolated regions of similarity.

     There are several Blast programs with different functions such as:

blastp - Compare an amino acid query sequence against a protein sequence database.
blastn - Compare a nucleotide query sequence against a protein sequence database.
blastx - Compare an nucleotide query sequence translated in all reading frames against a protein sequence database.
tblastn - Compare a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames.
tblastx - Compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.



How to perform a search in BLAST?


Here's the guide to perform a BLAST search.



Step 1. Enter the NCBI website.

Step 2. Click into the program need to be perform.


Step 3. Key in the data need to be search.


Step 4. The software will take some time to generate the resuts.






Step 5. The results is shown as above.


     Conclusion, the BLAST programs improved the overall speed of searches while retaining good sensitivity (important as databases continue to grow) by breaking the query and database sequences into fragments ("words"), and initially seeking matches between fragments. As result, it can identified a unknown species sample or find homologous species within a shorten time which is beneficial to all researchers.










AUTO DOCK

1. Opening the protein pdb file.

Go to File/Read Molecule.

Slide1.BMP

Read in a pdb file for a protein that has no hydrogen atoms.  
Here it call as dhp_noh.pdb.   
Before you read in the pdb file open it and change the iron atom from FE to Fe.   
In general, all element must have the same appearance as in the periodic table for use in Autodock.  

Slide2.BMP

The protein will look as below.

Slide3.BMP


2. Assign polar hydrogens and save the file as a pdbqt file.

Go to Edit/Hydrogens/Add.


Slide4.BMP
Select “Polar Only”.   
This means O and N atoms which is polar will get an H.  
C-H which is non polar will be treated as a unit in the electrostatic calculation.

Slide5.BMP

Now we need to write the file type for the Autodock calculation.  
It has a pdbqt extension.
Go to Grid/Macromolecule/Choose.

Slide6.BMP

Make sure the molecule is highlighted and click on “Select Molecule”

Slide7.BMP

Write the file as a pdbqt on the menu that appears. 

   3. Determine a grid center and grid dimensions for the Autodock calculation. 
      Edit the pdbqt file using an editor like UltraEdit.  
      Here we want to identify some location in the protein that will serve as the center of the grid.   
      I will choose the Fe atom since I am most interested in the heme.   
      You could also choose the center of mass of the protein.   
       Recall that you can find the center of mass in VMD using the commands 
      Ø  set OBJECT [atomselect top prot]
      Ø  measure center $OBJECT
 

Next we choose Grid/Grid Box

Slide9.BMP

The Grid menu appears.
 

Adjust the Spacing to 1.00 so the grid is in even increments of 1.00 Angstrom.
Input the center of the grid and then adjust the grid size.   
The grid dimensions can range from 1 to 40 Angstroms.

Slide13.BMP

The 18 x 18 x 18 grid is appropriate to focus on the heme itself.
 

The 32 x 40 x 30 grid covers essential the whole protein. 

4. Read in the substrate from a pdb file and then generate the pdbqt file.

Next we need to read in the substrate.   
First, we want to check the substrate file to make sure that the atom types have the appropriate names.  
 For example, bromine needs to be Fe and not FE.

Slide17.BMP


To load the substrate into Autodock Tools use the Ligand/Input/Open menu selection.

Slide15.BMP

We choose tbp.pdb.

Slide16.BMP
‘Hide the protein and center the substrate (just for appearance).


Slide18.BMP

When the ligand is read into the program, it automatically is assigned charges. 
To write the ligand as an output file use Ligand/Output and then select to write it as a pdbqt file.

Slide19.BMP

Aside from writing the pdbqt files, the Autodock Tools interface does not generate the configuration file for “vina” the docking routine. 

5. Edit the configuration file to prepare to run Vina.

Along the way you need to save the information you have obtained, such as the center of mass and grid box and write that information in a very simple configuration file shown below.

Slide20.BMP

The output will be put in all.pdbqt as a series of 10 different structures of the substrate. 
The exhaustiveness tells the program how hard to search.   
In our case the molecule is very simple and the protein is relatively small so we set the exhaustiveness to the maximum value of 16.

6. Dock the substrate using Vina.

To run vina, go to the Command Prompt.

Slide21.BMP

Navigate to the appropriate directory.

Slide22.BMP

One clever trick is to put the vina program (which you download from the Autodock website) into the Windows/system32 directory.  
From the directory where the program is you type.

      Ø  copy vina.exe c:\windows\system32

Then you may type vina anywhere on your system and it will run the vina program.  If you type

      Ø  vina

You get a kind of help menu.

Slide24.BMP

To run the program using the configuration file config.txt you type.

      Ø  vina –config config.txt –log log.txt

Note use two dashes - -, not one.  Once vina starts running you will see the progress on the screen.   
It typically takes only a few minutes for our system.

Slide25.BMP

You can read in the final pdbqt structure using the Read/Molecule command and it will show you all of your docked structures.   
The green substrate remains from previous work.   
We could also delete it if we want to put the protein in the center of the screen.

Slide27.BMP

An important final step is to read your docked inhibitor molecules into VMD and check that the molecular bonds are all in order. 
Sometimes atoms get switched or minus signs get deleted during translation of file types between pdbqt and pdb. 
A common error that we have observed is for the CAG and CAH carbon atoms to be reversed in the docked inhibitor. 
Then the bonds look like a zig-zag across the phenyl ring. 
You may fix this quick simply by editing the pdb file and reversing the CAG and CAH atom names.