Tutorial

From victor
Jump to: navigation, search

In the sample folder are available all input and output files used in this tutorial.
Consider also to visit the Features page for further examples.

Target/Template alignment

Supposing you have already found a template candidate, you need to align it against your target sequence. In this dummy example we take the sequences of two homologous proteins both endowed with 3D structure. That allows us to compare different type of Victor alignments with the "exact" one derived from the structural alignment.

The two proteins are:

  • Target = 2ANL (chain A)
  • Template = 1DP5 (chain A)

The two structure superimpose quite well (RMSD 2.03 A) considering the low level of sequence identity (28.06%). This is the resulting sequence alignment after the 3D alignment done by FATCAT:

3d align.png

Alignment cartoon.png

rigth


The subali application let you choose from very different type of algorithms, strategies and parameters. The fist step is to create a file (i.e. pair.fasta, already available in the sample folder) including both the target and template Fasta sequences together like that:

>2ANL:A Target
SENDVIELDDVANLMFYGEGEVGDNHQKFMLIFDTGSANLWVPSKKCNSIGCSTKHLYDSSKSKSYEKDGTKVEITYGSG
TVRGFFSKDLVTLGYLSLPYKFIEVTDTDDLEPLYTAAEFDGILGLGWKDLSIGSIDPIVVELKNQNKIDQALFTFYLPV
HDKHSGYLTIGGIEEKFYEGELTYEKLNHDLFWQVDLDVNFGKTSMEKANVIVDSGTSTITAPTSFINKFFKDLNVIKVP
FLPFYITTCNNKDMPTLEFKSANNTYTLEPEYYMEPLLDIDDTLCMLYILPVDIDKNTFILGDPFMRKYFTVFDYDKESI
GFAVAKN
>1DP5:A Template
GGHDVPLTNYLNAQYYTDITLGTPPQNFKVILDTGSSNLWVPSNECGSLACFLHSKYDHEASSSYKANGTEFAIQYGTGS
LEGYISQDTLSIGDLTIPKQDFAEATSEPGLTFAFGKFDGILGLGYDTISVDKVVPPFYNAIQQDLLDEKRFAFYLGDTS
KDTENGGEATFGGIDESKFKGDITWLPVRRKAYWEVKFEGIGLGDEYAELESHGAAIDTGTSLITLPSGLAEMINAEIGA
KKGWTGQYTLDCNTRDNLPDLIFNFNGYNFTIGPYDYTLEVSGSCISAITPMDFPEPVGPLAIVGDAFLRKYYSIYDLGN
NAVGLAKAI


Sequence to sequence alignment

Supposing we call the input file with the target and template sequences pair.fasta than by running the following command you obtain a basic alignment with the default parameters (see Features):

subali --in pair.fasta

The resulting alignment is this:

Default align.png


Profile to profile alignment

Most of the time including evolutionary information helps improving the alignment quality. In this example we used PsiBlast to calculate profiles both for the target and the template sequences. The PsiBlast tool is available at:


The profiles have to be generated in a specific format.

When using the online service set the "Formatting options" specifying "Show -> Alignment as -> Plain text" and "Alignment View -> Flat query-anchored with letters for identities".

Instead, if you prefer to generate the input using the command line tool remember to use "-outfmt 4" ("6" in older versions of Blast, see Features). In our case the commands are:

psiblast -num_iterations 3 -db /db/blastdb/nr90 -query 2anl_A.fasta -out 2anl_A.psi -outfmt 4

psiblast -num_iterations 3 -db /db/blastdb/nr90 -query 1dp5_A.fasta -out 1dp5_A.psi -outfmt 4

The output files 2anl_A.psi and 1dp5_A.psi are provided in the samples folder.


Then to generate the alignment simply run:

subali --in pair.fasta --pro1 2anl_A.psi --pro2 1dp5_A.psi


Profile align.png

Evaluate 3D models

Based on the alignment created in the previous section we can easily model the target. In our case, for simplicity, we used the SwissModel online service. In the sample folder you can find two files:

  • model_default.pdb - The model obtained from the default sequence-to-sequence alignment.
  • model_profile.pdb - The model obtained from the profile-to-profile alignment.


Model default.png
rigth

By using the following commands for the two models:

frst -v -i model_default.pdb
frst -v -i model_profile.pdb


We obtain the following output (last line):

model_default.pdb	-29822.6749	-6266.6390	 -18.1340	-223.0000	 -46.3974
model_profile.pdb	-1549.1603	-3196.3570	  -1.1504	-236.0000	 -13.6659


Where numbers represent the following energies:

FRST | RAPDF | Solvation | Hydrogen | Torsion


For comparison, the experimental structure of the target obtains the following energies:

2ANL.pdb (Target)	-26691.5315	-8822.3390	 -23.3529	-224.0000	 -11.8836


According to these results the profile alignment is worse than the default alignment. Moreover the model generated from the default alignment appears to be more stable of the native structure of the Target protein (PDB id 2ANL), one explanation could be that SwissModel favours stability when generating structures. The general idea is that with Victor you can easily generate different alignments (changing algorithms and/or parameters) and you can effectively test them by evaluating the quality of the 3D models built from these alignments.

For an extensive discussion about these methods visit the References page.

Build loops

In the last section we will show you how to build a loop. In this example we take the 3DFR and try to model the loop of 4 residues from the position 89 to 93.

The first step is to generate a LUT (see Features) of size 4:

loboLUT_all -a 4

After that you should find in the data folder of the Victor package the following files: aa2.lt, aa3.lt, aa4.lt.

Now to model the loop simply do:

lobo -i 3DFR.pdb -c A -s 89 -e 93


N.B. To save disk space the "-c" option of loboLUT_all can be used to create only the necessary LUTs in a given range:

loboLUT_all -c 10

will generate only aa2, aa3, aa5, aa10.

Create a new project

To make your own project you need a source file and a make file. It follows a simple program for loading in memory a PDB file and the necessary makefile.


  1.  
  2.   #include <PdbLoader.h>
  3.   #include <Protein.h>
  4.   #include <iostream>
  5.  
  6.   using namespace Victor::Biopool;
  7.   using namespace Victor; 
  8.  
  9.   int main( int argc, char* argv[] ) {
  10.  
  11.      string inputFile = "MyPdbFile.pdb";
  12.      ifstream inFile( inputFile.c_str() );
  13.      PdbLoader pl(inFile);    // creates the PdbLoader object
  14.  
  15.      Protein prot;            
  16.      prot.load( pl );         // creates the Protein object
  17.   }


To generate the corresponding executable you can use the following make file.


  1.  #--*- makefile -*--------------------------------------------------------------
  2.  #
  3.  #   Standard makefile
  4.  #
  5.  #------------------------------------------------------------------------------
  6.  
  7.  # Path to project directory
  8.  UPDIR = /home/Victor/
  9.  # Path to subdirectories
  10.  SUBDIR =
  11.  # Path to directory for binaries
  12.  BINPATH = /home/Victor/bin
  13.  
  14.  #
  15.  # Libraries and paths 
  16.  #
  17.  
  18.  LIBS = -lBiopool -ltools
  19.  
  20.  LIB_PATH = -L.  -L/home/Victor/lib/
  21.  
  22.  INC_PATH +=    -I. -I/home/Victor/tools/ -I/home/Victor/Biopool/Sources/          
  23.  
  24.  #
  25.  # Options
  26.  #
  27.  
  28.  CC=g++
  29.  CFLAGS=-I.  -ansi -pedantic -DNEXCEPTIONS -DLINUX -c -O3 -ffast-math -DNDEBUG -ftemplate-depth-36 -Wno-reorder  -Wno-uninitialized -Wno-write- strings -Wno-narrowing 
  30.  
  31.  #
  32.  # Install rule
  33.  #
  34.  
  35.  install:  
  36. 	$(CC)   $(CFLAGS)   $(INC_PATH) -c test.cc -o test.o  
  37. 	$(CC)   test.o -o test $(LIB_PATH) $(LIBS)


Notice that in this example only Biopool and tools modules are used.