BioComputing

    Homology Modeling with HOMER


Quick Help and References

Description
Homer (HOmology ModellER) is a comparative modelling server for protein structure prediction. In the manual template selection mode it builds a model structure from an alignment (in FASTA format) and a single template structure (PDB format). The latter can be either uploaded directly or be selected from the local PDB database. It may perform loop modeling and sidechain optimization on request and generelly follows a series of best practices established at the bi-annual CASP meetings. The program output, including the constructed model and a per-residue energy profile calculated with FRST, is accessible as a series of dynamic web pages.

E-Mail address
This is both needed for identification purposes and where the (optional) notification will be sent. Please check that it is typed correctly.

Sequence Name
An optional title for your submission. This will appear in the header of the output. We suggest you select one, especially if sending multiple queries, as they may be completed in a different order.

Alignment
An alignment in FASTA format is required. Only the first two sequences will be considered. Using the order switch this can be toggled between target, template (default) and template, target. The target is the query sequence with unknown structure, whereas the template sequence has to match the uploaded PDB file (at least partially).
NB: The alignment wil be used to construct the model as provided by the user, i.e. gaps have to be placed in the alignment. The Jalview editor may be used to modify the alignment and paste it back into the form using the Paste Alignment button.

An example of a valid FASTA format input follows:

>t111
SKIVKIIGREIIDSRGNPTVEAEVHLEGGFVGMAAAPSGASTGSREALELRDGDKSRFLGKGVTKAVAAVNG
PIAQALIGK--DAKDQAGIDKIMIDLDGTENKSKFGANAILAVSLANAKAAAAAKGMPLYEHIAELNGTPGK
YSMPVPMMNIINGGEHADNNVDIQEFMIQPVGAKTVKEAIRMGSEVFHHLAKVLKAKG--MNTAVGDEGGYA
PNLGSNAEALAVIAEAVKAAGYELGKDITLAMDCAASEFYK-DGKYVLAG-----EGNKAFTSEEFTHFLEE
LTKQYPIVSIEDGLDESDWDGFAYQTKVLGDKIQLVGDDLFVTNTKILKEGIEKGIANSILIKFNQIGSLTE
TLAAIKMAKDAGYTAVISHRSGETEDATIADLAVGTAAGQIKTGSMSRSDRVAKYNQLIRIEEALGEKAPYN
GRKEIKGQA---
>1pdy
-SITKVFARTIFDSRGNPTVEVDLYTSK-GLFRAAVPSGASTGVHEALEMRDGDKSKYHGKSVFNAVKNVND
VIVPEIIKSGLKVTQQKECDEFMCKLDGTENKSSLGANAILGVSLAICKAGAAELGIPLYRHIANLAN-YDE
VILPVPAFNVINGGSHAGNKLAMQEFMILPTGATSFTEAMRMGTEVYHHLKAVIKARFGLDATAVGDEGGFA
PNILNNKDALDLIQEAIKKAGYTG--KIEIGMDVAASEFYKQNNIYDLDFKTANNDGSQKISGDQLRDMYME
FCKDFPIVSIEDPFDQDDWETWSKMTSGTT--IQIVGDDLTVTNPKRITTAVEKKACKCLLLKVNQIGSVTE
SIDAHLLAKKNGWGTMVSHRSGETEDCFIADLVVGLCTGQIKTGAPCRSERLAKYNQILRIEEELGSGAKFA
GKNFRAPS----

Template
A PDB file matching the template in the alignment is required. This PDB file can either be selected from the local PDB database by providing a valid ID, or uploaded using the appropriate button. The PDB file may contain more than one chain. If it contains more than one chain, the chain identifier has to be supplied. (Otherwise the first chain will be selected) It may contain more amino acids than the aligned sequence, as long as the latter can be identified.

NB: Please make sure that the Chain selection is either left blank or filled with a valid identifier for the server to produce meaningful results.

Options
If the loop modeling option is checked, the server will try to model all insertions and deletions using the fast loop modelling method LOBO, returning a model that matches the query sequence. Otherwise there may be some breaks in the chain continuity (ie. deletions) and some residues (ie. insertions) may be missing.

If the sidechain placement option is checked, the server will place and optimize all amino acid sidechain positions using SCWRL 3.0. Conserved (ie. identical) sidechains are always copied from the template structure as this ensures that the best geometry is maintained for functionally important residues. Non-conserved sidechains are only modelled if the sidechain placement option is selected.

If the E-mail notification option is checked, an E-mail will be sent to the user once the modelling process has been finished. This especially is useful if the loop modelling option is selected, as the server may need up to a couple of hours to complete the job (for long proteins with many indels).

Output
The output of the HOMER server can be divided in four parts. The top left part contains links to the input parameters and the transcript of the modelling session. This link contains the output of the Homer modelling software during model construction. Any low level errors are reported here, along with the exact details of modelled insertions and deletions.

The FRST energy validation output, complete with partial and composite scores (see the FRST server help for details), and per-residue energy profile of the modelled structure as both TEXT and PDF files are shown in the central part of the output page. Residues with high energy are more likely to contain structural errors.

The annotated Jalview alignment provides the opportunity to display the correspondence between residue positions in the sequence and structure of the model. The template sequence, colored by secondary structure element (gold for strands, red for helices) is shown below the target sequence. Conserved positions between both sequences are colored in blue.

The bottom left part ontains the control buttons to select the structure to be visualized using Jmol in the right part. This can be either the template or model structure. The latter can be colored using the FRST energy Z-score, ranging from blue to white to red. Residues with red colors are more likely to contain structural errors.

Examples
Below is the link to sample output of the HOMER server.

Example 1   -    T0111 was a modelling target in 2000 during CASP-4 blind test. The output was generated from the sample alignment given above (see Alignment) and selecting the PDB structure 1PDY. The latter can also be found at the RCSB PDB homepage.

References

If you use the server in work leading to publications, please cite:
  • Web server:
    Silvio C.E. Tosatto.
    The VICTOR Package for 3D Protein Structure Modelling.
    Submitted.


Additional references to components used by the server are:
  • FRST:
    Silvio C.E. Tosatto.
    The Victor/FRST Function for Model Quality Estimation.
    Journal of Computational Biology 2005; 12(10):1316-1327.

  • LOBO:
    Silvio C.E. Tosatto, Eckart Bindewald, Jürgen Hesser, Reinhard Männer.
    A Divide and Conquer Approach to Fast Loop Modeling.
    Protein Engineering 2002; 15(4):279-286.

  • SCWRL:
    Adrian A. Canutescu, Andrew A. Shelenkov, Roland L. Dunbrack jr.
    A graph-theory algorithm for rapid protein side-chain prediction.
    Protein Science 2003; 12(9):2001-2014.

  • Jalview:
    Michele Clamp, James Cuff, Stephen M. Searle, Geoffrey Barton
    The Jalview Java alignment editor.
    Bioinformatics 2004; 20(3):426-427.
    http://www.jalview.org/

  • Jmol:
    http://jmol.sourceforge.net/



Silvio Tosatto   08 / 2005