BioComputing

    RING - Residue Interaction Network Generator

BioComputing

Quick Help Output References Examples Method Cytoscape File format

Network files for Cytoscape


Specifications for RING output files encoding residue interaction networks

RING stores each interaction network and its calculated attributes in files using the Simple Interaction Format (SIF) recognized by CYTOSCAPE.

For this purpose we have established specifications regarding the definition of names to be given to nodes, edges and files encoding the network. The files with extension .sif must contain in the name the word pdb followed by the 4-letter PDB ID code identifying the structure from which it was derived.

For example, if the network is created from a PDB file with code 1abc then the name must be at least pdb1abc.sif. You can still store more information in the .sif file name, provided it is separated from the PDB ID with the symbol "_". For example a compatible name is:

net_pdb1abc_new.sif

Node names must be unique since in the SIF format, nodes with identical names are handled by CYTOSCAPE as the same node. RING offers the user to select between two node name types, one short (default) and a longer name. In the long type each residue is uniquely identified by: the four letter PDB code which it comes from, the one letter chain to which it belongs, the PDB residue number, the identification insertion code (if present) and the name of the residue in 3 letters code. Each field must be separated by ":"; if a field is missing, it is replaced with "_". An example of this format is as follows:

pdb1abc:A:42:_:GLU

This identifies the 42nd residue of chain A (Glu) in the PDB file "1abc", without identity insertion code.

In the short node name type, residues are identified by: the one letter chain to which it belongs, the PDB residue number, the identification insertion code (if present) and the name of the residue in 3 letters code. As in the long type each field must be separated by ":"; if a field is missing, it is replaced with "_". An example of this format is as follows:

A:42:_:GLU

This identifies the same amino acid as in the previous example.

An example of a connection between two nodes is:

A:42:_:GLU <edge type> A:57:_:PRO

It identifies the link between the 42nd residue of chain A (Glu) and 57th residue of the same chain (Pro) from the PDB file "1abc", both without identity insertion code. The type of connection is specified by <edge type>. This format takes into account the directionality of the link, if the order of the source and target nodes is swapped it refers to a different connection.

To delimit the names of the nodes from the connection type the "tab" character should be used as specified by the SIF format.

The name of each link was defined to specify clearly the type and subtype of interaction. Interaction types are as follows:

RESINT    Interaction between C-alpha atoms
IAC    Interaction between closest atoms
CNT    van der Waals interaction
HBOND    Hydrogen bond
IONIC    Salt bridge
PICATION    π-cation interaction
PIPISTACK    π-π interaction
SSBOND    Disulfide bridge
PB    Peptide Bond

Interaction subtypes which rappresents the interaction site of which the respective atoms are includes are:

mc-mc    Interaction between main chain atoms
mc-sc    Interaction between a main chain atom and a side chain atom
sc-sc    Interaction between side chain atoms

Thus, an example of interaction with interactionType:interactionSubtype in the previous example could be:

A:42:_:GLU hbond:mc-mc A:57:_:PRO

This example specifies that between residues Glu 42 and Pro 57 of the chain encoded in the PDB file "1abc", there is a hydrogen bond formed by their main chain atoms.

Nodes and interaction attributes should be placed in a separate file. For node attribute file names the extension is denoted .na while .ea is the extension for interaction attributes files. See details about the .sif format in CYTOSCAPE manual .

Each node's attribute file starts with a header defining the attribute name, type (string, double, integer) and other additional meta-information as specified by the CYTOSCAPE format for the attribute file. Each line contains the name of a node, followed by the sign "=" and the value of the attribute. The separation takes place through the space character as provided in SIF format.

An example of a short attribute file for nodes according to this specification could be :

ResidueName (class = String)

A:42:_:GLU = GLU
A:57:_:PRO = PRO

This file assigns to each node the 3-letters code of the corresponding residue.

Each interaction attribute file starts with a header defining the attribute name, type (string, double, integer) and other additional meta-information, as specified by CYTOSCAPE for attribute files. Each line contains the name of a node, followed by the sign "=" and the value of the attribute. The separation takes place through the space character as in SIF format.

An example of a short interaction attributes file is:

Weight (class = Integer)

A:42:_:GLU (hbond: mc-mc) A:57:_:PRO = 12
A:86:_:SER (hbond: sc-sc) A:88:_:ARG = 4
A:142:_:SER (hbond: mc-mc) A:141:_:SER = 16

In this case, each link is assigned an integer representing the "strength" of the interaction in the network.
The standard defined here was tested in CYTOSCAPE 3.0 and earlier versions.

RING Network Files for Cytoscape

RING collects all generated files for viewing the network in CYTOSCAPE using the standard specifications described above plus the multiple sequence alignment in FASTA format and the file for nodes degree analysis in a zip file called Ring_results.zip.

This file contains:

  1. A folder named as the PDB file submitted to the server. Its name can change as follows:

    • If the user entered a valid PDB code the name in the folder will contain the 4 letters PDB ID.

    • If the user loads a PDB file the folder will be named with the generic name 1ABC.

    The name may vary also depending on the Hydrogen atoms options choice:

    • If the user has chosen the options Replace all or Keep existing the directory name will be preceded by the prefix H_.

    • If the user has chosen the option No correction there will not be a H_ prefix.

  2. The Readme.txt file.

The file Readme.txt contains the instructions needed to load in CYTOSCAPE the required files for the visualization of the network and its attributes; it also contains a memorandum on the encoding specification of those files.

The directory described above contains the following files whose names vary depending on chosen input options:

  1. A file with extension .sif encoding the structure of the network itself.

  2. The folder *_Edge_Atrib which contains all attribute files for network connections.

  3. The foder *_Node_Atrib which contains all attribute files for the node.

  4. The folder VIZ-PROPS witch contains preset visualization styles ready to be loaded in the Cytoscape styles manager VizMapper, useful to graphically emphasize the different network attributes of nodes and edges.

  5. A file named conservation.fasta containing the multiple sequence alignment in FASTA format produced by running PSI-BLAST against the UniRef90 sequence database

  6. If the user has selected Node degree sequence alignment there will also be present the following files:

  7. A file with .fasta extension containing the multiple sequence alignment filtered according to the increasing degree of the corresponding nodes previously described.

  8. A file with .anal extension containing information regarding the nodes degree analysis of the amino acids making the network as explained above.

  9. A file called ring.log containing the log of the RING session.

  10. A file called ring-input.txt describing the input parameters for the RING session.

  11. A file called named with the correspondent PDB code and extension .dssp containing the output of the DSSP program.

Files related to the C-alpha network edge attributes (in the subdirectory "Alpha_Edge_Atrib")

There are 11 edges attributes for this type of network each in a different file. Attribute file names and contents are listed below:

  1. Alpha_Edge_Atrib_Atom_Type.ea indicates the closest atoms that determine the interaction (will always be C-alpha).

  2. Alpha_Edge_Atrib_Closest_Distance.ea contains the measured distance between the atoms that determine the interaction (i.e. distance between the C-alphas).

  3. Alpha_Edge_Atrib_is_HBond.ea indicates if there is at least one hydrogen bond between the pair in contact.

  4. Alpha_Edge_Atrib_is_Salt_Bridge.ea indicates if there is a salt bridge between the pair in contact.

  5. Alpha_Edge_Atrib_is_PiCation_Interaction.ea indicates if there is a π-cation interaction between the pair in contact.

  6. Alpha_Edge_Atrib_is_PiPi_Stack.ea indicates if there is a π-π interaction between the pair in contact.

  7. Alpha_Edge_Atrib_is_Disulfide_Bond.ea indicates if there is a disulfide bridge interaction between the pair in contact.

  8. Alpha_Edge_Atrib_Interaction_Site.ea contains the involved interaction sites of the aminoacids.

  9. Alpha_Edge_Atrib_Edge_Weight.ea contains the weights assigned to single interactions.

  10. Alpha_Edge_Atrib_Mutual_Info.ea contains Mutual Information assigned to single interactions.

  11. Alpha_Edge_Atrib_APC.ea contains Average Product Correction assigned to single interactions.

Files related to the C-alpha network node attributes (in the subdirectory "Alpha_Node_Atrib")
There are 12 node attributes for this type of network each in a different file.
  1. Alpha_Node_Atrib_Secondary_Structure.ne indicates the secondary structure conformation of each amino acid in the sequence. The information is derived from the φ and ψ torsion angles. The allowed values are "E" for sheet, "H" for helix, "C" for coil and "" for unknown.

  2. Alpha_Node_Atrib_Secondary_Structure_DSSP3.ne indicates the secondary structure conformation of each amino acid in the sequence according to DSSP in three classes as follows: "H" = helix : DSSP's "H" (alpha helix) + "G" (3-10 helix) + "I" (π-helix) classes; "E" = strand : DSSP's "E" (extended strand) + "B" (beta-bridge) classes; "C" = the rest : DSSP's "T" (turn) + "S" (bend) + "." (the rest).

  3. Alpha_Node_Atrib_Secondary_Structure_DSSP8.ne indicates the secondary structure conformation of each amino acid in the sequence according to DSSP in eight classes as follows: "H" (alpha helix); "G" (3-10 helix); "I" (π-helix); "E" (extended strand); "B" (beta-bridge); "T" (turn); "S" (bend); "." (the rest).

  4. Alpha_Node_Atrib_Solvent_Accesibility.ne indicates the solvent accessibility of each amino acid in the sequence computed by DSSP.

  5. Alpha_Node_Atrib_Conservation.ne indicates the conservation of each amino acid in the sequence computed on the multiple sequence alignment produced by PSI-BLAST on the UniRef90 database.

  6. Alpha_Node_Atrib_Energy_FRST.ne indicates the energy of each amino acid in the sequence computed by FRST.

  7. Alpha_Node_Atrib_Energy_TAP.ne indicates the energy of each amino acid in the sequence computed by TAP.

  8. Alpha_Node_Atrib_Nodes_Degree.ne indicates the nodes degree.

  9. Alpha_Node_Atrib_ID.ne indicates the complete identifier of the node.

  10. Alpha_Node_Atrib_Bfactor.ne indicates C&alpha B-factor of the residues.

  11. Alpha_Node_Atrib_Occupancy.ne indicates C&alpha Occupancy of the residues.

  12. Alpha_Node_Atrib_Comul_Mutual_Info.ne indicates Comulative Mutual Information of each residue.

Files related to the Closest Atoms Network edges attributes(in the subdirectory "Closest_allHB_Edge_Atrib"

There are 29 edge attributes reported for this type of network are. Each of them are contained in a different file. Attribute file names and contents are listed below:

  1. Closest_allHB_Edge_Atrib_Atom_Type.ea contains the closest atoms determining the interaction.

  2. Closest_allHB_Edge_Atrib_Closest_Distance.ea contains the distance measured between the atoms determining the interaction.

  3. Closest_allHB_Edge_Atrib_Closest_Distance.ea contains if there is at least one hydrogen bond between the pair in contact.

  4. Closest_allHB_Edge_Atrib_HBond_DA.ea contains the atom donor code – the atom acceptor code.

  5. Closest_allHB_Edge_Atrib_HB_DA_Distance.ea contains the distance between donor and acceptor atoms.

  6. Closest_allHB_Edge_Atrib_HB_DHA_Angle.ea contains the angle between the donor and the acceptor atoms, with the vertex on the hydrogen atom.

  7. Closest_allHB_Edge_Atrib_is_Salt_Bridge.ea contains if there is a salt bridge between the pair in contact.

  8. Closest_allHB_Edge_Atrib_Salt_Positive_Negative.ea contains the 3-letters code of the positively charged amino acid – the 3-letter code of the negatively charged amino acid.

  9. Closest_allHB_Edge_Atrib_Salt_Distance.ea contains the distance between the charges mass centers.

  10. Closest_allHB_Edge_Atrib_Salt_Angle.ea contains the ρ angle of the bridge.

  11. Closest_allHB_Edge_Atrib_is_PiPi_Stack.ea contains if there is a π-π interaction between the pair in contact.

  12. Closest_allHB_Edge_Atrib_PiPi_Stack_N.ea contains the value of n.

  13. Closest_allHB_Edge_Atrib_PiPi_Stack_P.ea contains the value of p.

  14. Closest_allHB_Edge_Atrib_PiPi_Stack_Ring_Angle.ea contains the value of the θ angle.

  15. Closest_allHB_Edge_Atrib_PiPi_Stack_Orientation.ea is the final spatial conformation of the pair.

  16. Closest_allHB_Edge_Atrib_is_PiCation_Interaction.ea contains if there is a π-cation between the pair in contact.

  17. Closest_allHB_Edge_Atrib_PiCation_C-PI.ea contains the 3-letters code of the positively charged amino acid – the 3-letter code of the aromatic amino acid.

  18. Closest_allHB_Edge_Atrib_PiCation_Distance.ea contains the distance between the cation mass center and the closest aromatic system's atom.

  19. Closest_allHB_Edge_Atrib_PiCation_Angle.ea contains the angle between the cation and the aromatic system.

  20. Closest_allHB_Edge_Atrib_PiCation_Is_Guanidinium_ion.ea contains the presence of a guanidinium ion.

  21. Closest_allHB_Edge_Atrib_PiCation_Guanidinium_Angle.ea contains the angle θ between the plane defined by the guanidinium ion and that defined by the π-system.

  22. "Closest_allHB_Edge_Atrib_PiCation_Guanidinium_Orientation.ea" contains the mutual positions of the aromatic systems and the guanidinium ion.

  23. Closest_allHB_Edge_Atrib_is_Disulfide_Bond.ea contains if there is a disulfide bridge between the pair in contact.

  24. Closest_allHB_Edge_Atrib_Disulfide_Distance.ea contains the calculated distance of the bridge.

  25. Closest_allHB_Edge_Atrib_Disulfide_Angle.ea contains the angle χ of the bridge.

  26. Closest_allHB_Edge_Atrib_Interaction_Site.ea contains the involved interaction sites of the aminoacids.

  27. Closest_allHB_Edge_Atrib_Edge_Weight.ea contains the weights assigned to single interactions.

  28. Closest_allHB_Edge_Atrib_Mutual_Info.ea contains the Mutual Information assigned to single interactions.

  29. Closest_allHB_Edge_Atrib_APC.ea contains the Average Product Correction assigned to single interactions.

File related to the Closest atoms network nodes attribute (in the subdirectory "Closest_allHB_Node_Atrib")
There are 12 node attributes for this type of network each in a different file.
  1. Closest_allHB_Node_Atrib_Secondary_Structure.ne indicates the secondary structure conformation of each amino acid in the sequence. The information is derived from the φ and ψ torsion angles. The allowed values are "E" for sheet, "H" for helix, "C" for coil and "" for unknown.

  2. Closest_allHB_Node_Atrib_Secondary_Structure_DSSP3.ne indicates the secondary structure conformation of each amino acid in the sequence according to DSSP in three classes as follows: "H" = helix : DSSP's "H" (alpha helix) + "G" (3-10 helix) + "I" (π-helix) classes; "E" = strand : DSSP's "E" (extended strand) + "B" (beta-bridge) classes; "C" = the rest : DSSP's "T" (turn) + "S" (bend) + "." (the rest).

  3. Closest_allHB_Node_Atrib_Secondary_Structure_DSSP8.ne indicates the secondary structure conformation of each amino acid in the sequence according to DSSP in eight classes as follows: "H" (alpha helix); "G" (3-10 helix); "I" (π-helix); "E" (extended strand); "B" (beta-bridge); "T" (turn); "S" (bend); "." (the rest).

  4. Closest_allHB_Node_Atrib_Solvent_Accesibility.ne indicates the solvent accessibility of each amino acid in the sequence computed by DSSP.

  5. Closest_allHB_Node_Atrib_Conservation.ne indicates the conservation of each amino acid in the sequence computed on the multiple sequence alignment produced by PSI-BLAST on the UniRef90 database.

  6. Closest_allHB_Node_Atrib_Energy_FRST.ne indicates the energy of each amino acid in the sequence computed by FRST.

  7. Closest_allHB_Node_Atrib_Energy_TAP.ne indicates the energy of each amino acid in the sequence computed by TAP.

  8. Closest_allHB_Node_Atrib_Degree.ne indicates the nodes degree.

  9. Closest_allHB_Node_Atrib_ID.ne indicates the complete identifier of the node.

  10. Closest_allHB_Node_Atrib_Bfactor.ne indicates C&alpha B-factor of the residues.

  11. Closest_allHB_Node_Atrib_Occupancy.ne indicates C&alpha Occupancy of the residues.

  12. Closest_allHB_Node_Atrib_Comul_Mutual_Info.ne indicates Comulative Mutual Information of the residues.

Files related to the van der Waals Network edges attributes (in the subdirectory "VdW_allHB_Edge_Atrib")

There are 30 edge attributes reported for this type of network are. Each of them are contained in a different file. Attribute file names and contents are listed below:

  1. VdW_allHB_Edge_Atrib_Atom_Type.ea contains the closest atoms.

  2. VdW_allHB_Edge_Atrib_Closest_Distance.ea contains the distance measured between the closest atoms.

  3. VdW_allHB_Edge_Atrib_Closest_Distance.ea contains if there is at least one hydrogen bond between the pair in contact.

  4. VdW_allHB_Edge_Atrib_HBond_DA.ea contains the atom donor code – the atom acceptor code.

  5. VdW_allHB_Edge_Atrib_HB_DA_Distance.ea contains the distance between donor and acceptor atoms.

  6. VdW_allHB_Edge_Atrib_HB_DHA_Angle.ea contains the angle between the donor and the acceptor atoms, with the vertex on the hydrogen atom.

  7. VdW_allHB_Edge_Atrib_is_Salt_Bridge.ea contains if there is a salt bridge between the pair in contact.

  8. VdW_allHB_Edge_Atrib_Salt_Positive_Negative.ea contains the 3-letters code of the positively charged amino acid – the 3-letter code of the negatively charged amino acid.

  9. VdW_allHB_Edge_Atrib_Salt_Distance.ea contains the distance between the charges mass centers.

  10. VdW_allHB_Edge_Atrib_Salt_Angle.ea contains the ρ angle of the bridge.

  11. VdW_allHB_Edge_Atrib_is_PiPi_Stack.ea contains if there is a π-π interaction between the pair in contact.

  12. VdW_allHB_Edge_Atrib_Atrib_PiPi_Stack_N.ea contains the value of n.

  13. VdW_allHB_Edge_Atrib_PiPi_Stack_P.ea contains the value of p.

  14. VdW_allHB_Edge_Atrib_PiPi_Stack_Ring_Angle.ea contains the value of the θ angle.

  15. VdW_allHB_Edge_Atrib_PiPi_Stack_Orientation.ea is the final spatial conformation of the pair.

  16. VdW_allHB_Edge_Atrib_is_PiCation_Interaction.ea contains if there is a π-cation between the pair in contact.

  17. VdW_allHB_Edge_Atrib_PiCation_C-PI.ea contains the 3-letters code of the positively charged amino acid – the 3-letter code of the aromatic amino acid.

  18. VdW_allHB_Edge_Atrib_PiCation_Distance.ea contains the distance between the cation mass center and the closest aromatic system's atom.

  19. VdW_allHB_Edge_Atrib_PiCation_Angle.ea contains the angle between the cation and the aromatic system.

  20. VdW_allHB_Edge_Atrib_PiCation_Is_Guanidinium_ion.ea contains the presence of a guanidinium ion.

  21. VdW_allHB_Edge_Atrib_PiCation_Guanidinium_Angle.ea contains the angle θ between the plane defined by the guanidinium ion and that defined by the π-system.

  22. VdW_allHB_Edge_Atrib_PiCation_Guanidinium_Orientation.ea contains the mutual positions of the aromatic systems and the guanidinium ion.

  23. VdW_allHB_Edge_Atrib_is_Disulfide_Bond.ea contains if there is a disulfide bridge between the pair in contact.

  24. VdW_allHB_Edge_Atrib_Disulfide_Distance.ea contains the calculated distance of the bridge.

  25. VdW_allHB_Edge_Atrib_Disulfide_Angle.ea contains the angle χ of the bridge.

  26. VdW_allHB_Edge_Atrib_Interaction_Site.ea contains the involved interaction sites of the aminoacids.

  27. VdW_allHB_Edge_Atrib_Edge_Weight.ea contains the weights assigned to single interactions.

  28. VdW_allHB_Edge_Atrib_Van_der_Waals_contacts.ea contains the van der Waals contact scores by PROBE.

  29. VdW_allHB_Edge_Atrib_Mutual_Info.ea contains the Mutual Information.

  30. VdW_allHB_Edge_Atrib_APC.ea contains Average Product Correction.

File related to the van der Waals network nodes attribute (in the subdirectory "VdW_allHB_Node_Atrib")
There are 12 node attributes for this type of network each in a different file.
  1. VdW_allHB_Node_Atrib_Secondary_Structure.ne indicates the secondary structure conformation of each amino acid in the sequence. The information is derived from the φ and ψ torsion angles. The allowed values are "E" for sheet, "H" for helix, "C" for coil and "" for unknown.

  2. VdW_allHB_Node_Atrib_Secondary_Structure_DSSP3.ne indicates the secondary structure conformation of each amino acid in the sequence according to DSSP in three classes as follows: "H" = helix : DSSP's "H" (alpha helix) + "G" (3-10 helix) + "I" (π-helix) classes; "E" = strand : DSSP's "E" (extended strand) + "B" (beta-bridge) classes; "C" = the rest : DSSP's "T" (turn) + "S" (bend) + "." (the rest).

  3. VdW_allHB_Node_Atrib_Secondary_Structure_DSSP8.ne indicates the secondary structure conformation of each amino acid in the sequence according to DSSP in eight classes as follows: "H" (alpha helix); "G" (3-10 helix); "I" (π-helix); "E" (extended strand); "B" (beta-bridge); "T" (turn); "S" (bend); "." (the rest).

  4. VdW_allHB_Node_Atrib_Solvent_Accesibility.ne indicates the solvent accessibility of each amino acid in the sequence computed by DSSP.

  5. VdW_allHB_Node_Atrib_Conservation.ne indicates the conservation of each amino acid in the sequence computed on the multiple sequence alignment produced by PSI-BLAST on the UniRef90 database.

  6. VdW_allHB_Node_Atrib_Energy_FRST.ne indicates the energy of each amino acid in the sequence computed by FRST.

  7. VdW_allHB_Node_Atrib_Energy_TAP.ne indicates the energy of each amino acid in the sequence computed by TAP.

  8. VdW_allHB_Node_Atrib_Degree.ne indicates the nodes degree.

  9. VdW_allHB_Node_Atrib_ID.ne indicates the complete identifier of the node.

  10. VdW_allHB_Node_Atrib_Bfactor.ne indicates C&alpha B-factor of the residues.

  11. VdW_allHB_Node_Atrib_Occupancy.ne indicates C&alpha Occupancy of the residues.

  12. VdW_allHB_Node_Atrib_Comul_Mutual_Info.ne indicates Comulative Mutual Information of the residues.


A.J.M. Martin,   11 / 2010