PROFASI  Version 1.5
Public Member Functions | List of all members
prf::AtomLabelDictionary Class Reference

A translator for atom labels from other conventions. More...

#include <AtomLabelDictionary.hh>

Public Member Functions

bool interpret (std::string three_letter_code, std::string &atomlabel)
 Interpret for a given residue what atomlabel means.
 
bool interpret (prf::OneLetterCode olc, std::string &atomlabel)
 Interpret for a give residue what atomlabel means.
 
int learn (std::vector< std::set< std::string > > &new_labels)
 Learn new labels from a bunch of label sets, one for each kind of residue.
 

Detailed Description

Sometimes, one encounters atom labels that are not consistent with conventions used in ProFASi. This happens in particular for hydrogen atoms, as there are lots and lots of different ways people label them in PDB files.

Here is the ProFASi convention: (i) A hydrogen atom is labeled with the character 'H' and two indices. (ii) The first index, a subscript, is the same as the index of the heavy atom it is attached to. The subscript is written immediately to the write of the atom type label 'H'. For instance, the index 'A' in "HA" is derived from the heavy atom attached to the HA: "CA". (ii) In case there are more than one hydrogen atoms attached to the same heavy atom, they are to be distinguished with a second index, a superscript. Superscripts are written to the left of the atom type label 'H'. Superscript numbering starts from 1.

Notice that there can not be a label "HG11" or " HB3" with this convention. The two indices attached to the hydrogen are fundamentally of different kinds, and we insist that they should be written on different sides of the 'H' for clarity.

Since we nevertheless want to be able to read PDB files generated by other programs, this class provides a translation to the ProFASi conventions. Some labels, like "HG11" are easily interpreted: perform a cyclic rotation of these characters to put the 'H' at the right place. But this does not work. In many cases, when there are two hydrogen atoms, for instance attached to a CB atom, they are labeled " HB2" and " HB3", i.e., the numbering is not meant to describe the hydrogens alone. Also, some translations have to be residue specific.

Therefore, all translations will be done using a simple table look up. There is a learn() function which can be used to train the dictionary to recognize new labels. The updated dictionary is then saved as a file ".profasi_dictionary" in the user's home directory. The next time the same label is encountered, the learned value is used.

Member Function Documentation

int prf::AtomLabelDictionary::learn ( std::vector< std::set< std::string > > &  new_labels)

The size of the vector should be Groups::max_grp, or the total number of different Ligands/groups known to ProFASi. It's elements are accessed using the one letter codes for the groups. For each group there is a set of labels which can be interactively modified. Often, many of these lists would be empty. For each residue with a non-empty list of labels, the user is presented with a list of labels understood by ProFASi. The user then maps the unknown labels to known labels. The user chooses when editing stops and the function quits. The return value is the actual number of changes due to the user interaction.


The documentation for this class was generated from the following files:

PROFASI: Protein Folding and Aggregation Simulator, Version 1.5
© (2005-2016) Anders Irbäck and Sandipan Mohanty
Documentation generated on Mon Jul 18 2016 using Doxygen version 1.8.2