PROFASI  Version 1.5
List of all members
extract_snapshot Class Reference

Extracting a snapshot configuration. More...

#include <extract_snapshot.hh>

Detailed Description

The program extract_snapshot browses through the saved trajectory information from a run of a typical profasi simulation program like BasicMCRun, SimTempRun or ParTempRun, and extracts the state of the population at a requested point. The output could be a pdb file, an xml PROFASI structure description file, a text or binary structure file with only the population information.

Typically, because of job time limits on clusters and supercomputers, a simulation requires many restarts. Every stage of such a run opens a new binary data file with a name "conf..." inside the run directoy (n0,n1 etc.). In addition, with PROFASI version 1.5, there are trajectory metadata files called "traj" in the run directories. They store information about ranges, write intervals and number of blocks present in different conf files. For extraction of structures and properties, it is best to use these "traj" files, as input to this program. The traj files contain information about which binary file to open to look for the desired structure. But since they don't have any data as such, if you move the traj files, you should move all the corresponding "conf..." files along with it.

PROFASI's binary configuration files store information in an assumed order. So, this program will not work if the configuration layout is changed. The original configuration files typically include random number generator information etc. in addition to the state of the population.

If the reason for extracting a structure is to start new runs using it, we recommend using the xml structure file for this, as it contains sequence information, and works as a self contained human readable record of the structure, unlike the textconf or the binary configuration files. Runs can be started with the xml configuration by using "set_population somefile.xml" command, while with the text configuration files, you would need to use the "add_chain ..." commands as well as "init_config" command. (See Commands to set up the population for more information on these commands.)


extract_snapshot [OPTIONS] input_trajectory_file

Options could be any of ...


$ extract_snapshot -o min_at_1293999.pdb n19/traj -c 1293999
$ extract_snapshot -o min_at_1293999.xml n19/traj -c 1293999
$ extract_snapshot -o min_at_1293999.xml -c 1293999 --raw n0/conf.bkp0 conf.bkp2 conf.bkp1

In the above examples, we extract the state of the population at the cycle 1293999 and save the state as a pdb or xml file. The last example shows that it is possible to use the binary conf files directly with the raw option. It is possible, but not recommended for efficiency. The format of the data in the output file can be inferred from the suffix you provide. So, the "-f" option is unnecessary. In the following case, the "-f" option would be necessary:

$ extract_snapshot -o weird_structure.whatonearth n0/traj -c 87999 -f pdb

Without the "-f pdb" option above, the file "weird_structure.whatonearth" would be written in the (default) xml format. When given, the option "-f" overrides any automatically inferred output file format.

In order to interpret the binary data in the conf files, information about the layout of data in these files is needed. From ProFASi version 1.5, this information is embedded in the binary files. In versions 1.1 to 1.5, this information was in a separate file called "" in the same directory as the conf file. For hints on extracting information from old runs, made with PROFASI versions without embedded layout information or files, see the section "Analysis of data generated by older PROFASI versions" in Generating trajectory files.

The following (BASH) shell script will create files emin.xml corresponding to the minimum energy configurations from each of the runs n0...nN, and save them in the respective directories. Those population configurations can then be used to start a run with different initial configurations for each of the nodes.

$ for i in `seq 0 N` ; do
> tx=`grep ENERGY n\$i/minen.pdb | awk '{print \$9;}' ` ;
> extract_snapshot n\$i/traj -o n\$i/emin.xml -c \$tx ;
> done

This works because PROFASI saves energy and Monte Carlo time information as remarks in its pdb output files when it can. Note that some of the quote marks above (used around the seq and grep commands) are back quotes.

Finally, note that if you see this line in the documentation, your version of extract_snapshot is capable of reading binary configurations written in other machines with different binary formats. If you made a run on a cluster of IBM PPC 6 processors, and want to analyze the data on your own laptop with an old Intel Pentium M in it, you don't have to convert the binaries. Use it directly:

$ extract_snapshot -o min_at_1293999.xml –raw n19/conf -c 1293999

The byte order information is part one of the things written in the header part of the binary files in recent versions and in the file in older versions of PROFASI. In the above, using the raw option is not inefficient, as there is only one conf segment. If the cycle 1293999 is found in it, it will be recovered.

See Also

The documentation for this class was generated from the following files:

PROFASI: Protein Folding and Aggregation Simulator, Version 1.5
© (2005-2016) Anders Irbäck and Sandipan Mohanty
Documentation generated on Mon Jul 18 2016 using Doxygen version 1.8.2