PROFASI  Version 1.5
Converting binary conf files to gzipped text and back

The program conf_to_gzip can convert back and forth between the binary configuration file layout from PROFASI simulation programs and a gzipped text format.

To continue a run performed on machine A on machine B with a different architecture, or to analyze data generated in machine A on machine B, it is often necessary to browse the simulation history as stored in the configuration files. But this poses an immediate problem: the binary files generated on one machine, when read on another, give rise to nonsense. This program is meant to deal with precisely this situation.

There are three steps involved:

  1. Use conf_to_gzip to convert the binary format to a compressed gzipped format of the same data. Of course, we are not gzipping our binary data. That would not really "compress" anything. The binary data is already compressed taking advantage of apriori knowledge of the layout of the data. This can not be further compressed with a general purpose algorithm like the "deflate" algorithm of gzip. conf_to_gzip.ex interprets the binary data as they would be interpreted in a simulation, and then creates a gzipped file of the text representation of that data. So, if you gunzip the file generated by the program, you will get an ordinary text file.
  2. Copy the gzip files to your target system. As described above, these are gzipped versions of text files. So, you can transfer them anywhere, and they will still mean the same thing.
  3. Use conf_to_gzip on the target machine to convert the gzip file to a binary configuration file suitable for that machine.

To interpret the binary format, the program needs a map of the data layout in that particular conf file. This information is generated by the same simulation programs, and stored in the same directory where they store their "conf" files. These files are named "conf.info". The program conf_to_gzip requires the location of a "conf.info" file to be able to interpret the conf data correctly.

Usage example

Basic usage example:

$ conf_to_gzip -i conf.info conf -o conf.txt.gz

This takes an input binary configuration file called conf, a file called "conf.info" containing a map of the binary data in conf and produces an output file in gzipped format called "conf.txt.gz". The output file is a gzipped text file. On most modern systems, one can simply "less" it and view its contents without uncompressing it. For the reverse process, making a binary file out of a gzipped text configuration file,

$ conf_to_gzip -r -i conf.info conf.txt.gz -o conf

Note that the option "i" does not stand for input. It is "info_file". The input in this second case is conf.txt.gz, and output is the binary file conf.

You have run ParTemp.mex on a cluster for 3 weeks. While analyzing the data on your laptop, you realize that node 13 had visited a very interesting state on cycle 1599999, but that state was not the lowest energy state. You want to use extract_snapshot to get a PDB of that state and see what it looks like. You need the program state histories saved in files n0/conf, n1/conf etc for each compute node. But the cluster consists of AMD Opteron processors, while your laptop has a 32 bit Intel Pentium M processor. The binary conf files are incompatible.

Of course, one solution is to log in to the cluster, and run the extract_snapshot command there. But if this is not possible for some reason, you should proceed as follows...

Whenever you copy your run data from one system to another, you should remember to convert the "conf" files to the native format of the new system. So, before you copy anything, run the following command in the directory containing n0, n1 etc. ...

for j in seq 0 max_run_index ; do
conf_to_gzip -i n$j/conf.info n$j/conf -o n$j/conf.txt.gz
done

This should create gzipped text configurations named "conf.txt.gz" in each directory. You might want to add another line "rm -f n$j/conf" before the "done" statement to remove the binary conf file. It can be regenerated when required from the conf.txt.gz file when required on any system.

Next, copy your run directory to your laptop. Now you can to create binary conf files for the laptop, as if they had been natively generated there. Type the following:

for j in seq 0 max_run_index; do
conf_to_gzip -r -i n$j/conf.info n$j/conf.txt.gz -o n$j/conf
done

Now, you can proceed with the structure extraction with "extract_snapshot" in the usual way. This same procedure can be applied to continue one set of runs done on one cluster on another.


PROFASI: Protein Folding and Aggregation Simulator, Version 1.5
© (2005-2016) Anders Irbäck and Sandipan Mohanty
Documentation generated on Mon Jul 18 2016 using Doxygen version 1.8.2