PROFASI  Version 1.5
Public Member Functions | List of all members
prf_utils::AdaptiveHis Class Reference

A histogram that can adjust its own range according to the data. More...

#include <AdaptiveHis.hh>

Inheritance diagram for prf_utils::AdaptiveHis:
Inheritance graph
[legend]

Public Member Functions

 AdaptiveHis ()
 Default constructor.
 
 AdaptiveHis (const AdaptiveHis &)
 Copy constructor.
 
AdaptiveHisoperator= (const AdaptiveHis &)
 Assignment operator.
 
void init ()
 Initializes using info about range etc.
 
int adjust ()
 Adjust range to accommodate data, keeping bin size fixed.
 
int put (double x, int i=0)
 Put a value into the histogram.
 
void Export (const char *filename, int normmode=2, int lyout=2)
 Save histogram and out of range points to files.
 
void disable_adjust ()
 
- Public Member Functions inherited from prf_utils::His1D
 His1D ()
 Default constructor.
 
 His1D (int nbl)
 Create His1D with nbl blocks (create nbl histograms)
 
 His1D (double xmn, double xmx, int npnts, int numblocks=1)
 Construct with xmin, xmax, number of bins and number of blocks.
 
 His1D (double xmn, double xmx, double bnsz, int numblocks=1)
 Construct with xmin, xmax, bin size and number of blocks.
 
 His1D (const His1D &)
 Copy constructor (copies data, so do not initialize after this!)
 
His1Doperator= (const His1D &)
 Assignment operator (copies data, do not initialize after this!)
 
void Name (std::string nm)
 Give it a name.
 
void init ()
 Initializes using info about range etc.
 
void reset ()
 Reset all data (init calls this)
 
void NBlocks (int n)
 Make it a histogram of n blocks.
 
int NBlocks () const
 Return the number of blocks.
 
long n_entries (int i) const
 number of entries in block i
 
long n_entries_in_range (int i) const
 number of entries in block i in range
 
void Range (double x0, double x1)
 Set range.
 
void Nbins (int v)
 Set number of bins.
 
int Nbins () const
 Get number of bins.
 
void set_bin_size (double sz)
 Set bin size.
 
double Xmin () const
 Get xmin.
 
double Xmax () const
 Get xmax.
 
double Xbin () const
 Get bin size.
 
double xval (int i)
 x value for the middle of i'th bin
 
double yval (int iblk, int i)
 y value for the middle of i'th bin
 
int put (double x, int iblk=0)
 put value x into the iblk block
 
int nput (double howmanytimes, double x, int iblk)
 put n indentical values at once
 
double normalize ()
 normalize histogram so that each block sums to 1
 
double unnormalize ()
 unnormalize histogram so that each block sums to its occupancy
 
int Import (const char *filename)
 Import histogram data written in the format of Export function.
 
virtual His1Doperator+= (His1D &)
 Add information from another given histogram.
 

Detailed Description

This is a minor modification of the His1D class that frees the user from estimating a good range for a histogram before starting to fill it. When one starts a Monte Carlo run with a new protein, or in any other application where a histogram may be required, one has often no idea where the values of one measurement might lie. Before version 1.1 of PROFASI, one had to first make a trial run to get a good feeling for the true range of the data, and then start new production runs where the correct histogram ranges were specified. To a large extent, this will now be unnecessary.

This histogram follows the data, wherever it is. You just have to declare an AdaptiveHis, fill it with data, ask it to "adjust()" once in a while, and Export() it to a file. If you then plot the file, it will have a very reasonable range: not too many empty bins on left and right. Not too many missed data points.

This does not mean that one does not need to think about the size of the values put into the histograms at all. The adjust() function does not change the size of the bins used for the histogram. That's how it works! If there are many points outside the current range, the class remembers those missed points. When statistics is collected we pretend that there were an infinite number of bins of the size set at initialization. We only choose to do the book keeping on a finite range of those bins. If there are points outside our currently tracked bins, we can add a few bins to the left or right to accommodate them, without affecting the collected data at all. If we were to change the bin size during an adjustment, that would interfere with the data collected before adjust() was called.

So, an initial guess for the size of the data is useful. More precisely a good initial estimate of the size of the bins is useful. If the minimum or maximum ranges are wrong, this class will take care of it. If your data values range from 3000 to 10000 and you initialize your AdaptiveHis to a range 0 to 1 with 100 bins, The adjust function will result in a huge histogram. It will have a range 3000 to 10000, with 700000 bins. But if you initialize it to 2000 to 4000 with 50 bins, you will be fine. It will once again find the correct range, but will have less than 200 bins.

This class was introduced in version 1.1. It is possible that in the future, its functionality will be absorbed in His1D.

Member Function Documentation

int prf_utils::AdaptiveHis::adjust ( )

Appropriate range for the data is found by examining out of range points, and the occupancy of currently used bins. We add bins, only if we can fill them. The fundamental reason for the existance of any kind of histograms is that one does not wish to save each and every data point. "Similar" data points are groupped, or binned together. Now if there is one data point outside the current range, such that we would need add 100 bins to the right to reach that datapoint, the use of the histogram itself loses its meaning, if we do that. It is more economical to save that data point than to create 100 more bins and remember the frequency of each. Therefore, even after repeated calls to adjust, there might be one remaining out-of-range point that this class simply refuses to cover. When the histogram is saved, such points, if any, are saved in a separate file, and you can deal with them if you like.

Return value is non-zero if the range really changes.

void prf_utils::AdaptiveHis::disable_adjust ( )
inline

Disable/enable range tracking features. One can temporarily disable the "adaptive" qualities of the histogram. This is intended for use, if it is known that the incomming data for a certain stage of the program can contain non-sensical values which should have no bearing on the range. When "adjustability" is disabled, the histogram forgets new out of range values, until it is re-enabled.

void prf_utils::AdaptiveHis::Export ( const char *  filename,
int  normmode = 2,
int  lyout = 2 
)
virtual

The histogram data is saved as in class His1D. The parameters normmode and lyout are simply passed down to the base class function. But this class also saves the out-of-range points not covered by the final range(those the function adjust() refuses to include), in a second file with the same name as the histogram file, but with an extension ".out_of_range" at the end.

Reimplemented from prf_utils::His1D.

AdaptiveHis & prf_utils::AdaptiveHis::operator= ( const AdaptiveHis hs)

It copies data, do not initialize after this!

int prf_utils::AdaptiveHis::put ( double  x,
int  i = 0 
)

The value x is put into the histogram if it fits in the range. If not, it is stored in the out-of-range list. The adjust function deals with these out of range points and may put them into bins when the range is appropriately extended.


The documentation for this class was generated from the following files:

PROFASI: Protein Folding and Aggregation Simulator, Version 1.5
© (2005-2016) Anders Irbäck and Sandipan Mohanty
Documentation generated on Mon Jul 18 2016 using Doxygen version 1.8.2