yat  0.11.3pre
Classes | Functions
theplu::yat::statistics Namespace Reference

Statistical methods, classes, and functions. More...

Classes

class  AUC
 Area Under ROC Curve. More...
 
class  averager_base
 Base class for averager classes. More...
 
class  averager_base2
 Base class for averagers calculating mean and variance. More...
 
class  averager_base3
 
class  averager_base4
 
struct  Average
 Functor to take average of a range. More...
 
class  Averager
 Class to calculate simple (first and second moments) averages. More...
 
class  Averager1
 class to calculate mean More...
 
class  Averager3
 class to calculate 1st, 2nd, and 3rd central moments More...
 
class  Averager4
 class to calculate 1st, 2nd, 3rd, and 4th central moments More...
 
class  AveragerPair
 Class for taking care of mean and covariance of two variables. More...
 
class  AveragerWeighted
 Class to calulate averages with weights. More...
 
class  AveragerPairWeighted
 Class for taking care of mean and covariance of two variables in a weighted manner. More...
 
struct  averager_traits
 
struct  averager_traits< utility::unweighted_iterator_tag >
 
struct  averager_traits< utility::weighted_iterator_tag >
 
struct  averager
 
struct  averager_pair
 
struct  EuclideanDistance
 Calculates the Euclidean distance between two points given by elements of ranges. More...
 
class  Fisher
 Fisher's exact test. More...
 
class  FoldChange
 Score given by the difference by the group means. More...
 
class  Histogram
 Histograms provide a convenient way of presenting the distribution of a set of data. More...
 
class  Kendall
 Kendall's tau rank coefficient. More...
 
class  KolmogorovSmirnov
 Kolmogorov Smirnov Test. More...
 
class  KolmogorovSmirnovOneSample
 Kolmogorov Smirnov Test for one class. More...
 
class  Pearson
 Class for calculating Pearson correlation. More...
 
class  PearsonCorrelation
 Class for calculating Pearson correlation. More...
 
struct  PearsonDistance
 Calculates the Pearson correlation distance between two points given by elements of ranges. More...
 
class  Percentiler
 Functor to calculate percentile of a range. More...
 
class  ROC
 Reciever Operating Characteristic. More...
 
class  SAMScore
 Class for score used in Significance Analysis of Microarrays (SAM). More...
 
class  Score
 Interface Class for score classes. More...
 
class  Smoother
 Estimating a distribution in a smooth fashion. More...
 
class  SNRScore
 Class for score based on signal-to-noise ratio (SNRScore). More...
 
class  Spearman
 Spearman rank correlation coefficient. More...
 
class  TukeyBiweightEstimator
 Tukey's Biweight Estimator. More...
 
class  tScore
 Class for Fisher's t-test. More...
 
class  tTest
 Class for Student's t-test. More...
 
struct  VectorFunction
 Interface Class for vector functors. More...
 
struct  Max
 Larget element. More...
 
struct  Median
 Median element. More...
 
struct  Mean
 Mean element. More...
 
struct  Min
 Smallest element. More...
 
class  Nth_Element
 
class  WilcoxonFoldChange
 WilcoxonFoldChange. More...
 

Functions

template<typename T , typename ForwardIterator >
void add (T &o, ForwardIterator first, ForwardIterator last, const classifier::Target &target)
 
template<typename BidirectionalIterator1 , typename BidirectionalIterator2 >
void benjamini_hochberg (BidirectionalIterator1 first, BidirectionalIterator1 last, BidirectionalIterator2 result)
 Benjamini Hochberg multiple test correction.
 
double cdf_hypergeometric_P (unsigned int k, unsigned int n1, unsigned int n2, unsigned int t)
 
double pearson_p_value (double r, unsigned int n)
 one-sided p-value
 
double kurtosis (const utility::VectorBase &)
 Computes the kurtosis of the data in a vector.
 
template<class RandomAccessIterator >
double mad (RandomAccessIterator first, RandomAccessIterator last, bool sorted=false)
 Median absolute deviation from median.
 
template<class RandomAccessIterator >
double median (RandomAccessIterator first, RandomAccessIterator last, bool sorted=false)
 
template<class RandomAccessIterator >
double percentile (RandomAccessIterator first, RandomAccessIterator last, double p, bool sorted=false)
 
template<class RandomAccessIterator >
double percentile2 (RandomAccessIterator first, RandomAccessIterator last, double p, bool sorted=false)
 
double skewness (const utility::VectorBase &)
 Computes the skewness of the data in a vector.
 

Detailed Description

Statistical methods, classes, and functions.

All classes and functions related to statistical methods or functions are defined within this namespace. See Weighted Statistics.

Function Documentation

template<typename T , typename ForwardIterator >
void theplu::yat::statistics::add ( T &  o,
ForwardIterator  first,
ForwardIterator  last,
const classifier::Target &  target 
)

Adding a range [first, last) into an object of type T. The requirements for the type T is to have an add(double, bool, double) function.

template<typename BidirectionalIterator1 , typename BidirectionalIterator2 >
void theplu::yat::statistics::benjamini_hochberg ( BidirectionalIterator1  first,
BidirectionalIterator1  last,
BidirectionalIterator2  result 
)

Benjamini Hochberg multiple test correction.

Given a sorted range of p-values such that $ p_1 \le p_2 \le ... \le p_N $ a Benjamnini-Hochberg corrected p-value, q, is calculated recursively as $ q_i = $ min $(p_i \frac{N}{i}, q_{i+1})$ with the anchor constraint that $ q_m = p_m $.

Requirements: BidirectionalIterator1 should be a Bidirectional Iterator and BidirectionalIterator2 should be a mutable Bidirectional Iterator

Since
New in yat 0.8
double theplu::yat::statistics::cdf_hypergeometric_P ( unsigned int  k,
unsigned int  n1,
unsigned int  n2,
unsigned int  t 
)

Calculates the probability to get k or smaller from a hypergeometric distribution with parameters n1 n2 t. Hypergeomtric situation you get in the following situation: Let there be n1 ways for a "good" selection and n2 ways for a "bad" selection out of a total of possibilities. Take t samples without replacement and k of those are "good" samples. k will follow a hypergeomtric distribution.

Returns
cumulative hypergeomtric distribution functions P(k).
double theplu::yat::statistics::kurtosis ( const utility::VectorBase &  )

Computes the kurtosis of the data in a vector.

The kurtosis measures how sharply peaked a distribution is, relative to its width. The kurtosis is normalized to zero for a gaussian distribution.

template<class RandomAccessIterator >
double theplu::yat::statistics::mad ( RandomAccessIterator  first,
RandomAccessIterator  last,
bool  sorted = false 
)

Median absolute deviation from median.

Function is non-mutable function

Requirements: RandomAccessIterator should be a Data Iterator and Random Access Iterator

Since 0.6 function also work with a Weighted Iterator

template<class RandomAccessIterator >
double theplu::yat::statistics::median ( RandomAccessIterator  first,
RandomAccessIterator  last,
bool  sorted = false 
)

Median is defined to be value in the middle. If number of values is even median is the average of the two middle values. the median value is given by p equal to 50. If sorted is false (default), the range is copied, the copy is sorted, and then used to calculate the median.

Function is a non-mutable function, i.e., first and last can be const_iterators.

Requirements: RandomAccessIterator should be a Data Iterator and Random Access Iterator

Returns
median of range
double theplu::yat::statistics::pearson_p_value ( double  r,
unsigned int  n 
)

one-sided p-value

This function uses the t-distribution to calculate the one-sided p-value. Given that the true correlation is zero (Null hypothesis) the estimated correlation, r, after a transformation is t-distributed:

$ \sqrt{(n-2)} \frac{r}{\sqrt{(1-r^2)}} \in t(n-2) $

Returns
Probability that correlation is larger than r by chance when having n samples. For r larger or equal to 1.0, 0.0 is returned. For r smaller or equal to -1.0, 1.0 is returned.
template<class RandomAccessIterator >
double theplu::yat::statistics::percentile ( RandomAccessIterator  first,
RandomAccessIterator  last,
double  p,
bool  sorted = false 
)

The percentile is determined by the p, a number between 0 and

  1. The percentile is found by interpolation, using the formula $ percentile = (1 - \delta) x_i + \delta x_{i+1} $ where p is floor $((n - 1)p/100)$ and $ \delta $ is $ (n-1)p/100 - i $.Thus the minimum value of the vector is given by p equal to zero, the maximum is given by p equal to 100 and the median value is given by p equal to 50. If sorted is false (default), the vector is copied, the copy is sorted, and then used to calculate the median.

Function is a non-mutable function, i.e., first and last can be const_iterators.

Requirements: RandomAccessIterator is an iterator over a range of doubles (or any type being convertable to double).

Returns
p'th percentile of range
Deprecated:
percentile2 will replace this function in the future
Note
the definition of percentile used here is not identical to that one used in percentile2 and Percentile. The difference is smaller for large ranges.
template<class RandomAccessIterator >
double theplu::yat::statistics::percentile2 ( RandomAccessIterator  first,
RandomAccessIterator  last,
double  p,
bool  sorted = false 
)
See Also
Percentiler
Since
new in yat 0.5
double theplu::yat::statistics::skewness ( const utility::VectorBase &  )

Computes the skewness of the data in a vector.

The skewness measures the asymmetry of the tails of a distribution.


Generated on Sat May 24 2014 03:33:05 for yat by  doxygen 1.8.2