yat  0.16pre
Classes | Functions
theplu::yat::statistics Namespace Reference

Statistical methods, classes, and functions. More...

Classes

class  AUC
 Area Under ROC Curve. More...
 
class  averager_base
 Base class for averager classes. More...
 
class  averager_base2
 Base class for averagers calculating mean and variance. More...
 
class  averager_base3
 
class  averager_base4
 
struct  Average
 Functor to take average of a range. More...
 
class  Averager
 Class to calculate simple (first and second moments) averages. More...
 
class  Averager1
 class to calculate mean More...
 
class  Averager3
 class to calculate 1st, 2nd, and 3rd central moments More...
 
class  Averager4
 class to calculate 1st, 2nd, 3rd, and 4th central moments More...
 
class  AveragerPair
 Class for taking care of mean and covariance of two variables. More...
 
class  AveragerWeighted
 Class to calulate averages with weights. More...
 
class  AveragerPairWeighted
 Class for taking care of mean and covariance of two variables in a weighted manner. More...
 
struct  averager_traits
 
struct  averager_traits< utility::unweighted_iterator_tag >
 
struct  averager_traits< utility::weighted_iterator_tag >
 
struct  averager
 
struct  averager_pair
 
class  Distance
 A convenience class to implement Distance. More...
 
class  EuclideanDistance
 Calculates the Euclidean distance between elements of two ranges. More...
 
class  Fisher
 Fisher's exact test. More...
 
class  FoldChange
 Score given by the difference by the group means. More...
 
class  GaussianMixture
 Data modelled as mixture of Gaussian distributions. More...
 
class  Histogram
 Histograms provide a convenient way of presenting the distribution of a set of data. More...
 
class  Kendall
 Kendall's tau rank coefficient. More...
 
class  KolmogorovSmirnov
 Kolmogorov Smirnov Test. More...
 
class  KolmogorovSmirnovOneSample
 Kolmogorov Smirnov Test for one class. More...
 
class  LikelihoodRatioTestBinomial
 Likelihood-ratio test for binomial data. More...
 
class  Pearson
 Class for calculating Pearson correlation. More...
 
class  PearsonCorrelation
 Class for calculating Pearson correlation. More...
 
struct  PearsonDistance
 Calculates the Pearson correlation distance between two points given by elements of ranges. More...
 
class  Percentiler
 Functor to calculate percentile of a range. More...
 
class  ROC
 Reciever Operating Characteristic. More...
 
class  SAMScore
 Class for score used in Significance Analysis of Microarrays (SAM). More...
 
class  Score
 Interface Class for score classes. More...
 
class  Smoother
 Estimating a distribution in a smooth fashion. More...
 
class  SNRScore
 Class for score based on signal-to-noise ratio (SNRScore). More...
 
class  Spearman
 Spearman rank correlation coefficient. More...
 
class  TukeyBiweightEstimator
 Tukey's Biweight Estimator. More...
 
class  tScore
 Class for Fisher's t-test. More...
 
class  tTest
 Class for Student's t-test. More...
 
struct  VectorFunction
 Interface Class for vector functors. More...
 
struct  Max
 Larget element. More...
 
struct  Median
 Median element. More...
 
struct  Mean
 Mean element. More...
 
struct  Min
 Smallest element. More...
 
class  Nth_Element
 
class  WilcoxonFoldChange
 WilcoxonFoldChange. More...
 

Functions

template<typename T , typename ForwardIterator >
void add (T &o, ForwardIterator first, ForwardIterator last, const classifier::Target &target)
 
template<typename BidirectionalIterator1 , typename BidirectionalIterator2 >
void benjamini_hochberg (BidirectionalIterator1 first, BidirectionalIterator1 last, BidirectionalIterator2 result)
 Benjamini Hochberg multiple test correction. More...
 
template<typename RandomAccessIterator , typename MutableRandomAccessIterator >
void benjamini_hochberg_unsorted (RandomAccessIterator first, RandomAccessIterator last, MutableRandomAccessIterator result)
 Benjamini Hochberg multiple test correction. More...
 
double cdf_hypergeometric_P (unsigned int k, unsigned int n1, unsigned int n2, unsigned int t)
 
template<typename InputIterator >
double entropy (InputIterator first, InputIterator last)
 
double pearson_p_value (double r, unsigned int n)
 one-sided p-value More...
 
double kurtosis (const utility::VectorBase &)
 Computes the kurtosis of the data in a vector. More...
 
template<class RandomAccessIterator >
double mad (RandomAccessIterator first, RandomAccessIterator last, bool sorted=false)
 Median absolute deviation from median. More...
 
template<class RandomAccessIterator >
double median (RandomAccessIterator first, RandomAccessIterator last, bool sorted=false)
 
template<class T >
double mutual_information (const T &A)
 Calculates the mutual information of A. More...
 
template<class RandomAccessIterator >
double percentile (RandomAccessIterator first, RandomAccessIterator last, double p, bool sorted=false)
 
template<class RandomAccessIterator >
double percentile2 (RandomAccessIterator first, RandomAccessIterator last, double p, bool sorted=false)
 
double skewness (const utility::VectorBase &)
 Computes the skewness of the data in a vector. More...
 

Detailed Description

Statistical methods, classes, and functions.

All classes and functions related to statistical methods or functions are defined within this namespace. See Weighted Statistics.

Function Documentation

template<typename T , typename ForwardIterator >
void theplu::yat::statistics::add ( T &  o,
ForwardIterator  first,
ForwardIterator  last,
const classifier::Target &  target 
)

Adding a range [first, last) into an object of type T. The requirements for the type T is to have an add(double, bool, double) function.

Type Requirements:

template<typename BidirectionalIterator1 , typename BidirectionalIterator2 >
void theplu::yat::statistics::benjamini_hochberg ( BidirectionalIterator1  first,
BidirectionalIterator1  last,
BidirectionalIterator2  result 
)

Benjamini Hochberg multiple test correction.

Given a sorted range of p-values such that $ p_1 \le p_2 \le ... \le p_N $ a Benjamnini-Hochberg corrected p-value, q, is calculated recursively as $ q_i = $ min $(p_i \frac{N}{i}, q_{i+1})$ with the anchor constraint that $ q_m = p_m $.

Type Requirements:

Since
New in yat 0.8
template<typename RandomAccessIterator , typename MutableRandomAccessIterator >
void theplu::yat::statistics::benjamini_hochberg_unsorted ( RandomAccessIterator  first,
RandomAccessIterator  last,
MutableRandomAccessIterator  result 
)

Benjamini Hochberg multiple test correction.

Similar to benjamini_hochberg() but does not assume that input range, [first, last), is sorted. The resulting range is the same as if sorting input range, call benjamini_hochberg, and unsort the result range.

Type Requirements:

Since
New in yat 0.13
double theplu::yat::statistics::cdf_hypergeometric_P ( unsigned int  k,
unsigned int  n1,
unsigned int  n2,
unsigned int  t 
)

Calculates the probability to get k or smaller from a hypergeometric distribution with parameters n1 n2 t. Hypergeomtric situation you get in the following situation: Let there be n1 ways for a "good" selection and n2 ways for a "bad" selection out of a total of possibilities. Take t samples without replacement and k of those are "good" samples. k will follow a hypergeomtric distribution.

Returns
cumulative hypergeomtric distribution functions P(k).
template<typename InputIterator >
double theplu::yat::statistics::entropy ( InputIterator  first,
InputIterator  last 
)

The entropy is calculated as $ - \sum_i p_i \log p_i $ where $p_i = \frac{n_i}{\sum_j n_j} $

Requirements:

Since
New in yat 0.12
double theplu::yat::statistics::kurtosis ( const utility::VectorBase &  )

Computes the kurtosis of the data in a vector.

The kurtosis measures how sharply peaked a distribution is, relative to its width. The kurtosis is normalized to zero for a gaussian distribution.

template<class RandomAccessIterator >
double theplu::yat::statistics::mad ( RandomAccessIterator  first,
RandomAccessIterator  last,
bool  sorted = false 
)

Median absolute deviation from median.

Function is non-mutable function

Type Requirements:

Since 0.6 function also work with a Weighted Iterator

template<class RandomAccessIterator >
double theplu::yat::statistics::median ( RandomAccessIterator  first,
RandomAccessIterator  last,
bool  sorted = false 
)

Median is defined to be value in the middle. If number of values is even median is the average of the two middle values. the median value is given by p equal to 50. If sorted is false (default), the range is copied, the copy is sorted, and then used to calculate the median.

Function is a non-mutable function, i.e., first and last can be const_iterators.

Type Requirements:

Returns
median of range
template<class T >
double theplu::yat::statistics::mutual_information ( const T &  A)

Calculates the mutual information of A.

The elements in A are unnormalized probabilies of the joint distribution.

The mutual information is calculated as $ \sum \sum p(x,y) \log_2 \frac {p(x,y)} {p(x)p(y)} $ where $ p(x,y) = \frac {A_{xy}}{\sum_{x,y} A_{xy}} $; $ p(x) = \sum_y A_{xy} / \sum_{x,y} A_{xy} $; $ p(y) = \sum_x A_{xy} / \sum_{x,y} A_{xy} $

Requirements:

  • T must be a model of Container2D
  • T::value_type must be convertible to double
Returns
mutual information in bits; if you want in natural base multiply with M_LN2 (defined in gsl/gsl_math.h )
Since
New in yat 0.12
double theplu::yat::statistics::pearson_p_value ( double  r,
unsigned int  n 
)

one-sided p-value

This function uses the t-distribution to calculate the one-sided p-value. Given that the true correlation is zero (Null hypothesis) the estimated correlation, r, after a transformation is t-distributed:

$ \sqrt{(n-2)} \frac{r}{\sqrt{(1-r^2)}} \in t(n-2) $

Returns
Probability that correlation is larger than r by chance when having n samples. For r larger or equal to 1.0, 0.0 is returned. For r smaller or equal to -1.0, 1.0 is returned.
template<class RandomAccessIterator >
double theplu::yat::statistics::percentile ( RandomAccessIterator  first,
RandomAccessIterator  last,
double  p,
bool  sorted = false 
)

The percentile is determined by the p, a number between 0 and

  1. The percentile is found by interpolation, using the formula $ percentile = (1 - \delta) x_i + \delta x_{i+1} $ where p is floor $((n - 1)p/100)$ and $ \delta $ is $ (n-1)p/100 - i $.Thus the minimum value of the vector is given by p equal to zero, the maximum is given by p equal to 100 and the median value is given by p equal to 50. If sorted is false (default), the vector is copied, the copy is sorted, and then used to calculate the median.

Function is a non-mutable function, i.e., first and last can be const_iterators.

Requirements: RandomAccessIterator is an iterator over a range of doubles (or any type being convertable to double).

Returns
p'th percentile of range
Deprecated:
percentile2 will replace this function in the future
Note
the definition of percentile used here is not identical to that one used in percentile2 and Percentile. The difference is smaller for large ranges.
template<class RandomAccessIterator >
double theplu::yat::statistics::percentile2 ( RandomAccessIterator  first,
RandomAccessIterator  last,
double  p,
bool  sorted = false 
)
See Also
Percentiler

Type Requirements:

Since
new in yat 0.5
double theplu::yat::statistics::skewness ( const utility::VectorBase &  )

Computes the skewness of the data in a vector.

The skewness measures the asymmetry of the tails of a distribution.


Generated on Mon Oct 16 2017 02:22:39 for yat by  doxygen 1.8.5