yat  0.13.2pre
Classes | Public Member Functions | Friends | Related Functions | List of all members
theplu::yat::statistics::KolmogorovSmirnov Class Reference

Kolmogorov Smirnov Test. More...

#include <yat/statistics/KolmogorovSmirnov.h>

Classes

struct  Element
 

Public Member Functions

 KolmogorovSmirnov (void)
 Constructor.
 
void add (double value, bool class_label, double weight=1.0)
 add a value
 
template<typename ForwardIterator >
void add (ForwardIterator first, ForwardIterator last)
 add a range More...
 
double p_value (void) const
 Large-Sample Approximation. More...
 
double p_value (size_t perm) const
 p-value More...
 
void remove (double value, bool class_label, double weight=1.0)
 Remove a data point. More...
 
void reset (void)
 resets everything to zero
 
double score (void) const
 Kolmogorov Smirnov statistic. More...
 
void shuffle (void)
 shuffle class labels More...
 
double signed_score (void) const
 

Friends

std::ostream & operator<< (std::ostream &, const KolmogorovSmirnov &)
 

Related Functions

(Note that these are not member functions.)

std::ostream & operator<< (std::ostream &, const KolmogorovSmirnov &)
 output operator
 

Detailed Description

Kolmogorov Smirnov Test.

Member Function Documentation

template<typename ForwardIterator >
void theplu::yat::statistics::KolmogorovSmirnov::add ( ForwardIterator  first,
ForwardIterator  last 
)

add a range

value_type of ForwardIterator must be convertible to KolmogorovSmirnov::Element

Insertion takes typically N*log(N). However, if range is sorted, insertion takes linear time. A common use case of this function is when calculating several KS statistics on a data set (values and weights) with different class labels.

Since
New in yat 0.5
double theplu::yat::statistics::KolmogorovSmirnov::p_value ( void  ) const

Large-Sample Approximation.

This analytical approximation of p-value can be used when all weight equal unity and sample sizes n and m are large. The p-value is calcuated as $ P = \displaystyle - 2 \sum_{k=1}^{\infty} (-1)^ke^{-2k^2s^2}$, where s is the scaled score:

$ s = \sqrt\frac{nm}{n+m} $ score().

Since
New in yat 0.5

Following Hollander and Wolfe

double theplu::yat::statistics::KolmogorovSmirnov::p_value ( size_t  perm) const

p-value

Performs a permutation test using perm label randomizations and calculate how often a score equal or larger than score() is obtained.

See Also
shuffle
void theplu::yat::statistics::KolmogorovSmirnov::remove ( double  value,
bool  class_label,
double  weight = 1.0 
)

Remove a data point.

Exceptions
utility::runtime_errorif no data point exist with value, class_label, and weight.
Since
New in yat 0.9
double theplu::yat::statistics::KolmogorovSmirnov::score ( void  ) const

Kolmogorov Smirnov statistic.

$ sup_x | F_{\textrm{True}}(x) - F_{\textrm{False}}(x) | $ where $ F(x) = \frac{\sum_{i:x_i\le x}w_i}{ \sum w_i} $

void theplu::yat::statistics::KolmogorovSmirnov::shuffle ( void  )

shuffle class labels

This is equivalent to reset and re-add values with shuffled class labels.

Since
New in yat 0.5
double theplu::yat::statistics::KolmogorovSmirnov::signed_score ( void  ) const

Same as score() but keeping the sign, in other words, abs(signed_score())==score()

A positive score implies that values in class true on average are smaller than values in class false.

Since
New in yat 0.5

The documentation for this class was generated from the following file:

Generated on Wed Jan 4 2017 02:23:08 for yat by  doxygen 1.8.5