yat  0.8.3pre
Classes | Public Member Functions | Friends | Related Functions
theplu::yat::statistics::KolmogorovSmirnov Class Reference

Kolmogow Smirnov Test. More...

#include <yat/statistics/KolmogorovSmirnov.h>

List of all members.

Classes

struct  Element

Public Member Functions

 KolmogorovSmirnov (void)
 Constructor.
void add (double value, bool class_label, double weight=1.0)
 add a value
template<typename ForwardIterator >
void add (ForwardIterator first, ForwardIterator last)
 add a range
double p_value (void) const
 Large-Sample Approximation.
double p_value (size_t perm) const
 p-value
void reset (void)
 resets everything to zero
double score (void) const
 Kolmogorov Smirnov statistic.
void shuffle (void)
 shuffle class labels
double signed_score (void) const

Friends

std::ostream & operator<< (std::ostream &, const KolmogorovSmirnov &)

Related Functions

(Note that these are not member functions.)

std::ostream & operator<< (std::ostream &, const KolmogorovSmirnov &)
 output operator

Detailed Description

Kolmogow Smirnov Test.


Member Function Documentation

template<typename ForwardIterator >
void theplu::yat::statistics::KolmogorovSmirnov::add ( ForwardIterator  first,
ForwardIterator  last 
)

add a range

value_type of ForwardIterator must be convertible to KolmogorovSmirnov::Element

Insertion takes typically N*log(N). However, if range is sorted, insertion takes linear time. A common use case of this function is when calculating several KS statistics on a data set (values and weights) with different class labels.

Since:
New in yat 0.5

Large-Sample Approximation.

  This analytical approximation of p-value can be used when all
  weight equal unity and sample sizes \a n and \a m are
  large. The p-value is calcuated as \form#279@_fakenl, where s is the
  scaled score:

$ s = \sqrt\frac{nm}{n+m} $ score().

  \since New in yat 0.5

  Following Hollander and Wolfe

p-value

Performs a permutation test using perm label randomizations and calculate how often a score equal or larger than score() is obtained.

See also:
shuffle

Kolmogorov Smirnov statistic.

$ sup_x | F_{\textrm{True}}(x) - F_{\textrm{False}}(x) | $ where $ F(x) = \frac{\sum_{i:x_i\le x}w_i}{ \sum w_i} $

shuffle class labels

This is equivalent to reset and re-add values with shuffled class labels.

Since:
New in yat 0.5

Same as score() but keeping the sign, in other words, abs(signed_score())==score()

A positive score implies that values in class true on average are smaller than values in class false.

Since:
New in yat 0.5

The documentation for this class was generated from the following file:

Generated on Thu Dec 20 2012 03:12:59 for yat by  doxygen 1.8.0-20120409