public class KolmogorovSmirnovDistribution extends Object implements Serializable
Treats the distribution of the two-sided P(D_n < d)
where D_n = sup_x |G(x) - G_n (x)|
for the
theoretical cdf G
and the empirical cdf G_n
.
This implementation is based on [1] with certain quick decisions for extreme values given in [2].
In short, when wanting to evaluate P(D_n < d)
, the method in [1] is to write d = (k - h) / n
for
positive integer k
and 0 <= h < 1
. Then P(D_n < d) = (n! / n^n) * t_kk
, where t_kk
is
the (k, k)
'th entry in the special matrix H^n
, i.e. H
to the n
'th power.
References:
Constructor and Description |
---|
KolmogorovSmirnovDistribution(int nIn) |
Modifier and Type | Method and Description |
---|---|
double |
cdf(double d)
Calculates
P(D_n < d) using method described in [1] with quick
decisions for extreme values given in [2] (see above). |
double |
cdf(double d,
boolean exact)
Calculates
P(D_n < d) using method described in [1] with quick
decisions for extreme values given in [2] (see above). |
double |
cdfExact(double d)
Calculates
P(D_n < d) using method described in [1] with quick
decisions for extreme values given in [2] (see above). |
public KolmogorovSmirnovDistribution(int nIn)
nIn
- Number of observationsNotStrictlyPositiveException
- if n <= 0
public double cdf(double d)
P(D_n < d)
using method described in [1] with quick
decisions for extreme values given in [2] (see above). The result is not
exact as with cdfExact(double)
because
calculations are based on double
rather than BigFraction
.d
- statisticP(D_n < d)
MathArithmeticException
- if algorithm fails to convert h
to a BigFraction
in
expressing d
as (k - h) / m
for integer k, m
and 0 <= h < 1
.public double cdfExact(double d)
P(D_n < d)
using method described in [1] with quick
decisions for extreme values given in [2] (see above). The result is
exact in the sense that BigFraction/BigReal is used everywhere at the
expense of very slow execution time. Almost never choose this in real
applications unless you are very sure; this is almost solely for
verification purposes. Normally, you would choose cdf(double)
d
- statisticP(D_n < d)
MathArithmeticException
- if algorithm fails to convert h
to a BigFraction
in
expressing d
as (k - h) / m
for integer k, m
and 0 <= h < 1
.public double cdf(double d, boolean exact)
P(D_n < d)
using method described in [1] with quick
decisions for extreme values given in [2] (see above).d
- statisticexact
- whether the probability should be calculated exact using
BigFraction
everywhere at the
expense of very slow execution time, or if double
should be used
convenient places to gain speed. Almost never choose true
in real
applications unless you are very sure; true
is almost solely for
verification purposes.P(D_n < d)
MathArithmeticException
- if algorithm fails to convert h
to a BigFraction
in
expressing d
as (k - h) / m
for integer k, m
and 0 <= h < 1
.Copyright © 2019 CNES. All rights reserved.