vignettes/webs/analytic_and_numeric_methods.rmd
analytic_and_numeric_methods.rmd
This document is for understanding how and when distr6 uses analytical or numerical results. For a tutorial on how to use decorators, see here.
distr6 operates under a strict rule that implemented distributions,
of class SDistribution
, and similarly implemented kernels,
of class Kernel
, should only contain analytical methods.
This ensures that all results are as precise as possible and ensure
optimal efficiency.
It was not a straightforward decision deciding which methods should
be included in the core distribution interfaces. We made the decision
that methods involving infinite sums should not be included as any
approximation via numerical integration would not be sufficiently
accurate. However we allowed the use of the beta
,
gamma
, and digamma
functions from base R
packages, as these are widely used in the community, have good
documentation and have been widely tested. We additionally chose to
include the gammaz
and gammainc
methods from
pracma
and expint
, both packages use C
implementations and appear to be in wide usage although we encourage the
user to examine these further if interested.
For distributions where no closed form analytical expression can be found, details of this are found in the ‘Details’ section of the distribution’s help page, and attempting to run the function will throw an error. To impute numerical results, decorators are used.
Currently distr6 has three decorators:
CoreStatistics
ExoticStatistics
FunctionImputation
CoreStatistics
contains all commonly used mathematical
and statistical functions used to describe a probability distribution,
for example: expectation, variance, kurtosis and skewness. In addition
there is a generalised expectation function genExp
that can
be used to find any generic expectation of the distribution of the form
where
can be any function acting on
.
This is particularly useful for more complex functions such as
kthmoment
.
ExoticStatistics
contains less commonly used functions
for the general user but useful functions for modelling and survival
analysis. These include survival, hazard and any p-norm of pdf.
FunctionImputation
imputes the pdf/cdf/quantile/random
using numerical methods, depending which are provided. These follow a
strict hierarchy to maximise efficiency and minimise error from
numerical approximations. The hierarchy is as follows:
cdf2pdf
; rand2pdf
;
quantile2pdf
quantile2cdf
; pdf2cdf
;
rand2cdf
cdf2quantile
;
pdf2quantile
; rand2quantile
quantile2rand
;
cdf2rand
; pdf2rand
When a decorator is constructed, only the functions that do not exist
analytically are added. For example, if a distribution has an analytical
variance
function as part of its core interface, then this
will not be overloaded by the CoreStatistics
decorator.
n <- Normal$new(decorators = "CoreStatistics")
n$mean # The analytical method
#> function (...)
#> {
#> unlist(self$getParameterValue("mean"))
#> }
#> <environment: 0x1209e2ea8>
Similarly where possible an analytical method is always preferred.
For example, if an analytical cdf
expression is available,
then the survival
function in ExoticStatistics
is defined as 1 - cdf
otherwise a numerical integration is
calculated.
n <- Normal$new(decorators = "ExoticStatistics")
n$survival(2) == n$cdf(2, lower.tail = F) # The analytical method
#> [1] TRUE
Finally, even if an analytical expression for
expectation
exists, the CoreStatistics
decorator always adds a genExp
method for a generalised
numerical expectation formula, this is required for numerical results
for variance, moments, etc.
See the decorators tutorial for details on how to decorate a Distribution.