Decorators, Analytical and Numerical Methods

This document is for understanding how and when distr6 uses analytical or numerical results. For a tutorial on how to use decorators, see here.

distr6 operates under a strict rule that implemented distributions, of class SDistribution, and similarly implemented kernels, of class Kernel, should only contain analytical methods. This ensures that all results are as precise as possible and ensure optimal efficiency.

It was not a straightforward decision deciding which methods should be included in the core distribution interfaces. We made the decision that methods involving infinite sums should not be included as any approximation via numerical integration would not be sufficiently accurate. However we allowed the use of the beta, gamma, and digamma functions from base R packages, as these are widely used in the community, have good documentation and have been widely tested. We additionally chose to include the gammaz and gammainc methods from pracma and expint, both packages use C implementations and appear to be in wide usage although we encourage the user to examine these further if interested.

For distributions where no closed form analytical expression can be found, details of this are found in the ‘Details’ section of the distribution’s help page, and attempting to run the function will throw an error. To impute numerical results, decorators are used.

Currently distr6 has three decorators:

CoreStatistics
ExoticStatistics
FunctionImputation

CoreStatistics contains all commonly used mathematical and statistical functions used to describe a probability distribution, for example: expectation, variance, kurtosis and skewness. In addition there is a generalised expectation function genExp that can be used to find any generic expectation of the distribution of the form $E[f(X)]$ where $f(X)$ can be any function acting on $X$ . This is particularly useful for more complex functions such as kthmoment.

ExoticStatistics contains less commonly used functions for the general user but useful functions for modelling and survival analysis. These include survival, hazard and any p-norm of pdf.

FunctionImputation imputes the pdf/cdf/quantile/random using numerical methods, depending which are provided. These follow a strict hierarchy to maximise efficiency and minimise error from numerical approximations. The hierarchy is as follows:

To impute the pdf: cdf2pdf; rand2pdf; quantile2pdf
To impute the cdf: quantile2cdf; pdf2cdf; rand2cdf
To impute the quantile: cdf2quantile; pdf2quantile; rand2quantile
To impute the rand: quantile2rand; cdf2rand; pdf2rand

When a decorator is constructed, only the functions that do not exist analytically are added. For example, if a distribution has an analytical variance function as part of its core interface, then this will not be overloaded by the CoreStatistics decorator.

n <- Normal$new(decorators = "CoreStatistics")
n$mean # The analytical method
#> function (...) 
#> {
#>     unlist(self$getParameterValue("mean"))
#> }
#> <environment: 0x1209e2ea8>

Similarly where possible an analytical method is always preferred. For example, if an analytical cdf expression is available, then the survival function in ExoticStatistics is defined as 1 - cdf otherwise a numerical integration is calculated.

n <- Normal$new(decorators = "ExoticStatistics")
n$survival(2) == n$cdf(2, lower.tail = F) # The analytical method
#> [1] TRUE

Finally, even if an analytical expression for expectation exists, the CoreStatistics decorator always adds a genExp method for a generalised numerical expectation formula, this is required for numerical results for variance, moments, etc.

See the decorators tutorial for details on how to decorate a Distribution.

2024-07-08