vignettes/webs/create_sdistribution.rmd
create_sdistribution.rmd
This tutorial assumes that you have a good knowledge of R6 and so we will not be going through the basics of inheritance and private/public methods/variables.
All implemented probability distributions (excluding Kernels) in
distr6 inherit from the SDistribution
class. This means
they share a common interface. The only differences between these
distributions is that some will have methods missing as no analytic
results are available. See the uml
diagram for an overview of how this all fits in together. A core
design principle in distr6 is that only analytical methods are defined
in the SDistribution child classes, all numerical results are available
through decorators. See the decorators
tutorial for more information on decorators and the analytical
and numerical article for further discussions on analytical and
numerical methods. The summary is that when creating your own
SDistribution class, please do not put any numerical methods in the core
interface, if a closed form expression cannot be found, omit the method
entirely and it can be imputed with a decorator. If your desired method
is not available in one of our decorators but you think it is useful,
see the creating
a decorator extension guidelines.
Every class inheriting from SDistribution must have the following public variables:
For the Normal distribution, the above all looks like
Normal <- R6::R6Class("Normal", inherit = SDistribution, lock_objects = F)
Normal$set("public","name","Normal")
Normal$set("public","short_name","Norm")
Normal$set("public","description","Normal Probability Distribution.")
Normal$set("public","package","stats")
Note:
lock_objects=F
is not left
out as it ensures decorators work correctly.set
method to add private/public variables/methodsFor the full list of methods to (optimally) include see the
‘Statistical Methods’ section in the help pages of SDistribution:
?SDistribution
. This does not include
pdf/cdf/quantile/rand, these are defined in the constructor and not in
the class. Once again, if there is no closed form analytical expression
possible, omit the method completely.
The following methods are included by default and can therefore by omitted from the class definition:
Additionally the pgf
method returns NaN if omitted but
this can be overloaded by including the method in the class definition.
Below is an example of adding four methods to the Normal
distribution
Normal$set("public","mean",function(){
return(self$getParameterValue("mean"))
})
Normal$set("public","variance",function(){
return(self$getParameterValue("var"))
})
Normal$set("public","skewness",function(){
return(0)
})
Normal$set("public", "mgf", function(t){
return(exp((self$getParameterValue("mean") * t) + (self$getParameterValue("var") * t^2 * 0.5)))
})
Note:
self
keyword
and the getParameterValue
method, we looked at this in the
custom
distribution tutorial
?SDistribution
help page, this ensures
that the automated S3 dispatch methods run correctlyThe constructor for all SDistribution objects looks the same, below is the constructor for the Normal distribution, which we will talk through as an example.
initialize = function(mean = NULL, var = NULL, sd = NULL, prec = NULL,
decorators = NULL) {
super$initialize(
decorators = decorators,
support = Reals$new(),
symmetry = "sym",
type = Reals$new()
)
Note the following:
getParameterSet.Normal
.And that’s it!
In a separate script called getParameterSet.R we have the generic and
dispatch methods for every SDistribution. param6 is used to
handle parameter sets. Below is the getParameterSet
method
for the Normal distribution
getParameterSet.Normal <- function(object, ...) {
pset(
prm("mean", "reals", 0, tags = "required"),
prm("var", "posreals", 1, tags = c("linked", "required")),
prm("sd", "posreals", tags = c("linked", "required")),
prm("prec", "posreals", tags = c("linked", "required")),
trafo = function(x, self) {
vars <- sds <- precs <- NULL
if (any(grepl("sd", names(x)))) {
sds <- list_element(x, "sd")
vars <- setNames(as.list(unlist(sds) ^ 2),
gsub("sd", "var", names(sds)))
} else if (any(grepl("prec", names(x)))) {
precs <- list_element(x, "prec")
vars <- setNames(as.list(1 / unlist(precs)),
gsub("prec", "var", names(precs)))
}
if (is.null(vars)) {
vars <- list_element(x, "var")
}
if (is.null(sds)) {
sds <- setNames(as.list(sqrt(unlist(vars))),
gsub("var", "sd", names(vars)))
}
if (is.null(precs)) {
precs <- setNames(as.list(1 / unlist(vars)),
gsub("var", "prec", names(vars)))
}
unique_nlist(c(vars, sds, precs, x))
}
)
}
Note:
getParameterSet.Normal
prm
objectstrafo
function is used to calculate
the other linked arguments. This is vectorised to handle
VectorDistributions
.## Summary
That’s everything that is required to create your own SDistribution class. In summary the different components include
?SDistribution
getParameterSet
dispatch method, written in the
getParameterSet.R
script## Extension Guidelines