This tutorial assumes that you have a good knowledge of R6 and so we will not be going through the basics of inheritance and private/public methods/variables.

SDistribution Class

All implemented probability distributions (excluding Kernels) in distr6 inherit from the SDistribution class. This means they share a common interface. The only differences between these distributions is that some will have methods missing as no analytic results are available. See the uml diagram for an overview of how this all fits in together. A core design principle in distr6 is that only analytical methods are defined in the SDistribution child classes, all numerical results are available through decorators. See the decorators tutorial for more information on decorators and the analytical and numerical article for further discussions on analytical and numerical methods. The summary is that when creating your own SDistribution class, please do not put any numerical methods in the core interface, if a closed form expression cannot be found, omit the method entirely and it can be imputed with a decorator. If your desired method is not available in one of our decorators but you think it is useful, see the creating a decorator extension guidelines.

Creating an SDistribution

SDistribution Variables

Every class inheriting from SDistribution must have the following public variables:

  • name - Full (unique) name of probability distribution
  • short_name - Short name (unique) id for distribution
  • description - Short description, usually just the name
  • package - The package in which the d/p/q/r functions are written

For the Normal distribution, the above all looks like

Normal <- R6::R6Class("Normal", inherit = SDistribution, lock_objects = F)
Normal$set("public","name","Normal")
Normal$set("public","short_name","Norm")
Normal$set("public","description","Normal Probability Distribution.")
Normal$set("public","package","stats")

Note:

  1. It is very important that lock_objects=F is not left out as it ensures decorators work correctly.
  2. We use the R6 convention of defining the class as succinctly as possible (class name and inherited classes) and then using the set method to add private/public variables/methods

SDistribution Methods

For the full list of methods to (optimally) include see the ‘Statistical Methods’ section in the help pages of SDistribution: ?SDistribution. This does not include pdf/cdf/quantile/rand, these are defined in the constructor and not in the class. Once again, if there is no closed form analytical expression possible, omit the method completely.

The following methods are included by default and can therefore by omitted from the class definition:

  1. prec
  2. correlation
  3. stdev
  4. median
  5. iqr

Additionally the pgf method returns NaN if omitted but this can be overloaded by including the method in the class definition. Below is an example of adding four methods to the Normal distribution

Normal$set("public","mean",function(){
  return(self$getParameterValue("mean"))
})
Normal$set("public","variance",function(){
  return(self$getParameterValue("var"))
})
Normal$set("public","skewness",function(){
  return(0)
})
Normal$set("public", "mgf", function(t){
  return(exp((self$getParameterValue("mean") * t) + (self$getParameterValue("var") * t^2 * 0.5)))
})

Note:

  1. Methods that require parameters use the self keyword and the getParameterValue method, we looked at this in the custom distribution tutorial
  2. The arguments to the methods are not optional, they must have the names given in the ?SDistribution help page, this ensures that the automated S3 dispatch methods run correctly

The Constructor

The constructor for all SDistribution objects looks the same, below is the constructor for the Normal distribution, which we will talk through as an example.

initialize = function(mean = NULL, var = NULL, sd = NULL, prec = NULL,
                      decorators = NULL) {
  super$initialize(
    decorators = decorators,
    support = Reals$new(),
    symmetry = "sym",
    type = Reals$new()
  )

Note the following:

  1. The arguments to the constructor include all possible parameterisations; all are NULL by default as defaults are set in getParameterSet.Normal.
  2. Every constructor is a simple call to the super initialize method and we only pass through the objects symmetry, support, type, and decorators.

And that’s it!

getParameterSet

In a separate script called getParameterSet.R we have the generic and dispatch methods for every SDistribution. param6 is used to handle parameter sets. Below is the getParameterSet method for the Normal distribution

getParameterSet.Normal <- function(object, ...) {
pset(
  prm("mean", "reals", 0, tags = "required"),
  prm("var", "posreals", 1, tags = c("linked", "required")),
  prm("sd", "posreals", tags = c("linked", "required")),
  prm("prec", "posreals", tags = c("linked", "required")),
  trafo = function(x, self) {

    vars <- sds <- precs <- NULL

    if (any(grepl("sd", names(x)))) {
      sds <- list_element(x, "sd")
      vars <- setNames(as.list(unlist(sds) ^ 2),
                       gsub("sd", "var", names(sds)))
    } else if (any(grepl("prec", names(x)))) {
      precs <- list_element(x, "prec")
      vars <- setNames(as.list(1 / unlist(precs)),
                       gsub("prec", "var", names(precs)))
    }

    if (is.null(vars)) {
      vars <- list_element(x, "var")
    }

    if (is.null(sds)) {
      sds <- setNames(as.list(sqrt(unlist(vars))),
                        gsub("var", "sd", names(vars)))
    }

    if (is.null(precs)) {
      precs <- setNames(as.list(1 / unlist(vars)),
                        gsub("var", "prec", names(vars)))
    }

    unique_nlist(c(vars, sds, precs, x))
  }
)
}

Note:

  1. This is a dispatch method so the method name is getParameterSet.Normal
  2. Defaults are set within the prm objects
  3. Parameters that are ‘linked’, i.e. where only one can be set are given the ‘linked’ tag. All parameters have the ‘required’ tag.
  4. The transformation trafo function is used to calculate the other linked arguments. This is vectorised to handle VectorDistributions.

## Summary

That’s everything that is required to create your own SDistribution class. In summary the different components include

  1. The 5 public variables
  2. Public methods: ‘statistical methods’ section in ?SDistribution
  3. The constructor
  4. getParameterSet dispatch method, written in the getParameterSet.R script

## Extension Guidelines