Probability distributions and Random Number Genereation ======================================================= Probability distributions ------------------------- To some extent, the foundation of statistics is an understanding of probability distributions. In addition, drawing of random samples from specific probability distributions is ubiquitous in applied statistics and useful in many contexts, not least of which is an appreciation of how different probabilty distriutions behave .. code:: python help(Distributions) .. raw:: html
Distributions {stats} | R Documentation |
Density, cumulative distribution function, quantile function and random variate generation for many standard probability distributions are available in the stats package.
The functions for the density/mass function, cumulative distribution
function, quantile function and random variate generation are named in the
form dxxx
, pxxx
, qxxx
and rxxx
respectively.
For the beta distribution see dbeta
.
For the binomial (including Bernoulli) distribution see
dbinom
.
For the Cauchy distribution see dcauchy
.
For the chi-squared distribution see dchisq
.
For the exponential distribution see dexp
.
For the F distribution see df
.
For the gamma distribution see dgamma
.
For the geometric distribution see dgeom
. (This is also
a special case of the negative binomial.)
For the hypergeometric distribution see dhyper
.
For the log-normal distribution see dlnorm
.
For the multinomial distribution see dmultinom
.
For the negative binomial distribution see dnbinom
.
For the normal distribution see dnorm
.
For the Poisson distribution see dpois
.
For the Student's t distribution see dt
.
For the uniform distribution see dunif
.
For the Weibull distribution see dweibull
.
For less common distributions of test statistics see
pbirthday
, dsignrank
,
ptukey
and dwilcox
(and see the
‘See Also’ section of cor.test
).
RNG
about random number generation in R.
The CRAN task view on distributions, http://cran.r-project.org/web/views/Distributions.html, mentioning several CRAN packages for additional distributions.
Binomial {stats} | R Documentation |
Density, distribution function, quantile function and random
generation for the binomial distribution with parameters size
and prob
.
This is conventionally interpreted as the number of ‘successes’
in size
trials.
dbinom(x, size, prob, log = FALSE) pbinom(q, size, prob, lower.tail = TRUE, log.p = FALSE) qbinom(p, size, prob, lower.tail = TRUE, log.p = FALSE) rbinom(n, size, prob)
x, q |
vector of quantiles. |
p |
vector of probabilities. |
n |
number of observations. If |
size |
number of trials (zero or more). |
prob |
probability of success on each trial. |
log, log.p |
logical; if TRUE, probabilities p are given as log(p). |
lower.tail |
logical; if TRUE (default), probabilities are P[X ≤ x], otherwise, P[X > x]. |
The binomial distribution with size
= n and
prob
= p has density
p(x) = choose(n, x) p^x (1-p)^(n-x)
for x = 0, …, n.
Note that binomial coefficients can be computed by
choose
in R.
If an element of x
is not integer, the result of dbinom
is zero, with a warning.
is computed using Loader's algorithm, see the reference below.
The quantile is defined as the smallest value x such that F(x) ≥ p, where F is the distribution function.
dbinom
gives the density, pbinom
gives the distribution
function, qbinom
gives the quantile function and rbinom
generates random deviates.
If size
is not an integer, NaN
is returned.
The length of the result is determined by n
for
rbinom
, and is the maximum of the lengths of the
numerical arguments for the other functions.
The numerical arguments other than n
are recycled to the
length of the result. Only the first elements of the logical
arguments are used.
For dbinom
a saddle-point expansion is used: see
Catherine Loader (2000). Fast and Accurate Computation of Binomial Probabilities; available from http://www.herine.net/stat/software/dbinom.html.
pbinom
uses pbeta
.
qbinom
uses the Cornish–Fisher Expansion to include a skewness
correction to a normal approximation, followed by a search.
rbinom
(for size < .Machine$integer.max
) is based on
Kachitvichyanukul, V. and Schmeiser, B. W. (1988) Binomial random variate generation. Communications of the ACM, 31, 216–222.
For larger values it uses inversion.
Distributions for other standard distributions, including
dnbinom
for the negative binomial, and
dpois
for the Poisson distribution.
require(graphics) # Compute P(45 < X < 55) for X Binomial(100,0.5) sum(dbinom(46:54, 100, 0.5)) ## Using "log = TRUE" for an extended range : n <- 2000 k <- seq(0, n, by = 20) plot (k, dbinom(k, n, pi/10, log = TRUE), type = "l", ylab = "log density", main = "dbinom(*, log=TRUE) is better than log(dbinom(*))") lines(k, log(dbinom(k, n, pi/10)), col = "red", lwd = 2) ## extreme points are omitted since dbinom gives 0. mtext("dbinom(k, log=TRUE)", adj = 0) mtext("extended range", adj = 0, line = -1, font = 4) mtext("log(dbinom(k))", col = "red", adj = 1)