Common Severity Distributions

Severity refers to the per-claim loss to the insurer. We discuss the most common continuous claim severity distributions.


The continuous uniform distribution or rectangular distribution is a family of symmetric probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by the two parameters, [math]a[/math] and [math]b[/math], which are its minimum and maximum values. The distribution is often abbreviated [math]U(a,b)[/math].

Probability Density Function

The probability density function of the continuous uniform distribution is:

[[math]] f(x)=\begin{cases} \frac{1}{b - a} & \mathrm{for}\ a \le x \le b, \\ 0 & \mathrm{for} \ x \lt a \ \mathrm{or} \ x \gt b \end{cases} [[/math]]

The values of [math]f(x)[/math] at the two boundaries [math]a[/math] and [math]b[/math] are usually unimportant because they do not alter the underlying probability distribution.

Cumulative distribution function

The cumulative distribution function is:

[[math]] F(x)= \begin{cases} 0 & \text{for }x \lt a \\[8pt] \frac{x-a}{b-a} & \text{for }a \le x \le b \\[8pt] 1 & \text{for }x \gt b \end{cases} [[/math]]

Its inverse is:

[[math]]F^{-1}(p) = a + p (b - a) \,\,\text{ for } 0 \lt p \lt 1.[[/math]]

Moment-generating function

The moment-generating function is:

[[math]] M_x = E(e^{tx}) = \frac{e^{tb}-e^{ta}}{t(b-a)} \,\! [[/math]]

from which we may calculate the raw moments [math]m_k[/math]

[[math]]m_1=\frac{a+b}{2},m_2=\frac{a^2+ab+b^2}{3},m_k=\frac{1}{k+1}\sum_{i=0}^k a^ib^{k-i}. [[/math]]

If [math]X[/math] is uniformly distributed, written [math]X \sim U(a,b) [/math], then

[[math]] \operatorname{E}[X] = m_1 = (a+b)/2,\operatorname{Var}[X] = m_2 - m_1^2 = (b-a)^2/12. [[/math]]

Related Distributions

  • If [math]X \sim U(0,1)[/math], then [math]Y = − λ^{−1} \ln(X)[/math] has an exponential distribution with (rate) parameter [math]\lambda[/math].
  • If [math]X[/math] has a standard uniform distribution, then [math]Y = X^n[/math] has a beta distribution with parameters [math]1/n[/math] and [math]1[/math]. (Note this implies that the standard uniform distribution is a special case of the beta distribution, with parameters 1 and 1.)


The beta distribution is a family of continuous probability distributions defined on the interval [math][0,1][/math]parametrized by two positive shape parameters, denoted by [math]\alpha[/math] and [math]\beta[/math], that appear as exponents of the random variable and control the shape of the distribution. The beta distribution has been applied to model the behavior of random variables limited to intervals of finite length in a wide variety of disciplines and is a suitable model for the random behavior of percentages and proportions.

Probability density function

The probability density function of the beta distribution, for 0 ≤ [math]x[/math] ≤ 1, and shape parameters [math]\alpha,\beta[/math] > 0, is a power function of the variable [math]x[/math] and of its reflection (1 - [math]x[/math]) as follows:

[[math]]\begin{align*} f(x;\alpha,\beta) & = \mathrm{constant}\cdot x^{\alpha-1}(1-x)^{\beta-1} \\ & = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{\int_0^1 u^{\alpha-1} (1-u)^{\beta-1}\, du} \\[6pt] & = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}\, x^{\alpha-1}(1-x)^{\beta-1} \\[6pt] & = \frac{1}{B(\alpha,\beta)} x^{\alpha-1}(1-x)^{\beta-1} \end{align*}[[/math]]

where [math]\Gamma(z)[/math] is the gamma function. The beta function, [math]B[/math], is a normalization constant to ensure that the total probability integrates to 1. In the following, a random variable [math]X[/math] beta-distributed with parameters [math]\alpha[/math] and [math]\beta[/math] will be denoted by:

[[math]]X \sim \operatorname{Beta}(\alpha, \beta).[[/math]]

Cumulative distribution function

The cumulative distribution function is

[[math]]F(x;\alpha,\beta) = \dfrac{B(x;\alpha,\beta)}{B(\alpha,\beta)} = I_x(\alpha,\beta)[[/math]]

where [math]B(x;\alpha,\beta)[/math] is the incomplete beta function and [math]I_x(\alpha,\beta)[/math] is the regularized incomplete beta function.

Mean and Variance

If [math]X \sim \operatorname{Beta}(\alpha,\beta)[/math], then its expected value and variance are given by:

[[math]] \operatorname{E}[X] = \frac{\alpha}{\beta}, \ \ \operatorname{Var}[X] = \frac{\alpha \beta}{(\alpha + \beta)^2(\alpha + \beta + 1)}. [[/math]]

Higher Moments

The moment generating function is

[[math]]\begin{align*} M_X(\alpha; \beta; t) = \operatorname{E}\left[e^{tX}\right] = \int_0^1 e^{tx} f(x;\alpha,\beta)\,dx &= {}_1F_1(\alpha; \alpha+\beta; t) \\ &= \sum_{n=0}^\infty \frac {\alpha^{(n)}} {(\alpha+\beta)^{(n)}}\frac {t^n}{n!}\\ &= 1 +\sum_{k=1}^{\infty} \left( \prod_{r=0}^{k-1} \frac{\alpha+r}{\alpha+\beta+r} \right) \frac{t^k}{k!} \end{align*}[[/math]]

In particular [math]M_X(\alpha; \beta; 0) = 1 [/math].

Using the moment generating function, the k-th raw moment is given by[1] the factor

[[math]]\prod_{r=0}^{k-1} \frac{\alpha+r}{\alpha+\beta+r} [[/math]]

multiplying the (exponential series) term [math]\left(\frac{t^k}{k!}\right)[/math] in the series of the moment generating function

[[math]]\operatorname{E}[X^k]= \frac{\alpha^{(k)}}{(\alpha + \beta)^{(k)}} = \prod_{r=0}^{k-1} \frac{\alpha+r}{\alpha+\beta+r}[[/math]]

where [math](x)^k[/math] is a Pochhammer symbol representing rising factorial. It can also be written in a recursive form as

[[math]]\operatorname{E}[X^k] = \frac{\alpha + k - 1}{\alpha + \beta + k - 1}\operatorname{E}[X^{k - 1}].[[/math]]


The normal (or Gaussian) distribution is a very common continuous probability distribution. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known.[2][3] The normal distribution is sometimes informally called the bell curve.

Probability Density Function

The probability density of the normal distribution is:

[[math]] f(x \; | \; \mu, \sigma^2) = \frac{1}{\sqrt{2\sigma^2\pi} } \; e^{ -\frac{(x-\mu)^2}{2\sigma^2} } [[/math]]

where [math]\mu[/math] is mean of the distribution, [math]\sigma[/math] is standard deviation and [math]\sigma^2[/math] is the variance.

Standard normal distribution

The simplest case of a normal distribution is known as the standard normal distribution. This is a special case when [math]\mu = 0[/math] and [math]\sigma = 1[/math], and it is described by this probability density function:

[[math]]\phi(x) = \frac{e^{- \frac{\scriptscriptstyle 1}{\scriptscriptstyle 2} x^2}}{\sqrt{2\pi}}\, [[/math]]

The factor [math]1/\sqrt{2\pi}[/math] in this expression ensures that the total area under the curve [math]\phi(x)[/math] is equal to one. The ½ in the exponent ensures that the distribution has unit variance (and therefore also unit standard deviation). This function is symmetric around [math]x = 0[/math], where it attains its maximum value [math]1/\sqrt{2\pi}[/math]; and has inflection points at +1 and −1.

General normal distribution

Every normal distribution is a version of the standard normal distribution whose domain has been stretched by a factor [math]\sigma[/math] (the standard deviation) and then translated by [math]\mu[/math](the mean value):

[[math]] f(x \mid \mu, \sigma) =\frac{1}{\sigma} \phi\left(\frac{x-\mu}{\sigma}\right). [[/math]]

The probability density must be scaled by [math]1/\sigma[/math] so that the integral is still 1.

If [math]Z[/math] is a standard normal deviate, then [math]X = Z\sigma + \mu[/math] will have a normal distribution with expected value [math]\mu[/math] and standard deviation [math]\sigma[/math]. Conversely, if [math]X[/math] is a general normal deviate, then [math]Z = (X - \mu)/\sigma[/math] will have a standard normal distribution.

Moment Generating Function

For a normal distribution with mean [math]\mu[/math] and standard deviation [math]\sigma[/math], the moment generating function exists and is equal to

[[math]]M(t) = \hat \phi(-it) = e^{ \mu t} e^{\frac12 \sigma^2 t^2 }[[/math]]

For any non-negative integer p, the plain central moments are

[[math]] \mathrm{E}\left[X^p\right] = \begin{cases} 0 & \text{if }p\text{ is odd,} \\ \sigma^p\,(p-1)!! & \text{if }p\text{ is even.} \end{cases} [[/math]]

Here n!! denotes the double factorial, that is, the product of every number from n to 1 that has the same parity as n.

Order Non-central moment Central moment
1 μ 0
2 μ2 + σ2 σ 2
3 μ3 + 3μσ2 0
4 μ4 + 6μ2σ2 + 3σ4 3σ 4
5 μ5 + 10μ3σ2 + 15μσ4 0

Cumulative distribution function

The cumulative distribution function (CDF) of the standard normal distribution, usually denoted with the capital Greek letter [math]\Phi[/math]), is the integral

[[math]]\Phi(x)\; = \;\frac{1}{\sqrt{2\pi}} \int_{-\infty}^x e^{-t^2/2} \, dt.[[/math]]

For a generic normal distribution with mean [math]μ[/math] and standard deviation [math]\sigma[/math], the cumulative distribution function is

[[math]]F(x)\;=\;\Phi\left(\frac{x-\mu}{\sigma}\right)\;=\; \frac12\left[1 + \operatorname{erf}\left(\frac{x-\mu}{\sigma\sqrt{2}}\right)\right] [[/math]]

with [math]\operatorname{erf}(x)[/math] the error function.


A lognormal (or log-normal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable [math]X[/math] is lognormally distributed, then [math]Y = \ln(X)[/math] has a normal distribution. Likewise, if [math]Y[/math] has a normal distribution, then [math]X = \exp(Y)[/math] has a lognormal distribution.


Given a lognormally distributed random variable [math]X[/math] and two parameters [math]\mu[/math] and [math]\sigma[/math] that are, respectively, the mean and standard deviation of the variable’s natural logarithm, then the logarithm of [math]X[/math] is normally distributed, and we can write [math]X[/math] as

[[math]] X=e^{\mu+\sigma Z} [[/math]]

with [math]Z[/math] a standard normal variable. On a logarithmic scale, [math]\mu[/math] and [math]\sigma[/math] can be called the location parameter and the scale parameter, respectively.

Cumulative distribution function

The cumulative distribution function is

[[math]] \begin{align*} \operatorname{P}(X\leq x) = \operatorname{P}\left(e^{\mu + \sigma Z} \leq x\right) &= \operatorname{P}\left(Z \leq \frac{\ln x - \mu}{\sigma}\right) \\ &=\Phi\left(\frac{\ln x - \mu}{\sigma}\right) \end{align*} [[/math]]

where [math]\Phi[/math] is the cumulative distribution function of the standard normal distribution.

Probability density function

To derive the probability density function, we can simply differentiate the cumulative distribution function:

[[math]] \frac{d}{dx} \Phi\left(\frac{\ln x - \mu}{\sigma}\right) = (\sigma x)^{-1}\phi\left(\frac{\ln x - \mu}{\sigma}\right) = \frac{1}{ x\sigma \sqrt{2 \pi}}\exp\left[-\frac {(\mbox{ln}x - \mu)^{2}} {2\sigma^{2}}\right],\ \ x\gt0. [[/math]]


All moments of the lognormal distribution exist and it holds that


(which can be derived by letting [math]z=\frac{\ln(x) - (\mu+n\sigma^2)}{\sigma}[/math] within the integral). However, the expected value [math]\operatorname{E}[e^{t X}][/math] is not defined for any positive value of the argument [math]t[/math] as the defining integral diverges. In consequence the moment generating function is not defined.[4] The last is related to the fact that the lognormal distribution is not uniquely determined by its moments.

The arithmetic mean, expected square, arithmetic variance, and arithmetic standard deviation of a lognormally distributed variable [math]X[/math] can be derived easily from its moments (see \ref{lnmoms}):

[[math]]\begin{align*} & \operatorname{E}[X] = e^{\mu + \tfrac{1}{2}\sigma^2}, \\ & \operatorname{E}[X^2] = e^{2\mu + 2\sigma^2}, \\ & \operatorname{Var}[X] = \operatorname{E}[X^2] - \operatorname{E}[X]^2 = (\operatorname{E}[X])^2(e^{\sigma^2} - 1) = e^{2\mu + \sigma^2} (e^{\sigma^2} - 1), \\ & \operatorname{SD}[X] = \sqrt{\operatorname{Var}[X]} = e^{\mu + \tfrac{1}{2}\sigma^2}\sqrt{e^{\sigma^2} - 1} = \operatorname{E}[X] \sqrt{e^{\sigma^2} - 1}. \end{align*}[[/math]]


The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto, is a power-law probability distribution that is characterized by a scale parameter [math]\theta[/math] and a shape parameter [math]\alpha[/math], which is known as the tail index.

Cumulative distribution function

The cumulative distribution function of a Pareto random variable with parameters [math]\alpha[/math] and [math]\theta[/math] is:

[[math]]F_X(x) = \begin{cases} 1-\left(\frac{\theta}{x}\right)^\alpha & x \ge \theta, \\ 0 & x \lt\theta. \end{cases}[[/math]]

Probability density function

It follows, by differentiation, that the probability density function is

[[math]]f_X(x)= \begin{cases} \frac{\alpha \theta^\alpha}{x^{\alpha+1}} & x \ge \theta, \\ 0 & x \lt \theta. \end{cases} [[/math]]

When plotted on linear axes, the distribution assumes the familiar J-shaped curve which approaches each of the orthogonal axes asymptotically. All segments of the curve are self-similar (subject to appropriate scaling factors).


The expected value of a random variable [math]X[/math] following a Pareto distribution is

[[math]]\operatorname{E}(X)= \begin{cases} \infty & \alpha\le 1, \\ \frac{\alpha \theta}{\alpha-1} & \alpha\gt1 \end{cases}[[/math]]

and the variance is

[[math]]\operatorname{Var}(X)= \begin{cases} \infty & \alpha\in(1,2], \\ \left(\frac{\theta}{\alpha-1}\right)^2 \frac{\alpha}{\alpha-2} & \alpha\gt2. \end{cases}[[/math]]

The raw moments are

[[math]]\operatorname{E}[X^n]= \begin{cases} \infty & \alpha \le n, \\ \frac{\alpha \theta^n}{\alpha-n} & \alpha \gt n. \end{cases}[[/math]]


The exponential distribution (a.k.a. negative exponential distribution) is the probability distribution that describes the time between events in a Poisson process, i.e. a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless.

Probability density function

The probability density function (pdf) of an exponential distribution is

[[math]] f(x;\lambda) = \begin{cases} \lambda e^{-\lambda x} & x \ge 0, \\ 0 & x \lt 0. \end{cases}[[/math]]

Here [math]\lambda \gt 0[/math] > is the parameter of the distribution, often called the rate parameter. The distribution is supported on the interval [math][0, \infty)[/math]. If a random variable [math]X[/math] has this distribution, we write [math]X \sim \operatorname{Exp}(\lambda).[/math]

Cumulative distribution function

The cumulative distribution function is given by

[[math]]F(x;\lambda) = \begin{cases} 1-e^{-\lambda x} & x \ge 0, \\ 0 & x \lt 0. \end{cases}[[/math]]

Alternative parameterization

A commonly used alternative parametrization is to define the probability density function of an exponential distribution as

[[math]]f(x;\theta) = \begin{cases} \frac{1}{\theta} e^{-\frac{x}{\theta}} & x \ge 0, \\ 0 & x \lt 0. \end{cases}[[/math]]

where [math]\theta \gt 0[/math] > 0 is mean, standard deviation, and cale parameter of the distribution, the reciprocal of the rate parameter, [math]\lambda[/math], defined above. In this specification, [math]\theta[/math] is a survival parameter in the sense that if a random variable [math]X[/math] is the duration of time that a given biological or mechanical system manages to survive and [math]X \sim \operatorname{Exp}(\theta)[/math] then [math]\operatorname{E}[X] = \theta[/math]. That is to say, the expected duration of survival of the system is [math]\theta[/math] units of time. The parametrization involving the "rate" parameter arises in the context of events arriving at a rate [math]\lambda[/math], when the time between events (which might be modeled using an exponential distribution) has a mean of [math]\theta = \lambda^{−1}[/math].

This alternate specification should be the one to expect for the exam.


The mean or expected value of an exponentially distributed random variable [math]X[/math] with rate parameter [math]\lambda[/math] is given by [math]\operatorname{E}[X] = \theta[/math] and the variance is given by [math]\operatorname{Var}[X] = \theta^2[/math]. In fact, the moment generating function of an exponential distribution is given by

[[math]] \operatorname{E}[e^{tX}] = \lambda \int_{0}^{\infty}e^{-x(\lambda - t)} dx = \frac{\lambda}{\lambda - t}, \textrm{for} \ \ t \lt \lambda [[/math]]

which in turn gives the raw moments [math]\operatorname{E}[X^n] = \theta^n n!.[/math]


An exponentially distributed random variable [math]T[/math] obeys the relation

[[math]]\operatorname{P} \left (T \gt s + t | T \gt s \right ) = \operatorname{P}(T \gt t), \qquad \forall s, t \ge 0.[[/math]]

When [math]T[/math] is interpreted as the waiting time for an event to occur relative to some initial time, this relation implies that, if [math]T[/math] is conditioned on a failure to observe the event over some initial period of time [math]s[/math], the distribution of the remaining waiting time is the same as the original unconditional distribution. For example, if an event has not occurred after 30 seconds, the conditional probability that occurrence will take at least 10 more seconds is equal to the unconditional probability of observing the event more than 10 seconds relative to the initial time.


The gamma distribution is a two-parameter family of continuous probability distributions. The common exponential distribution and chi-squared distribution are special cases of the gamma distribution.

A random variable [math]X[/math] that is gamma-distributed with shape [math]\alpha[/math] and scale [math]\theta[/math] is denoted by

[[math]]X \sim \Gamma(\alpha, \theta) \equiv \textrm{Gamma}(\alpha, \theta).[[/math]]

If [math]\alpha[/math] is a positive integer, then the distribution represents an Erlang distribution; i.e., the sum of [math]\alpha[/math] independent exponentially distributed random variables, each of which has a mean of [math]\theta[/math].

Probability Density Function

The probability density function using the shape-scale parametrization is

[[math]]f(x;k,\theta) = \frac{x^{\alpha-1}e^{-\frac{x}{\theta}}}{\theta^{\alpha}\Gamma(\alpha)} \quad \text{ for } x \gt 0 \text{ and } , α \theta \gt 0.[[/math]]

Here [math]\Gamma(\alpha)[/math] is the gamma function evaluated at [math]\alpha[/math].

Cumulative Distribution Function

The cumulative distribution function is the regularized gamma function:

[[math]] F(x;\alpha,\theta) = \int_0^x f(u;\alpha,\theta)\,du = \frac{\gamma\left(\alpha, \frac{x}{\theta}\right)}{\Gamma(\alpha)}[[/math]]

where [math]\gamma\left(\alpha, \frac{x}{\theta}\right)[/math] is the lower incomplete gamma function.

Mean and Variance

If [math]X [/math] has a Gamma( [math]\alpha[/math], [math]\theta[/math] ) distribution, then [math]\operatorname{E}[X] = \alpha \theta [/math] and [math]\operatorname{Var}[X] = \alpha \theta ^2 [/math].


If [math]X_i[/math] has a Gamma([math]\alpha_i[/math], [math]\theta[/math]) distribution for [math]i = 1,\ldots,N[/math] (i.e., all distributions have the same scale parameter [math]\theta[/math]), then

[[math]] \sum_{i=1}^N X_i \sim\mathrm{Gamma} \left( \sum_{i=1}^N \alpha_i, \theta \right)[[/math]]

provided all [math]X_i[/math] are independent.


The Weibull distribution is a continuous probability distribution parametrized by a shape parameter [math]\tau \gt 0[/math] and a scale parameter [math] \theta \gt 0 [/math].

Probability Density Function

The probability density function of a Weibull random variable is:[5]

[[math]] f(x;\theta,\tau) = \begin{cases} \frac{\tau}{\theta}\left(\frac{x}{\theta}\right)^{\tau-1}e^{-(x/\theta)^{\tau}} & x\geq0 ,\\ 0 & x\lt0, \end{cases}[[/math]]

Cumulative Distribution Function

The cumulative distribution function for the Weibull distribution is

[[math]] F(x;\tau,\theta) = 1- e^{-(x/\theta)^\tau}\, [[/math]]

for [math]x \geq 0[/math], and 0 otherwise.


The moment generating function of the logarithm of a Weibull distributed random variable is given by[1]

[[math]]E\left[e^{t\log X}\right] = \theta^t\Gamma\left(\frac{t}{\tau}+1\right)[[/math]]

where [math]\Gamma[/math] is the gamma function.

In particular, the kth raw moment of [math]X[/math] is given by

[[math]]\operatorname{E}\left[X^k\right] = \theta^k \Gamma\left(1+\frac{k}{\tau}\right).[[/math]]

The mean and variance of a Weibull random variable can be expressed as

[[math]]\operatorname{E}(X) = \theta \Gamma\left(1+\frac{1}{\tau}\right)\,[[/math]]


[[math]]\operatorname{Var}(X) = \theta^2\left[\Gamma\left(1+\frac{2}{\tau}\right) - \left(\Gamma\left(1+\frac{1}{\tau}\right)\right)^2\right]\,.[[/math]]

Burr Distribution

The Burr Type XII distribution or simply the Burr distribution[6] is a continuous probability distribution for a non-negative random variable. It is most commonly used to model household income (See: Household income in the U.S.).

Probability Density Function

The Burr (Type XII) distribution has probability density function:

[[math]]f(x;\gamma,\alpha,\theta) = \alpha \theta^{-1} \gamma \frac{(x/\theta)^{\gamma-1}}{(1+(x/\theta)^{\gamma})^{\alpha+1}}\![[/math]]

and cumulative distribution function:

[[math]]F(x;\gamma,\alpha,\theta) = 1-\left(1+ (x/\theta)^{\gamma}\right)^{-\alpha} .[[/math]]


The raw moments of a Burr distribution can be expressed in terms of the Beta function [math]\operatorname{B}[/math]:

[[math]] \begin{align*} \operatorname{E}[X^k] = \int_0^{\infty}x^k f(x;\gamma,\alpha,\theta) dx &= \alpha \theta^k \int_0^{\infty} u^{k/\gamma} (1 + u)^{-(\alpha + 1)} du \\ &= \alpha \theta^k \int_0^1 (1-v)^{k/\gamma}v^{\alpha - k/\gamma - 1} dv \\ &= \alpha \operatorname{B}(k/\gamma + 1,\alpha - k/\gamma) \end{align*} [[/math]]

To be consistent with [7], we can also express the raw moments in terms of the Gamma function:

[[math]] \operatorname{E}[X^k] = \alpha \theta^k \operatorname{B}(k/\gamma + 1,\alpha - k/\gamma) = \frac{\alpha \theta^k \Gamma(k/\gamma + 1)\Gamma(\alpha - k/\gamma)}{\Gamma(\alpha + 1)} = \frac{\theta^k \Gamma(k/\gamma + 1) \Gamma(\alpha - k/\gamma)}{\Gamma(\alpha)}. [[/math]]


  1. 1.0 1.1 Johnson, Kotz & Balakrishnan 1994
  2. Normal Distribution, Gale Encyclopedia of Psychology
  3. Casella & Berger (2001, p. 102)
  4. Heyde, CC. (1963), "On a property of the lognormal distribution", Journal of the Royal Statistical Society, Series B (Methodological), 25 (2): 392–393, doi:10.1007/978-1-4419-5823-5_6
  5. Papoulis, Athanasios Papoulis; Pillai, S. Unnikrishna (2002). Probability, Random Variables, and Stochastic Processes (4th ed.). Boston: McGraw-Hill. ISBN 0-07-366011-6.
  6. Burr, I. W. (1942). "Cumulative frequency functions". Annals of Mathematical Statistics 13 (2): 215–232. doi:10.1214/aoms/1177731607. 

Wikipedia References