# Aggregate Models

An insurance company will sell insurance policies that enable the purchaser to make a claim when a certain well-defined event occurs within a specified time period. An actuary will be called upon to examine the distributional (stochastic) properties of portfolios of such policies. We consider two basic approaches: the individual risk model and the collective risk model. The individual risk model is less granular than the collective risk model since it doesn't try to model claim frequency and claim size (severity) separately. We will denote by $S$ the random variable representing the aggregate claims (total loss to the insurer) associated with a portfolio of policies.

## The Individual Risk Model

In the individual risk model, we let $X_i$ represent the claim (loss) size associated with the $i\textrm{th}$ policy in a portfolio of $n$ policies:

[$]S = \sum_{i=1}^n X_i \, .[$]

It is assumed that $X_1,\ldots,X_n$ are mutually independent but not necessarily identically distributed.

## The Collective Risk Model

In the collective risk model, we let $N$ denote the random variable representing the number of individual claims and let $Y_i$ represent the claim (loss) size associated with the $i\textrm{th}$ individual claim:

[$]S = \sum_{i=1}^N Y_i \,.[$]

The following additional constraints are also assumed:

• $Y_1,\ldots,Y_n$ are mutually independent and identically distributed.
• $N$ is independent of each $Y_i$.

### Relation to Compound Distributions

A compound probability distribution is the probability distribution that results from assuming that a random variable is distributed according to some parametrized distribution, with the parameters of that distribution being assumed to be themselves random variables. The compound distribution is the result of marginalizing over the intermediate random variables that represent the parameters of the initial distribution. An important type of compound distribution occurs when the parameter being marginalized over represents the number of random variables in a summation of random variables as is the case with the collective risk model.

### Mean and Variance

Mean and variance of the compound distribution derive in a simple way from the law of total expectation and the law of total variance. The mean equals

[$] \mu_S = \operatorname{E}[S] = \operatorname{E}[N]\operatorname{E}[Y_i] = \mu_N \mu_Y [$]

and the variance equals

[] \begin{align*} \sigma^2_S = \operatorname{Var}[S] &= \operatorname{E}[N]\operatorname{Var}[Y_i] + \operatorname{E}[Y_i]^2 \operatorname{Var}[N] \\ &=\mu_N \sigma^2_Y + \mu^2_Y \sigma_N^2. \end{align*} []

### Moment Generating Function

The moment generation function of $S$ can be expressed in terms of the moment generating functions of $Y$ and $N$:

[$] M_{S}(t) = M_N\left(\ln\left(M_Y(t)\right)\right). [$]

## Compound Poisson Distribution

If the claim frequency $N$ has a poisson distribution with mean $\theta$, then $S$ is said to have a compound poisson distribution.

### Properties

We have (see Mean and Variance)

[$] \mu_S = \mu_N \mu_Y,\,\sigma^2_S = \mu_N \sigma^2_Y + \mu^2_Y \sigma_N^2. [$]

Since $\operatorname{E}[N] = \operatorname{Var}[N]$, these formulae can be reduced to:

[$]\mu_S = \theta \mu_Y,\,\sigma^2_S = \theta \operatorname{E}\left[Y^2\right].[$]

The probability generating function of $S$ has a simple representation in terms of the probability generation function of $Y$:

[$]P_S(t) = \textrm{e}^{\lambda(P_Y(t) - 1)}.[$]

## Compound Negative Binomial Distribution

If the claim frequency $N$ has a negative binomial distribution with parameters $r$ and $\beta$, then $S$ is said to have a compound negative binomial distribution.

### Properties

Since $\mu_N = r\beta$ and $\sigma^2_{N} = \mu_N (1 + \beta)$, then

[$]\mu_S = r\beta \mu_Y,\,\sigma^2_S = r\beta \left(\beta \operatorname{E}[Y]^2 + \operatorname{E}[Y^2] \right).[$]

The probability generating function of $S$ equals

[$]P_S(t) = [1 - \beta(P_Y(t) - 1)]^{-r}.\,[$]

## Compound Binomial Distribution

If the claim frequency $N$ has a binomial distribution with size parameter $m$ and success parameter $q$, then $S$ is said to have a compound binomial distribution.

### Properties

Since $\mu_N = pn$ and $\sigma^2_{N} = \mu_N (1-p)$, then

[$]\mu_S = np \mu_Y,\,\sigma^2_S = np \left(\operatorname{E}[Y^2] - p \operatorname{E}[Y]^2 \right).[$]

The probability generating function of $S$ can be computed:

[$]P_S(t) = [1 + p(P_Y(t) - 1)]^{n}.\,[$]

## Panjer Recursion

The Panjer recursion is an algorithm to compute the probability distribution approximation of a compound random variable $S = \sum_{i=1}^N X_i\,$ where both $N\,$ and $X_i\,$ are random variables and of special types. In more general cases the distribution of $S$ is a compound distribution. The recursion for the special cases considered was introduced in a paper  by Harry Panjer (Distinguished Emeritus Professor, University of Waterloo). The content below is based primarily on the wikipedia page panjer recursion and (ArXiv preprint ).

### Preliminaries

We are interested in the compound random variable $S = \sum_{i=1}^N X_i\,$ where $N\,$ and $X_i\,$ fulfill the following preconditions.

#### Claim size distribution

We assume the $X_i\,$ to be i.i.d. and independent of $N\,$. Furthermore the $X_i\,$ have to be distributed on a lattice $\delta \mathbb{N}_0\,$ with latticewidth $\delta \gt0\,$.

[$]f_k = P[X_i = \delta k].\,[$]

In actuarial practice, $X_i\,$ is obtained by discretisation of the claim density function (upper, lower...).

#### Claim number distribution

The number of claims $N$ is a random variable, which is said to have a "claim number distribution", and which can take values 0, 1, 2, .... etc.. For the "Panjer recursion", the probability distribution of $N$ has to be a member of the Panjer class, otherwise known as the (a,b,0) class of distributions. This class consists of all counting random variables which fulfill the following relation:

[$]P[N=k] = p_k= \left(a + \frac{b}{k} \right) \cdot p_{k-1},~~k \ge 1.\, [$]

for some $a$ and $b$ which fulfill $a+b \ge 0\,$. The initial value $p_0\,$ is determined such that $\sum_{k=0}^\infty p_k = 1.\,$

The Panjer recursion makes use of this iterative relationship to specify a recursive way of constructing the probability distribution of $S$. In the following $W_N(x)\,$ denotes the probability generating function of $N$: for this see the table in (a,b,0) class of distributions.

### Recursion

The algorithm now gives a recursion to compute the $h_k =P[S = \delta k] \,$. The starting value is $h_0 = W_N(f_0)\,$ with the special cases

[$]h_0=p_0\cdot \exp(f_0 b) \quad \text{ if } \quad a = 0,\,[$]

and

[$]h_0=\frac{p_0}{(1-f_0a)^{1+b/a}} \quad \text{ for } \quad a \ne 0,\,[$]

and proceed with

[$]h_k=\frac{1}{1-f_0a}\sum_{j=1}^k \left( a+\frac{b\cdot j}{k} \right) \cdot f_j \cdot h_{k-j}.\,[$]

Panjer Recursion
1. Initialization: calculate $f_0$ and $h_0$, and set $H_0=h_0$.
2. Calculate $h_n=\frac{1}{1-af_0}\sum_{j=1}^{n}\left(a+\frac{bj}{n}\right)f_jh_{n-j}$
3. Calculate $H_n=H_{n-1}+h_n$
4. Interrupt the procedure if $H_n$ is larger than the required quantile level $\alpha$, e.g. $\alpha=0.999$. Then the estimate of the quantile $q_\alpha$ is $n\times\delta$.

## Panjer Extensions

The Panjer recursion formula can be extended to a class of frequency distributions $(a,b,1)$. The distribution is said to be in $(a,b,1)$ Panjer class if it satisfies

[$] p_n=\left(a+\frac{b}{n}\right)p_{n-1}, \quad \mbox{for}\quad n\geq 2\quad \mbox{and}\quad a,b\in \mathbb{R}. [$]

For the frequency distributions in a class $(a,b,1)$:

[$] \begin{eqnarray*} h_n&=&\frac{(p_1-(a+b)p_0)f_n+\sum_{j=1}^{n}\left(a+bj/n\right)f_j h_{n-j}}{1-af_0}, \quad n\geq 1, \nonumber \\ h_0&=&\sum\limits_{k = 0}^\infty {(f_0)^k p_k}. \end{eqnarray*} [$]

## Discretization

### Method of Rounding

Severity distributions are continuous and thus discretisation is required. To concentrate severity, whose continuous distribution is $F(x)$, on $\{0,\delta,2\delta,\ldots\}$, one can choose $\delta\gt0$ and use the central difference approximation

[$] \begin{eqnarray*} \label{Paper_CompDistr_CentralDiffDisretization_eq} f_0&=&F(\delta/2),\nonumber\\ f_n&=&F(n\delta+\delta/2)-F(n\delta-\delta/2),\quad n=1,2,\ldots\;. \end{eqnarray*} [$]

Then the compound discrete density $h_n$ is calculated using Panjer recursion and compound distribution is calculated as $H_n=\sum_{i=0}^n h_i$.

Discretisation can also be done via the forward and backward differences:

[$] \begin{eqnarray*} \label{Paper_CompDistr_FwdBackwardDiffDisretization_eq} f_n^U=F(n\delta+\delta)-F(n\delta);\quad f_n^L=F(n\delta)-F(n\delta-\delta). \end{eqnarray*} [$]

These allow for calculation of the upper and lower bounds for the compound distribution:

[$] \begin{eqnarray*} H_n^U=\sum_{i=0}^n h_i^U;\quad H_n^L=\sum_{i=0}^n h_i^L. \end{eqnarray*} [$]

### Local Moment Matching

With the local moment matching method, the discretization is performed by matching $p$ moments of a discrete distribution to that of the claim size distribution on each interval in a collection of intervals of equal width.

We start with $\delta \gt 0$ and consider the distribution, say $F(x)$, of the claim size on the interval $[x_k,x_k + p\delta)$. The discretization is found by determining probabilities $w_k^j$, $j=0,\ldots,p$, such that

[$] \sum_{j=0}^p (x_k + j\delta)^r w_j^k = \int_{x_k}^{x_k + p\delta}x^r \, dF(x). [$]

for all $0 \leq r \leq p$. The discretization sets the probability weight for the point $x_k + \delta j$, $0 \lt j \lt p$, equal to $w^k_j$, and sets the probability weight for the point $x_{k+p}$ to $w^k_p + w^{k+1}_0$. By setting $r=0$ and summing over $k$, we see that the discretization method produces a valid probability distribution. Furthermore, the resulting discrete probability distribution matches the first $p$ moments of the distribution $F(x)$.