# Introduction to Credibility

Credibility describes an approach used by actuaries to improve statistical estimates. In a typical application of credibility, the actuary has an estimate X based on a small set of data, and an estimate M based on a larger but less relevant set of data. The credibility estimate is $ZX + (1-Z)M$, where $Z$ is a number between 0 and 1 (called the "credibility weight" or "credibility factor") calculated to balance the sampling error of $X$ against the possible lack of relevance (and therefore modeling error) of $M$. The case $Z = 1$ corresponds to full credibility whereas $Z \lt 1$ corresponds to partial credibiilty.

When an insurance company calculates the premium it will charge, it divides the policy holders into groups. For example, it might divide motorists by age, sex, and type of car. The division is made balancing the two requirements that the risks in each group are sufficiently similar and the group sufficiently large that a meaningful statistical analysis of the claims experience can be done to calculate the premium. This compromise means that none of the groups contains only identical risks. The problem is then to devise a way of combining the experience of the group with the experience of the individual risk to calculate the premium better. Credibility theory provides a solution to this problem.

There are two broad approaches used in classical credibility theory to determine how the individual experience affects the premium: classical credibility, also referred to as limited fluctuations credibility, and greatest accuracy credibility, also referred as European credibility. This page focuses on the former approach.

## Basic Variables of Interest

The basic variables of interest related to claims generated by an insured over a fixed time period are the following:

• Exposure level $E$ expressed in units of exposure, also sometimes referred as the volume of the insured's experience. Typical examples of units of exposure are number of claims, number of employees, square footage of building(s), etc.
• Claim frequency $N$ (number of claims generated)
• Aggregate loss $S$
• Claim severity $S/N$ (average claim size)
• Pure premium $S/E$ (loss per unit of exposure)

## Full Credibility

We are interested in finding a criterion that will establish whether the insured deserves full credibility ($Z = 1$). In classical credibility theory, the decision to assign full credibility to an insured is based solely on having an adequately large sample size. More precisely, suppose $X_1,\ldots,X_n$ denotes an i.i.d sequence of random variables representing a random sample of a relevant variable of interest (claim size, claim frequency, etc.) associated with an insured. Classical credibility states that full credibility will be assigned when the following full credibility standard holds:

[$] \operatorname{P}\left[-r\operatorname{E}[\overline{X}] \leq \overline{X} - \operatorname{E}[\overline{X}] \leq r \operatorname{E}[\overline{X}]\right ] \geq p [$]

with $\overline{X}$ the sample mean, $p \gt 0$ and $0 \lt r \lt 1$. Set $\mu = \operatorname{E}[\overline{X}] = \operatorname{E}[X_1]$ and $\sigma^2 = \operatorname{Var}[X_1] \lt\infty$. By the central limit theorem:

[$] \operatorname{P}\left[-r\mu \leq \overline{X} - \mu \leq r \mu\right ] \sim \operatorname{P}\left(\left | \operatorname{N}(0,\operatorname{Var}(\overline{X})) \right | \leq r\mu \right) = \operatorname{P}\left(\left | \operatorname{N}(0,1) \right | \leq \frac{r\mu \sqrt{n}}{\sigma} \right) [$]

with $\operatorname{N}(0,1)$ a standard normal distribution as $n$ tends to infinity. Using this normal approximation for large $n$, the full credibility criterion becomes

[$] \begin{equation} \label{criterion} \operatorname{P}\left(\left | \operatorname{N}(0,1) \right | \leq \frac{r\mu \sqrt{n}}{\sigma} \right) \geq p \Leftrightarrow n \geq \left(\frac{z_{\frac{1 + p}{2}}}{r}\right)^2 \cdot \frac{\sigma^2}{\mu^2} \end{equation} [$]

with $z_{\alpha}$ denoting the quantile function for the standard normal distribution.

### Standard For Claim Severity

Suppose $N$ claims are made with associated claim sizes $Y_1,\ldots,Y_N$. To establish a standard for claim severity, we set $n = N$ and $X_i = Y_i$ in \ref{criterion}:

[$] \begin{equation} \label{criterion-s} N \geq \left(\frac{z_{\frac{1 + p}{2}}}{r}\right)^2 \cdot \frac{\sigma^2}{\mu^2} \end{equation} [$]

### Standard For Claim Frequency

To establish a standard for claim frequency, one needs to determine an appropriate $n$ value in \ref{criterion}. In the frequency case, the $n$ in \ref{criterion} will refer to the exposure units associated with the insured. This approach makes sense since the exposure units represents the number of homogenous exposures in an insurance portfolio (the insured). To connect with \ref{criterion}, the $X_i$ refer to claim frequencies associated with the single exposure unit $i = 1,\ldots, E$ . More precisely, if $\mu_f$ denotes the expected claim frequency per exposure unit and $\sigma_f^2$ denotes the variance of the claim frequency per exposure unit, then the standard for claim frequency is given by

[$] \begin{equation} \label{criterion-f} E \geq \left(\frac{z_{\frac{1 + p}{2}}}{r}\right)^2 \cdot \frac{\sigma_f^2}{\mu_f^2} \end{equation} [$]

It is common to express the standard for claim frequency in terms of the expected claim frequency:

[$] \begin{equation} \label{criterion-ef} \operatorname{E}[N] = \mu_f E \geq \left(\frac{z_{\frac{1 + p}{2}}}{r}\right)^2 \cdot \frac{\sigma_f^2}{\mu_f} \end{equation} [$]

It is often the case that $\operatorname{E}[N]$ is unknown in advance . In such a scenario, it is common practice to use $N$ as an estimate.

### Standard For Aggregate Loss

As is the case with claim frequencies, we will need to work with exposure units to establish the standard for aggregate losses. The aggregate loss can be written as $S = Y_1 + \cdots + Y_N$. The mean and variance of the loss equals

[$] \mu_S = \operatorname{E}[S] = \operatorname{E}[N]\operatorname{E}[Y_i] = \cdot \mu_N \cdot \mu_Y,\ \ \sigma^2_S = \mu_N \cdot \sigma^2_Y + \mu^2_Y \cdot \sigma_N^2 [$]

with $\mu_N = E \cdot \mu_f$ and $\sigma_N^2 = E \cdot \sigma_f^2$. To get the standard in terms of the exposure units, we use \ref{criterion-f} with $\mu_S$ instead of $\mu_f$ and $\sigma_S^2$ instead of $\sigma_f^2$:

[$] \begin{equation} \label{criterion-al} E \geq \left(\frac{z_{\frac{1 + p}{2}}}{r}\right)^2 \cdot \frac{\mu_f \cdot \sigma^2_Y + \mu^2_Y \cdot \sigma_f^2}{\mu_Y^2 \cdot \mu_f^2} = \left(\frac{z_{\frac{1 + p}{2}}}{r}\right)^2 \cdot \mu_f^{-1} \cdot \left( \frac{\sigma_Y^2}{\mu_Y^2} + \frac{\sigma_f^2}{\mu_f}\right). \end{equation} [$]

If we multiply both sides of \ref{criterion-al} by $\mu_f$, we obtain the criterion in terms of the expected value (mean) of the claim frequency:

[$] \begin{equation} \label{criterion-len} \mu_N \geq \left(\frac{z_{\frac{1 + p}{2}}}{r}\right)^2 \cdot \left( \frac{\sigma_Y^2}{\mu_Y^2} + \frac{\sigma_f^2}{\mu_f}\right) \end{equation} [$]

From \ref{criterion-s} and \ref{criterion-ef}, we conclude that the standard for aggregate loss, expressed in terms of the expected claim frequency, is the sum of the standard for severity and the standard for claim frequency, expressed in terms of expected claim frequency.

The standard for full credibility for the pure premium is the same as the standard for aggregate loss since the pure premium is just the aggregate loss divided by the exposure.

## Partial Credibility

When the standard for full credibility isn't met, one can turn to partial credibility ($Z \lt 1$). When considering partial credibility, one needs to determine how much credibility should be assigned to the experience of the insured (how large should we set $Z$). The only partial credibility rule examined here is the square root rule.

### Square Root Rule

The square root rule simply sets $Z$ as the square root of the expected number of claims, based on the exposure level, divided by the number of claims needed for full credibility, $n_F$:

[$] \begin{equation} \label{sqrule} Z = \sqrt{\frac{\operatorname{E}[N]}{n_F}} \end{equation} [$]

As we have already seen, when $\operatorname{E}[N]$ is sufficiently to satisfy the full credibility standard then $Z$ is set to 1; consequently, it is always the case that $Z \lt1$ when using the square root rule \ref{sqrule}. When $\operatorname{E}[N]$ is unknown, $N$, the number of claims observed, is used instead in formula \ref{sqrule}