# An introduction to risk measures

This post is an introduction to measures of risk. Examples of risk measures are considered. The post also focuses on coherent risk measures, which possess several desirable properties for risk measures.

In actuarial applications, an important focus is on developing loss distributions for insurance products. When the loss distribution is a probability based model, we can also use the model to derive a description of risk. Often times, a measure of risk is given by one number that describes the level of exposure to risk. One example of a measure of risk is value-at-risk (VaR), discussed here, which can quantify the likelihood of an adverse outcome. VaR can then be used to determine the amount of capital required to withstand such adverse outcomes. Actuaries and risk managers along with investors, regulators and rating agencies are particularly interested in using VaR and other measures of risk to quantify the ability of an enterprise to withstand various potential adverse events.

Two important examples of measures of risk – value-at-risk and tail-value-at-risk – have been discussed in this previous post.

Examples of Risk Measures

A measure of risk (or a risk measure) is a mapping that maps a given loss distribution to the real numbers. In a general discussion, we use the Greek letter $\rho$ to denote the mapping. For example, if $X$ is the random variable that describes the losses, then $\rho(X)$ is a number that is intended to quantify the risk exposure. We first look at some examples.

Example 1 (Premium Principles)
Some of the early risk measures in actuarial science were based on the so called premium principles. The purpose is to develop an appropriate premium to charge for a given risk. The following gives several risk measures based on premium principles.

$\rho(X)=E[X]$ ………………………………………………….(Equivalence Principle)

$\rho(X)=(1+k) \ E[X]$ ……………………………………….(Expected Value Principle)

$\rho(X)=E[X]+ k \ Var(X)$…………………………………(Variance Principle)

$\rho(X)=\mu+k \ \sqrt{Var(X)}$…………………………………..(Standard Deviation Principle)

Each case is a rule for assigning a premium to an insurance risk. In each of the last three cases, $k$ is a fixed constant where $k \ge 0$. The fixed constant $k$ reflects an adjustment that may include expenses and profit and/or a risk loading. The equivalence principle does not account for any risk loading; it simply charges the expected losses, i.e. the pure premium.

Example 2 (Standard Deviation Principle)
This example highlights the standard deviation premium principle:

$\rho(X)=\mu+k \ \sigma$

where $\mu=E[X]$ and $\sigma^2$ is the variance of $X$.

To shed light on the standard deviation principle, suppose that the loss $X$ has a normal distribution. The $\rho(X)$ resembles a quantile in the normal distribution where $k$ is a z-score. With $k=1.645$, $P(X>\mu+1.645 \ \sigma)=0.05$. This means that $\mu+1.645 \ \sigma$ is a threshold that indicates the level of adverse outcome such that the probability of exceeding this threshold is 5%. With $k=2.326$, $P(X>\mu+2.326 \ \sigma)=0.01$. The higher threshold $\mu+2.326 \ \sigma$ is the level of adverse outcome so that there is a 1% chance of exceeding it.

If the loss distribution $X$ is not a normal distribution, the coefficient $k$ would not be the same as the ones given for normal distribution. Whatever the loss distribution is, the fixed constant $k$ is usually chosen such that the probability of losses exceeding the risk measure equals to some pre-determined small probability level.

Example 3 (Value-at-Risk)
When the loss $X$ does not have a normal distribution, the same constants $k$ in Example 2 cannot be used. Instead, we can start with a small probability level $p$ (5%, 1%, 0.5% etc) and determine the $100p$th percentile. The corresponding $k$ can then be determined. This is the basic idea behind the risk measure of value-at-risk.

The risk measure of value-at-risk (VaR) is mathematically a percentile of the loss distribution. Given that $X$ represents the losses, $\text{VaR}_p(X)$ is the $100p$th percentile of $X$. The value-at-risk (VaR) discussed here is for gauging the exposure of risk with respect to insurance losses such that the probability of exceeding the threshold is $p$, the pre-determined security level. See this previous post for a more detailed discussion.

Example 4 (Tail-Value-at-Risk)
We now briefly discuss Tail-value-at-risk (TVaR). Suppose that $X$ is the random variable that models losses. As before we assume that $X$ takes on positive real numbers. The tail-value-at-risk at the $100p$% security level, denoted by $\text{TVaR}_p(X)$, is the expected loss on the condition that loss exceeds the 100pth percentile of $X$. The following is a more succinct way of describing it.

$\text{TVaR}_p(X)=E(X \lvert X > \pi_p)$

where $\pi_p=\text{VaR}_p(X)$. Tail-value-at-risk is a risk measure that is in many ways superior than VaR. The risk measure VaR is a merely a cutoff point and does not describe the tail behavior beyond the VaR threshold. TVaR reflects the shape of the tail beyond VaR threshold. See this previous post for a more detailed discussion.

Desirable Properties of Risk Measures

One important consideration when choosing risk measures concerns the combining of risk entities in a given company or enterprise. For example, an insurance company may have different divisions specializing in different insurance products and services. The results of the risk measure applied individually to the different divisions should be consistent to the results of the risk measure applied to the entire company. For example, it is desirable for the risk measure for two risks combined be no greater than the sum of the two results applied to two risks individually. The following is a list of four desirable properties for a risk measure. Any risk measure that satisfies these four properties is said to be a coherent risk measure.

Coherent Risk Measure
Let $X$ and $Y$ be two loss random variables. The risk measure $\rho$ is a coherent risk measure if it satisfies the following four properties.

1. $\rho(X+Y) \le \rho(X)+\rho(Y)$ ……………………………………….(Subadditivity)
2. If $X \le Y$ with probability 1, then $\rho(X) \le \rho(Y)$ ………………..(Monotonicity)
3. For any constant $c >0$, $\rho(c X)=c \ \rho(X)$ ………………………(Positive Homogeneity)
4. For any constant $c >0$, $\rho(X+c)=\rho(X)+c$ ……………….(Translation Invariance)

The first one, subadditivity, requires that for any two random loss variables $X$ and $Y$, the risk measure for $X+Y$ be no greater than the risk measures for $X$ and $Y$ separately. This is a sensible requirement for a risk measure. This reflects the notion that combining risks leads to diversification and thus to a reduction of total overall risk. Otherwise, a large enterprise would simply be broken into smaller entities in order to reduce risk.

The property of monotonicity is another sensible requirement. If the random loss $X$ is always less than the random loss $Y$, then it makes sense for the risk measure of $X$ be no greater than the risk measure of $Y$. For example, if the risk measure represents the surplus or reserve that is required to cover a random loss, then we would want the reserve for the lesser loss to be no greater than the reserve for a larger potential loss.

The property of positive homogeneity specifies that the risk measure of a constant multiple of the random loss should be the constant multiple of the risk measure. For example, expressing the loss random variable in another currency should indicate the same level of reserve or surplus.

Translation invariance says that the risk measure of combining a random loss and a fixed loss should be the risk measure of the random loss plus the fixed loss. The reserve to cover a fixed loss should be just the fixed loss.

A Closer Look at Examples

Now that we have a list of desirable properties a risk measure should have, which of the risk measures discussed above possess these properties? We comment on these risk measures one by one.

Example 5 (Equivalence Principle)
The risk measure based on the equivalence principle is a coherent risk measure since it satisfies all four properties. The properties of expected value derive the four properties of coherent measure.

• $\rho(X+Y)=E(X+Y)=E(X)+E(Y)=\rho(X)+\rho(Y)$
• If $X \le Y$ always, then $\rho(X)=E(X) \le E(Y)=\rho(Y)$.
• $\rho(cX)=E(c X)=c \ E(X)=c \ \rho(X)$
• $\rho(X+c)=E(X+c)=E(X)+c=\rho(X)+c$

Example 6 (Expected Value Principle)
For the four measures based on the premium principles, equivalence principle is the only one that is coherent. The risk measure based on the expected value principle fails translation invariance. Here’s the derivation.

• $\rho(X+Y)=(1+k) \ E(X+Y)=(1+k) \ E(X)+(1+k) \ E(Y)=\rho(X)+\rho(Y)$
• If $X \le Y$ always, then $\rho(X)=(1+k) \ E(X) \le (1+k) \ E(Y)=\rho(Y)$.
• $\rho(cX)=(1+k) \ E(c X)=c \ (1+k) \ E(X)=c \ \rho(X)$
• $\rho(X+c)=(1+k) \ E(X+c)=(1+k) \ E(X)+(1+k) \ c \ne \rho(X)+c$

Example 7 (Variance Principle)
Now, consider the risk measure based on the variance principle. Recall the risk measure based on the variance principle is of the form $\rho(L)=E(L)+k Var(L)$ where $L$ represents the random losses and $k$ is some positive constant. First we show that it does not satisfy the subadditivity property.

\displaystyle \begin{aligned} \rho(X+Y)&=E(X+Y)+k Var(X+Y) \\&=E(X)+E(Y)+k Var(X)+k Var(Y)+2 k Cov(X,Y) \\&=\rho(X)+\rho(Y)+2 k Cov(X,Y) \end{aligned}

When $Cov(X,Y)>0$, i.e. the risk $X$ and the risk $Y$ are positively correlated, we have $\rho(X+Y)>\rho(X)+\rho(Y)$. There is no benefit of risk reduction in combing risks that are positively correlated.

Next we demonstrate why the variance principle fails the monotonicity. Recall that $\rho$ satisfies the property of monotonicity if $\rho(X) \le \rho(Y)$ for any risks $X$ and $Y$ such that $X \le Y$ for all scenarios. For the risk measure $\rho(L)=E(L)+k Var(L)$, we show that for each $k>0$, we can find random variables $X$ and $Y$ such that $X \le Y$ always and $\rho(X)>\rho(Y)$. To this end, we define random variables $X_n$ and $Y$ for each positive integer $n$.

Let $n$ be any positive integer. Let $X_n$ be a random variable such that $P(X_n=-n)=0.1$ and $P(X_n=90)=0.9$. Let $Y$ be a constant value of 100, i.e. $P(Y=100)=1$. Then it is straightforward to verify that

$E(X_n)=81-0.1n$

$E(X_n^2)=7290+0.1n^2$

$Var(X_n)=7290+0.1n^2-(81-0.1n)^2=0.09 n^2+16.2 n+729$

$E(Y)=100$

$Var(Y)=0$

For $\rho(X_n)=81-0.1 n^2+k (0.09 n^2+16.2 n+729)>\rho(Y)=100$, the constant $k$ must satisfy the following.

$\displaystyle k>\frac{19+0.1 n}{0.09 n^2+16.2 n+729} \longrightarrow 0$ as $n \longrightarrow \infty$

Note that the quantity on the right approaches zero as $n$ becomes large. For each $k>0$ (however small), we can always choose an integer $n$ large enough such that the above inequality is satisfied. Then for the pair $X_n$ and $Y$, $X_n \le Y$ always and $\rho(X_n) >\rho(Y)$. Thus monotonicity fails for the variance principle.

The following derivation shows that positive homogeneity fails and translation invariance is satisfied.

$\rho(c \ X)=c \ E(x)+k \ c^2 \ Var(X) \ne c \ \rho(X)=c \ E(X)+k \ c \ Var(X)$

$\rho(X+c)=E(X+c)+k \ Var(X+c)=E(X)+c+k \ Var(X)=\rho(X)+c$

Example 8 (Standard Deviation Principle)
The risk measure according to the standard deviation principle is also not coherent even though it is an improvement to the variance principle. The standard deviation principle satisfies all properties except monotonicity.

To show subadditivity, note that the fact that $\sqrt{Var(X+Y)} \le \sqrt{Var(X)}+\sqrt{Var(Y)}$. This is derived by the following:

\displaystyle \begin{aligned} Var(X+Y)&=Var(X)+Var(Y)+2 Cov(X,Y) \\&=Var(X)+Var(Y)+2 Corr(X,Y) \sqrt{Var(X)} \sqrt{Var(Y)} \\&=Var(X)+Var(Y)+2 \sqrt{Var(X)} \sqrt{Var(Y)} \\&\le (\sqrt{Var(X)}+\sqrt{Var(Y)})^2 \end{aligned}

In the above derivation, $Corr(X,Y)$ refers to the correlation coefficient, which is always $\le 1$. The following shows the subadditivity.

\displaystyle \begin{aligned} \rho(X+Y)&=E(X+Y)+k \sqrt{Var(X+Y)} \\&\le E(X)+E(Y) +k (\sqrt{Var(X)}+\sqrt{Var(Y)}) \\&=\rho(X)+\rho(Y) \end{aligned}

The following derivation verifies positive homogeneity and translation invariance.

\displaystyle \begin{aligned} \rho(c \ X)&=E(c \ X)+k \sqrt{Var(c \ X)} \\&=c \ E(X)+ k \sqrt{c^ 2 \ Var(X)} \\&=c \ E(X)+ k \ c \ \sqrt{Var(X)} \\&=c \ \rho(X) \end{aligned}

\displaystyle \begin{aligned} \rho(X+c)&=E(X+c)+k \ \sqrt{Var(X+c)} \\&=E(X)+c+k \ \sqrt{Var(X)} \\&=\rho(X)+c \end{aligned}

We now show that the standard deviation principle does not satisfy monotonicity. The proof is similar to the one for variance principle in Example 7 but with a crucial adjustment. For the risk measure $\rho(L)=E(L)+k \sqrt{Var(L)}$, we show that for each $k>0$, we can find random variables $X$ and $Y$ such that $X \le Y$ always and $\rho(X)>\rho(Y)$. To this end, we define random variables $X_n$ and $Y$ for each positive integer $n$.

Let $n$ be any positive integer. Let $X_n$ be a random variable such that $P(X_n=- x_n)=10^{-n}$ and $P(X_n=90)=1-10^{-n}$ where $x_n$ is the positive number defined by $x_n=10^{n+1}-90$. Let $Y$ be a constant value of 100, i.e. $P(Y=100)=1$. Then it is straightforward to verify that

$E(X_n)=80$

$E(X_n^2)=10^{n+2}+6300$

$Var(X_n)=10^{n+2}+6300-80^2=10^{n+2}-100$

$E(Y)=100$

$Var(Y)=0$

For $\rho(X_n)=80+k \ \sqrt{10^{n+2}-100} > \rho(Y)=100$, the constant $k$ must satisfy the following inequality.

$\displaystyle k > \frac{20}{\sqrt{10^{n+2}-100}} \longrightarrow 0$ as $n \longrightarrow \infty$

For any $k >0$ (however small), we can always find an integer $n$ large enough so that the above inequality holds. Then the pair of random variables $X_n$ and $Y$ are such that $X_n \le Y$ always and $\rho(X_n)>\rho(Y)$. Thus monotonicity fails for the standard deviation principle.

Example 9 (Value-at-Risk)
The risk measure of VaR is not coherent despite its attractive quality. An example for the violation of subadditivity can be found in this article. Example 1 in this article gives two independent risks $X$ and $Y$ such that $\text{VaR}_{0.99}(X+Y)>\text{VaR}_{0.99}(X)+\text{VaR}_{0.99}(Y)$.

Example 10 (Tail-Value-at-Risk)
One advantage of TVaR is that it describes the tail behavior beyond the threshold of VaR. Another is that TVaR is a coherent risk measure. This fact has been verified in the article Artzner, P., Delbaen, F., Eber, J.-M., & Heath, D. (1997). Thinking Coherently. Risk, 10, 68-71.

Dan Ma actuarial topics
Dan Ma actuarial
Dan Ma math

Daniel Ma actuarial
Daniel Ma mathematics
Daniel Ma actuarial topics

$\copyright$ 2018 – Dan Ma

# Tail-value-at-risk of a mixture

The risk measures of value-at-risk and tail-value-at-risk are discussed in the preceding post. This post extends the preceding post with an algorithm on evaluating the tail-value-at-risk of a mixture distribution with discrete mixing weights.

The preceding post introduces the notions of value-at-risk (VaR) and tail-value-at-risk (TVaR). These are two particular examples of risk measures that are useful for insurance companies and other enterprises in a risk management context. For $0, VaR at the security level $p$ gives the threshold that the probability of a loss more adverse than the threshold is at most $1-p$. Thus in our context VaR is a percentile of the loss distribution. TVaR is a conditional expected value. At the security level $p$, TVaR is the expected value of the losses given that the losses exceed the threshold VaR.

The preceding post gives several representations of TVaR. It also gives the formula for TVaR for several distributions – exponential, Pareto, normal and lognormal. We now discuss TVaR of a mixture distribution. If a distribution is the mixture of two distributions and if each of the individual distributions has a clear formulation of TVaR, can we mix the two TVaR's? The answer is that we can provided some adjustments are made. The following gives the formula.

Suppose that the loss $X$ is a mixture of two distributions represented by the random variables $X_1$ and $X_2$, with weights $w$ and $1-w$, respectively. Let $\pi_p$ be the $100p$th percentile of the loss $X$, i.e. $\pi_p=\text{VaR}_p(X)$. Then the tail-value-at-risk at the $100p$ percent security level is:

\displaystyle \begin{aligned} \text{TVaR}_p(X)&=\pi_p+\frac{1}{1-p} \biggl[w \times P(X_1>\pi_p) \times e_{X_1}(\pi_p)\\& \ \ +(1-w) \times P(X_2>\pi_p) \times e_{X_2}(\pi_p)\biggr] \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (a) \end{aligned}

The comparison is with formula (3) in the preceding post. That formula shows that TVaR is $\pi_p+e(\pi_p)$. In other words, TVaR is VaR plus the mean excess loss function evaluated at $\pi_p$. The content within the squared brackets in Formula (a) is a weighted average of the two individual mean excess loss functions with the adjustment of multiplying with the probabilities $P(X_1>\pi_p)$ and $P(X_2>\pi_p)$. This formula is useful if the $\pi_p$ (VaR) can be calculated and if the mean excess loss functions are accessible.

We give an example and then show the derivation of Formula (a).

Example 1
The mean excess loss function of an exponential distribution is constant. Let’s consider the mixture of two exponential distributions. Suppose that losses follow a mixture of two exponential distributions where one distribution has mean 5 (75% weight) and the other has mean 10 (25% weight). Determine the VaR and TVaR at the security level 99%.

First, calculate the 99th percentile of the mixture, which is the solution to the following equation.

$\displaystyle 0.75 e^{-x/5}+0.25 e^{-x/10}=1-0.99=0.01$

By letting $y=e^{-x/10}$, we solve the following equation.

$\displaystyle 0.75 y^2+0.25 y-0.01=0$

Use the quadratic formula to solve for $y$. Then solve for $x$. The following is the 99th percentile of the loss $X$.

$\displaystyle \pi_p=-10 \times \text{ln} \biggl(\frac{-1+\sqrt{1.48}}{6} \biggr)=33.2168$

The following gives the TVaR.

\displaystyle \begin{aligned} \text{TVaR}_p(X)&=\pi_p+\frac{1}{1-0.99} \biggl[0.75 \times e^{-\pi_p/5} \times 5 +0.25 \times e^{-\pi_p/10} \times 10 \biggr] \\&=42.7283 \end{aligned}

Note that the mean excess loss for the first exponential distribution is 5 and for the second one is 10 (the unconditional means). The survival functions $P(X_1>\pi_p)$ and $P(X_2>\pi_p)$ are also easy to evaluate. As long as the percentile $\pi_p$ of the mixture is calculated, the formula is very useful. In this example, the two exponential parameters are set so that the calculation of percentiles uses the quadratic formula. If the parameters are set differently, then we can use software to evaluate the required percentile.

Deriving the formula

Suppose that $X$ is the mixture of $X_1$, with weight $w$, and $X_2$, with weight $1-w$. The density function for $X_1$ is $f_1(x)$ and the density function for $X_2$ is $f_2(x)$. The density function of $X$ is then $f(x)=w f_1(x)+(1-w) f_2(x)$. We derive from the basic definition of TVaR. Let $\pi_p$ be the $100p$th percentile of $X$.

\displaystyle \begin{aligned} \text{TVaR}_p(X)&=\frac{\int_{\pi_p}^\infty x f(x) \ dx}{1-p}\\&=\pi_p+\frac{\int_{\pi_p}^\infty (x-\pi_p) f(x) \ dx}{1-p} \\&=\pi_p+\frac{\int_{\pi_p}^\infty (x-\pi_p) (w f_1(x)+(1-w) f_2(x)) \ dx}{1-p} \\&=\pi_p+\frac{1}{1-p} \biggl[w \int_{\pi_p}^\infty (x-\pi_p) f_1(x) \ dx +(1-w) \int_{\pi_p}^\infty (x-\pi_p) f_2(x) \ dx\biggr] \\&=\pi_p+\frac{1}{1-p} \biggl[w \ P(X_1>\pi_p) \ \frac{\int_{\pi_p}^\infty (x-\pi_p) f_1(x) \ dx}{P(X_1>\pi_p)}\\& \ \ +(1-w) \ P(X_2>\pi_p) \ \frac{\int_{\pi_p}^\infty (x-\pi_p) f_2(x) \ dx}{P(X_2>\pi_p)} \biggr] \\&=\pi_p+\frac{1}{1-p} \biggl[w \ P(X_1>\pi_p) \ e_{X_1}(\pi_p) +(1-w) \ P(X_2>\pi_p) \ e_{X_2}(\pi_p) \biggr] \end{aligned}

The formula derived here is for mixtures for two distributions. It is straightforward to extend it for mixtures of any finite-mixture.

Practice Problems

Practice problems are available in the companion blog to reinforce the concepts of value-at-risk and tail-value-at-risk. Practice Problems 10-G and 10-H in that link are for TVaR of mixtures.

actuarial
math

Daniel Ma
mathematics

$\copyright$ 2018 – Dan Ma

# Value-at risk and tail-value-at-risk

In actuarial applications, an important focus is on developing loss distributions for insurance products. In such applications, it is desirable to employ risk measures to evaluate the exposure to risk. Such risk measures are indicators, often one or a small set of numbers, that inform actuaries and risk managers about the degree to which the risk bearing entity is subject to various aspects of risk. This post introduces two risk measures – value-at-risk (VaR) and tail-value-at-risk (TVaR).

Value-at-Risk

One natural question for a risk bearing entity (e.g. insurance companies and other enterprises) is: what is the chance of an adverse outcome? Value-at-risk (VaR) provides a ready answer to this question. Mathematically speaking, VaR is a quantile of the distribution of aggregate losses. For example, VaR at the 99% probability level indicates the level of adverse outcome such that the probability of exceeding this threshold is 1%. More broadly, VaR is the amount of capital required to ensure, with a high level of confidence, that the risk bearing entity does not become insolvent. The security level or probability level is chosen arbitrarily. In practice, it is usually a high number such as 95%, 99% or 99.5%. The preference is for a higher security level when evaluating the risk exposure for the entire enterprise. For a sub unit of the enterprise, the security level may be set to a lower number such as 95%.

Suppose that $X$ is a random variable that models the loss distribution in question. We assume that the support of $X$ is the set of positive real numbers or some appropriate subset. The value-at-risk (VaR) of $X$ at the $100p$th security level, denoted by $\text{VaR}_p(X)$, is the $100p$th percentile of $X$.

In the current discussion, we focus on loss distributions that are continuous random variables. Thus $\text{VaR}_p(X)$ is the value $\pi_p$ such that $P(X>\pi_p)=1-p$. In some ways, VaR is an attractive risk measure. Mathematically speaking, VaR has a clear and simple definition. For certain probability models, VaR can be evaluated in closed form. For those models that have no closed form for percentiles, VaR can be evaluated using software. However, VaR has limitations (this point will be briefly discussed below).

Example 1
Suppose that the loss $X$ is normally distributed with mean $\mu$ and variance $\sigma^2$. Then $\text{VaR}_p(X)=\mu+z_p \cdot \sigma$ where $z_p$ is the $100p$th percentile of the standard normal distribution (i.e. normal with mean 0 and standard deviation 1).

VaR for normal distribution is identical to the risk measure called standard deviation principle. For any loss $X$ with mean $\mu$ and variance $\sigma^2$, the quantity $\mu+k \cdot \sigma$ for some fixed constant $k$ is a risk measure called the standard deviation principle. The constant $k$ is usually chosen such that losses will exceed $\mu+k \cdot \sigma$ with a pre-determined small probability. For losses that are normally distributed, $k=1.645$ for security level $p=0.95$ and $k=2.326$ for security level $p=0.99$.

Comment
The value-at-risk discussed here is for gauging the exposure of risk with respect to insurance losses. Thus we assume that the random variable $X$ is one that takes on positive real values (or some appropriate subset of the positive real number line). In this context, the adverse outcomes would be the extremely high values of the positive real numbers. In other words, the adverse outcomes we wish to guard against would be the right tails of the loss distribution. Thus $\text{VaR}_p(X)$ is evaluated for high values for $p$ such as 0.95 or 0.99 or higher. In the actuarial loss point of view, VaR is the high right tail of the loss distribution.

VaR is also used extensively in banking and investment industry. The random variable $X$ in that setting is a profit and loss distribution, which can extend into the negative real numbers (when there are losses). The adverse outcomes for banking and investment applications would be the extreme left tails of the profit and loss distribution. Thus when VaR is evaluated at the security level 95%, we actually calculate the 5th percentile of the profit and loss distribution.

Tail-Value-at-Risk

Tail-value-at-risk (TVaR) is risk measure that is in many ways superior than VaR. The risk measure VaR is a merely a cutoff point and does not describe the tail behavior beyond the VaR threshold. We will see that TVaR reflects the shape of the tail beyond VaR threshold.

Suppose that $X$ is the random variable that models losses. As before we assume that $X$ takes on positive real numbers. The tail-value-at-risk at the $100p$th security level, denoted by $\text{TVaR}_p(X)$, is the expected loss on the condition that loss exceeds the $100p$th percentile of $X$.

Just as in the discussion of VaR, the discussion of TVaR focuses on continuous distributions. As before, we use $\pi_p$ to denote the $100p$th percentile of $X$. The quantity $\text{TVaR}_p(X)$ can be expressed as follows:

$\displaystyle (1) \ \ \ \ \ \text{TVaR}_p(X)=E(X \lvert X> \pi_p)=\frac{\int_{\pi_p}^\infty x f(x) \ dx}{1-F(\pi_p)}=\frac{\int_{\pi_p}^\infty x f(x) \ dx}{1-p}$

The above is the formulation of TVaR is that based on the definition. To use (1) to obtain TVaR, evaluate VaR and then evaluate the integral on the right side. TVaR is obtained by dividing the integral result by $1-p$. However, there are other formulations that will give more insight into TVaR. The following is another formulation.

$\displaystyle (2) \ \ \ \ \ \text{TVaR}_p(X)=\frac{\int_p^1 \text{VaR}_w(X) \ dw}{1-p}$

The above is derived by the substitution $w=F(x)$ where $F(x)$ is the cumulative distribution function (CDF) of $X$. From a computation standpoint, (2) is not easy to use since it is usually hard to integrate VaR. The value of (2) is in the insight. From (2), we see that TVaR can be viewed as the average of all VaR at the level $w$ above $p$. Thus TVaR tells us much more about the tail of the loss distribution than VaR for just one security level. The following is another formulation.

\displaystyle \begin{aligned} (3) \ \ \ \ \ \text{TVaR}_p(X)&=\frac{\int_{\pi_p}^\infty x f(x) \ dx}{1-p} \\&=\pi_p+\frac{\int_{\pi_p}^\infty (x-\pi_p) f(x) \ dx}{1-p} \\&=\pi_p+e(\pi_p) \\&=\text{VaR}_p(X)+e(\text{VaR}_p(X))\end{aligned}

In (3), the function $e(x)$ is the mean excess loss function evaluated at $x$. Thus TVaR is the VaR plus the mean excess loss evaluated at VaR. Thus TVaR is VaR plus the average excess of all losses exceeding the threshold of VaR. As a result, TVaR gives information on the tail to the right of VaR. For those parametric loss distributions that have accessible formulations for the limited expectation $E(X \wedge x)$, the following is a useful formulation for TVaR.

\displaystyle \begin{aligned} (4) \ \ \ \ \ \text{TVaR}_p(X)&=\text{VaR}_p(X)+e(\text{VaR}_p(X)) \\&=\text{VaR}_p(X)+\frac{E[X]-E[X \wedge \text{VaR}_p(X)]}{1-p} \end{aligned}

See this blog post in a companion blog for more information on mean excess loss function and limited expectation $E(X \wedge x)$.

Tail-value-at-risk is also known as conditional tail expectation (CTE) as well as tail conditional expectation (TCE). CTE and TCE are widely used in North America. In Europe, TVaR is also known as expected shortfall (ES).

Formulas for VaR and TVaR

Many distributions have CDFs that allow relatively easy computation of percentiles. Thus VaR is quite accessible for these parametric distributions. This link has a table of distributional information for parametric distributions that includes value-at-risk. Distributions in this table that have formulas for VaR include Burr distribution, inverse Burr distribution, Pareto distribution, inverse Pareto distribution, loglogistic distribution, paralogistic distribution, inverse paralogistic distribution, Weibull distribution, inverse Weibull distribution, exponential distribution, inverse exponential distribution.

The calculation for TVaR is not so accessible in the table in the given link. In fact, for some distributions that have heavy tails, TVaR is not even defined since TVaR involves an average value of the excess of losses above a threshold. In the remainder of the post, we present formulas for TVaR for four distributions – exponential, Pareto, normal and lognormal.

Distribution VaR TVaR
Exponential $-\theta \text{ln}(1-p)$ $\theta (1- \text{ln}(1-p))$
Pareto $\displaystyle \frac{\theta}{(1-p)^{1/\alpha}}-\theta$ $\displaystyle \text{VaR}_p(X)+\frac{\theta+\text{VaR}_p(X)}{\alpha-1}$ $\alpha>1$
Pareto $\displaystyle \text{VaR}_p(X) \ \frac{\alpha}{\alpha-1}+\frac{\theta}{\alpha-1}$ $\alpha>1$
Pareto $\displaystyle \frac{\theta}{\alpha-1} \biggl[1+ \frac{\alpha}{\theta} \ \text{VaR}_p(X) \biggr]$ $\alpha>1$
Normal $\mu+z_p \ \sigma$ $\displaystyle \mu+ \sigma \ \frac{\phi(z_p)}{1-p}$
Lognormal $\displaystyle e^{\mu+z_p \ \sigma}$ $\displaystyle e^{\mu+0.5 \ \sigma^2} \biggl[\frac{\Phi(\sigma-z_p)}{1-p} \biggr]$

The exponential distribution in the table has parametrized by a scale parameter $\theta$, which happens to be the mean of the distribution (see here for more information).

The Pareto distribution in the table is Pareto Type II Lomax distribution discussed here. It is parametrized by a shape parameter $\alpha$ and a scale parameter $\theta$. The table gives three formulations of TVaR. Note that in order for the mean to exist, the shape parameter must be greater than 1. Thus TVaR is not defined when $\alpha>1$.

For the normal distribution, the parameters are $\mu$ (mean) and $\sigma$ (standard deviation). The lognormal distribution are parametrized by $\mu$ and $\sigma$, meaning that the logarithm of the lognormal distribution is a normal distribution with mean $\mu$ and standard deviation $\sigma$. To use the formulas for TVaR, note that $\phi$ and $\Phi$ are the probability density function and the cumulative distribution function of the standard normal distribution, respectively. Thus $\phi$ and $\Phi$ are:

$\displaystyle \phi(x)=\frac{1}{\sqrt{2 \pi}} \ e^{-0.5 x^2}$

$\displaystyle \Phi(x)=\int_{-\infty}^x \frac{1}{\sqrt{2 \pi}} \ e^{-0.5 t^2} \ dt$

Of course $\Phi(x)$ is evaluated by using a table or software and not by evaluating the integral.

Comment

The risk measure of value-at-risk, though simple in concept and calculation, has shortcomings. One undesirable aspect is that VaR does not possess certain desirable properties among risk measures. VaR is not a coherent risk measure. To be a coherent risk measure, it must satisfy four properties, one of which is subadditivity. When a risk measure is subadditive, the risk measure of two risks combined into one will not be greater than the risk measure for the two risks treated separately. It is desirable to have the benefits of diversification by combing several risks into one. When the risk measure is not subadditive, it cannot model the diversification of risks. It had been shown that VaR is not a subadditive risk measure. On the other hand, TVaR had been shown to be a coherent measure. It satisfies subadditivity and three other desirable properties for risk measures. The topic of risk measures is to be discussed in a subsequent post.

A Formula for a Mixture

A relatively easy to use formula for tail-value-at-risk for a mixture distribution is shown in next post.

Practice Problems

Practice problems are available in the companion blog to reinforce the concepts of value-at-risk and tail-value-at-risk.

actuarial
math

Daniel Ma
mathematics

$\copyright$ 2017 – Dan Ma