This post discusses the basic properties of the lognormal distribution. The lognormal distribution is a transformation of the normal distribution through exponentiation. As a result, some of the mathematical properties of the lognormal distribution can be derived from the normal distribution. The normal distribution is applicable in many situations but not in all situations. The normal density curve is a bell-shaped curve and is thus not appropriate in phenomena that are skewed to the right. In such situations, the lognormal distribution can be a good alternative to the normal distribution. In an actuarial setting, the lognormal distribution is an excellent candidate for a model of insurance claim sizes.
Defining the distribution
In this post, the notation log refers to the natural log function, i.e., logarithm to the base . Thus log(2) = 0.693147181.
A random variable is said to follow a lognormal distribution if follows a normal distribution. A lognormal distribution has two parameters and , which are the mean and standard deviation of the normal random variable . To be more precise, the definition is restated as follows:
Many useful probability distributions are transformations of other known distributions. The above definition shows that a normal distribution is the transformation of a lognormal distribution under the natural logarithm. Start with a lognormal distribution, taking the natural log of it gives you a normal distribution. The other direction is actually more informative, i.e., a lognormal distribution is the transformation of a normal distribution by the exponential function. Start with a normal random variable , the exponentiation of it is a lognormal distribution, i.e., is a lognormal distribution. The following summarizes these two transformations.
Comparing Normal and Lognormal
Normally one of the first things to focus on is the probability density function when studying a continuous probability model. In the case of the lognormal distribution, a natural way to start is to focus on the relationship between lognormal distribution and normal distribution. In this section, we compare the following:
- The lognormal distribution with parameters = 0 and = 1 (standard lognormal distribution).
- The normal distribution with mean 0 and standard deviation 1 (standard normal distribution).
Suppose that the random variable follows the standard lognormal distribution. This means that is the normal distribution with mean 0 and standard deviation 1, i.e. the standard normal distribution. Going the other direction, is the lognormal distribution we start with.
The standard normal takes on values from to . But most likely, about 99.7% of the times, takes on values between -3 and 3 (in general, a normal random variable takes on values between -3 standard deviations and +3 standard deviation about 99.7% of the time).
Since exponentiation always gives positive values, the standard lognormal random variable always takes on positive values. Since is rarely outside of -3 and +3, the standard lognormal random variable will takes on values between = 0.049787068 to = 20.08553692 about 99.7% of the time. Thus observing a standard lognormal value over 20 would be an extremely rare event. The following figure shows the graphs of the standard normal density function and the standard lognormal density function.
In Figure 1, the standard normal density curve is symmetric bell shaped curve, with mean and median located at x = 0. The standard lognormal density is located entirely over the positive x-axis. This is because the exponential function always gives positive values regardless of the sign of the argument. The lognormal density curve in Figure 1 is not symmetric and is a uni-modal curve skewed toward the right. All the standard normal values on the negative x-axis, when exponentiated, are in the interval (0, 1). Thus in Figure 1, the lower half of the lognormal probabilities lie in the interval x = 0 to x = 1 (i.e. the median of this lognormal distribution is x = 1). The other half of the lognormal probabilities lie in the interval . Such lopsided assignment of probabilities shows that lognormal distribution is a positively skewed distribution (skewed to the right).
In the above paragraph, the lower half of the normal distribution on is matched with the lognormal distribution on the interval . Such interval matching can tell us a great deal about the lognormal distribution. Another example: about 75% of the standard normal distributional values lie below x = 0.67. Thus in Figure 1, about 75% of the lognormal probabilities lie in the interval where . Another example: what is the probability that the lognormal distribution in Figure 1 lie between 1 and 3.5? Then the normal matching interval is where . The normal probability in this interval is 0.8944 – 0.5 = 0.3944. Thus randomly generated a value in the standard lognormal distribution, there is a 39.44% percent chance that it is between 1 and 3.5.
The interval matching idea is very useful for computing lognormal probabilities (e.g. cumulative distribution function) and for finding lognormal percentiles. This idea is discussed further below to make it work for any lognormal distribution, not just the standard lognormal distribution. The following compares the two cumulative distribution functions (CDFs).
Note that the standard normal CDF basically reaches the level of y = 1 when the x-values get close to 3.0. The lognormal CDF approaches 1.0 too, but at a much slower rate. The lognormal CDF is close to 1 when x = 10 and is rapidly approaching 1 after that point.
Though lognormal distribution is a skewed distribution, some are less skewed than others. The lognormal distributions with larger parameter value of tend to be more skewed. The following is a diagram of three lognormal density curves that demonstrates this point. Note that the small of 0.25 relatively resembles a bell curve.
How to compute lognormal probabilities and percentiles
Let be a random variable that follows a lognormal distribution with parameters and . Then the related normal random variable is , which has mean and standard deviation . If we raise to , we get back the lognormal .
Continuing the interval matching idea, the lognormal interval will match with the normal interval . Both intervals receive the same probability in their respective distributions. The following states this more clearly.
On the other hand, the normal interval will match with the lognormal interval . The same probability is assigned to both intervals in their respective distributions. This idea is stated as follows:
The idea of gives the cumulative distribution of the lognormal distribution (argument ), which is evaluated as the CDF of the corresponding normal distribution at . One obvious application of is to have an easy way to find percentiles for the lognormal distribution. It is relatively easy to find the corresponding percentile of the normal distribution. Then the lognormal percentile is raised to the corresponding percentile of the normal distribution. For example, the median of the normal distribution is at the mean . Then the median of the lognormal distribution is .
The calculation in both and involve finding normal probabilities, which can be obtained using software or using a table of probability values of the standard normal distribution. To do the table approach, each normal CDF is converted to the standard normal CDF as follows:
where is the z-score which is the ratio and is the cumulative distribution function of the standard normal distribution, which can be looked up from a table based on the z-score. In light of this, can be expressed as follows:
A quick example to demonstrate how this works.
If is lognormally distributed with parameters and ,
- what is the probability ?
- what is the 95th percentile of this lognormal distribution?
The first answer is , which is calculated as follows:
The z-score for the 95th percentile for the standard normal distribution is z = 1.645. Then the 95th percentile for the normal distribution with mean 2.5 and standard deviation 1.5 is x = 2.5 + 1.645 (1.5) = 4.9675. Then apply the exponential function to obtain , which is the desired lognormal 95th percentile.
As and Example 1 suggest, to find a lognormal percentile, first find the percentile for the corresponding normal distribution. If is the th percentile of the normal distribution, then is the th percentile of the lognormal distribution. Usually, we can first find the th percentile for the standard normal distribution . Then the normal percentile we need is . The lognormal percentile is then:
The above discussion shows that the explicit form of the lognormal density curve is not needed in computing lognormal probabilities and percentiles. For the sake of completeness, the following shows the probability density functions of both the normal distribution and the lognormal distribution.
The cumulative distribution function for the lognormal distribution is then
Of course, we do not have to use since the lognormal CDF can be obtained based on the corresponding normal CDF.
One application of the lognormal PDF in is to use it to find the mode (by taking its derivative and finding the critical value). The mode of the lognormal distribution with parameters and is .
How to find lognormal moments
To find the mean and higher moments of the lognormal distribution, we once again rely on basic information about normal distribution. For any random variable (normal or otherwise), its moment generating function, if exists, is defined by . The following is the moment generating function of the normal distribution with mean and standard deviation .
As before, let be a random variable that follows a lognormal distribution with parameters and . Then where is normal with mean and standard deviation . Then is simply the normal moment generating function evaluated at 1. In fact, the kth moment of , , is simply the normal mgf evaluated at . Because the mgf of the normal distribution is defined at any real number, all moments for the lognormal distribution exist. The following gives the moments explicitly.
In particular, the variance and standard deviation are:
The formulas and give the variance and standard deviation if the parameters and are known. They do not need to be committed to memory, since they can always be generated from knowing the moments in . As indicated before, the lognormal median is , which is always less than the mean, which is raised to . So the mean is greater than the median by a factor of raised to . The mean being greater than the median is another sign that the lognormal distribution is skewed right.
Suppose follows a lognormal distribution with mean 12.18 and variance 255.02. Determine the probability that is greater than its mean.
With the given information, we have:
From the last equation, we can solve for . The following shows the derivation.
Thus we can take . Then plug into to get . The desired probability is:
For any random variable , a linear transformation of is the random variable where and are real constants. It is well known that if follows a normal distribution, any linear transformation of also follows a normal distribution. Does this apply to lognormal distribution? A linear transformation of a lognormal distribution may not have distributional values over the entire positive x-axis. For example, if is lognormal, then is technically not lognormal since the values of lie in and not . Instead, we focus on the transformations where is a constant. We have the following fact.
If has a lognormal distribution with parameters and , then has a lognormal distribution with parameters and .
The effect of the constant adjustment of the lognormal distribution is on the parameter, which is adjusted by adding the natural log of the constant . Note that the adjustment on is addition and not multiplication. The parameter is unchanged.
One application of the transformation is that of inflation. For example, suppose represents claim amounts in a given calendar year arising from a group of insurance policies. If the insurance company expects that the claims amounts in the next year will increase by 10%, then is the random variable that models next year’s claim amounts. If is assumed to be lognormal, then the effect of the 10% inflation is on the parameter as indicated above.
To see why the inflation on works as described, let’s look at the cumulative distribution function of .
Taking derivative of the last item above, we obtain the probability density function .
Comparing with the density function , the last line is the lognormal density function with parameters and .
Distributional quantities involving higher moments
As the formula shows, all moments exist for the lognormal distribution. As a result, any distributional quantity that is defined using moments can be described explicitly in terms of the parameters and . We highlight three such distributional quantities: coefficient of variation, coefficient of skewness and kurtosis. The following shows their definitions. The calculation is done by plugging in the moments obtained from .
The above definitions are made for any random variable . The notations and are the mean and standard deviation of , respectively. Coefficient of variation is the ratio the standard deviation to the mean. It is a standardized measure of dispersion of a probability distribution or frequency distribution. The coefficient of skewness is defined as the ratio of the third central moment about the mean to the cube of the standard deviation. The second line in the above definition is an equivalent form that is in terms of the mean, variance and the third raw moment, which may be easier to calculate in some circumstances. Kurtosis is defined to be the ratio of the fourth central moment about the mean to the square of the second central moment about the mean. The second line in the definition gives an equivalent form that is in terms of the mean, variance and the third and fourth raw moments.
The above general definitions of CV, and can be obtained for the lognormal distribution. The mean and variance and higher raw moments can be obtained by using . Then it is a matter of plugging in the relevant items into the above definitions. The following example shows how this is done.
Determine the CV, and of the lognormal distribution in Example 2.
The calculation in Example 2 shows that the lognormal parameters are and . Now use formula to get the ingredients.
Right away, CV = . The following shows the calculation for skewness and kurtosis.
Is there a moment generating function for the lognormal distribution?
Because the normal distribution has a moment generating function, all moments exist for the lognormal distribution (see formula above). Does the moment generating function exist for the lognormal distribution? Whenever the mgf exists for a distribution, its moments can be derived from the mgf. What about the converse? That is, when all moments exist for a given distribution, does it mean that its moment generating function would always exist? The answer is no. It turns out that the lognormal distribution is a counterexample. We conclude this post by showing this fact.
Let be the standard lognormal distribution, i.e., has the standard normal distribution. We show that the expectation converges to infinity when .
The last integral in the above derivation converges to infinity. Note that the Taylor’s series expansion of is . In the last step, is replaced by . Then the exponent in the last integral is a third degree polynomial with a positive coefficient for the term. Thus this third degree polynomial converges to infinity as x goes to infinity. With the last integral goes to infinity, the mgf goes to infinity as well.
Practice problems to reinforce the calculation are found in here.
2017 – Dan Ma