Some people believe specific probability distribution functions provide more reliable estimates than the simple scenario approach described above. However, while specific probability distribution functions can provide more precision, this additional precision is often spurious. Specific probability distributions often provide less accurate estimates, because the assumptions are not appropriate and distort the issues modelled. The exceptions to this rule arise when the assumptions inherent in some distribution functions clearly hold and a limited data set can be used to estimate distribution parameters effectively. In such cases it may be appropriate to replace a specification like that of Table 10.1 with a distribution function specification. However, it is counterproductive and dangerous to do so if the nature of these assumptions are not clearly understood and are not clearly applicable. For example, Normal (Gaussian) distributions should not be used if the 'Central Limit Theorem' is not clearly understood and applicable (e.g., for a discussion of what is involved, see Gordon and Pressman, 1978). Table 10.4 indicates the distributions that are often assumed and their associated assumptions.

Instead of estimating parameters for an appropriate theoretical distribution, an alternative approach is to fit a theoretical distribution to a limited number of elicited probability estimates. This can serve to reduce the number of probabilities that have to be elicited to produce a complete probability distribution.

This approach is facilitated by the use of computer software packages such as MAINOPT or @Risk. MAINOPT is a tool that models 'bathtub' curves for reliability analysis. The generic bathtub-shaped curve shows the probability of failure of

Table 10.4—Applicability of theoretical probability distributions distribution applicability poisson

distribution of the number of independent rare events n that occur infrequently in space, time, volume, or other dimensions. Specify A, the average number of rare events in one unit of the dimension (e.g., the average number of accidents in a given unit of time)

exponential

uniform

[ 0 elsewhere mean — (U + L)/2 variance — (U — L)2/12

standard Normal f (x) — exp(—x 2/2)/V2k mean — 0 variance 1

useful for modelling time to failure of a component where the length of time a component has already operated does not affect its chance of operating for an additional period. Specify k, the average time to failure, or 1/k the probability of failure per unit time where any value in the specified range [U, L] is equally likely. Specify U and L

appropriate for the distribution of the mean value of the sum of a large number of independent random variables (or a small number of Normally distributed variables). Let Yi, Y2, ... , Yn be independent and identically distributed random variables with mean p and variance a2 < 1. Define xn — \pn( Y — p)a where Y 1 Y(. Then the distribution function of xn converges to the standard Normal distribution function as n !i. Requires p and a2 to be estimated standard normal f (x) = exp(-x 2/2)V2n mean = 0 variance = 1

if y represents the number of 'successes' in n independent trials of an event for which p is the probability of 'success' in a single trial, then the variable x —(y — np)/\Jnp(1 — p) has a distribution that approaches the standard Normal distributions as the number of trials becomes increasingly large. The approximation is fairly good as long as np > 5 when p < 0.5 and n(1 — p) > 5 when p > 0.5. Requires specification of p and n

lower most likely bound value

lower most likely bound value

Expected value = {L + M + U) 13 Figure 10.3—The triangular distribution a component at a particular time, given survival to that point in time. The analyst specifies parameters that specify the timing of the 'burn-in', 'steady-state' and 'wear-out' periods, together with failures rates for each period. The software then produces appropriate bathtub and failure density curves. Woodhouse (1993) gives a large number of examples in the context of maintenance and reliability of industrial equipment.

A popular choice for many situations is the triangular distribution. This distribution is simple to specify, covers a finite range with values in the middle of the range more likely than values of the extremes, and can also show a degree of skewness if appropriate. As shown in Figure 10.3, this distribution can be specified completely by just three values: the most likely value, an upper bound or maximum value, and the lower bound or minimum value.

Alternatively, assessors can provide 'optimistic' and 'pessimistic' estimates in place of maximum and minimum possible values, where there is an x% chance of exceeding the optimistic value and a (100 — x)% chance of exceeding the pessimistic value. A suitable value for x to reflect the given situation is usually 10, 5, or 1%.

In certain contexts, estimation of a triangular distribution may be further simplified by assuming a particular degree of skewness. For example, in the case of activity durations in a project-planning network Williams (1992) and Golenko-Ginzburg (1988) have suggested that durations tend to have a 1: 2 skew, with the most likely value being one-third along the range (i.e., 2(M — L) = (U — M) in Figure 10.3).

The triangular distribution is often thought to be a convenient choice of distribution for cost and duration of many activities where the underlying processes are obscure or complex. Alternative theoretical distributions such as the Beta, Gamma, and Berny (Berny, 1989) distributions can be used to model more rounded, skewed distributions, but analytical forms lack the simplicity and transparency of the triangular distribution (Williams, 1992). In the absence of any theoretical reasons for preferring them and given limited precision in estimates of distribution parameters, it is doubtful whether use of Beta, Gamma, or Berny distributions have much to offer over the use of the simple trianglular distribution.

In our view, for reasons indicated earlier, it is doubtful that triangular distributions offer any advantages over the use of the approach illustrated in Example 10.5. They may cause significant underestimation of extreme values. The use of an absolute maximum value also raises difficulties (discussed in the next subsection) about whether or not the absolute value is solicited directly from the estimator.

Was this article helpful?

## Post a comment