Montecarlo Tutorial

8/10/2019 Montecarlo Tutorial

1/31

Tutorial for Monte Carlo Simulation with Excel

Copyright by Moshe Cohen, [email protected]

Table of Contents

Page

Introduction 2Part I. Macros of Excel 3

1. Setting of Excel 2003 and earlier versions 4

2. Setting of Excel 2007 53. Loading macros to Visual Basic Editor 7

Part II. Deciding probability weights 9

1. Estimation of the distribution form a sample 10

2. Simple considerations for distribution choice 123. More detailed information from the histogram 13

4. Probability density functions of distributions 165. Selecting a distribution without a sample 16

Part III. Simple static simulation 21

1. Running the simulation 21

2. Appreciating the results 23Part IV. More examples 25

1. Portfolio analysis 24

2. Project planning 253. Travel time estimation 26

4. Reliability of systems 29Index 31


2/31

2

Introduction

Many decisions must be made involving unpredictable figures1such as a share price six

months from now. Other examples are sales forecasts, time duration of performing a

completely new task, future selling prices, inflation and exchange rates.

A general strategy that may indicate what decision is most advisable is to assume several

possible realizations of the figures called scenarios -, calculate a numerical outcome

from each scenario and weigh the outcomes by the probability of the respective

realizations. Nowadays this procedure is computerized and it is possible to checkhundreds or thousands of scenarios and thereby increase the credibility of such an

analysis. The procedure is called staticorMonte-Carlosimulation(or just simulationin

the rest of this tutorial). Excel can carry out static simulations with the aid of suitablemacros.

This tutorial expects some familiarity with Excel but none for statistics or probability. Itillustrates static simulations and introduces the minimal theory underlying probabilities of

realizations. The tools for these are the macros available in

http://my.jce.ac.il/bani/StaticSimulation/Sources .

The macros are free in the sense of GNU Lesser General Public License (LGPL) thatmeans approximately that they can be freely used for any purpose, including commercial

one; but please read LGPL for the exact meaning. The macros can make essentially the

same job as the commercial programs ExpertFit and Crystal Ball that sell forhundreds of dollars.

At first it is explained how to attach probability weights (in Part II) aided by macro

ChooseDistribution and only then how to use the Simulate macro that performs the

simulation run (in Part III). If you have not done yet, please, install the macros and set upExcel so that you can rerun all the examples given in this tutorial (explained in Part I).

Part IV shows an assortment of simple example problems.

An introductory example used in Parts II and III is the problem of an American call

option. It is a decision whether it is worthwhile to pay a given sum now (called

premium) for the right to buy a share on an agreed future date (called expiration date) at a

given fixed price (called strike price) agreed today. Obviously, this depends on the priceof the share at the expiration date. The outcome is supposed to aid the decision maker.

The outcome is based on the difference between the expense and the gross profit while

the expense is the sum of the premium and the strike price and the gross profit is theshares market value at the expiration date. It is worthwhile to pay now the premium only

if profit is more probable than loss.

1More exactly, stochastic variables


3/31

3

PART I. Macros of Excel

There are 22 files at http://my.jce.ac.il/bani/StaticSimulation and its subdirectory (besidesthis file) listed also in Table 1.

No. Source file File types

1 BoundsFrm.frm

Source files

2 chooseFrm.frm

3 FitFrm.frm

4 mainForm.frm

5 htpFrm.frm

6 progressForm.frm

7 ChooseDistribution.bas

8 FitModule.bas

9 Simulate.bas

10 Utility.bas

11 BoundsFrm.frx

Files containing information for the above *.frm files.

Each one must be in the same directory as thecorresponding *.frm files

12 chooseFrm.frx

13 FitFrm.frx

14 mainForm.frx

15 htpFrm.frx

16 progressForm.frx

17 Empty.xls Empty Excel file containing the above macros

18 Samples.xls Excel file with all the sample problems of this tutorial

19 GNULesserGeneralPblicLicense.txt The licenses that cover the Visual Basic macros andforms20 GNUlicense.txt

21 DrawDistributionSetup.exe Program drawing probability density function

22 MonteCralo.zip Zip file containing all the above

Table 1. Files for Monte Carlo simulation

You may wish to download just the zip file (#22) and open its contents on your computer.

File #21 is not absolutely necessary.

If you are not interested in the source files themselves, you may download only the Excelfiles. Please, save and keep intact the original Empty.xlsas is and work only on its copies.

Empty.xlscontains only the macros.

Macros are suspected to be malicious by Excel and therefore many Excel installations are

routinely instructed to ignore them. Therefore, first of all one has to assure that Excelprograms are ready to run macros. For the 2007 version one also has to assure that themacros are saved in the workbooks. Some default settings in Excel must be altered for

running, saving and examining macros. How to do them is described in the next sections.


4/31

4

1. Settings of Excel 2003 and earlier versions

The security level is set as follows. Go to security option by clicking on the sub menu

Security that can be reached by Tools > Macro (see Fig 1.). The appropriate settingis shown in Fig. 2. If the Macros button is clicked (see Fig 2.) a list of available

macros are displayed. The examination of the macros source code is possible via Tools >Macro > Visual Basic Editor (also in Fig 2., about the Editor see further in Section 3.).Check if menu Tools contains the item Data An1alysis (Like can be seen in Fig. 1).

If not, you have to install the Data An1alysis add-in by selecting Tools > Add-Ins

that pops up a list (Fig. 3) where ToolPak is to be selected.

Figure 1. Macro menu in Excel 2003

Figure 2. Excel 2003 security options


5/31

5

2. Settings of Excel 2007

Clicking the Office button (in the left upper corner) opens a drop down menu

where the Excel Options button (located at the bottom of the menu) is to be clicked.

Here it is needed to change at least two or possibly up to four settings concerning:

the format of saved workbook under Save, the security of the macros under Trust Center,

the Visual Basic Editor (if you want to observe/alter the macro) under Popularand

making histograms (if you wish so) under Add-Ins

Figure 3. Choice of Add-In for Statistical Data Analysis

The change is necessary because the default of Excel 2007 is not to save macros, not torun them, not to let users use Visual Basic and not to include any add-ins. Please note that

if settings are changed they become effective only after closing and reopening Excel.

The format that saves macros is shown in Fig 4.

Figure 4. Setting that saves macros

The security should be set as shown in Fig. 5 in the Trust Center window. The TrustCenter window pops up when the button Trust Center Settings is pressed (located at

the right bottom of Trust Center option)

Macros can be run by clicking the Macros button in the View tab of the ribbon (Fig.

6). It opens a list of all available macros.


6/31

6

The examination of the macro source code is possible in Excel 2007 only if theDeveloper tab is placed on the ribbon. This is also not a default setting and it is

necessary to request. This is done in the Popular part of Excel options. The setting is

shown in Fig 7. After this operation the Developer tab may be open where clicking theVisual Basic button pops up the Visual Basic Editor (see Fig.8; about the Editor see

under section 3).

Figure 5. Setting for running macros

Figure 6. The Macros button

Figure 7. Requesting the Developer tab


7/31

7

Check if the ribbon of Data contains the Data Analysis button in the Analysisgroup as in Fig. 9. If not, you have to request it by entering Excel Options (Fig. 6)

selecting Add-Ins and clicking the Go button in the bottom. A list of add-ins is

popped up where Analysis ToolPak is to be selected (Fig. 3).

Figure 8. The Developer tab

Figure 9. The Data ribbon having the Data Analysis button

3. Loading macros to Visual Basic Editor

You may download the source files and load them yourself to any Excel file that is

supposed to run simulations. For adding the files to an Excel workbook, you have to open

the Visual Basic Editor. How to open was described above for each version.

You can import all the source files through the File sub menu of the editor as shown in

Fig. 11: click Import File that opens a window for selecting the source file. Theimporting is for the active workbook, so if the Excel program has no workbooks,

importing is disabled.

Even if you do not have any desire to look at the source, it may be useful for checking itabout any malicious content such as writing on your disk without your permission2. For

checking purposes you can read the source files with a simple text editor such as

Notepad.exe where an Excel macro cannot do any damage.

2This advice is a good way to ovoid malicious code (virus, Trojan horse, spy ware etc) in a macro. Of

course, this is possible only when the source is available.


8/31

8

Figure 10. The File menu of VBA editor


9/31

9

PART II. Deciding probability weights

It is shown in probability theory that weighing the outcomes with the probability of therealizations is equivalent with choice of realizations with high probability more often

than those with low probability. More exactly, realizations in simulation are chosen atrandom and proportionately to their probability. By doing so the estimate of the expectedoutcome may be calculated as a simple average of the scenarios outcomes.

Excel provides functions that ensures both random choice (function RAND()) and

proportionality to probability. The pertinent Excel functions are listed in Table 2; how tochoose among them, what are the meanings and their appropriate values are the topic of

this part.

StatisticalDistribution

Excel Function

Beta =BETAINV(RAND(), alpha, beta, lower bound, upper bound)

Exponential =-LN(RAND())*meanGamma =GAMMAINV(RAND(),shape parameter, scale parameter)

Log-normal =LOGINV(RAND(),mean of ln(X), standard deviation of ln(X))

Normal =NORMINV(RAND(),mean, standard deviation)

Table 2. Excel functions suitable for getting random realizations

The Excel functions relate to what is termed in probability theory as statistical

distributions3(or just distributions) listed at the left column. Macro4ChooseDistribution

helps to decide what function with what parameters to choose for a given problem. The

way to start a macro is:

in Excel 2003 and earlier versions: Tools > Macro > Macros (see Fig 1.)

in Excel 2007: click Macros button at the View tab (see Fig 6.).This lists on a pop-up window all the available5 macros of the workbook. Select

ChooseDistribution and click the Run button6. It pops up the window shown in Fig 11.

Figure 11. Opening window of macro ChooseDistribution

3For more on distributions consult http://en.wikipedia.org/wiki/Statistical_distributions4Macro #7 of Table 1.5You cannot see macro ChooseDistribution if it has not been installed; see how to install it in Part I.6You cannot run the macro if the security/trust check is to high. See the topic in Part I


10/31

10

Recall from the Introduction that for the American call option problem you want to make

an educated guess about the share price at the expiration date. You may either estimate

the price based on historical data (if available); in statistical parlance it called as

estimationbased on a sample. The sample is the historical data and its individual items

are referred as observations. Alternatively, you can guess the share price withoutprevious sample. These are the two options presented in Fig. 11. We shall discuss the firstpossibility in section 1 while the second on section 5. In the intermediate sections

important techniques and concepts are introduced.

1. Estimation of the distribution form a sample

Suppose that the expiration date is six months from now and it is possible to find

historical data about prices of 20 shares that behave similarly to the one underconsideration. So we can find the price increase or decrease (in percents) from their

prices six month ago to the prices today. We assume, for the sake of illustration, that

similar change will occur to our share in the future. The increases are displayed in Table3, while a negative number means that the price went down.

8.03 -2.58 1.59 9.17 1.61 7.30 15.66 0.59 9.00 17.98

1.11 7.66 7.14 14.82 8.95 10.58 7.11 5.62 10.86 2.62

Table 3. Increases in prices of 20 shares in percents

Pressing button OK (while fit a sample is selected) pops up another window (shown

in Fig. 14) that has a warning about the appropriate position of the sample. One cannot go

on until the sample is arranged in a single column starting with cell A1 on a sheet thatcontains only the sample as shown in Fig 12.

After the sample is arranged properly on the active worksheet (one where the user can

readily write if so wishes), fitting the sample is simply performed by clicking the Fit

button (see Fig. 14). It rearranges and fills the active sheet (see Fig. 13) and displays theresult as shown in Fig. 14.

It selects the (shifted) exponential distribution as the only distribution (from the fivedistributions mentioned in range B13:F13) that fits the sample.

Note how the active sheet looks after fitting: the sample is pushed down 13 rows, itsaverage and standard deviation are given in range B2:B3. The original sample starting

in cell A14 is sorted. This is necessary for calculating the so-called Kolmogov-Smirnov

(K-S) statistic7 that is the maximal absolute difference between the sample distribution

and the assumed theoretical distribution with parameters in rows 6-12. This is shown forthe five distributions in range B13:F13. Distributions with reasonable fit must be below

the 95% critical value8. The smaller this statistic the closer is the sample to a proposed

7For theory see http://en.wikipedia.org/wiki/Kolmogorov_Smirnov.8May be obtained from http://www.york.ac.uk/depts/maths/tables/kolmogorovsmirnov.ps, but bare is mind

that in our case we must use the two-sided test.


11/31

11

distribution. The preferred distribution in the case under consideration is the exponentialthat is shifted by the offset in row 6. Thus, to get a realization of the share price six

months from now is given by=-4.165-LN(RAND())*15.436.

Figure 12. Organizing the sample on an Excel sheet

Figure 13. The worksheet after fitting.


12/31

12

Figure 14. The fitting window

Similarly, in the beta distributed case the appropriate realization is given as

Rows 14 and downwards are the values of the assumed theoretical distribution for thearguments in column A of the same rows. Column of the log-normal distribution is empty

because it is defined only for samples having strictly positive observations.

2. Simple considerations for distribution choice.

The fact that the best fit was found as exponential may upset many financialmathematicians whose first reaction would be before consulting your sample that log-

normal distribution9should be the best choice. Such reaction highlights the principle that

the distribution carries implicit assumptions about the sample that may not be deduced

from the sample, but known to the user. There are simple questions10

that any user mayconsider before choosing a distribution. They may direct the choice to a theoretically

reasonable distribution. They are:

Is there any value below which no realization is possible? For example, timedurations cannot be negative, thus the value below which no realization ispossible in this case is 0. Such a value is called lower bound.

Is there any value above which no realization is possible? If the value is some

percent from a given total then the value above which no realization is possible is100. Such a value is called upper bound.

Is there a value that is more probable than others? Such a value is called mode.

9The log-normal distribution is, however, valid not for the percent change but for the ratio between the

present price and the price six months ago. This was raised for methodological reasons to open the

discussion for theoretical considerations.10The financial mathematicians choice of log-normal distribution is a result of much more sophisticated

considerations beyond the scope of this tutorial


13/31

13

Are the realizations deviations from the mode above the mode and below themode equiprobable? If so, the distribution is termed symmetric. Deviations from

a nominal weight or size due to slight technical variations in production

processes11are often assumed to be symmetric. The nominal values serve as the

modes.

Non-existing of a lower bound means that any negative value is possible, non-existing of

upper bound means that values may be large without any limitation, although largedeviations above the mode are extremely rare.

The answers for three questions above for the five distributions are shown in Table 4.The alpha and beta in the table are the parameters of the distribution of Table 2.

Distribution Is Lower Bound? Is Upper Bound? Symmetric?

Beta Yes Yes Only if alpha=beta

Exponential Yes No No

Gamma Yes No No

Log-normal Yes No No

Normal No No Yes

Table 4. Some basic features of distributions.

As a rule, exponential and log-normal distributions need not be candidates as gammadistribution can replace them. Exponential distribution is a special case of gamma

distribution and gamma distribution may be indistinguishably close to log-normal

distribution12

. Assuming existence of upper and lower bound for our case, the betadistribution looks more plausible than the others.

3. More detailed information from the histogram.

The histogram is a graphical summary of the data that Excel may provide. It shows how

many cases fall into given, disjoint ranges, called bins. For appreciating it, the best is to

show a histogram for our case with accompanying clarifications. The procedure ofgetting it is as follows.

1. Define the boundaries of the bins. This may be done as seen in Fig. 15 in columnC. They are six increasing numbers, while the first is less than the minimum ofthe data (see Fig. 12) and the last one is larger than the maximum. The number of

boundaries is the number of bins plus 1. The recommended number of bins is

given in Fig. 16.2. Request the window listing Excels statistical procedures from menu13 Tools >

Data Analysis (in Excel 2003 or earlier) or by clicking the Data Analysis

11This holds only if the process is under control. Indeed, large part of statistical quality control is based on

analyses of possible asymmetries.12See the comparison in subsection 4 further13In some installation the data analysis does not appear under Tools menu. See Part I. how to cause it

appear.


14/31

14

button on the ribbon of the Data tab (in Excel 2007). This opens a list of theavailable statistical methods. Histogram should be selected.

3. The selection opens a window shown in Fig. 15 where the location of the data(A1:A20 supposing the situation from Fig. 12) is entered. The output range shouldbe an empty cell that below and right to it there are empty cells so that the output

can be written there. The output is shown in Fig 17 in the range E1:F8. In columnE has the upper boundaries of the bins and in column F the number of casesfalling in the bin. Thus, there are five numbers in the bin whose boundaries are -3

and 2.

Figure 15. Generating histogram

Figure 16. Proposed number of bins


15/31

15

Figure 17. Drawing the histogram

4. Select the range F3:F7 (the number cases in the bins). Call now the chart wizard

of Excel by clicking the button (in Excel 2003 or earlier) or through theColumn chart button in the Insert tab (in Excel 2007). In Excel 2003 it opens

a window as shown in Fig. 17, in Excel 2007 is to be selected. Thissupplies the histogram that is displayed (for the data from A1:A20) in Fig. 18.

Figure 18. Histogram of the data from Fig. 12

The histogram yields visually a much richer description of data than the rudimental

information used in Table 4.

The mathematical abstraction14 of the histogram is the so-called probability densityfunction, abbreviated as pdf. A reasonable distribution may be one that resembles the

histogram. For this reason, it is necessary to shortly review the pdfs of distributions.

14If the number of bins tends to infinity and at the same time the rage of bins tends to 0 then the histogram

tends to the pdf.


16/31

16

4. Probability density functions of distributions

The program DrawDistribution.exe aided by InvF.dll(they must be in the same folder)shows the pdf for various distributions. A few remarks are in order:

the lower bounds of all the distributions having lower bounds is set to 0

the upper bound of the beta distribution is set to 1 the order of the parameters are not always the same as in Excel (as displayed in

Table 2)

Figures 19 through 30 are the pdfs of several distributions to show the influence of the

parameters on pdf. They were produced by program DrawDistribution. The horizontalaxis shows the range of values containing most realizations.

The general shape of the normal distribution (Figs 19 through 21) is the same for allparameters. The standard deviation of pdf shown in Figs 19 and 20 are equal to 1 while

the values of the means are different. Note that the corresponding ranges are the same:

6.576-1.424 = 2.576-(-2.576). The effect of changing the value of the mean is that eachpoint is translated by the difference between the two means (4 in the case of Figs 19 and

20).

Figure 19. Normal pdf with mean=0,

standard deviation=1


standard deviation=1

Figs 19 and 21 are different only in standard deviation. Both are centered at 0 (the mode

is 0), but the corresponding range in Fig 21 is twice that of Fig 19. Hence, the standard

deviation can stretch the horizontal axis. Such stretching parameter that does notinfluence the shape or location is called scale parameter. The parameter that changes the

shape is called the shapeparameter. Normal distributions have the same overall shape.

Another distribution where the shape is unchanged is the exponential distribution (offset

0) shown in Fig. 22. Its mode is always 0 and the mean is its scale parameter.

The gamma distribution is another one that has a scale parameter. The shape depends on

the shape parameter see Figs 23 and 24. The larger the shape parameter, the farther isthe mode from the lower bound (0) and the more symmetric the pdf is.


17/31

17


standard deviation=2Figure 22. Exponential pdf with mean=1

Figure 23. Gamma pdf with scale

parameter=1 and shape parameter=3

Figure 24. Gamma pdf with scale

parameter=1 and shape parameter=6

The pdf of the log-normal distribution is shown in Figs 25 and 26. The shape is a

complicate function of the parameters. Log-normal distributions are similar to gamma

distributions.

Figure 25. Log-normal pdf with ln-mean

parameter=0 and ln-standard deviationparameter=1

Figure 26. Log-normal pdf with ln-mean

parameter=0 and ln-standard deviationparameter=0.5

As indicated in Table 4, the beta pdf is symmetric if alpha=beta parameters. This is

corroborated by Figs 27 and 28. It also may be observed that the larger the parameters,the more concentrated is the pdf around the mode. If alpha is lower than beta parameter

then the mode tends to the lower bound, if alpha is larger than beta parameter then the


18/31

18

mode tends to the upper bound as demonstrated by Figs 29 and 30. The tendency isstronger when the imbalance between the two parameters is larger

Figure 27. Beta pdf with alpha parameter=3

and beta parameter=3







Finally note that the beta pdf may be very similar to normal pdf (compare Figs 21 and28). The normal distribution may be closely approximated by a beta distribution where

alpha=beat=4.3 and the bounds are mean3(standard deviation); sign is for the lowerand + for the upper bound. The beta approximation is preferable if normal distribution

may lead to negative values with probability that is not negligible.

Beta pdf may be also similar to gamma pdf (compare Figs 23 and 29). Moreover, beta

distribution can be also applied to the case where any value is equally probable between

bounds:=BETAINV(RAND(),1,1,lower bound, upper bound)

5. Selecting a distribution without a sample

It was demonstrated in the previous section that beta pdf is the most versatile and can

imitate other distributions. Therefore, the second option selected in pop-up window ofFig. 11 suggests the beta distribution and pops up a window where three characteristic

values of the beta distribution can be set (Fig 31): the two bounds and the mode as the

most typical value.


19/31

19

Figure 31. Input window when no sample is available

Consulting the histogram (Figs 17 and 18), one can think of the bounds as -3 and 22.

There are actually two modes (with frequencies 5 and 10), but possibly the bigger one(between 7 and 10) is dominant. The estimate of the bigger mode is the average of the

bins boundaries: (7+12)/2 = 9.5. To compensate for the smaller mode, for input to

window of Fig. 31, the lower boundary (7) is used rather than 9.5. The result is shown in

Fig 32. The calculation adds a new sheet, calculates15

the pdf in column D and draws thepdf..

The pdf may be made more concentrated or less concentrated around by selectingnarrower or wider and clicking the modify button. It shows visually what happens the

pdf. Moreover, different proposed arguments may be written in the cumulative

distribution. By doing so, selecting the with different parameters option and clickingthe Modify option a new pdf also recalculates and redraws the pdf.

15This is an approximation to the pdf or more exactly to pdf multiplied by a constant that has no influence

on the shape of pdf and is satisfactory for the purpose.


20/31

20

Figure 32. Output resulted from the input displayed in Fig 31


21/31

21

Part III Simple static simulations

In the problem of the American call (see Introduction) assume that:

the percentage rise of the market price of the share at the expiration date from its

present market price is beta distributed with the parameters shown in Fig 31 anddiscussed before that figure,

to profit from purchasing the option materializes only if the market price risesmore than 7% and out of 1000 scenarios, if there are more scenarios with profit

than with loss then purchasing the option is worthwhile.

1. Running the simulation

Before trying out all the scenarios one scenario must be built on an Excel sheet used as a

template for the other ones. This may be done for our problem as shown in Fig 33.

Figure 33. Template scenario for the American call problem

The values of the cells B2 and B3 were calculated by formulas. The formula in cell B2 is

BETAINV(RAND(),2.6,3.4,-3,22), the same as suggested in Fig 32 for the random

variable. (If you try out this problem cell B2 may contain another value 16). It is a price

that is randomly selected from all the possible prices with this distribution, thus it is anunpredictable figure, as was referred to in the Introduction. Formula in cell B3 is

=IF(B1 Macro > Macros (see Fig 1.)

in Excel 2007: click Macros button at the View tab (see Fig 6.).This lists on a pop-up window all the available19 macros of the workbook. Select

Simulate and click the Run button. It pops up the window shown in Fig 34.

The active cell (the cell with wide black frame) must be on the outcome cell B3 in the

example. There are four options for output. If Summary is selected the number of

16The reason is that the choice is random. The truth is that it only looks so, what is called pseudo-random.

This issue is however beyond the scope of this tutorial.17Such functions are widely used in modern probability theory and called indicator functions.18Macro #9 of Table 1.19You cannot see macro Simulate if it has not been installed; see how to install it in Part I.


22/31

22

scenarios is performed and only the average and standard deviation of the outcomes arecomputed. Besides this information there are possibilities to get the following if the

options are checked:

the outcomes of all the scenarios

histogram of the outcomes

fitting a distribution to the outcomes.

Figure 34. Simulation input form

Having clicking the Start button 1000 (or whatever is written for number of scenarios inthe simulation input form) scenarios are chosen at random and a new sheet is inserted

where all the required output is written. The sheet is shown in Fig 35.

Figure 35 Results of simulation

The result is 0.545 (so in 545 scenarios out of 1000 showed profit and 455 scenarios loss), so the profitable scenarios outweigh those with losses. (Again, if you try out this

problem range B2:B3 may contain slightly different values due to randomness.)


23/31

23

2. Appreciating the results

If scenarios are chosen at random, it may be possible that by doing so we got mostlyexceptionally good ones or bad ones. The probability that this happens may be

sufficiently small especially when many scenarios are tried. But how many is manythat is satisfactory? Are 1000 sufficient or too many? What is a reasonable number?These are the question that we deal here with.

The key is the standard deviation estimate (0.49822) and the number of scenarios tried

(1000). For illustrative purposes the same problem is rerun for different number ofscenarios see Table 5.

Numberof

scenariosAverage

Standarddeviation

tstatistic

Lower 95%confidence

limit

Upper 95%confidence

limit

9 0.56 0.53 2.31 0.15 0.96

25 0.36 0.49 2.06 0.16 0.56

100 0.47 0.50 1.98 0.37 0.57

900 0.55 0.50 1.96 0.52 0.59

1600 0.57 0.50 1.96 0.55 0.59

4900 0.55 0.50 1.96 0.54 0.57

Infinity 0.55 0.50 1.96 0.55 0.55

Table 5. Simulation of the American call option for different scenarios

The last row of the table is not obtained by simulation but rather by theoretical

calculation that is possible in this simple case. For the other rows Number of scenarios,Average and Standard deviation were obtained by actual simulations. The last three

columns are computed from the first three. To appreciate them the concepts of t statisticand confidence limit must be clarified.

Confidence limitsin statistics mean the bounds of the range that includes the true value of

the outcome in a given probability. Thus for 25 scenarios there is 95% chance that thetrue value20is between 0.16 and 0.56. Note that true value is known to be 0.55, the value

of average in the last row. The true value is indeed between the confidence limits for the

other cases as well. This should happen in 95% of the cases on the average, thus, in 19out of 20 cases. The limits are easily computed by the formula (1):

Average t(Standard deviation) /SQRT(Number of scenarios). (1)

where t is the t statistic from the table, sign is used for the lower and sign + for the

upper confidence limit. For example the lower limit for 25 scenarios is computed as:

=0.36 2.060.49 /SQRT(25)

20Its official name in statistical theory is the expected value.


24/31

24

The t statistic is given by the Excel function TINV(0.05,( Number of scenarios)-1).

The width between the confidence limits reduces when the number of scenariosincreases. The reduction of the width is however not uniform. For the present problem the

width around the true value depicted in Fig 36. For a very few observations any average

outcome is possible, for sample size larger than 100 the width reduction becomes veryslow. The width becomes 0 only at infinity, but of course, it is not reasonable to demand

infinite number of scenarios. Some realistic width must be tolerated.

Figure 36. Confidence limits vs. sample size

Now we are in a position to be able to answer the question what is the number of

reasonable replications (=scenarios or sample size) for a problem. It is one for which theconfidence limits (as given by (1)) are narrow enough. The following procedure is

therefore recommended for the appropriate sample size, N:

1) Assume that the required maximal width between 95% confidence limit is .2) For N1 simulation runs get the average and the standard deviation. N1may be in

the range 50-200 or any number that seems intuitively reasonable.3) Calculate the confidence limits given by (1). If the upper limit is not higher than

the lower limit + then no further runs are necessary, N1 is a sufficient samplesize and procedure may stop. Else perform also the following.

4) Calculate the required number of simulations21: N = 16((Standard deviation)/)25) Run N - N1more simulations and combine the results with the runs made in 2)

21The formula is a result of approximating the value of the t statistic by 2 in (1), equating the difference

between the upper and lower confidence limits as given by (1) to and solving the equation for the

number of scenarios, N.


25/31

25

Part IV More examples

We shall illustrate the possible uses of simulation in four sample problems. They are very

small to keep them simple and readily understandable, however, the size of problems arelimited only by Excel itself, the computer and the users imagination. Also the derivation

of the outcome that is called in the sequel the model, may look complicated because thederivation uses simple arguments, without reference to special techniques22

. However,with using such techniques models may be constructed routinely, or even by a computer

code, if so desired. The interested reader may turn to the author ([email protected]).

1. Portfolio analysis

Assume a portfolio consists of 3 assets: Real Property, State Bonds and Shares. Assume

that the expected return (as %) and its variability of each asset are as given in Table 6

Assets Expected return % Standard deviation of return %

Real Property 6 1.5

State Bonds 1 0.1

Shares 8 3

Table 6. The data of the assets in the portfolio

If we have 100 units in Real Property, 50 units in state bonds and 50 units in shares, what

is the expected return and its standard deviation?

We assume that the returns are normally distributed23. Then a sheet can be constructed as

shown in Fig. 37. Column C (Expected return per share) has the random returns (see

formulas in Table 7 and a random result in Fig 37) and column D has their contributions

to unit portfolio. Their sum (in cell D6) is the return of a unit portfolio, the outcome ofthe model. It is the subject of simulation (see Fig 37.)

AssetsExpected return per

share unit portfolio

Real Property =NORMINV(RAND(),6,1.5) =C3*B3/200

State Bonds =NORMINV(RAND(),1,0.1) =C4*B4/200

Shares =NORMINV(RAND(),8,3) =C5*B5/200

Sum =SUM(D3:D5)

Table 7. The formulas for the portfolio problem

22Such techniques are mostly in the domain of Operations Research.23The assumption is also for independence. Actually, returns of assets areinterdependent: in bear market

returns of most assets go down, in bull market they go usually up. If they were independent the going up or

down should happen independently. If the covariance matrix (statistical measure of correlations) can be

estimated we still can perform simulation but it is beyond the scope of this tutorial. Interested readers may

turn to the author ([email protected])


26/31

26

Figure 37. The model of portfolio and its simulation setting

Fig 37. also displays the setting of simulation. As a result, the true average is between

5.2% and 5.4% but any value approximately between 2.5% and 8% is possible.

2. Project planning24

Consider the planning of a project about some device from conception to a readyprototype. The assumption is that the whole project may be dissected to well-defined

tasks called as activities for which time durations can be estimated. The problem is to

estimate the time duration of the whole project, thus, how to translate uncertainty aboutactivity durations to uncertainty of project duration.

Some parts of several activities can be undertaken simultaneously that can shorten the

time of the whole project. The estimate of project duration is calculated assumingmaximal possible parallel execution of activities. There are some activities, however, that

cannot start without finishing others. For example it is impossible to put the roof before

some walls are built. The activity of building the walls is called a predecessor of theactivity of building the roof. The foundations are predecessor of roof but not animmediate predecessor. Predecessors lengthen project duration, so they are relevant for

estimation of project duration.

Assume that the activities, their immediate predecessors and their time estimate are given

in Table 8. A further concept is the milestone that is the earliest point of time (counted

from the beginning of the project) that a specific activity may start or the earliest point oftime that the whole project is finished. Note that

24The problem dealt in this Section is called stochastic PERT in the project management literature.


27/31

27

the milestone of activities with no predecessors is 0 (by the way that time iscounted and the fact of maximum parallelism is desired),

milestones of other activities is the time when each of its predecessors arefinished

the milestone of the whole project is the outcome of the model.

In this project there are four milestones listed in Fig 38. Note that the milestone ofFunctional design and Documentation is the same as both may start as soon as Basicspecification is done.

Activity Immediate predecessorsMinimum

timeMaximum

time

Basic specification none 2 4

Market research none 1 3

Functional design Basic specification 2 3

Documentation Basic specification 4 5

Detailed design and prototype Market research, Functional design 5 8

Table 8. Data of the activities comprising the project

Figure 38. The model of project duration and its simulation setting

Fig 38. also shows the simulation settings. The earliest time of milestone #1 is 0 and the

others are based on formulas presented in Table 9. For example, the earliest time of


28/31

28

milestone #4 (end of project) is when both Documentation (its starting time is in cell B3)and Detailed design and prototype (its starting time is in cell B4) are done. Note that in

this case other activities finish is implicit in cells B3 and B4. As only bounds are

available in Table 8, any value between the bounds it taken with equal probability that

leads to functions (see end of Section 4 of Part II).

Milestone # Earliest time

2

3

4

Table 9. The formulas for the project duration problem

As a result, the true average is between 11.9 and 12 but any value approximately between9.5 and 14.5 is possible.

3.Travel time estimation25

Consider the prediction of travel time with varying traffic conditions. Assume that for

each section of all the possible routes from origin to destination it is possible to tell the

distribution of time needed to traverse the section. The problem is illustrated for themap of Fig 39. as a graph. The origin is node A, the destination is node D, node C is an

intersection and the arrows are the sections. The minimal and maximal traverse times

through the sections are listed in Table 10.

Section minimum time maximum time

A->B 2 4

A->C 3 9

B->C 2 4

C->D 3 7

Figure 39. Map of the travel time problem Table 10. Data for the travel time problem

Define the immediate predecessor nodesas the set of nodes from which there is a section

to a given node. A has no predecessors, nodes B and C each has a single predecessorwhile node C has two predecessor nodes: A and B.

The Excel sheet (Fig 40) calculates the shortest time from node A to a given node. The

shortest time for node A is clearly 0. For the other nodes the shortest time is the minimum

over the set of all immediate predecessor nodes the sum of the shortest routes to apredecessor and the traverse time to the node under consideration. The actual formula for

this problem is given in Table 11.

As a result, the true average is between 10.1 and 10.3 but any value approximately

between 6.5 and 13.5 is possible.

25The problem dealt in this Section is called stochastic shortest route in the network algorithms literature.


29/31

29

Figure 40. The model of travel time and its simulation setting

Town Earliest time

Table 11. The formulas for the project duration problem

4.Reliability of systems

Consider a system consisting of two batteries and a camera. The system can work with asingle battery, but cannot if either the camera or both batteries are broken down. The

lifetimes of the three components are random but the maximum and the most typical

lifetime are given for each in Table 12. The lower bound of lifetimes is 0. The problem isto find the lifetime of the whole system.

Clearly, the systems lifetime is the minimum between the cameras and the maximallifetime of one of the two batteries.


30/31

30

Component Maxmum Most typical

Lifetime in hours

Camera 1300 400

Battery A 900 280

Battery B 900 280

Table 12. The lifetimes of the systems components

Reasonable distributions of the lifetimes are found by the macro ChooseDistribution

without a sample26

and rounding the parameters to a single decimal place27

. The sheet inFig. 41. has the model, the formulas in column B are in Table 41.

.

Figure 41. The model of reliability and its simulation setting

System Life time

!" #$!!%&'%(

$)%&$

"% (*(!%)

Table 41. The formulas for the reliability problem

As a result, the true average is between 325 and 343 but any value approximately

between 80 and 640 is possible.

26See Section 5 of Part II.27In lifetime analysis it is common to use not beta distribution offered by ChooseDistribution, rather

Weibull distribution that has a theoretical justification, beyond the scope of this tutorial.


31/31

Index

page page

activity 26 outcome 2

American call 2 pdf 15beta distribution 9 PERT 26

bin 13 predecessor 26

bound: lower 12 probability density function 15bound: upper 12 sample 10

call 2 scale parameter 16

ChooseDistribution macro 2 scenario 2

confidence limits 23 security 4

covariance matrix 25 shape parameter 16

Data analysis package in Excel 4 shotest route 28

distribution 9 Simulate macro 2distribution: symmetric 13 stachastic 2

estimation 9 static simulation 2

exponential distribution 9 statistical distribution 9

gamma distribution 9 statistical distribution: beta 9

histogram 13 statistical distribution: exponential 9

immediate predecessor 26 statistical distribution: gamma 9

Kolmogorov-Smirnov test 10 statistical distribution: log-normal 9

K-S test 10 statistical distribution: normal 9

log-normal distribution 9 statistical distribution: symmetric 13

lower bound 12 statistical distribution: t Student 24

macro 2 statistical distribution: Weibull 30

macro: ChooseDistribution 2 statistical independence 25

macro: Simulate 2 statistical test: Kolmogorov-Smirnov 10

milestone 26 statistical test: K-S 10

mode 12 statistical test: one sided 10

model 25 statistical test: two sided 10

Monte-Carlo 21 symmetric distribution 13

Monte-Carlo simulation 21 t distribution 24

normal distribution 9 trust 5numerical outcome 2 two sided test 10

observations 10 upper bound 12

one sided test 10 Visual Basic Editor 4

option 2 Weibull distribution 30

Montecarlo Tutorial

Documents

Transcript of Montecarlo Tutorial