Determining the optimal sample size in the Monte Carlo experiments

(1)

Selçuk J. Appl. Math. Selçuk Journal of Vol. 7. No.2. pp. 103-111, 2006 Applied Mathematics

Determining The Optimal Sample Size In The Monte Carlo Experi-ments

Mustafa Y. ATA

Department of Statistics, Faculty of Arts and Sciences,Gazi University, Ankara e-mail: yavuzata@ gazi.edu.tr

Summary.A convergence criterion for the Monte Carlo estimates will be pro-posed which can be used as a stopping rule for the Monte Carlo experiments. The proposed criterion searches a convergence band of a given width and length such that the probability of the Monte Carlo sample variance to fall outside of this band is practically null. After the convergence to the process variance realized according to the new rule, a confidence interval in the usual statistical sense can be determined for the steady-state mean of the process.

Key words: sequential confidence interval, stopping rule, Monte Carlo, con-vergence

1. Introduction

In the Monte Carlo (MC) experiments, the aim of which is to estimate the expected value of some stochastic process, statistically valid confidence intervals (CI) can be constructed only if the steady-state mean and the variance of the process have the properties that

(1) lim →∞ µ ¯ = 1_  P =1  ¶ 1 ⇒ ; lim →∞ µ 2 = −11  P =1 (− ¯)2 ¶ 1 ⇒ 2 

where the MC sample {:=1 (1)} is a realization of the stochastic process

{1 2 }composed of independent and identically distributed (i.i.d.)

(2)

terms of the absolute error, > |¯− | and by appealing to the Central Limit

Theorem(CLT) a CI can be constructed as

(2) Pr µ |¯− |  2  √  ¶ = 1 − 

where 2 is the upper 100(1-/2)percentile of the standard normal

distrib-ution. Hence relying on the CI in Eq.(2), the deterministic sample sizes can be determined as, (3) (  2 ∞) = »³ 2   ´2¼

according to the absolute error criterion which will be called here shortly as the Acceptable Confidence Interval Rule (ACIR).

In practice, the variance of process is also unknown and it should be estimated simultaneously with the process mean. The moments of the process can be esti-mated by one of the two general approaches. In the fixed-sample-size procedure which will be referred shortly as FCIR, a single simulation run of an arbitrary fixed length  is performed, and by using the computed variance of the MC

sample of size  as the unbiased point estimate of the process variance, a

de-terministic sample size according to Eq.(3) is chosen for a given ACIHW. An experimenter using this procedure has to be content with either a default value of confidence level or with a default value of CI half width. Being have selected an acceptable CI half width, the experimenter usually prefers to perform the MC experiment in excessively long run lengths to avoid the default confidence level to be below than a reasonable level. Almost all of the published MC studies fix the MC sample size to an excessive number which is usually some multiple of number 1000, i.e. 1000, =1,2,. . . and being sure that the convergence is attained they do not need to state any confidence measure. Therefore in this procedure a saving in computing time due to not estimating process variance, may be overtaken by that required to generate the redundant MC sample points. In the sequential procedures, the length of a single simulation run is increased sequentially until an “acceptable” CI with a given width and confidence level can be constructed. The stopping rule proposed by Chow and Robbins[1] based on Eq.(3) is the standard sequential procedure in determining the optimal MC sample size and it will be referred here shortly as the Sequential Confidence In-terval Rule (SCIR). If the computational time of generating a MC sample point is common to both procedure and the default MC sample size in sequential pro-cedure is which is much lower than , then the SCIR is 100[(−)/]%

more eﬃcient than the FCIR.

If one assumes that the process is stationary and not autocorrelated, then there is not any theoretical diﬃculty in constructing statistically valid CIs, for the process mean and the variance respectively such as

(3)

(4) Pr³_|¯− |  1−2;−1 p 2  ´ = 1 −  (5) Pr " ( − 1) 2  2 1−2;−1  2( − 1)  2  2 2;−1 # = 1 − 

where 1−2;−1 is the (1 − 2) quantile of the distribution with  − 1

degrees of freedom and 2

2;−1 and 22;−1are the (2) and (1 − 2)

quantile point, respectively, of the chi-square distribution with  − 1 degrees of freedom. The main argument in the present study is that it is more reasonable to have first the MC sample estimate of process variance via reducing the half width of sequentially constructed CIs for the process variance in Eq.(5) to a predetermined level of precision, i.e.

(6)  = " (∗_{− 1) }2 ∗ 2 2;−1 −(∗− 1)  2 ∗ 2 1−2;−1 #

and to determine the MC sample size according to (3). However, since the chi-square distribution is not symmetric, constructing sequential confidence in-tervals with equal probability tails for the MC process variance is not a trivial job and requires considerable amount of computations.[2] Therefore, in esti-mating the process variance, the sequential confidence interval approach is not appropriate to be applied directly, from the point of computational eﬃciency. In this article, a convergence criterion will be introduced which can be used to devise a stopping rule for the MC experiments. Newly introduced convergence criterion will be applied in estimating the process variance and then the optimal MC sample size will be determined by (3). Therefore, it is a hybrid procedure consisting of the sequential and fixed-sample-size procedures.

The general frame and the definition of the new criterion will be given in the next section. After then, the empirical distribution of the convergence band length will be explored via some Monte Carlo experiments, and finally, some concluding remarks will be made.

2. Empirical Convergence

Let  be a random variable with the distribution function  () and  be a

random sample from  () constituting the virtual observation in the _{trial of}

a MC experiment. A pooled sample consisting of all virtual observations from the 1_{to the latest }_{trial constitutes the MC sample {}

:=1(1)} from which

(4)

(7) ¯= 1   X =1 

and the mean squares of the distances from the MC sample mean, i.e. the MC sample variance, (8) 2_ = 1  − 1  X =1 (− ¯)2

for each =1,2,... are computed. So, the sequence of MC sample means and MC sample variances up to a large number  , i.e.

(9) _{¯ ¯:  = 1(1) }

is a realization of a MC process. A MC process is strictly stationary in the sense that the MC sample mean will converge in probability to the expected value  = hi, if it exists, i.e.

(10) 

→∞Pr (|¯− |  ) → 1

For any given suﬃciently large , the MC sample mean in (7) and the MC sample variance in (8), a 100(1-)% CI can be constructed with a half width of

(11) () = 2

s 2



 

which will surely converge to zero as  → ∞ but slowly after some large  =  where 2

 gets suﬃciently close to the true process variance 2. In sequential

stopping rule, for a given acceptable CI half width (), MC experiment is ended where (12) () = 2 r 2   

If (11) is to be achieved likely after the true variance of the process is converged by the MC sample variance, then a stopping rule can be based directly on the precision level of this convergence. Let  be the precision level of this conver-gence. Then a sequential interval always covering the MC sample variances in (8) can be constructed with the upper and lower limits respectively as

(5)

(13)  (2_) = ½  (2_₋₁) = 0 2 +   = 1 ; ( 2 ) = ½ (2_₋₁) = 0 2 −  = 1 where  = ½ 0 (2 −1)  2   (2−1) 1©2_ 6 (2_₋₁)ª_∨©2_ > (2_₋₁)ª .

The number of adjacent MC sample variances which shares the same upper and lower limits in Eq.(13), i.e. = 0 for all of them, is a random variable whose

observed values can be defined as, with

(14) ←

½

−1+ 1  = 0

0  = 1 

Eq.(13), and Eq.(14) constitute a sequence of shifting bands with fixed half width  and variable length  , in which the MC sample variance is trapped.

If the sample space of the random variable  is Z = {=  :  = 0 1 2 }

and  is an index variable whose values corresponds to the points of Z, then replacing the convergence index  by  , Eq.(1) and Eq.(10) can be restated for the MC sample variance and the true variance of the process such as

(15) lim →∞ ¡ 2  ¢1 ⇒ 2 _ _ →∞Pr¡¯¯ 2 − 2 ¯ ¯  ¢→ 1

because as  → ∞, also  → ∞ . For a given , the upper and lower limits in Eq.(13) converge to some constants 

→∞ ¡ 2_¢_{→ }2_{−  and } →∞ ¡ 2_¢_→ 2_{+ . Thus it can be asserted that an acceptable convergence band (ACB )}

with a width * and a lenght * is achieved simultaneously with Eq.(12), i.e.  =  +  ∗ −1 ⇔ = ∗ and a stopping rule can be proposed as

(16) _{ (∗ ∗) = { : } = ∗  = 1 2 } 

In this article, leaving the further asymptotic discussion of this proposition for a research in the near future, we will be content with a MC empirical verification of Eq.(15). The ACB half width ∗ _{determines the floating-point precision of}

the MC estimate, and can be set to 10−_{/2 where  is the desired number of}

significant digits after the decimal point. For a given ∗_{, the ACB length }∗

can be determined so as to be confident that the probability of the MC sample variance to fall out of the ACB (∗,∗) is practically null for  >  + ∗ . An optimal value for * can be determined based on the information

(6)

3. Monte Carlo Experıments

The probability law in Eq.(16) can be approximated by some MC experiments. The MC experiments performed to obtain the empirical distribution in Eq.(16) showed that the random variable in question can be hypothesized well as a logarithmic series variate [3] whose density function is

(18) Pr() = _{{− ln(1 − )}}−1  = 1 2 ; 0    1 with the asymptotically unbiased estimator of the shape parameter c

(19) ˆ_{ = 1 −} ˆ( = 1)_¯

 

In the MC experiments, three stochastic processes were used to generate the MC sample points:

a discrete process of i.i.d. Bernoulli variate with a success probability of 0.5, denoted by B(0.5).

a continuous process of i.i.d. Uniform variates with mean 0.5 and variance 0.083333, denoted by U(0,1).

a stationary autoregressive process defined as  = 03−1+  and denoted

by AR(0,3), where the steady state mean and the variance of the process are 0 and 1.098901 respectively, and  are i.i.d. Normal variates with mean 0 and

variance 1.

The design points were determined by selecting respectively the reasonable val-ues for the levels of precision in convergence to the true value of process variance, i.e. for the ACB widths in each of the three experiments with the processes given above, as 0.01, 0.001, and 0.001. Then at each design point, 100 replications of a long MC simulation run performed until = 51 observed in each independent

run . The frequencies of  = 1, 2, ...,50 were accumulated over all the repli-cations and transformed to the relative frequencies ( = ), then an unbiased

estimate of c in (18) was obtained as

(20) ˆ_{ = 1 −} ₅₀1( = 1)

P

=1

( = )



For a randomly selected case, the relative frequencies as the MC empirical prob-abilities and the theoretical probprob-abilities computed from Eq.(18) by inserting the MC estimates of c are sketched together in Fig.1, to give an idea of how well the hypothesized density in Eq.(17) fits to the empirical one.

(7)

0 0,05 0,1 0,15 0,2 0,25 0,3 0,35 0,4 1 5 9 13 17 21 25 29 33 37 41 45 49 CB Length E m pi ri c a l an d T h e o re tic a l P rob ab ili tie s Empirical Theoretical

Fig.1. The empirical and the theoretical probabilities of the CB length . The global results of the MC experiments are summarized in Table1. which indicate that the reasonable value for the shape parameter c in Eq.(17) is about 0.9.

The figures under the columns ¯( ¯ )and 2( ¯ )are the grand averages of MC sample means and variances estimated in each replicate by the Shifting Conver-gence Band Rule (SCBR) with ACB(∗, 50).

(8)

According to the experimental results, the appropriate values for ∗ can be specified for a desired level of confidence 1-, as it is defined in the previous section, from the relation

(21) _{Pr( = ∗) ∼}= 09∗{−23∗}−16 

An algorithm of the criterion to be used as an automated stopping rule for the standard MC experiments is presented in the appendix. The algorithm of the SCBR can be embedded in a typical Monte Carlo algorithm where the aim is to estimate the expected value of a process with a specified confidence level. After the convergence to the process variance is realized within a CB of a half-width ∗_{, and of a length * , it returns the MC sample size  , the MC estimate}

of process mean and the MC estimate of process variance. In the algorithm of SCBR, and  are the upper and the lower limits of the CB respectively at

the  trial.

4. Concluding Remarks.

If a confidence interval for the steady-state mean of stochastic process is to be constructed via CLT, the MC estimate of the process variance is a prerequi-site. It can be proposed that the convergence to the process variance can be determined by adopting the ACIR to the case of MC sample variance. But constructing the sequential confidence intervals for the process variance is inef-ficient since the distribution of the sample statistic is not symmetric one and therefore to determine the upper and lower points of the distribution with the tails having equal probabilities on both sides requires extra computational ef-fort. However the presently proposed stopping rule bypasses this diﬃculty using the empirical convergence concept. It searches a convergence band of a given width and length such that the probability of the Monte Carlo sample variance to fall outside of this band is practically null which is certainly more eﬃcient computationally than constructing sequentially valid confidence intervals for the process variance. After the convergence to the process variance realized accord-ing to the new rule, a confidence interval in the usual statistical sense can be determined for the steady-state mean of the process.

Appendix : An algorithm for SCBR

Initialize 0← ; 0← ;  ← ∗;  ← ∗; 0← 0;  0← 0

Do Steps 1-6,until = ∗;

Step: 1  ←  + 1 generate 

Step: 2 = −1+   =  −1+ 2

(9)

Step: 4  = ½ 0 −16  6 −1 1 { −1} ∨ {  −1}  Step: 5 = ½ −1  = 0 + ∗ 6= 0   = ½ −1  = 0 − ∗  6= 0  Step: 6 ← ½ −1+ 1  = 0 0  6= 0  Return  ←  ˆ2_{← }− ∗; ˆ ← ¯• References

1. Y.S.Chow and H.Robbins On the asymptotic theory of fixed-width confidence intervals for the mean, Annals of Mathematical Statistics, 36, 457-462 (1965). 2. A.M.Mood,F.Graybill and D.C.Boes, Introduction to the Theory of Statistics, 3 ed., McGraw-Hill, Tokyo. (1974).

3. Merran Evans, Nicholas Hastings and Brian Peacock, Statistical Distributions, 2 ed., John Wiley and Sons, Inc., New York (1993).