By: Bradley Setzler

Re-posted from: http://juliaeconomics.com/2014/06/18/parallel-processing-in-julia-bootstrapping-the-mle/

* Scripts that reproduce the results of this tutorial in Julia are located here and here.

To review our extended example, we first simulated the OLS DGP, solved for the MLE of OLS, bootstrapped the MLE, and computed various non-parametric p-values.

The only part of our previous work that may have slowed down your computer was forming the 1,000 bootstrap samples (*samples*). In this post, we will see how to speed up the bootstrapping by using more than one processor on our computers.

**The Estimator that We Wish to Bootstrap**

We are interested in the following MLE estimator, adapted from our previous work. The only difference is that I will include more independent variables in by setting instead of , knowing that this will slow down our computers. I have to adjust *trueParams* and some other parts of the code to make room for the additional independent variables. First, we import the two needed packages and generate the data:

using Distributions using Optim N=1000 K=10 genX = MvNormal(eye(K)) X = rand(genX,N) X = X' constant = ones(N) X = [constant X] genEpsilon = Normal(0, 1) epsilon = rand(genEpsilon,N) trueParams = [-K/2:K/2]*.02 Y = X*trueParams + epsilon

Next, we define the likelihood of OLS:

function loglike(rho,x,y) beta = rho[1:K+1] sigma2 = exp(rho[K+2]) residual = y-x*beta dist = Normal(0, sigma2) contributions = logpdf(dist,residual) loglikelihood = sum(contributions) return -loglikelihood end

Finally, we define the bootstrap function, which includes within it a wrapper for the likelihood:

function bootstrapSamples(B) println("hi") M=convert(Int,floor(N/2)) samples = zeros(B,K+2) for b=1:B theIndex = randperm(N)[1:M] x = X[theIndex,:] y = Y[theIndex,:] function wrapLoglike(rho) return loglike(rho,x,y) end samples[b,:] = optimize(wrapLoglike,params0,method=:cg).minimum end samples[:,K+2] = exp(samples[:,K+2]) println("bye") return samples end

The reason for the cute *println* statements will become apparent later. Save all of this in a file called “bootstrapFunction.jl”.

**Parallel Bootstrapping**

We now have the function, *bootstrap Samples*, saved in the file “bootstrapFunction.jl”, that we wish to run in parallel. The idea is that, instead of running the MLE times on one processor, we run it times on each of four processors, then collect the results back to the first processor. Unfortunately, Julia makes it very difficult to parallelize functions within the same file that the functions are defined; the user must explicitly tell Julia every parameter and function that is needed on each processor, which is a frustrating process of copy-pasting the command

*@everywhere*all around your code.

Instead, we separate our work into two files, the first one you have already created. Create a second file in the same location on your computer as “bootstrapFunction.jl”. In the second file, write the following commands:

srand(2) addprocs(4) require("bootstrapFunction.jl") B=1000 b=250 samples_pmap = pmap(bootstrapSamples,[b,b,b,b]) samples = vcat(samples_pmap[1],samples_pmap[2],samples_pmap[3],samples_pmap[4])

The command *addprocs(4)* means to add four processors into your Julia session. The command *require* is the key to getting this right: it ensures that all of the functions and parameters contained in “bootstrapFunction.jl” are available to each processor. *pmap* is the function that actually applies *bootstrap Samples* to each value in the list

*[b,b,b,b]*. I set the list next to it to be

*[b,b,b,b]*, where

*b=250*, so that

*bootstrapEstimates*would collect 250 bootstrap samples on each processor. The resulting matrices from all four processors are stored in the length-4 vector called

*samples*

*_pmap*. Finally, we vertically stack (

*vcat*) the four matrices into one matrix,

*, which is exactly the matrix of bootstrap samples we have seen previously.*

*samples*Tip: For parallel processing in Julia, separate your code into two files, one file containing the functions and parameters that need to be run in parallel, and the other file managing the processing and collecting results. Use the

requirecommand in the managing file to import the functions and parameters to all processors.

**Results**

Once you have run the outer file, you can do speed tests to compare the two ways of getting 1,000 bootstrap samples. We use the *@elapsed* command to time each code. Here are the results:

julia> @elapsed pmap(bootstrapSamples,[b,b,b,b]) From worker 2: hi From worker 3: hi From worker 4: hi From worker 5: hi From worker 5: bye From worker 2: bye From worker 4: bye From worker 3: bye 8.687256164

Each of the processors tells us “hi” and “bye” when it begins and finishes its job, respectively. The total time was just under 9 seconds. By comparison, the following code will compute 1,000 bootstrap samples on only one processor, as in our previous work:

julia> @elapsed bootstrapSamples(B) hi bye 21.335191688

So, in the absence of parallel processing, the same result would require just over 21 seconds. Thus, parallel processing on a four-core personal computer allowed us to reduce the time required to bootstrap our estimator by a factor greater than 3.

Bradley J. Setzler