Author Archives: Metals, Magnets, and Miscellaneous Materials

Runge-Kutta Methods

Ordinary Differential Equation Solvers: Runge-Kutta Methods

Christina Lee

So what’s an Ordinary Differential Equation?

Differential Equation means we have some equation (or equations) that have derivatives in them.

The ordinary part differentiates them from partial differential equations, the ones with curly $\partial$ derivatives. Here, we only have one independent variable, let’s call it $t$, and one or more dependent variables, let’s call them $x_1, x_2, …$. In partial differential equations, we can have more than one independent variable.

This ODE can either be written as a system of the form

$\frac{d x_1}{dt}=f_1(t,x_1,x_2,...,x_n)$

$\frac{d x_2}{dt}=f_2(t,x_1,x_2,...,x_n)$

…

$\frac{d x_n}{dt}=f_n(t,x_1,x_n,...,x_n)$

or a single n’th order ODE of the form

$f_n(t,x) \frac{d^n x}{dt^n}+...+f_1(t,x) \frac{dx}{dt}+f_0(t,x)=0$

that can be rewritten in terms of a system of first order equations by performing variable substitutions such as

$\frac{d^i x}{dt^i}=x_i$

Though STEM students such as I have probably spent thousands of hours pouring of ways to analytically solve both ordinary and partial differential equations, unfortunately, the real world is rarely so kind as to provide us with an exactly solvable differential equation. At least for interesting problems.

We can sometimes approximate the real world as an exactly solvable situation, but for the situation we are interested in, we have to turn to numerics. I’m not saying those thousand different analytic methods are for nothing. We need an idea ahead of time of what the differential equation should be doing, to tell if it’s misbehaving or not. We can’t just blindly plug and chug.

Today will be about introducing four different methods based on Taylor expansion to a specific order, also known as Runge-Kutta Methods. We can improve these methods with adaptive stepsize control, but that is a topic for another time, just like the other modern types of solvers such as Richardson extrapolation and predictor-corrector.

Nonetheless, to work with ANY computational differential equation solver, you need to understand the fundamentals of routines like Euler and Runge-Kutta, their error propagation, and where they can go wrong. Otherwise, you might misinterpret the results of more advanced methods.

WARNING: If you need to solve a troublesome differential equation for a research problem, use a package, like DifferentialEquations. These packages have much better error handling and optimization.

Let’s first add our plotting package and colors.

Pkg.update()
Pkg.add("Gadfly")
Pkg.add("Colors")
Pkg.add("Lazy")
using Gadfly
# Solarized Colors that I like working with
# http://ethanschoonover.com/solarized
using Colors
base03=parse(Colorant,"#002b36");
base02=parse(Colorant,"#073642");
base01=parse(Colorant,"#586e75");
base00=parse(Colorant,"#657b83");
base0=parse(Colorant,"#839496");
base1=parse(Colorant,"#839496");
base2=parse(Colorant,"#eee8d5");
base3=parse(Colorant,"#fdf6e3");

yellow=parse(Colorant,"#b58900");
orange=parse(Colorant,"#cb4b16");
red=parse(Colorant,"#dc322f");
magenta=parse(Colorant,"#d33682");
violet=parse(Colorant,"#6c71c4");
blue=parse(Colorant,"#268bd2");
cyan=parse(Colorant,"#3aa198");
green=parse(Colorant,"#859900");

We will be benchmarking our solvers on one of the simplest and most common ODE’s,

$\frac{d}{d t}x=x \;\;\;\;\;\;\; x(t)=C e^t$

Though this only has one dependent variable, we want to structure our code so that we can accommodate a series of dependent variables, $y_1,y_2,…,y_n$, and their associated derivative functions. Therefore, we create a function for each dependent variable, and then push it into an array declared as type Function.

function f1(t::Float64,x::Array{Float64,1})
    return x[1]
end
f=Function[]
push!(f,f1)

Euler’s Method

Stepping

Approximating the slope each step.

First published in Euler’s Instutionum calculi integralis in 1768, this method gets a lot of milage, and if you are to understand anything, this method is it.

We march along with step size $h$, and at each new point, calculate the slope. The slope gives us our new direction to travel for the next $h$.

We can determine the error from the Taylor expansion of the function

$x_{n+1}=x_n+h f(x_n,t) + \mathcal{O}(h^2).$

In case you haven’t seen it before, the notation $\mathcal{O}(x)$ stands for “errors of the order x”.
Summing over the entire interval, we accumuluate error according to

$N\mathcal{O}(h^2)= \frac{x_f-x_0}{h}\mathcal{O}(h^2)=h$ ,

making this a first order method. Generally, if a technique is $n$th order in the Taylor expansion for one step, its $(n-1)$th order over the interval.

function Euler(f::Array{Function,1},t0::Float64,x::Array{Float64,1},h::Float64)
    d=length(f)
    xp=copy(x)
    for ii in 1:d
        xp[ii]+=h*f[ii](t0,x)
    end

    return t0+h,xp
end

Implicit Method or Backward Euler

If $f(t,x)$ has a form that is invertible, we can form a specific expression for the next step. For example, if we use our exponential,

$x_{n+1}=x_n+ h f(t_{n+1},x_{n+1})$

$x_{n+1}-h x_{n+1}=x_n$

$x_{n+1}=\frac{x_n}{1-h}$

This expression varies for each differential equation and only exists if the function is invertible.

function Implicit(f::Array{Function,1},t0::Float64,x::Array{Float64,1},h::Float64)
    return t0+h,x[1]/(1-h)
end

2nd Order Runge-Kutta

So in the Euler Method, we could just make more, tinier steps to achieve more precise results. Here, we make bettter steps. Each step itself takes more work than a step in the first order methods, but we win by having to perform fewer steps.

This time, we are going to work with the Taylor expansion up to second order,

$x_{n+1}=x_n+h f(t_n,x_n) + \frac{h^2}{2} f^{\prime}(t_n,x_n)+ \mathcal{O} (h^3).$

Define

$k_1=f(t_n,x_n),$

so that we can write down the derivative of our $f$ function, and the second derivative (curvature), of our solution,

$f^{\prime}(t_n,x_n)=\frac{f(t_n+h/2,x_n+h k_1/2)-k_1}{h/2}+\mathcal{O}(h^2).$

Plugging this expression back into our Taylor expansion, we get a new expression for $x_{n+1}$

$x_{n+1}=x_n+hf(t_n+h/2,x_n+h k_1/2)+\mathcal{O}(h^3)$

We can also interpret this technique as using the slope at the center of the interval, instead of the start.

function RK2(f::Array{Function,1},t0::Float64,x::Array{Float64,1},h::Float64)
    d=length(f)

    xp=copy(x)
    xk1=copy(x)

    for ii in 1:d
        xk1[ii]+=f[ii](t0,x)*h/2
    end
    for ii in 1:d
        xp[ii]+=f[ii](t0+h/2,xk1)*h
    end

    return t0+h,xp
end

4th Order Runge-Kutta

Wait! Where’s 3rd order? There exists a 3rd order method, but I only just heard about it while fact-checking for this post. RK4 is your dependable, multi-purpose workhorse, so we are going to skip right to it.

$k_1= f(t_n,x_n)$

$k_2= f(t_n+h/2,x_n+h k_1/2)$

$k_3 = f(t_n+h/2, x_n+h k_2/2)$

$k_4 = f(t_n+h,x_n+h k_3)$

$x_{n+1}=x_n+\frac{h}{6}\left(k_1+2 k_2+ 2k_3 +k_4 \right)$

I’m not going to prove here that the method is fourth order, but we will see numerically that it is.

Note: I premultiply the $h$ in my code to reduce the number of times I have to multiply $h$.

function RK4(f::Array{Function,1},t0::Float64,x::Array{Float64,1},h::Float64)
    d=length(f)

    hk1=zeros(x)
    hk2=zeros(x)
    hk3=zeros(x)
    hk4=zeros(x)

    for ii in 1:d
        hk1[ii]=h*f[ii](t0,x)
    end
    for ii in 1:d
        hk2[ii]=h*f[ii](t0+h/2,x+hk1/2)
    end
    for ii in 1:d
        hk3[ii]=h*f[ii](t0+h/2,x+hk2/2)
    end
    for ii in 1:d
        hk4[ii]=h*f[ii](t0+h,x+hk3)
    end

    return t0+h,x+(hk1+2*hk2+2*hk3+hk4)/6
end

This next function merely iterates over a certain number of steps for a given method.

function Solver(f::Array{Function,1},Method::Function,t0::Float64,
        x0::Array{Float64,1},h::Float64,N::Int64)
    d=length(f)
    ts=zeros(Float64,N+1)
    xs=zeros(Float64,d,N+1)

    ts[1]=t0
    xs[:,1]=x0

    for i in 2:(N+1)
        ts[i],xs[:,i]=Method(f,ts[i-1],xs[:,i-1],h)
    end

    return ts,xs
end

N=1000
xf=10
t0=0.
x0=[1.]
dt=(xf-t0)/N

tEU,xEU=Solver(f,Euler,t0,x0,dt,N);
tIm,xIm=Solver(f,Implicit,t0,x0,dt,N);
tRK2,xRK2=Solver(f,RK2,t0,x0,dt,N);
tRK4,xRK4=Solver(f,RK4,t0,x0,dt,N);

xi=tEU
yi=exp(xi);

errEU=reshape(xEU[1,:],N+1)-yi
errIm=reshape(xIm[1,:],N+1)-yi
errRK2=reshape(xRK2[1,:],N+1)-yi;
errRK4=reshape(xRK4[1,:],N+1)-yi;

plot(x=tEU,y=xEU[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=green,
default_point_size=3pt))

lEU=layer(x=tEU,y=xEU[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=green,
default_point_size=3pt))

lIm=layer(x=tIm,y=xIm[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=yellow,
default_point_size=3pt))

lRK2=layer(x=tRK2,y=xRK2[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=cyan,
default_point_size=2pt))

lRK4=layer(x=tRK4,y=xRK4[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=violet,
default_point_size=4pt))

lp=layer(x->e^x,-.1,10,Geom.line,Theme(default_color=red))


plot(lp,lEU,lIm,lRK2,lRK4,
Guide.manual_color_key("Legend",["Euler","Implicit","RK2","RK4","Exact"],
[green,yellow,cyan,violet,red]),
Coord.cartesian(xmin=9.5,xmax=10.1))

lEU=layer(x=xi,y=errEU,Geom.point,
Theme(highlight_width=0pt,default_color=green,
default_point_size=1pt))

lIm=layer(x=xi,y=errIm,Geom.point,
Theme(highlight_width=0pt,default_color=yellow,
default_point_size=1pt))

lRK2=layer(x=xi,y=errRK2,Geom.point,
Theme(highlight_width=0pt,default_color=cyan,
default_point_size=1pt))

lRK4=layer(x=xi,y=errRK4,Geom.point,
Theme(highlight_width=0pt,default_color=violet,
default_point_size=1pt))

plot(lEU,lIm,lRK2,lRK4,Scale.y_asinh,
Guide.manual_color_key("Legend",["Euler","Implicit","RK2","RK4"],
[green,yellow,cyan,violet]))

Scaling of the Error

I talked above about the error scaling either as $h,h^2$, or $h^4$. I won’t just talk but here will numerically demonstrate the relationship as well. For a variety of different step sizes, the below code calculates the final error for each method. Then we will plot the error and see how it scales.

t0=0.
tf=1.
dx=tf-t0
x0=[1.]

dt=collect(.001:.0001:.01)

correctans=exp(tf)
errfEU=zeros(dt)
errfIm=zeros(dt)
errfRK2=zeros(dt)
errfRK4=zeros(dt)



for ii in 1:length(dt)
    N=round(Int,dx/dt[ii])
    dt[ii]=dx/N

    tEU,xEU=Solver(f,Euler,t0,x0,dt[ii],N);
    tIm,xIm=Solver(f,Implicit,t0,x0,dt[ii],N);
    tRK2,xRK2=Solver(f,RK2,t0,x0,dt[ii],N);
    tRK4,xRK4=Solver(f,RK4,t0,x0,dt[ii],N);

    errfEU[ii]=xEU[1,end]-correctans
    errfIm[ii]=xIm[1,end]-correctans
    errfRK2[ii]=xRK2[1,end]-correctans
    errfRK4[ii]=xRK4[1,end]-correctans
end

lEU=layer(x=dt,y=errfEU,Geom.point,
Theme(highlight_width=0pt,default_color=green,
default_point_size=1pt))

lIm=layer(x=dt,y=errfIm,Geom.point,
Theme(highlight_width=0pt,default_color=yellow,
default_point_size=1pt))

lRK2=layer(x=dt,y=errfRK2*10^5,Geom.point,
Theme(highlight_width=0pt,default_color=cyan,
default_point_size=1pt))

lRK4=layer(x=dt,y=errfRK4*10^10,Geom.point,
Theme(highlight_width=0pt,default_color=violet,
default_point_size=1pt))

tempEU(x)=(errfEU[end]*x/.01)
tempIm(x)=(errfIm[end]*x/.01)
#le=layer([tempEU,tempIm],0,.01,Geom.line,Theme(default_color=base01))
le=layer(tempEU,0,.01,Geom.line,Theme(default_color=base01))
lei=layer(tempIm,0,.01,Geom.line,Theme(default_color=base01))


temp2(x)=(errfRK2[end]*(x/.01)^2*10^5)
l2=layer(temp2,0,.01,Geom.line,Theme(default_color=base00))

temp4(x)=(errfRK4[end]*(x/.01)^4*10^10)
l4=layer(temp4,0,.01,Geom.line,Theme(default_color=base00))

xl=Guide.xlabel("h")
ylrk2=Guide.ylabel("Error e-5")
ylrk4=Guide.ylabel("Error e-10")
yl=Guide.ylabel("Error")

pEU=plot(lEU,lIm,le,lei,xl,yl,Guide.title("Euler and Implicit, linear error"))
p2=plot(lRK2,l2,xl,ylrk2,Guide.title("RK2, error h^2"))
p4=plot(lRK4,l4,xl,ylrk4,Guide.title("RK4, error h^4"))
vstack(hstack(p2,p4),pEU)

Arbitrary Order

While I have presented 4 concrete examples, many more exist. For any choice of variables $a_i, \beta_{i,j},a_i$ that fulfill

$x_{n+1}=x_n+h\left(\sum_{i=1}^s a_i k_i \right)+ \mathcal{O}(h^p)$

with

$k_i=f\left(t_n+\alpha_i h,x_n+h\left(\sum_{j=1}^s \beta_{ij} k_j \right) \right)$

we have a Runge-Kutta method of order $p$, where $p\geq s$. The Butcher tableau provides a set of consistent coefficients.

These routines are also present in the M4 folder in a module named diffeq.jl. For later work, you may simply import the module.

Stay tuned for when we tuned these routines to the stiff van der Pol equations!

Gadfly

I’m Moving to Gadfly

### Christina Lee

So far in this blog, all my plots have either used PyPlot, or the experimental GLVisualize. BUT, NO LONGER!

While writing up my research notes, I’ve been trying to improve my PyPlot graphs. I was hoping for them to look nicer; Julia’s PyPlot was saying no. We had a falling out over background colors and resolution. So, I have decided to move on to a plotting partner more willing to cope with my demands.

When I first started learning Julia, Gadfly seemed too unfamiliar, and control too tricky. I didn’t give it much further thought. After some digging, Gadfly has changed my mind, and I hope today I can change yours too. Right now, like Julia, Gadfly doesn’t have the maturity or documentation of PyPlot, but once it does, you better watch out.

Gadfly preferentially produces plots in SVG (scalable vector graphics) form, instead of the rasterized form of PNG or JPG favored by PyPlot. As displayed in the figure left, rasterized images are composed of colored squares, so don’t look the best when highly magnifified. SVG, which is based on XML, stores the position of vector elements and scales to high resolution without artifacts. Tools such as Inkscape also work in SVG, and can edit the individual pieces of a graph since they haven’t been rasterized together.

Let’s plot some graphs!

_{_{By Yug, modifications by 3247 (Unknown) [CC BY-SA 2.5 (http://creativecommons.org/licenses/by-sa/2.5)], via Wikimedia Commons}}
</div>

# Time to load our Packages
#Pkg.update()
#Pkg.add("Gadfly")
using Gadfly

#Pkg.add("Compose")
using Compose

# Some test data
x=collect(1:.4:10);
y1=sin(x);
y2=cos(x);

# A simple plot
plot(x=x,y=y1)

Unlike PyPlot, we can give plot functions as well as points. After receiving a function, a max value, and a min value, Gadfly figures everything else out for itself.

For the function, we can pass in either an inbuilt function, a user defined function, or an anonymous function.

The brackets [ ] allow us to group multiple functions together in the plot.

# plotting function

function testfunction(x::Real)
    return 1-x^2
end

plot([cos,testfunction,x->x^2],0,1)

But that’s just one set of points…
How do we plot more than one set of data? That’s where layers come in. Instead of using plot, we use layer, and set it as a variable. We can then plot those layers.

p1=layer(x=x,y=y1,Geom.point)
p2=layer(x=x,y=y2,Geom.point)
plot(p1,p2)

Different style of plotting: Geom

Now, what if we want to plot something other than lines?
Gadfly allows a wide variety of other ways to plot our data. We control this with the Geom (Geometry) object.

plot(x=x,y=y1,Geom.point,Geom.line)

plot(x=x,y=y1,ymin=y1-.1,ymax=y1+.1,Geom.point,Geom.errorbar)

plot(x=x,y=y1,ymin=y1-.1,ymax=y1+.1,Geom.line,Geom.ribbon)

plot(x=x,y=y1,Geom.bar)

plot(x=x,y=y1,Geom.step)

plot(x=y1,y=y2,Geom.path())

Edit Point and Line style

Disclaimer: I’ve learned how to do the next two things by reading the code, not the documentation.

It seems like they are newer and less developed features than everything else I’m discussing here. The syntax seems less polished than in other areas, so I believe it’s still under construction.

z=collect(0:.2:2)

xp=[z;z+.5;z+1;z+1.5;z+2;z+2.5;z+3;z+3.5]
yp=[z;z;z;z;z;z;z;z]
sh=[ones(z);2*ones(z);3*ones(z);4*ones(z);5*ones(z);6*ones(z);7*ones(z);8*ones(z)]

plot(x=xp,y=yp,shape=sh,Geom.point,Theme(default_point_size=5pt))

# or Compose.mm for smaller sizes
# These ratios and numbers changed around how ever you like

dash = 6 * Compose.cm
dot = 1.5 * Compose.cm
gap = 1 * Compose.cm

l1=layer(x=x,y=zeros(x),Geom.line,Theme(line_style=[dot]))
l2=layer(x=x,y=ones(x),Geom.line,Theme(line_style=[dash]))
l3=layer(x=x,y=2*ones(x),Geom.line,Theme(line_style=[gap]))
l4=layer(x=x,y=3*ones(x),Geom.line,Theme(line_style=[dot,dash,dot]))
l5=layer(x=x,y=4*ones(x),Geom.line,Theme(line_style=[gap,dash]))

plot(l1,l2,l3,l4,l5,Coord.Cartesian(ymin=-.5,ymax=4.5),Theme(grid_line_width=0pt))

Guide: Labeling Axes

Where Geom alters how we plot, Guide alters the labeling.
Guide ties in with Compose.jl through the Guide.annotate command, but that will take a tutorial in itself.

plot(x=x,y=y2,color=y2,
Guide.xlabel("xlabel"),Guide.ylabel("ylabel"),Guide.title("Something explanatory"),
Guide.colorkey("y2"))

Here’s something we can do with a combination of Guide and Scale. Using Guide, we can set where we specifically want our xticks to be, namely at multiples of $\pi$. But then, the labeling would write out some irrational number, making the plot look horrible. So, we create a function that takes in a number and outputs a string label for Scale to use.

function labelname(x::Real)
    n=round(Int,x/π) #nearest integer*pi
    if n==0
        return "0"
    elseif n==1
        return "π"
    end
    return("$n π")
end

xmarks=[0,π,2π,3π]
ymarks=[-1,0,1]

plot(x=x,y=y2,
Guide.yticks(ticks=ymarks),
Guide.xticks(ticks=xmarks),Scale.x_continuous(labels=labelname))

Some other cool things we can do with Scale:
* Automatically transform the axis according to log, log10,log2, asinh, or sqrt.
* Write numbers in :plain, :scientific, :engineering, or :auto

plot(x->exp(x),0,10,Scale.y_log,Scale.x_continuous(format=:scientific))

Themes

I mostly chose Gadfly because of the control I could have with the Theme command. http://dcjones.github.io/Gadfly.jl/themes.html has a much more exhaustive list than what I will be demonstrating with here.

# Solarized Colors that I like working with
# http://ethanschoonover.com/solarized
using Colors
base03=parse(Colorant,"#002b36");
base02=parse(Colorant,"#073642");
base01=parse(Colorant,"#586e75");
base00=parse(Colorant,"#657b83");
base0=parse(Colorant,"#839496");
base1=parse(Colorant,"#839496");
base2=parse(Colorant,"#eee8d5");
base3=parse(Colorant,"#fdf6e3");

yellow=parse(Colorant,"#b58900");
orange=parse(Colorant,"#cb4b16");
red=parse(Colorant,"#dc322f");
magenta=parse(Colorant,"#d33682");
violet=parse(Colorant,"#6c71c4");
blue=parse(Colorant,"#268bd2");
cyan=parse(Colorant,"#3aa198");
green=parse(Colorant,"#859900");

plot(x=x,y=y1,
    Theme(highlight_width=0pt, # lose the white ring around the points
    default_point_size=3pt,    # size of the dots
    default_color=magenta,     # color of the dots
    background_color=base03,   # ... background color ...
    grid_color=base2,     # the lines
    minor_label_color=base2,  # numbers on axes color
    key_label_color=base2, # color key numbers color
    key_title_color=cyan,  # label of color key color
    major_label_color=cyan, # axes label and title color
    major_label_font_size=20pt, # font size for axes label and title
    minor_label_font_size=15pt, #font size for numbers on axes
    panel_opacity=1 #setting background to opaque
);)

Coord: Setting the boundries

The documentation states that you can change the range of the axes using Scale, but I’ve found it better to use this format to set my min and max values.

plot(x=x,y=y1,Geom.line,Coord.Cartesian(ymin=0,ymax=1,xmin=0,xmax=10))

So far, we have covered seperately Guide, Themes, and partially Coord and Scale. Individually, each aspect doesn’t add too much clunkiness to the code. However, if we start to add everything together, then the plot function would look quite nasty.

Luckily, just like layers for data points, we can put our modifiers into variables. Then we can simply call the variables in our plot function.

This also helps for when we want to use one theme for a series of graphs. We can define the theme variable up top, and then include it in every graph there after, never having to declare it again. This helps me to keep my graphs uniform in style.

function labelname(x::Real)
    n=round(Int,x/π)
    if n==0
        return "0"
    elseif n==1
        return "π"
    else
        return("$n π")
    end
    return("$n π")
end

xticks=[0,π,2π,3π]
yticks=[-1,0,1]

data=layer(x=x,y=y1,ymin=y1-.1,ymax=y1+.1,Geom.line,Geom.ribbon)
f=layer(cos,0,3π)

yt=Guide.yticks(ticks=yticks)
xt=Guide.xticks(ticks=xticks)
xs=Scale.x_continuous(labels=labelname)

t= Theme(highlight_width=0pt,
    default_point_size=3pt,
    default_color=blue,
    background_color=base03,
    grid_color=base2,
    minor_label_color=base2,
    key_label_color=base2,
    key_title_color=cyan,
    major_label_color=cyan,
    major_label_font_size=20pt,
    minor_label_font_size=15pt,
    panel_opacity=1)

xl=Guide.xlabel("time")
yl=Guide.ylabel("y axis")
GT=Guide.title("Important title")

plot(data,f,yt,xt,xs,t,xl,yl,GT)

Although we still have to call quite a few variables, this is a much simpler way of doing things.

Saving the Figure

Julia naturally saves to SVG (or SVGJS).
We have to specify the x and y dimensions of the plot, but since these images rescale so easily, we can choose some reasonable numbers.

p=plot(data,f,yt,xt,xs,t,xl,yl,GT)
draw(SVG("myplot.svg", 15cm, 9cm), p)

If you want to save to PNG, PDF, or PS, the package Cairo.jl provides that ability.

using Cairo
draw(PNG("myplot.png",15cm,9cm),p)

# Happy Plotting!

“`julia

“`

Monte Carlo Ferromagnet

Post Prerequisites: Monte Carlo Calculation of pi, Monte Carlo Markov Chain

Prerequisites: Statistical Mechanics

The physics

What process turns liquid water into a solid? As temperature lowers, materials transition from a regime where the desire to maximize entropy dominates to desiring to minimize energy. In transitions like liquid-solid, the energy arises from atom-atom forces. This problem falls into a category called first order phase transitions, which are difficult to study.

Instead of looking at the liquid-solid problem to understand phase transitions, we will look at a magnetic material undergoing a second order phase transition. In our approximation of the problem, the atoms exist at fixed positions on a lattice, and interact with their neighbor according to the following Hamiltonian,
\begin{equation}
{\cal H} = -J \sum_{<i,j>} S^z_i S_j^z
\end{equation}
, which is basically just the energy. This Ising Model has nearest neighbors interacting, and each spin variablesolely points in the $\pm z$ direction.

At a given temperature $T$, inverse temperature $\beta=1/T$, $k_b=1$, the occupancy of a given configuration $c_i$ follows the Maxwell-Boltzmann Probability Distribution,
\begin{equation}
P(c_i)=\frac{\mathrm{e}^{-\beta E(c_i)}}{\sum\limits_j \mathrm{e}^{-\beta E(c_j)}}
\end{equation}
We need to determine observables given this probability distribution.

The Numerics

In the post on the Markov Chain, each state was a location on our grid. Now our state is one configuration of all our $N$ spins. That means for an Ising spin we have $2^N$ different configurations. Yeah… We aren’t going to be exploring all of those.

So how do we get from one state to the next?

First, we choose a spin that we could flip. This potential spin is chosen randomly for convenience’s sake.

Then we use the Metropolis-Hastings Algorithm to decide whether or no to flip the spin and generate a new configuration.

We split the solution into two cases based on $\alpha = \frac{\pi_i}{\pi_j}= \frac{P(c_i)}{P(c_j)}$:

If $\alpha > 1$ then always accept the transition, $p_{i \to j} = 1$
If $\alpha < 1$, accept the transition with probability $\alpha = p_{i \to j}$

Does this obey detailed balance?

From the Monte Carlo Markov Chain post, we take our equation for detailed balance

\begin{equation}
\frac{p_{i \to j}}{ p_{j \to i}} = \frac{\pi_i}{\pi_j}
\end{equation}

How about ergodicity? We can reach any configuration by flipping spins.

For the Ising Ferromagnet, our $\alpha$ is
\begin{equation}
\alpha= \frac{Z}{Z} \frac{\mathrm{e}^{-\beta E_i}}{\mathrm{e}^{-\beta E_j}}
= \mathrm{e}^{\beta \left(E_j – E_i \right)}
\end{equation}
which is simply a function of difference in energy between two states. Therefore we don’t need to know the absolute energy at each point in time, just how the spin flip changes its local environment.

Tunable Parameters

I will say a little bit about the tunable parameters here, but Monte Carlo simulations are as much an art as a science, so I leave it up to you to play with the numbers and build intuition for what works.

Temperature

On the left I am performing a low temperature search of a couch store. I found a minimum, and even though I got stuck there, I thoroughly got to know what that minimum had to offer. On the right though, I’m performing a high temperature search of the Alexander Hills, Death Valley, California in pursuit of Neoproterzoic carbonates and dimictites. I needed to be able to explore a lot more of the environment.

While in a physical system like a magnet, temperature has a physical meaning, we can create other types of situations, like optimization or approximating a probability distribution, where we use temperature to describe how stuck we are to a particular minimum. This intuition also plays a role in interpreting the results of of our simulation.

Temperature can also provide other complications when we are studying critical phenomena, phase transitions. Near the critical temperature, computations get much more difficult. Elaboration in later post.

Size of Lattice

Larger lattice gives more precise results.
Larger lattice takes more memory and time.
Finite Size effects, to be disscussed later, display some interesting physics.

Coupling Constant J

Usually normalized out anyway…

Number of time steps

Each Monte Carlo time step is one complete sweep of the lattice, $N$ random flip attempts. The more time steps, the more accurate the results, but the longer you have to wait.

When to Measure

Though a time step does include $N$ spin flips, that doesn’t mean we have actually moved to a truly new configuration. If we want to randomly sample our configuration space, we need to wait a bit longer to sample. Too short a time, and our measurements are coupled and not accurate. Too long, and we are wasting computations. In a later post, I will look at the autocorrelation time, which is a good way to figure our if your measurement time is good.

Other potential Options

Some less important things: periodic versus open boundary conditions, shape and symmetries of simulation cell.

using PyPlot;
push!(LOAD_PATH,".")
using Lattices;

Instead of going into calculating all the lattice parameters again, we will use a class I define in the file Lattices.jl . This class contains

Lattice Types

Chain
Square
Honeycomb
You can edit the file to make your own types.

Once a lattice is created, it contains Members of Type:

name, a string
l, length in number of unit cells
dim, dimension of lattice
a, array containing the basis vectors by which positions are generated
unit, array of positions inside a single unit
N, number of total sites
X, array of positions
nnei, number of nearest neighbors
neigh, Array of nearest neighbors [i][j], where i is site and j is 1:nnei

Today, I will just look at the square lattice, since that indicates much of the standard phase transition properties. Some of the lattices I have shown (kagome, triangular, …) are special frustrated lattices, and thus will behave very wierdly in this situation.

## Define l here
l=50;

lt=MakeLattice("Square",l);
S=ones(Int8,l,l);  #Our spins
dt=1/(lt.N);

# The energy contribution of just one site
function dE(i::Int)
    Eii=0;
    for j in 1:lt.nnei
        Eii+=S[lt.neigh[i,j]];
    end
    Eii*=-J*S[i];  # we are computing J sz_i sz_j for one i
    return Eii;
end
# The energy of the entire lattice
function E()
    Evar=0;
    for k in 1:lt.N
        Evar+=.5*dE(k);
    end
    return Evar;
end
# The magnetization of the entire lattice
function M()
    Mvar=0;
    for k in 1:lt.N
        Mvar+=S[k];
    end
    return Mvar;
end
"defined functions"

Adjustable Parameters

I have set up the simulation so that you can perform two different things. For one, you can set video=true and t to a small variable. Then in a new window you see what the configuration looks like each time you measure.

Or you can set video=false and t to a large variable, and actually measure the statistics for the system over a bunch of configurations.

beta=.7;
J=1;
t=100000;
video=false;
nskip=10;   # don't measure every sweep= better decorrelation
"Parameters set"

nmeas=Int64(t/nskip); # how many times we will measure
Ma=Array{Int32}(nmeas); # our magnetization measurements
Ea=Array{Int32}(nmeas); # our energy measurements
"done"

tm=1; #Our measurement time step
pygui(true)
for ti in 1:t
    for j in 1:lt.N
        i = rand(1:lt.N); #Choosing a random site
        de=dE(i);
        if(de>0 || rand()<exp(2*beta*de) )
            S[i]=-S[i]; #Switch the sign
        end
    end
    if isapprox(mod(ti,nskip),0)
        Ma[tm]=M();
        Ea[tm]=E();
        tm+=1;

        if video==true
            pygui(true)
            pcolor(S,cmap="winter")
            magnow=round(M()/lt.N,3)
            title("t: $ti  M: $magnow ")
            draw()
            sleep(.5)
            #nm=dec(ti,4)
            #savefig("Magnetimages2/pic_l50_bp7_$nm.png")
        end
    end
end
Mave=mean(Ma/lt.N);
Mstd=std(Ma/lt.N);
Eave=mean(Ea/lt.N);
Estd=std(Ea/lt.N);
Mave, Mstd

pygui(false)
title("Magnetization versus Time, Beta=$beta")
xlabel("Monte Carlo Steps")
ylabel("Average Magnetization")
plot(collect(1:10:5000),Ma[1:500]/lt.N)
annotate(" White Noise = Well Mixing",
xy=[500, Mave ],
xytext=[1000, Mave-.0125],
    xycoords="data",
    arrowprops=Dict("facecolor"=>"green"))

pygui(false)
plt[:hist](Ma/lt.N,bins=30,normed=true,label="Samples Normed");
x=collect(Mave-Mstd*3:.001:Mave+Mstd*3)
gaussian=1/(Mstd*sqrt(2*pi))*exp(-(x-Mave).^2/(2*Mstd^2));
plot(x,gaussian,linewidth=5,label="Gaussian Fit")
title("Magnetization Sample Distribution beta=$beta")
xlabel("Mangetization")
ylabel("Normed Counts")
legend(loc="upper right")
#savefig("Ferromagnet/maghist_bp3.png")

pygui(false)
title("Energy versus Time, Beta=$beta")
xlabel("Monte Carlo Steps")
ylabel("Average Energy")
plot(collect(1:10:5000),Ea[1:500]/lt.N)
annotate(" White Noise = Well Mixing",
    xy=[500, Eave ],
xytext=[1000, Eave-.03],
   xycoords="data",
   arrowprops=Dict("facecolor"=>"green"))

pygui(false)
plt[:hist](Ea/lt.N,bins=30,normed=true,label="Samples Normed");
x=collect(Eave-3*Estd:.001:Eave+3*Estd)
gaussian=1/(Estd*sqrt(2*pi))*exp(-(x-Eave).^2/(2*Estd^2));
plot(x,gaussian,linewidth=5,label="Gaussian Fit")
title("Energy Histogram, beta=$beta")
xlabel("Energy")
ylabel("Normed Counts")
#savefig("Ferromagnet/Ehist_bp3.png")

Example Results

So here are some example results I got.

Paramagnet

States of a Paramagnet at $\beta=0.2$

Samples over Time

Paramagnet Magnetization Samples. A properly tuned simulations should be showing white noise without correlations or getting stuck in particular areas.

Magnetization Histogram

Energy Histogram

Magnet

States of a Paramagnet at $\beta=0.7$

Magnet Histogram

Histogram of a magnetization at $\beta=0.7$