Tag Archives: ODE

Implicit ODE Solvers Are Not Universally More Robust than Explicit ODE Solvers, Or Why No ODE Solver is Best

Re-posted from: https://www.stochasticlifestyle.com/implicit-ode-solvers-are-not-universally-more-robust-than-explicit-ode-solvers-or-why-no-ode-solver-is-best/

A very common adage in ODE solvers is that if you run into trouble with an explicit method, usually some explicit Runge-Kutta method like RK4, then you should try an implicit method. Implicit methods, because they are doing more work, solving an implicit system via a Newton method having “better” stability, should be the thing you go to on the “hard” problems.

This is at least what I heard at first, and then I learned about edge cases. Specifically, you hear people say “but for hyperbolic PDEs you need to use explicit methods”. You might even intuit from this “PDEs can have special properties, so sometimes special things can happen with PDEs… but ODEs, that should use implicit methods if you need more robustness”. This turns out to not be true, and really understanding the ODEs will help us understand better why there are some PDE semidiscretizations that have this “special cutout”.

What I want to do in this blog post is more clearly define what “better stability” actually means, and show that it has certain consequences that can sometimes make explicit ODE solvers actually more robust on some problems. And not just some made-up problems, lots of real problems that show up in the real world.

A Quick Primer on Linear ODEs

First, let’s go through the logic of why implicit ODE solvers are considered to be more robust, which we want to define in some semi-rigorous way as “having a better chance to give an answer closer to the real answer”. In order to go from semi-rigorous into a rigorous definition, we can choose a test function, and what better test function to use than a linear ODE. So let’s define a linear ODE:

$$u’ = \lambda u$$

is the simplest ODE. We can even solve it analytically, $u(t) = \exp(\lambda t)u(0)$. For completeness, we can generalize this to a linear system of ODEs, where instead of having a scalar $u$ we can let $u$ be a vector, in which case the linear ODE has a matrix of parameters $A$, i.e.

$$u’ = Au$$

In this case, if $A$ is diagonalizable, $A = P^{-1}DP$, then we can replace $A$:

$$u’ = P^{-1}DP u$$

$$Pu’ = DPu$$

or if we let $w = Pu$, then

$$w’ = Dw$$

where $D$ is a diagonal matrix. This means that for every element of $w$ we have the equation:

$$w_i’ = \lambda_i w_i$$

where $w_i$ is the vector in the direction of the $i$th eigenvector of $A$, and $\lambda_i$ is the $i$th eigenvalue of $A$. Thus our simple linear ODE $u’ = \lambda u$ tells us about general linear systems along the eigenvectors. Importantly, since even for real $A$ we can have $\lambda$ be a complex number, i.e. real-valued matrices can have complex eigenvalues, it’s important to allow for $\lambda$ to be complex to understand all possible systems.

But why is this important for any other ODE? Well by the Hartman-Grobman theorem, for any sufficiently nice ODE:

$$u’ = f(u)$$

We can locally approximate the ODE by:

$$u’ = Au$$

where $A = f'(u)$, i.e. $A$ is the linear system defined by the Jacobian local to the point. This is effectively saying any “sufficiently nice” system (i.e. if $f$ isn’t some crazy absurd function and has properties like being differentiable), you can understand how things locally move by looking at the system approximated by a linear system, where the right linear approximation is given by the Jacobian. And we know that linear systems then boil down generally to just the scalar linear system, and so understanding the behavior of a solver on the scalar linear system tells us a lot about how it will do “for small enough h”.

Okay, there are lots of unanswered questions, such as what if $A$ is not diagonalizable? What if $f$ is not differentiable? What if the system is very nonlinear so the Jacobian changes very rapidly? But under assumptions that things are nice enough, we can say that if a solver does well on $u’ = \lambda u$ then it is probably some idea of good.

Why implicit ODE solvers are “better”, i.e. more robust

So now we have a metric by which we can analyze ODEs: if they have good behavior on $u’ = \lambda u$, then they are likely to be good in general. So what does it mean to have good behavior on $u’ = \lambda u$? One nice property would be to at least be asymptotically correct for the most basic statement, i.e. does it go to zero when it should? If you have $u’ = \lambda u$ and $\lambda$ is negative, then the analytical solution $u(t) = \exp(\lambda t)u(0)$ goes to zero as $t$ goes to infinity. So a good question to ask is, for a given numerical method, for what values of $h$ (the time step size) does the numerical method give a solution that goes to zero, and for which $h$ does it get an infinitely incorrect answer?

To understand this, we just take a numerical method and plug in the test equation. So the first thing to look at is Euler’s method. For Euler’s method, we step forward by $h$ by assuming the derivative is constant along the interval, or:

$$u_{n+1} = u_n + hf(u_n)$$

When does this method give a solution that is asymptotically consistent? With a little bit of algebra:

$$u_{n+1} = u_n + h\lambda u_n$$

$$u_{n+1} = (1 + h\lambda) u_n$$

Let $z = h\lambda$, which means

$$u_{n+1} = (1 + z) u_n$$

This is a discrete dynamical system which has the analytical solution:

$$u_n = u_0 (1+z)^{n}$$

Note that if $1 + z > 1$, then $(1+z)^n$ keeps growing as $n$ increases, so this goes to infinity, while if $1 + z < 1$ it goes to zero. But since $\lambda$ can actually be a complex number, the analysis is a little bit more complex (pun intended), but it effectively means that if $z$ is in the unit circle shifted to the left in the complex plane by 1, then $u_n \rightarrow 0$. This gives us the definition of the stability region, $G(z)$ is the region for which $u_n \rightarrow 0$, and this is the shifted unit circle in the complex plane for explicit Euler.

This shows a pretty bad property for this method. For any given $\lambda$ with negative real part, there is a maximum $h$, actually $h = 1/\lambda$, such that for any larger step size we don’t just get a bad answer, we can get an infinitely bad answer, i.e. the analytical solution goes to zero but the numerical solution goes to infinity!

So, is there a method that doesn’t have this bad property? In comes the implicit methods. If you run the same analysis with implicit Euler,

$$u_{n+1} = u_n + hf(u_{n+1})$$

$$u_{n+1} = u_n + h\lambda u_{n+1}$$

$$(1-z) u_{n+1} = u_n$$

$$u_{n+1} = \frac{1}{1-z} u_n$$

Then we have almost an “inverse” answer, i.e. $G(z)$ is everything except the unit circle in the complex plane shifted to the right. This means that for any $\lambda$ with negative real part, for any $h$ the implicit Euler method has $u_n \rightarrow 0$, therefore it’s never infinitely wrong.

Therefore it’s just better, QED.

This then generalizes to more advanced methods. For example, the stability region of RK4

an explicit method has a maximum $h$, while the stability region of BDF2

an implicit method does not. You can even prove it’s impossible for any explicit method to have this “good” property, so “implicit methods are better”. QED times 2, done deal.

Wait a second, what about that other “wrongness”?

Any attentive student should immediately throw their hand up. “Teacher, given the $G(z)$ you said, you also have that for any $\lambda$ where $\text{Re}(\lambda)>1$, you also have that $u_n \rightarrow 0$, but in reality the analytical solution has $u(t) \rightarrow \infty$, so implicit Euler is infinitely wrong! And explicit Euler has the correct asymptotic behavior since it goes to infinity!”

That is completely correct! But it can be easy to brush this off with “practical concerns”. If you have a real model which has positive real eigenvalues like that, then it’s just going to explode to infinity. Those kinds of models aren’t really realistic? Energy goes to infinity, angular momentum goes to infinity, the chemical concentration goes to infinity: whatever you’re modeling just goes crazy! If you’re in this scenario, then your model is probably wrong. Or if the model isn’t wrong, the numerical methods aren’t very good anyways. If you analyze the error propagation properties, you’ll see the error of the numerical method also increases exponentially! So this is a case you shouldn’t be modeling anyways.

Seeing this robustness in practice

Therefore if you need a more accurate result, use an implicit method. And you don’t need to go to very difficult models to see this manifest in practice. Take the linear ODE:

$$T’ = 5(300-T)$$

with $T(0) = 320$. This is a simple model of cooling an object with a constant temperature influx. It’s easy to analytically solve, you just have an exponential fall in the temperature towards $T = 300$ the steady state. But when we solve it with an explicit method at default tolerances, that’s not what we see:

using OrdinaryDiffEq
function cooling(du,u,p,t)
    du[1] = 5.0*(300-u[1])
end
u0 = [310.0]
tspan = (0.0,10.0)
prob = ODEProblem(cooling, u0, tspan)
sol = solve(prob, Tsit5())
 
using Plots
plot(sol, title="RK Method, Cooling Problem")
savefig("rk_cooling.png")

We see that the explicit method gives oscillations in the solution! Meanwhile, if we take a “robust” implicit method like the BDF method from the classic C++ library SUNDIALS, we can solve this:

using Sundials
sol = solve(prob, CVODE_BDF())
plot(sol, title="BDF Method, Cooling Problem")
savefig("bdf_cooling.png")

Sure it’s not perfectly accurate, but at least it doesn’t give extremely wrong behavior. We can decrease tolerances to make this all go away,

But the main point is that the explicit method is just generally “less robust”, you have to be more careful, it can give things that are just qualitatively wrong.

This means that “good tools”, tools that have a reputation for robustness, they should default to just using implicit solvers because that’s going to be better. And you see that in tools like Modelica. For example, the Modelica University’s playground and other tools in the space like OpenModelica and Dymola, default to implicit solvers like DASSL. And you can see they do great on this problem by default!

Modelica tools gives a good answer out of the box.

So QED, that’s the “right thing to do”: if you want to be robust, stick to implicit methods.

But why oscillations?

Hold up a bit… why does the explicit method give oscillations? While we know that’s wrong, it would be good to understand why it gives the qualitatively wrong behavior that it does. It turns out that this falls right out of the definition of the method. If you go back to the definition of explicit Euler on the test problem, i.e.

$$u_{n+1} = u_n + hf(u_n)$$

then substitute in:

$$u_{n+1} = (1 + h\lambda) u_{n}$$

If we think about our stability criteria $G(z)$ another way, its boundaries are exactly the value by which the next $u_{n+1}$ would have a negative real part. So the analytical solution is supposed to go to zero, but the “bad” behavior is when we choose a step size $h$ such that if we extrapolate out with a straight line for $h$ long in time, then we will “jump” over this zero, something that doesn’t happen in the analytical solution. But now let’s think about what happens in that case. If you jump over zero, then $u_n < 0$ (think real right now), so therefore the derivative of the next update points in the other direction, i.e. we're still going towards zero, but now from the negative side we go up to zero. But since $\|1 + h\lambda\| > 1$, we have that $\|u_{n+1}\| > \|u_n\|$, i.e. the norm of the solution keeps growing. So you jump from positive to negative, then negative to positive, then positive to negative, where the jumps are growing each time. This is the phantom oscillations of the explicit ODE solver!

So what’s happening is the default tolerances of the explicit ODE solver were large enough that the chosen $h$s were in the range of the phantom oscillation behavior, and so you just need to cap $h$ below that value, which is dependent on the real part of the eigenvalue of $h$ (you can do the same analysis with complex numbers, but that just gives rotations in the complex plane along with the real part oscillation).

But if explicit methods give oscillations, what’s going on with implicit ODE solvers with large $h$? Let’s look at the update equation again:

$$u_{n+1} = \frac{1}{1-z} u_n$$

now instead of multiplying each time by $(1-z)$, we divide by it. This means that when $\lambda < 0$ (or $\text{Re}(\lambda) < 0$ to be more exact), then for any $h$ we have that $\|u_{n+1}\| < \|u_{n}\|$. Therefore, we might jump over the zero with a big enough $h$, but we are guaranteed that our "jump size" is always shrinking. Thus for any $h$, we will get to zero because we're always shrinking in absolute value. This means that implicit methods are working because they have a natural dampening effect. So:

Explicit methods introduce spurious oscillations, but implicit methods naturally damp oscillations

This explains in more detail why we saw what we saw: the explicit method when the error tolerance is sufficiently high will introduce oscillations that don’t exist, while the implicit method will not have this behavior. This is a more refined version of the “energy doesn’t go to infinity!”, now it’s “energy doesn’t come from nowhere in real systems”, and because of this implicit solvers give a better qualitative answer. This is why they are more robust, which is why robust software for real engineers just always default to them.

Wait a second… do we always want that?

You should now be the student in the front row raising your hand, “implicit methods are always dampening… is that actually a good idea? Are you sure that’s always correct?” And the answer is… well it’s not. And that then gives us exactly the failure case for which implicit methods are less robust. If you have a system that is supposed to actually oscillate, then this “hey let’s always dampen everything to make solving more robust” actually leads to very wrong answers!

To highlight this, let’s just take a simple oscillator. You can think of this as a harmonic oscillator, or you can think about it as a simple model of a planet going around a star. However you want to envision it, you can write it out as a system of ODEs:

$$u_1′ = 500u_2$$
$$u_2′ = -500u_1$$

This is the linear ODE $u’ = Au$ where $A = [0\ 500; -500\ 0]$, which has complex eigenvalues with zero real part. In other words, the analytical solution is $\sin(500t)$ and $\cos(500t)$, just a pure oscillation that just keeps going around and around in circles. If we solve this with an explicit ODE solver:

function f(du,u,p,t)
    du[1] = 500u[2]
    du[2] = -500u[1]
end
u0 = [1.0,1.0]
tspan = (0.0,1.0)
prob = ODEProblem(f, u0, tspan)
sol = solve(prob, Tsit5())
 
plot(sol, title="RK Method", idxs=(1,2))
savefig("rk_oscillate.png")

we can see that it generally gets the right answer. Over time you get some drift where the energy is slowly increasing due to numerical error in each step, but it’s going around in circles relatively well. However, our “robust implicit method”…

sol = solve(prob, CVODE_BDF())
plot(sol, title="BDF Method", idxs=(1,2))
savefig("bdf_oscillate.png")

is just not even close. And you can see that even our “robust Modelica tools” completely fall apart:

It says the answer goes to zero! Even when the analytical solution is just a circle! But we can understand why this is the case: the software developers made the implicit assumption that “dampening oscillations is always good, because generally that’s what happens in models, so let’s always do this by default so people get better answers”, and the result of this choice is that if someone puts in a model of the Earth going around the sun, then oops the Earth hits the sun pretty quickly.

Conclusion: ODE solvers make trade-offs, you need to make the right ones for your domain

This gives us the conclusion: there is no “better” or “more robust” ODE solver method, it’s domain-specific. This is why the Julia ODE solver package has hundreds of methods, because each domain can be asking for different properties that they want out of the method. Explicit methods are not generally faster, they are also something that tends to preserve (or generate) oscillations. Implicit methods are not generally more robust, they are methods which work by dampening transients, which is a good idea for some models but not for others. But then there’s a ton of nuance. For example, can you construct an explicit ODE solver so that on such oscillations you don’t get energy growth? You can! Anas5(w) is documented as “4th order Runge-Kutta method designed for periodic problems. Requires a periodicity estimate w which when accurate the method becomes 5th order (and is otherwise 4th order with less error for better estimates)”, i.e. if you give it a canonical frequency 500 it will be able to do extremely well on this problem (and being a bit off in that estimate still works, it just has energy growth that is small).

What about what was mentioned at the beginning of the article, “for hyperbolic PDEs you need to use explicit methods”? This isn’t a “special behavior” of PDEs, this is simply because for this domain, for example advective models of fluids, you want to conserve fluid as it moves. If you choose an implicit method, it “dampens” the solver, which means you get that as you integrate you get less and less fluid, breaking the conservation laws and giving qualitatively very incorrect solutions. If you use explicit methods, you don’t have this extraneous dampening, and this gives a better looking solution. But you can go even further and develop methods for which, if $h$ is sufficiently small, then you get little to no dampening. These are SSP methods, which we say are “for Hyperbolic PDEs (Conservation Laws)” but in reality what we mean is “when you don’t want things to dampen”.

But the point is, you can’t just say “if you want a better solution, use an implicit solver”. Maybe in some domains and for some problems that is true, but in other domains and problems that’s not true. And many numerical issues can stem from the implicit assumptions that follow from the choice being made for the integrator. Given all of this, it should be no surprise that much of the Modelica community has had many problems handling fluid models, the general flow of “everything is a DAE” → “always use an implicit solver” → “fluid models always dampen” → “we need to fix the dampening” could be fixed by making different assumptions at the solver level.

So, the next time someone tells you should just use ode15s or scipy.integrate.radau in order to make things robust without knowing anything about your problem, say “umm actually”.

Little Extra Details

The article is concluded. But here’s a few points I couldn’t fit into the narrative I want to mention:

Trapezoidal is cool

One piece I didn’t fit in here is that the Trapezoidal method is cool. The dampening property comes from L-stability, i.e. $G(z) \rightarrow 0$ as $\text{Re}(z) \rightarrow -\infty$. This is a stricter form of stability, since instead of just being stable for any finite $\lambda$, this also enforces that you are stable at the limit of bigger lambdas. “Most” implicit solvers that are used in practice, like Implicit Euler, have this property, and you can show the dampening is directly related to this property. But you can have an implicit method that isn’t L-stable. Some of these methods include Adams-Bashforth-Moulton methods, which are not even A-stable so they tend to have stability properties and act more like explicit methods. But the Trapezoidal method is A-stable without being L-stable, so it doesn’t tend to dampen while it tends to be also pretty stable. Though it’s not as stable, and the difference between “is stable for any linear ODE” versus “actually stable for nonlinear ODEs” (i.e. B-stability) is pronounced on real-world stiff problems. What this means in human terms is that the Trapezoidal method tends to not be stable enough for hard stiff problems, but it also doesn’t artificially dampen, so it can be a good default in cases where you know you have “some stiffness” but also want to keep some oscillations. One particular case of this is in some electrical circuit models with natural oscillators.

Lower order methods have purposes too

“All ODE solvers have a purpose”, I give some talks that give the justification for many high order methods, so in general “higher order is good if you solve with stricter tolerances and need more precision”. But lower order methods can be better because the higher order methods require that more derivatives of $f$ are defined, and if that’s not the case (like derivative discontinuities), then lower order methods will be more efficient. So even implicit Euler has cases where it’s better than higher order BDF methods, and it has to do with “how nice” $f$ is.

BDF methods like DASSL are actually α-stable

I said that generally implicit methods that you use are A-stable. That’s also a small lie to make the narrative simpler. The BDF methods which Sundials, DASSL, LSODE, FBDF, etc. use are actually α-stable, which means that they are actually missing some angle α of the complex plane for stability. The stability regions look like this:

So these BDF methods are actually pretty bad for other reasons on very oscillatory problems! Meanwhile, things like Rosenbrock methods can also solve DAEs while actually being L-stable, which can make them more stable in many situations where there’s oscillations towards a steady state. So there’s a trade-off there… again every method has a purpose. But this is another “ode15s is more stable than ode23s”… “well actually…”

The post Implicit ODE Solvers Are Not Universally More Robust than Explicit ODE Solvers, Or Why No ODE Solver is Best appeared first on Stochastic Lifestyle.

Runge-Kutta Methods

Ordinary Differential Equation Solvers: Runge-Kutta Methods

Christina Lee

So what’s an Ordinary Differential Equation?

Differential Equation means we have some equation (or equations) that have derivatives in them.

The ordinary part differentiates them from partial differential equations, the ones with curly $\partial$ derivatives. Here, we only have one independent variable, let’s call it $t$, and one or more dependent variables, let’s call them $x_1, x_2, …$. In partial differential equations, we can have more than one independent variable.

This ODE can either be written as a system of the form

$\frac{d x_1}{dt}=f_1(t,x_1,x_2,...,x_n)$

$\frac{d x_2}{dt}=f_2(t,x_1,x_2,...,x_n)$

…

$\frac{d x_n}{dt}=f_n(t,x_1,x_n,...,x_n)$

or a single n’th order ODE of the form

$f_n(t,x) \frac{d^n x}{dt^n}+...+f_1(t,x) \frac{dx}{dt}+f_0(t,x)=0$

that can be rewritten in terms of a system of first order equations by performing variable substitutions such as

$\frac{d^i x}{dt^i}=x_i$

Though STEM students such as I have probably spent thousands of hours pouring of ways to analytically solve both ordinary and partial differential equations, unfortunately, the real world is rarely so kind as to provide us with an exactly solvable differential equation. At least for interesting problems.

We can sometimes approximate the real world as an exactly solvable situation, but for the situation we are interested in, we have to turn to numerics. I’m not saying those thousand different analytic methods are for nothing. We need an idea ahead of time of what the differential equation should be doing, to tell if it’s misbehaving or not. We can’t just blindly plug and chug.

Today will be about introducing four different methods based on Taylor expansion to a specific order, also known as Runge-Kutta Methods. We can improve these methods with adaptive stepsize control, but that is a topic for another time, just like the other modern types of solvers such as Richardson extrapolation and predictor-corrector.

Nonetheless, to work with ANY computational differential equation solver, you need to understand the fundamentals of routines like Euler and Runge-Kutta, their error propagation, and where they can go wrong. Otherwise, you might misinterpret the results of more advanced methods.

WARNING: If you need to solve a troublesome differential equation for a research problem, use a package, like DifferentialEquations. These packages have much better error handling and optimization.

Let’s first add our plotting package and colors.

Pkg.update()
Pkg.add("Gadfly")
Pkg.add("Colors")
Pkg.add("Lazy")
using Gadfly
# Solarized Colors that I like working with
# http://ethanschoonover.com/solarized
using Colors
base03=parse(Colorant,"#002b36");
base02=parse(Colorant,"#073642");
base01=parse(Colorant,"#586e75");
base00=parse(Colorant,"#657b83");
base0=parse(Colorant,"#839496");
base1=parse(Colorant,"#839496");
base2=parse(Colorant,"#eee8d5");
base3=parse(Colorant,"#fdf6e3");

yellow=parse(Colorant,"#b58900");
orange=parse(Colorant,"#cb4b16");
red=parse(Colorant,"#dc322f");
magenta=parse(Colorant,"#d33682");
violet=parse(Colorant,"#6c71c4");
blue=parse(Colorant,"#268bd2");
cyan=parse(Colorant,"#3aa198");
green=parse(Colorant,"#859900");

We will be benchmarking our solvers on one of the simplest and most common ODE’s,

$\frac{d}{d t}x=x \;\;\;\;\;\;\; x(t)=C e^t$

Though this only has one dependent variable, we want to structure our code so that we can accommodate a series of dependent variables, $y_1,y_2,…,y_n$, and their associated derivative functions. Therefore, we create a function for each dependent variable, and then push it into an array declared as type Function.

function f1(t::Float64,x::Array{Float64,1})
    return x[1]
end
f=Function[]
push!(f,f1)

Euler’s Method

Stepping

Approximating the slope each step.

First published in Euler’s Instutionum calculi integralis in 1768, this method gets a lot of milage, and if you are to understand anything, this method is it.

We march along with step size $h$, and at each new point, calculate the slope. The slope gives us our new direction to travel for the next $h$.

We can determine the error from the Taylor expansion of the function

$x_{n+1}=x_n+h f(x_n,t) + \mathcal{O}(h^2).$

In case you haven’t seen it before, the notation $\mathcal{O}(x)$ stands for “errors of the order x”.
Summing over the entire interval, we accumuluate error according to

$N\mathcal{O}(h^2)= \frac{x_f-x_0}{h}\mathcal{O}(h^2)=h$ ,

making this a first order method. Generally, if a technique is $n$th order in the Taylor expansion for one step, its $(n-1)$th order over the interval.

function Euler(f::Array{Function,1},t0::Float64,x::Array{Float64,1},h::Float64)
    d=length(f)
    xp=copy(x)
    for ii in 1:d
        xp[ii]+=h*f[ii](t0,x)
    end

    return t0+h,xp
end

Implicit Method or Backward Euler

If $f(t,x)$ has a form that is invertible, we can form a specific expression for the next step. For example, if we use our exponential,

$x_{n+1}=x_n+ h f(t_{n+1},x_{n+1})$

$x_{n+1}-h x_{n+1}=x_n$

$x_{n+1}=\frac{x_n}{1-h}$

This expression varies for each differential equation and only exists if the function is invertible.

function Implicit(f::Array{Function,1},t0::Float64,x::Array{Float64,1},h::Float64)
    return t0+h,x[1]/(1-h)
end

2nd Order Runge-Kutta

So in the Euler Method, we could just make more, tinier steps to achieve more precise results. Here, we make bettter steps. Each step itself takes more work than a step in the first order methods, but we win by having to perform fewer steps.

This time, we are going to work with the Taylor expansion up to second order,

$x_{n+1}=x_n+h f(t_n,x_n) + \frac{h^2}{2} f^{\prime}(t_n,x_n)+ \mathcal{O} (h^3).$

Define

$k_1=f(t_n,x_n),$

so that we can write down the derivative of our $f$ function, and the second derivative (curvature), of our solution,

$f^{\prime}(t_n,x_n)=\frac{f(t_n+h/2,x_n+h k_1/2)-k_1}{h/2}+\mathcal{O}(h^2).$

Plugging this expression back into our Taylor expansion, we get a new expression for $x_{n+1}$

$x_{n+1}=x_n+hf(t_n+h/2,x_n+h k_1/2)+\mathcal{O}(h^3)$

We can also interpret this technique as using the slope at the center of the interval, instead of the start.

function RK2(f::Array{Function,1},t0::Float64,x::Array{Float64,1},h::Float64)
    d=length(f)

    xp=copy(x)
    xk1=copy(x)

    for ii in 1:d
        xk1[ii]+=f[ii](t0,x)*h/2
    end
    for ii in 1:d
        xp[ii]+=f[ii](t0+h/2,xk1)*h
    end

    return t0+h,xp
end

4th Order Runge-Kutta

Wait! Where’s 3rd order? There exists a 3rd order method, but I only just heard about it while fact-checking for this post. RK4 is your dependable, multi-purpose workhorse, so we are going to skip right to it.

$k_1= f(t_n,x_n)$

$k_2= f(t_n+h/2,x_n+h k_1/2)$

$k_3 = f(t_n+h/2, x_n+h k_2/2)$

$k_4 = f(t_n+h,x_n+h k_3)$

$x_{n+1}=x_n+\frac{h}{6}\left(k_1+2 k_2+ 2k_3 +k_4 \right)$

I’m not going to prove here that the method is fourth order, but we will see numerically that it is.

Note: I premultiply the $h$ in my code to reduce the number of times I have to multiply $h$.

function RK4(f::Array{Function,1},t0::Float64,x::Array{Float64,1},h::Float64)
    d=length(f)

    hk1=zeros(x)
    hk2=zeros(x)
    hk3=zeros(x)
    hk4=zeros(x)

    for ii in 1:d
        hk1[ii]=h*f[ii](t0,x)
    end
    for ii in 1:d
        hk2[ii]=h*f[ii](t0+h/2,x+hk1/2)
    end
    for ii in 1:d
        hk3[ii]=h*f[ii](t0+h/2,x+hk2/2)
    end
    for ii in 1:d
        hk4[ii]=h*f[ii](t0+h,x+hk3)
    end

    return t0+h,x+(hk1+2*hk2+2*hk3+hk4)/6
end

This next function merely iterates over a certain number of steps for a given method.

function Solver(f::Array{Function,1},Method::Function,t0::Float64,
        x0::Array{Float64,1},h::Float64,N::Int64)
    d=length(f)
    ts=zeros(Float64,N+1)
    xs=zeros(Float64,d,N+1)

    ts[1]=t0
    xs[:,1]=x0

    for i in 2:(N+1)
        ts[i],xs[:,i]=Method(f,ts[i-1],xs[:,i-1],h)
    end

    return ts,xs
end

N=1000
xf=10
t0=0.
x0=[1.]
dt=(xf-t0)/N

tEU,xEU=Solver(f,Euler,t0,x0,dt,N);
tIm,xIm=Solver(f,Implicit,t0,x0,dt,N);
tRK2,xRK2=Solver(f,RK2,t0,x0,dt,N);
tRK4,xRK4=Solver(f,RK4,t0,x0,dt,N);

xi=tEU
yi=exp(xi);

errEU=reshape(xEU[1,:],N+1)-yi
errIm=reshape(xIm[1,:],N+1)-yi
errRK2=reshape(xRK2[1,:],N+1)-yi;
errRK4=reshape(xRK4[1,:],N+1)-yi;

plot(x=tEU,y=xEU[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=green,
default_point_size=3pt))

lEU=layer(x=tEU,y=xEU[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=green,
default_point_size=3pt))

lIm=layer(x=tIm,y=xIm[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=yellow,
default_point_size=3pt))

lRK2=layer(x=tRK2,y=xRK2[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=cyan,
default_point_size=2pt))

lRK4=layer(x=tRK4,y=xRK4[1,:],Geom.point,
Theme(highlight_width=0pt,default_color=violet,
default_point_size=4pt))

lp=layer(x->e^x,-.1,10,Geom.line,Theme(default_color=red))


plot(lp,lEU,lIm,lRK2,lRK4,
Guide.manual_color_key("Legend",["Euler","Implicit","RK2","RK4","Exact"],
[green,yellow,cyan,violet,red]),
Coord.cartesian(xmin=9.5,xmax=10.1))

lEU=layer(x=xi,y=errEU,Geom.point,
Theme(highlight_width=0pt,default_color=green,
default_point_size=1pt))

lIm=layer(x=xi,y=errIm,Geom.point,
Theme(highlight_width=0pt,default_color=yellow,
default_point_size=1pt))

lRK2=layer(x=xi,y=errRK2,Geom.point,
Theme(highlight_width=0pt,default_color=cyan,
default_point_size=1pt))

lRK4=layer(x=xi,y=errRK4,Geom.point,
Theme(highlight_width=0pt,default_color=violet,
default_point_size=1pt))

plot(lEU,lIm,lRK2,lRK4,Scale.y_asinh,
Guide.manual_color_key("Legend",["Euler","Implicit","RK2","RK4"],
[green,yellow,cyan,violet]))

Scaling of the Error

I talked above about the error scaling either as $h,h^2$, or $h^4$. I won’t just talk but here will numerically demonstrate the relationship as well. For a variety of different step sizes, the below code calculates the final error for each method. Then we will plot the error and see how it scales.

t0=0.
tf=1.
dx=tf-t0
x0=[1.]

dt=collect(.001:.0001:.01)

correctans=exp(tf)
errfEU=zeros(dt)
errfIm=zeros(dt)
errfRK2=zeros(dt)
errfRK4=zeros(dt)



for ii in 1:length(dt)
    N=round(Int,dx/dt[ii])
    dt[ii]=dx/N

    tEU,xEU=Solver(f,Euler,t0,x0,dt[ii],N);
    tIm,xIm=Solver(f,Implicit,t0,x0,dt[ii],N);
    tRK2,xRK2=Solver(f,RK2,t0,x0,dt[ii],N);
    tRK4,xRK4=Solver(f,RK4,t0,x0,dt[ii],N);

    errfEU[ii]=xEU[1,end]-correctans
    errfIm[ii]=xIm[1,end]-correctans
    errfRK2[ii]=xRK2[1,end]-correctans
    errfRK4[ii]=xRK4[1,end]-correctans
end

lEU=layer(x=dt,y=errfEU,Geom.point,
Theme(highlight_width=0pt,default_color=green,
default_point_size=1pt))

lIm=layer(x=dt,y=errfIm,Geom.point,
Theme(highlight_width=0pt,default_color=yellow,
default_point_size=1pt))

lRK2=layer(x=dt,y=errfRK2*10^5,Geom.point,
Theme(highlight_width=0pt,default_color=cyan,
default_point_size=1pt))

lRK4=layer(x=dt,y=errfRK4*10^10,Geom.point,
Theme(highlight_width=0pt,default_color=violet,
default_point_size=1pt))

tempEU(x)=(errfEU[end]*x/.01)
tempIm(x)=(errfIm[end]*x/.01)
#le=layer([tempEU,tempIm],0,.01,Geom.line,Theme(default_color=base01))
le=layer(tempEU,0,.01,Geom.line,Theme(default_color=base01))
lei=layer(tempIm,0,.01,Geom.line,Theme(default_color=base01))


temp2(x)=(errfRK2[end]*(x/.01)^2*10^5)
l2=layer(temp2,0,.01,Geom.line,Theme(default_color=base00))

temp4(x)=(errfRK4[end]*(x/.01)^4*10^10)
l4=layer(temp4,0,.01,Geom.line,Theme(default_color=base00))

xl=Guide.xlabel("h")
ylrk2=Guide.ylabel("Error e-5")
ylrk4=Guide.ylabel("Error e-10")
yl=Guide.ylabel("Error")

pEU=plot(lEU,lIm,le,lei,xl,yl,Guide.title("Euler and Implicit, linear error"))
p2=plot(lRK2,l2,xl,ylrk2,Guide.title("RK2, error h^2"))
p4=plot(lRK4,l4,xl,ylrk4,Guide.title("RK4, error h^4"))
vstack(hstack(p2,p4),pEU)

Arbitrary Order

While I have presented 4 concrete examples, many more exist. For any choice of variables $a_i, \beta_{i,j},a_i$ that fulfill

$x_{n+1}=x_n+h\left(\sum_{i=1}^s a_i k_i \right)+ \mathcal{O}(h^p)$

with

$k_i=f\left(t_n+\alpha_i h,x_n+h\left(\sum_{j=1}^s \beta_{ij} k_j \right) \right)$

we have a Runge-Kutta method of order $p$, where $p\geq s$. The Butcher tableau provides a set of consistent coefficients.

These routines are also present in the M4 folder in a module named diffeq.jl. For later work, you may simply import the module.

Stay tuned for when we tuned these routines to the stiff van der Pol equations!

#MonthOfJulia Day 21: Differential Equations

Yesterday we had a look at Julia’s support for Calculus. The next logical step is to solve some differential equations. We’ll look at two packages today: Sundials and ODE.

Sundials

The Sundials package is based on a library which implements a number of solvers for differential equations. First off you’ll need to install that library. In Ubuntu this is straightforward using the package manager. Alternatively you can download the source distribution.

sudo apt-get install libsundials-serial-dev

Next install the Julia package and load it.

julia> Pkg.add("Sundials")
julia> using Sundials

To demonstrate we’ll look at a standard “textbook” problem: a damped harmonic oscillator (mass on a spring with friction). This is a second order differential equation with general form

$\ddot{x} + a \dot{x} + b x = 0$

where $x$ is the displacement of the oscillator, while $a$ and $b$ characterise the damping coefficient and spring stiffness respectively. To solve this numerically we need to convert it into a system of first order equations:

$\begin{aligned} \dot{x} &= v \\ \dot{v} &= - a v - b x \end{aligned}$

We’ll write a function for those relationships and assign specific values to $a$ and $b$ .

julia> function oscillator(t, y, ydot)
           ydot[1] = y[2]
           ydot[2] = - 3 * y[1] - y[2] / 10
       end
oscillator (generic function with 2 methods)

Next the initial conditions and time steps for the solution.

julia> initial = [1.0, 0.0];                   # Initial conditions
julia> t = float([0:0.125:30]);                # Time steps

And finally use cvode() to integrate the system.

julia> xv = Sundials.cvode(oscillator, initial, t);
julia> xv[1:5,:]
5x2 Array{Float64,2}:
 1.0        0.0     
 0.97676   -0.369762
 0.908531  -0.717827
 0.799076  -1.02841 
 0.65381   -1.28741

The results for the first few time steps look reasonable: the displacement (left column) is decreasing and the velocity (right column) is becoming progressively more negative. To be sure that the solution has the correct form, have a look at the Gadfly plot below. The displacement (black) and velocity (blue) curves are 90° out of phase, as expected, and both gradually decay with time due to damping. Looks about right to me!

ODE

The ODE package provides a selection of solvers, all of which are implemented with a consistent interface (which differs a bit from Sundials).

julia> using ODE

Again we need to define a function to characterise our differential equations. The form of the function is a little different with the ODE package: rather than passing the derivative vector by reference, it’s simply returned as the result. I’ve consider the same problem as above, but to spice things up I added a sinusoidal driving force.

julia> function oscillator(t, y)
           [y[2]; - a * y[1] - y[2] / 10 + sin(t)]
       end
oscillator (generic function with 2 methods)

We’ll solve this with ode23(), which is a second order adaptive solver with third order error control. Because it’s adaptive we don’t need to explicitly specify all of the time steps, just the minimum and maximum.

julia> a = 1;                                  # Resonant
julia> T, xv = ode23(oscillator, initial, [0.; 40]);
julia> xv = hcat(xv...).';                     # Vector{Vector{Float}} -> Matrix{Float}

The results are plotted below. Driving the oscillator at the resonant frequency causes the amplitude of oscillation to grow with time as energy is transferred to the oscillating mass.

If we move the oscillator away from resonance the behavior becomes rather interesting.

julia> a = 3;                                  # Far from resonance

Now, because the oscillation and the driving force aren’t synchronised (and there’s a non-rational relationship between their frequencies) the displacement and velocity appear to change irregularly with time.

How about a double pendulum (a pendulum with a second pendulum suspended from its end)? This seemingly simple system exhibits a rich range of dynamics. It’s behaviour is sensitive to initial conditions, one of the characteristics of chaotic systems.

First we set up the first order equations of motion. The details of this system are explained in the video below.

julia> function pendulum(t, y)
           Y = [
           6 * (2 * y[3] - 3 * cos(y[1] - y[2]) * y[4]) / (16 - 9 * cos(y[1] - y[2])^2);
           6 * (8 * y[4] - 3 * cos(y[1] - y[2]) * y[3]) / (16 - 9 * cos(y[1] - y[2])^2)
           ]
           [
           Y[1];
           Y[2];
           - (Y[1] * Y[2] * sin(y[1] - y[2]) + 3 * sin(y[1])) / 2;
           - (sin(y[2]) - Y[1] * Y[2] * sin(y[1] - y[2])) / 2;
           ]
       end
pendulum (generic function with 1 method)

Define initial conditions and let it run…

julia> initial = [pi / 4, 0, 0, 0];            # Initial conditions -> deterministic behaviour
T, xv = ode23(pendulum, initial, [0.; 40]);

Below are two plots which show the results. The first is a time series showing the angular displacement of the first (black) and second (blue) mass. Next is a phase space plot which shows a different view of the same variables. It’s clear to see that there is a regular systematic relationship between them.

Next we’ll look at a different set of initial conditions. This time both masses are initially located above the primary vertex of the pendulum. This represents an initial configuration with much more potential energy.

julia> initial = [3/4 * pi, pi, 0, 0];         # Initial conditions -> chaotic behaviour

The same pair of plots now illustrate much more interesting behaviour. Note the larger range of angles, θ₂, achieved by the second bob. With these initial conditions the pendulum is sufficiently energetic for it to “flip over”. Look at the video below to get an idea of what this looks like with a real pendulum.

It’s been a while since I’ve played with any Physics problems. That was fun. The full code for today is available at github. Come back tomorrow when I’ll take a look at Optimisation in Julia.

The post #MonthOfJulia Day 21: Differential Equations appeared first on Exegetic Analytics.

juliabloggers.com

A Julia Language Blog Aggregator