Tag Archives: C++

10 examples of embedding Julia in C/C++

By: Abel Soares Siqueira

Re-posted from: https://blog.esciencecenter.nl/10-examples-of-embedding-julia-in-c-c-66282477e62c?source=rss----ab3660314556--julia

A beginner-friendly collection of examples

A wooden table with 4 cardboard boxes, one inside the other. In the back, the C++ and Julia logos.
We can put C/C++ inside Julia, and Julia inside C/C++, but wait until we call C/C++ from Julia from C/C++. This image is derived from the photos of Andrej Lišakov and Kelli McClintock found on Unsplash.

I have been asked recently about how easy it would be to use Julia from inside C/C++. That was a very interesting question that I was eager to figure out. This question gave me the chance to explore a few toy problems, but I had some issues figuring out the basics. This post aims to help people in a similar situation. I don’t make any benchmarks or claims. Rather, I want to help kickstart future investigations.

The 10 examples in this post are more-or-less ordered by increasing difficulty. I explain some basic bits of Makefile, which can be skipped, if you know what you are doing.

Please notice that this was not done in a production environment, and in no real projects. Furthermore, I use Linux, which is also very specific. I also recommend you to check the code on GitHub, so you see the full result. If possible, leave a star, so I can use it as interest metric.

Target audience

This post should be useful for people evaluating whether using Julia inside C/C++ will be a good idea, or are just very curious about it. I assume some knowledge of Julia and C/C++.

Where is libjulia.so

First, install Julia and remember where you are installing it. I usually use Jill — which is my bash script to install Julia — with the following commands:

In this case, julia will be installed in /opt/julias/julia-x.y.z/ with a link to /usr/local/bin/.

You can try finding out where your Julia is installed using which to find the full path of julia, then listing with -l to check whether that is a link and where it points. For instance,

ls -l $(dirname $(which julia))/julia*

shows all relevant links.

Now, that folder, which I will call JULIA_DIR, should have folders include and lib, and therefore files $JULIA_DIR/include/julia/julia/julia.h and $JULIA_DIR/lib/libjulia.so should exist.

The 10 examples

Now we start with the examples. Since we are using C/C++, I will start from 0, I guess. These examples are some of the files in the GitHub. Look for files sqrtX.cpp, integrationX.cpp, and linear-algebraX.cpp. Notice that there are more files than examples, because some are mostly repetition.

Table of contents

0: The basics

Let’s make sure that we can build and run a basic example, taken directly from the official documentation:

I hope the code comments are self-explanatory. To compile this code, let’s prepare a very simple Makefile.

Explaining:

  • -fPIC: Position independent code. This is needed because we will work with shared libraries
  • -g: Adding debug information because we will probably need it
  • JULIA_DIR: It’s out Julia dir!
  • -I…: Include path for julia.h
  • -L…: Linking path for libjulia.so
  • -Wl,…: Linking path for the linker
  • -ljulia: libjulia.so
  • main.exe: main.cpp: The file main.exe needs the file main.cpp
  • On line 7 there must be a TAB, not spaces
  • $<: Expands to the left-most requirement (main.cpp)
  • $@: Expands to the target (main.exe)
  • The Makefile expression as a whole: “To build a main.exe, look for main.cpp and run the command g++ … main.cpp … -o main.exe”

Enter make main.exe in your terminal, and then ./main.exe. Your output should be 1.4142135623730951. I was not expecting this to just work, but it did. I hope you have a similar experience.

1: Expanding the basics

The simplest way to execute anything in Julia is to use jl_eval_string. Variables created using jl_eval_string remain in the Julia interpreter scope, so you can access them:

jl_eval_string("x = sqrt(2.0)");
jl_eval_string("print(x)");

Returned values can be stored as pointers of type jl_value_t. To access their values, use jl_unbox_SOMETYPE. For instance:

jl_value_t *x = jl_eval_string("sqrt(2.0)");
double x_value = jl_unbox_float64(x);

Finally, you can also store pointers to Julia functions using jl_get_function. The returned type is a pointer to a jl_function_t and we can’t just use it as a C++ function. Instead, we will use jl_call1 and jl_box_float64.

jl_function_t *sqrt = jl_get_function(jl_base_module, “sqrt”)
jl_value *x = jl_call1(sqrt, jl_box_float64(2.0));

In the code above we have jl_base_module, which is everything in Base of Julia. The other common module is jl_main_module, which will include whatever we load (using) or create.

The function jl_call1 is used to execute a Julia function with 1 argument. Variants with 0 to 3 arguments also exist, but if you need to call a function with more arguments, you need to use jl_call. We will get to that later.

In the end, we have something like this:

2: Exceptions

Now, try changing the 2.0 in the code above to -1.0. If you call sqrt(-1.0) in Julia you have a DomainError. But if you run the code with the change above, you will have a segmentation fault.

The problem is not in the execution, though, it is in the unboxing below of the now-undefined x. To check for exceptions on the Julia side we can use jl_exception_occurred. It returns the pointer to the error or 0 (NULL), so it can be used in a conditional statement.

To check the contents of jl_exception_occurred, we can use showerror from Julia’s Base.

After calling sqrt of -1.0, add the following:

The first line creates a variable ex and assigns the exception to it. The evaluated expression is the value of ex, which will be either false if there is no exception, or true if something else was returned.

Then, we call showerror from Julia and pass to it Julia’s error stream with jl_strerr_obj(), and the exception. We add some flourish printing before and after the error message.

To call this a few times, we can create a function called handle_julia_exception wrapping it, and move it to auxiliary files (we’ll call them aux.h and aux.cpp). However, since we are printing, we would have to add iostream or stdio to our auxiliary files, and we don’t want that. Instead, what we can do is use jl_printf and jl_stderr_stream to print using only julia.h.

Therefore we can create the following files:

In our main file we can just add #include "aux.h" and call handle_julia_exception() directly. And since we are already here, we can also create a wrapper for jl_eval_string that checks for exceptions as well. Add the following to your auxiliary files:

// To your aux.h
jl_value_t *handle_eval_string(const char* code);

and

// To you aux.cpp
jl_value_t *handle_eval_string(const char* code) {
jl_value_t *result = jl_eval_string(code);
handle_julia_exception();
assert(result && "Missing return value but no exception occurred!");
return result;
}

And modify your Makefile accordingly:

main.exe: main.cpp aux.o
g++ $(CARGS) $< $(JLARGS) aux.o -o $@
%.o: %.cpp
g++ $(CARGS) -c $< $(JLARGS) -o $@

The % in the Makefile acts as a wildcard.

After running the newest version, you should see something like:

Exception:
DomainError with -1.0:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
I quit!

Other possibilities to handle exceptions and JL_TRY and JL_CATCH but I don’t have an example for it yet.

If you think that example would be useful, leave a comment.

3: Including Julia files and callings functions with 4 arguments or more

Let’s move on to something a little more convoluted, the trapezoid method for computing approximations for integrals.

Animation of Trapezoid approximation to the integral of a function.

The idea of the method is to approximate the integral of the function — the blue-shaded region — by a finite amount of trapezoid areas (caveat: the geometric interpretation only applies to positive functions). As the animation suggests, by increasing the number of trapezoids, we tend to get better approximations for the integral.

The neat formula for the approximation is

Trapezoid formula. LaTeX: \int_a^b f(x) \text{d}x \approx \frac{(b — a)}{2n}\left(f(a) + 2 \sum_{i = 1}^{n — 1} f(x_i) + f(b)\right)

A basic implementation of the trapezoid method in Julia is as follows:

Don’t worry, we are not allocating when using range, not even when accessing [2:end-1].

Write down this to an aux.jl file and include this file using the code below:

handle_eval_string("include(\"aux.jl\")");

Now we can access trapezoid as any other Julia code, for instance, using jl_eval_string or jl_get_function. It is important to notice that trapezoid is not part of the Base module. Instead, we must use the Main module through jl_main_module.

To test this function, let’s compute the integral of x^2 from 0 to 1. We will use the evaluator to compute x -> x^2, which is the notation for anonymous functions in Julia. The result should be 1/3.

Notice that the function x -> x^2 was created with a handle_eval_string, which calls jl_eval_string. The return value of jl_eval_string is a jl_value_t *, but surprise, a jl_function_t is actually just another name for jl_value_t. The difference is just for readability purposes.

The trapezoid function has 4 arguments, therefore we have to use the general jl_call that we mentioned before. The arguments of jl_call are the function, an array of jl_value_t * arguments, and the number of arguments.

4: C function from Julia from C

How about computing the integral of a C function? We will need to access it through Julia to be able to pass it to a Julia function. First, we must create the function in C. Create a file my_c_func.cpp with the following contents:

It is important that we use extern "C" here, otherwise, C++ will mangle the function name. If you use C instead of C++, then this will not be an issue, but we intend to use C++ down the road. We will compile this code to a shared library, not only a .o object. Therefore, add the following to your Makefile:

lib%.so: %.o
ld -shared $< -o $@

ld is the linker and -shared is because we want a shared library. Furthermore, you should modify the following:

main.exe: main.cpp aux.o libmy_c_func.so

Now, when you run make main.exe, the libmy_c_func.so library will be compiled.

Finally, to call this function, we use the same string evaluator and Julia’s ccall.

The ccall function has 4+ arguments:

  • (:my_c_func, "libmy_c_func.so"): A tuple with the function name and the library;
  • Cdouble: Return type;
  • (Cdouble,): Tuple with the types of the arguments;
  • Then, all the arguments. In this case, only x.

That is it. This change is enough to make the code run. Notice that the function is x^3, so the integral result should be 1 / 4. Those are the only differences in the code.

5: Using a package

Instead of implementing our own integration method, we can use some existing one. One option is QuadGK.jl. To install it, open julia, press ], and enter add QuadGK.

An important note here is that I have not investigated much into maintaining a separate environment for these packages. If you know more about this subject, don’t hesitate to leave a comment.

Here is the code:

handle_eval_string("using QuadGK");
jl_value_t *integrator = handle_eval_string(
"(f, a, b, n) -> quadgk(f, a, b, maxevals=n)[1]"
);

Just like that we can compute the integral, and compare it with our implementation. Let’s use a harder integral to make things more interesting:

The integral of 1 over 1 plus x squared from 0 to 1 is Pi over 4. LaTeX: \int_0¹ \frac{1}{1 + x²} \text{d}x = \frac{\pi}{4}.
The integral of 1 over 1 plus x squared from 0 to 1 is Pi over 4. LaTeX: \int_0^1 \frac{1}{1 + x^2} \text{d}x = \frac{\pi}{4}.

Here is the complete code for this example:

The results you should see are

Integral of 1 / (1 + x^2) is approx: 0.785394
Error: 4.16667e-06
Integral of 1 / (1 + x^2) is approx: 0.785398
Error: -1.11022e-16

6: Using the Distributions package

The package Distributions contains various probability-related tools. We are going to use the Normal distributions’ PDF (Probability Density Function) and CDF (Cumulative Density Function) in this example. Don’t worry if you don’t know what these mean, we won’t need to understand the concept, only the formulas.

The Normal distribution with mean Mu (µ) and standard deviation Sigma (σ) has PDF given by

Normal probability density function. LaTeX: f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x — \mu}{\sigma}\right)²}
Normal probability density function. LaTeX: f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x – \mu}{\sigma}\right)^2}

And the CDF of a PDF is

Cumulative density function definition. LaTeX: F(x) = \int_{-\infty}^x f(t) \text{d}t
Cumulative density function definition. LaTeX: F(x) = \int_{-\infty}^x f(t) \text{d}t

What we will do is use the Distributions package to access the PDF and compute the CDF integral using QuadGK. We will then compare it to the existing CDF function in Distributions.

Once more there is not much secret. You only have to create the Normal structure on the Julia side and use Julia closures to define PDF and CDF jl_function_t with one argument. This is the code:

handle_eval_string("normal = Normal()");
jl_function_t *pdf = handle_eval_string("x -> pdf(normal, x)");
jl_function_t *cdf = handle_eval_string("x -> cdf(normal, x)");

The full code is below

7: Creating a class to wrap the Distributions package

To complicate it a little bit more, let’s create a class wrapping the Distributions package. The basic idea will be a constructor to call Normal , and C++ functions wrapping pdf and cdf. This can be done simply by having a call to handle_eval_string or by creating the function with jl_get_function and calling jl_call_X.

However, to make it more efficient, we want to avoid frequent calls to the functions that deal with strings. One solution is to store the functions returned by jl_get_function and just use them when necessary. To do that, we will use static members in C++.

The two files below show the implementation of our class:

As you can see, we keep a distributions_loaded flag to let the constructor know that the static variables can be used. In the initialization function, we define the necessary functions. The actual implementation of the constructor and the PDF and CDF functions is straightforward.

We can use this new class in our main file easily:

Don’t forget to update your Makefile by replacing aux.o by aux.o Normal.o, i.e., add Normal.o next toaux.o. The result of this execution is

x: -4.00e+00  pdf: +4.97e-08  cdf: +1.12e-08
x: -3.00e+00 pdf: +2.73e-06 cdf: +7.07e-07
x: -2.00e+00 pdf: +8.29e-05 cdf: +2.52e-05
x: -1.00e+00 pdf: +1.39e-03 cdf: +5.11e-04
x: +0.00e+00 pdf: +1.30e-02 cdf: +5.95e-03
x: +1.00e+00 pdf: +6.68e-02 cdf: +4.04e-02
x: +2.00e+00 pdf: +1.90e-01 cdf: +1.64e-01
x: +3.00e+00 pdf: +3.00e-01 cdf: +4.18e-01
x: +4.00e+00 pdf: +2.62e-01 cdf: +7.13e-01

8: Linear algebra: Arrays, Vectors, and Matrices

Let’s start our linear algebra exploration with a matrix-vector multiplication and solving a linear system. We will define the following:

x is a vector of ones and A is a matrix with n in the diagonal, 1 below the diagonal and -1 above the diagonal. LaTeX: x = \begin{bmatrix} 1 \\ 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix}, A = \begin{bmatrix} n & -1 & -1 & \cdots & -1 \\ 1 & n & -1 & \cdots & -1 \\ 1 & 1 & n & \cdots & -1 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & 1 & 1 & \cdots & n \end{bmatrix}
x is a vector of ones and A is a matrix with n in the diagonal, 1 below the diagonal and -1 above the diagonal. LaTeX: x = \begin{bmatrix} 1 \\ 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix}, A = \begin{bmatrix} n & -1 & -1 & \cdots & -1 \\ 1 & n & -1 & \cdots & -1 \\ 1 & 1 & n & \cdots & -1 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & 1 & 1 & \cdots & n \end{bmatrix}

Let’s start with some code:

The first two statements define the vectors and matrices types. Notice that we make explicit that the first has 1 dimension and the second has 2 dimensions.

The next 3 statements allocate the memory for the two vectors x and y, and the matrix A, using the array types we previously defined.

Finally, we have a JL_GC_PUSH3, which informs Julia’s Garbage Collector to not touch this memory. Naturally, we will have to pop these eventually.

Lastly, we declare C arrays pointing to the Julia data. You will notice that AData is a 1-dimensional array because Julia implements dense matrices as a linearized array by columns. That means that the element(i,j) will be at the linearized position i + j * nrows — using 0-based indexing.

To fill the values of the vector x and the matrix A , we can use the code below:

The product of A and x is pretty much the same as any function we had so far.

The noteworthy part of this code is that we have to cast the arrays for jl_value_t * to use them as arguments to jl_call2 , and the output is cast to jl_array_t *. Similarly, we can use jl_array_data(Ax) to access the content of the product.

We can also use mul! to compute the product in place, i.e., without allocating more memory:

Notice that we use jl_main_module because mul! is part of LinearAlgebra .

Finally, we move on to solving the linear system. To do that, let’s use the LU factorization and the ldiv! function. The \ (backslash) operator is usually used here, but we choose ldiv! to solve the linear system in place.

jl_function_t *lu_fact = jl_get_function(jl_main_module, "lu");
jl_value_t *LU = jl_call1(lu_fact, (jl_value_t *) A);
jl_function_t *ldiv = jl_get_function(jl_main_module, "ldiv!");
jl_call3(ldiv, (jl_value_t *) y, LU, (jl_value_t *) Ax);

The last call defines y as the solution of the linear system Ay = (Ax) . Since A is non-singular, we expect y and x to be sufficiently close (numerical errors could appear here). We can verify this using

double *yData = (double *) jl_array_data(y);
double norm2 = 0.0;
for (size_t i = 0; i < n; i++) {
double dif = yData[i] - xData[i];
norm2 += dif * dif;
}
cout << "|x - y|² = " << norm2 << endl;

My result was 6.48394e-26 .

To finalize this code, we have to run

JL_GC_POP();

This allows the Julia Garbage Collector to collect the allocated memory. The complete code can be seen below:

9: Sparse matrices

For our next example, we will solve a heat-equation on 1 spatial dimension, using a discretization of time and space called Backward Time Centered Space (BTCS), which is not quick to explain. Check these notes for a thorough explanation.

For our interests, it suffices to say that we will be solving a sparse linear system multiple times, where the matrix is the one below:

Tridiagonal matrix, where the diagonal stores 1 plus 2 times kappa, and the off-diagonal values are -kappa. LaTeX: A = \begin{bmatrix} 1 + 2\kappa & -\kappa \\ -\kappa & 1 + 2\kappa & \kappa \\ & \ddots & \ddots & \ddots \\ & & -\kappa & 1 + 2\kappa & -\kappa \\ & & & -\kappa & 1 + 2\kappa \end{bmatrix}
Tridiagonal matrix, where the diagonal stores 1 plus 2 times kappa, and the off-diagonal values are -kappa. LaTeX: A = \begin{bmatrix} 1 + 2\kappa & -\kappa \\ -\kappa & 1 + 2\kappa & \kappa \\ & \ddots & \ddots & \ddots \\ & & -\kappa & 1 + 2\kappa & -\kappa \\ & & & -\kappa & 1 + 2\kappa \end{bmatrix}

We don’t have to store this matrix as a dense matrix (like in the previous example). Instead, we want to store only the relevant elements. To do that, we will create three vectors for the rows and columns indexes, and for the values corresponding to these indexes.

The code is below:

long int rows[3 * n - 2], cols[3 * n - 2];
double vals[3 * n - 2];
for (size_t i = 0; i < n; i++) {
rows[i] = i + 1;
cols[i] = i + 1;
vals[i] = (1 + 2 * kappa);
if (i < n - 1) {
rows[n + i] = i + 1;
cols[n + i] = i + 2;
vals[n + i] = -kappa;
rows[2 * n + i - 1] = i + 2;
cols[2 * n + i - 1] = i + 1;
vals[2 * n + i - 1] = -kappa;
}
}

Now, we will create a sparse matrix using the sparse function from the SparseArrays module in Julia. For that, we allocate two array types, one for the integers, and one for the floating point numbers.

On the jl_call3 , we also call jl_ptr_to_array_1d to directly create and return a Julia vector wrapping the data we give it.

The A_sparsematrix is a Julia sparse matrix. Many of the matrix operations that work with dense matrices will work with sparse matrices. To test a different factorization, let’s use the function ldl from the LDLFactorizations package.

Now, we can use ldiv! with ldlObj instead of the LU factorization that we used in the previous example. There is one catch, though. Since we are using the ldlObj “for a while”, we need to prevent the Garbage collector to clean it. But the JL_GC_PUSHX function can only be called once per scope. Therefore, to use it we have to create an internal scope. So something like the following:

The complete code is below:

In the algorithm, we define u as the initial vector, then solve the linear system right u as the right-hand side to obtain unew. Then we assign unew to u and repeat. Each u is an approximation to the solution of the heat equation for a specific moment in time.

You will notice that, in addition to computing the solution, we also plot it using the Plots package. We plot the initial solution at different times. This makes the code much slower, unfortunately. The result can be seen below:

The image shows the solution starting from the function given by the exponential of the negative of the distance from the center. The other show the decay of this function, each one flatter and more similar to a symmetric quadratic with negative curvature.
Plot of heat equation solution at different moments in time.

Finalizing and open questions

I hope these 10 examples are helpful to get you started with embedding Julia in C. There are many more things not covered here, in particular things I do not know. Some of them are:

  • How to deal with strings?
  • How to deal with keyword arguments?
  • How to deal with installing packages and environments?
  • How to make it faster (e.g., using precompiled images)?

I will be on the lookout for future projects to investigate these. In the meantime, like and follow for more Julia and C/C++ content.

References and extra material


10 examples of embedding Julia in C/C++ was originally published in Netherlands eScience Center on Medium, where people are continuing the conversation by highlighting and responding to this story.

A Comparison Between Differential Equation Solver Suites In MATLAB, R, Julia, Python, C, Mathematica, Maple, and Fortran

By: Christopher Rackauckas

Re-posted from: http://www.stochasticlifestyle.com/comparison-differential-equation-solver-suites-matlab-r-julia-python-c-fortran/

Many times a scientist is choosing a programming language or a software for a specific purpose. For the field of scientific computing, the methods for solving differential equations are what’s important. What I would like to do is take the time to compare and contrast between the most popular offerings.
This is a good way to reflect upon what’s available and find out where there is room for improvement. I hope that by giving you the details for how each suite was put together (and the “why”, as gathered from software publications) you can come to your own conclusion as to which suites are right for you.

(Full disclosure, I am the lead developer of DifferentialEquations.jl. You will see at the end that DifferentialEquations.jl does offer pretty much everything from the other suite combined, but that’s no accident: our software organization came last and we used these suites as a guiding hand for how to design ours.)

Quick Summary Table

If you just want a quick summary, I created a table which has all of this information. You can find it here (click for PDF):

Comparison Of Differential Equation Solver Software

MATLAB’s Built-In Methods

Due to its popularity, let’s start with MATLAB’s built in differential equation solvers. MATLAB’s differential equation solver suite was described in a research paper by its creator Lawerance Shampine, and this paper is one of the most highly cited SIAM Scientific Computing publications. Shampine also had a few other papers at this time developing the idea of a “methods for a problem solving environment” or a PSE. The idea is pretty simple: users of a problem solving environment (the examples from his papers are MATLAB and Maple) do not have the same requirements as more general users of scientific computing. Instead of focusing on efficiency, they key for this group is to have a clear and neatly defined (universal) interface which has a lot of flexibility.

The MATLAB ODE Suite does extremely well at hitting these goals. MATLAB documents its ODE solvers very well, there’s a similar interface for using each of the different methods, and it tells you in a table in which cases you should use the different methods.

But the modifications to the methods goes even further. Let’s take for example the classic ode45. That method just works and creates good plots, right? Well, Shampine added a little trick to it. When you solve an equation using ode45, the Runge-Kutta method uses a “free” interpolation to fill in some extra points. So between any two steps that the solver takes, it automatically adds in 4 extra points using a 4th order interpolation. This is because high order ODE solvers are good enough at achieving “standard user error tolerances” that they actually achieve quite large timesteps, and in doing so step too infrequently to make a good plot. Shampine’s scheme is a good quick fix to this problem which most people probably never knew was occurring under the hood!

There’s quite a bit of flexibility here. The methods allow you to use complex numbers. You’re given access to the “dense output function” (this is the function which computes the interpolations). There’s a few options you can tweak. Every one of these methods is setup with event handling, and there are methods which can handle differential-algebraic equations. There are also dde23 and ddesd for delay differential equations, and in the financial toolbox there’s an Euler-Maruyama method for SDEs.

While MATLAB does an excellent job at giving a large amount of easily available functionality, where it lacks is performance. There’s a few reasons for this. For one thing, these modifications like adding extra points to the solution array can really increase the amount of memory consumed if the ODE system is large! This actually has an impact in other ways. There’s a very good example of this in ode45. ode45 is based on the Dormand-Prince 5(4) pair. However, in 1999, the same year the MATLAB ODE Suite was published, Shampine released a paper with a new 5(4) pair which was more efficient than the Dormand-Prince method. So that begs the question, why wasn’t this used in the MATLAB ODE Suite (it’s clear Shampine new about it!)? (I actually asked him in an email) The reason is because its interpolation requires calculating some extra steps, so it’s less efficient if you are ALWAYS interpolating. But since ode45 is always interpolating in order to make the plots look nicer, this would get in the way. Essentially, it can be more efficient, but MATLAB sets things up for nice plotting and not pure efficiency.

But there are other areas where more efficient methods were passed up during the development phase of the ODE suite. For example, Hairer’s benchmarks in his book Solving Ordinary Differential Equations I and II (the second is for stiff problems), along with the benchmarks from the Julia DifferentialEquations.jl suite, consistently show that high order Runge-Kutta methods are usually the most efficient methods for high accuracy solving of nonstiff ODEs. These benchmarks both consistently show that, for the same error, high order Runge-Kutta methods (like order >6) can solve the equation much faster than methods like Adams methods. But MATLAB does not offer high order Runge-Kutta methods and only offers ode113 (an Adams method) for high-accuracy solving.

Some of this is due to a limitation within MATLAB itself. MATLAB’s ODE solver requires taking in a user-defined function, and since this function is defined in MATLAB its function calls are very inefficient and expensive. Thus MATLAB’s ODE solver suite can become more efficient by using methods which reduce the number of function calls (which multistep methods do). But this isn’t the only case where efficient methods are missing. Both Hairer and the JuliaDiffEq benchmarks show that high-order Rosenbrock methods are the most efficient for low-medium accuracy stiff ODEs, but MATLAB doesn’t offer these methods. It does offer ode23s, a low-order Rosenbrock method, and ode15s which is a multistep method. ode15s is supposed to be the “meat and potatoes of the MATLAB Suite”, but it cannot handle all equations of since its higher order methods (it’s adaptive order) are not L-stable (and not even A-stable). For this reason there’s a few other low order SDIRK methods (ode23tb, an ESDIRK method for highly stiff problems) which are recommended to fill in the gaps, but none of the higher order variants which are known to be more efficient for many equations.

This pattern goes beyond the ODE solvers. The DDE solvers are all low order, and in the case of ddesd, it’s a low accuracy method which is fast for getting plots correct but not something which converges to many decimal places all too well since it doesn’t explicitly track discontinuities. This is even seen in the paper on the method which shows the convergence only matches dde23 to graphical accuracy on a constant-delay problem. Again, this fits in with the mantra of the suite, but may not hit all demographics. Shampine specifically made a separate version of ddesd for Fortran for people who are interested in efficiency, which is another way of noting that the key of ddesd is features and automatic usage, and not hardcore scientific computing efficiency. The mentioned SDE solver from the financial toolbox is only order 0.5 and thus requires quite a small dt to be accurate.

And I can keep going, but I think you get the moral of the story. This suite was created with one purpose in mind: to make it very easy to solve a wide array of differential equations and get a nice plot out. It does a very good job at doing so. But it wasn’t made with efficiency in mind, and so it’s missing a lot of methods that may be useful if you want high efficiency or high accuracy. Adding these wouldn’t make sense to the MATLAB suite anyways since it would clutter the offering. MATLAB is about ease of use, and not efficiency, and it does extremely well at what it set out to do. For software that was first made before the Y2K crisis with just a few methods added later, it still holds up very well.

Hairer’s Solvers (Fortran)

Next I want to bring up some Fortran solvers because they will come up later. Hairer’s Fortran solvers are a set methods with similar interfaces that were designed with efficiency in mind. Many of these methods are classics: dopri5, dop853, radau, and rodas will specifically show up in many of the suites which are discussed later. These methods are not too flexible: they don’t allow event handling (though with enough gusto you can use the dense output to write your own), or numbers that aren’t double-precision floating point numbers (it’s Fortran). They have a good set of options for tweaking parameters to make the adaptive timestepping more efficient, though you may have to read a few textbooks to know exactly what they do. And that’s the key to them: they will solve an ODE, stiff or non-stiff, and they will do so pretty efficiently, but nothing more. But even then, they show some age which don’t make them “perfectly efficient”. These solvers include their own linear algebra routines which don’t multithread like standard BLAS and LINPACK implementations, meaning that they won’t make full use of modern CPU architectures. The computations don’t necessarily SIMD or use FMA. But most of all, to use it directly you need to use Fortran which would be turn off for many people.

There is some flexibility in the offering. There are symplectic solvers for second order ODEs, the stiff solvers allow for solving DAEs in mass matrix form, there’s a constant-lag nonstiff delay differential equation solver (RETARD), there is a fantastic generalization of radau to stiff state-dependent delay differential equations (RADAR5), and there’s some solvers specifically for some “mechanical ODEs” commonly found in physical problems. Of course, to get this all working you’ll need to be pretty familiar with Fortran, but this is a good suite to look at if you are.

ODEPACK and Netlib ODE Solvers (Fortran)

ODEPACK is an old set of ODE solvers which accumulated over time. You can learn a bit about its history by reading this interview with Alan Hindenmarsh. I also bundle the Netlib suite together with it. There’s a wide variety of methods in there. There’s old Runge-Kutta methods due to Shampine, some of Shampine’s old multistep methods ddebdf and ddeabm, etc. The reason why this pack is really important though is because of the Lawarance Livermore set of methods, specifically LSODE and its related methods (LSODA, LSODR, VODE, etc.). It includes methods for implicit ODEs (DAEs) as well. These are a group of multistep methods which are descendent from GEAR, the original BDF code. They are pretty low level and thus allow you to control the solver step by step, and some of them have “rootfinding capabilities”. This means that you can use these to add event handling, though an event handling interface will take some coding. There’s lots of options and these are generally pretty performant for a large array of problems, but they do show their age in the same way that the Hairer codes do. Their linear algebra setups do not make use of modern BLAS and LINPACK, which in practical terms means they don’t make full use of modern computer architectures to speed things up. They don’t always make use of SIMD and other modern CPU acceleration features. They are the vanilla ODE solvers which have existed through time. In particular, LSODA is popular because this is the most widely distributed method which automatically detects stiffness and swtiches between integration methods, though it should be pointed out that there is a performance penalty from this feature.

Sundials and ARKCODE (C++ and Fortran)

Sundials‘ CVODE is a re-write of VODE to C which is a descendent of LSODE which is a descendent itself of the original GEAR multistep code. Yes, this has a long history. But you should think of it as “LSODE upgraded”: it makes use of modern BLAS/LINPACK, and also a bunch of other efficient C/Fortran linear solvers, to give a very efficient Adams and BDF method. Its solver IDA is like CVODE but handles implicit ODEs (DAEs). The interface for these is very similar to the ODEPACK interface, which means you can control it step by step and use the rootfinding capabilities to write your own event handling interface. Since the Adams methods handle nonstiff ODEs and the BDF methods handle stiff ODEs, this performance plus flexibility makes it the “one-stop-shop” for ODE solving. Many different scientific computing software internally are making use of Sundials because it can handle just about anything. Well, it does have many limitations. For one, this only works with standard C++ numbers and arrays. There’s no stochasticity or delays allowed. But more importantly, Hairer and DifferentialEquations.jl’s show that these multistep methods are usually not the most efficient methods since high order Rosenbrock, (E)SDIRK, and fully implicit RK methods are usually faster for the same accuracy for stiff problems, and high order Runge-Kutta are faster than the Adams methods for the same accuracy. Multistep methods are also not very stable at their higher orders, and so at higher tolerances (lower accuracy) these methods may fail to have their steps converge on standard test problems (see this note in the ROBER tests), meaning that you may have to increase the accuracy (and thus computational cost) due to stiffness issues. But, since multistep methods only require a single function evaluation per step (or are implicit in only single steps), they are the most efficient for asymptotically hard problems (i.e. when the derivative calculation is very expensive or the ODE is a large 10,000+ system). For this reason, these methods excel at solving large discretizations of PDEs. To top it off, there are parallel (MPI) versions for using CVODE/IDA for HPC applications.

I also want to note that recently Sundials added ARKCODE, a set of Runge-Kutta methods. These include explicit Runge-Kutta methods and SDIRK methods, including additive Runge-Kutta methods for IMEX methods (i.e. you can split out a portion to solve explicitly so that the implicit portion is cheaper when you know your problem is only partly or semi stiff). This means that it covers the methods which I mentioned earlier are more efficient in many of the cases (though it is a bit lacking on the explicit tableaus and thus could be more efficient, but that’s just details).

If you are using C++ or Fortran and want to write to only one interface, the Sundials suite is a great Swiss Army knife. And if you have an asymtopically large problem or very expensive function evaluations, this will be the most efficient as well. Plus the parallel versions are cool! You do have to live with the limitations of low-level software forcing specific number and array types, along with the fact that you need to write your own event handling, but if you’re “hardcore” and writing in a compiled language this suite is a good bet.

SciPy’s Solvers (Python)

Now we come to SciPy’s suite. If you look at what it offers, the names should now start to sound familiar: LSODA, VODE, dopri5, dop853. That is write: SciPy is simply a wrapper over Hairer’s and ODEPACK’s methods. Because it writes a generic interface though, it doesn’t have the granularity that is offered by ODEPACK, meaning that you don’t have step-by-step control and no event handling. The tweaking options are very basic as well. Basically, they let you define a function in Python, say what timeframe to solve it on, give a few tolerances, and it calls Fortran code and spits back a solution. It only has ODE solvers, no differential-algebraic, delay, or stochastic solvers. It only allows the basic number types and does no event handling. Super vanilla, but gets the job done? As with the methods analysis, it has the high order Runge-Kutta methods for efficient solving of nonstiff equations, but it’s missing Rosenbrock and SDIRK methods entirely, opting to only provide the multistep methods.

I should note here that it has the same limitation as MATLAB though, namely that the user’s function is Python code. Since the derivative function is where the ODE solver spends most of its time (for sufficiently difficult problems), this means that even though you are calling Fortran code, you will lose out on a lot of efficiency. Still, if efficiency isn’t a big deal and you don’t need bells and whistles, this suite will do the basics.

deSolve Package (R)

There’s not much to say other than that deSolve is very much like SciPy’s suite. It wraps almost the same solvers, has pretty much the same limitations, and has the same efficiency problem since in this case it’s calling the user-provided R function most of the time. One advantage is that it does have event handling. Less vanilla with a little bit more features, but generally the same as SciPy.

PyDSTool (Python)

PyDSTool is an odd little beast. Part of the software is for analytic continuation (i.e. bifurcation plotting). But another part of it is for ODE solvers. It contains one ODE solver which is written in Python itself and it recommends against actually using this for efficiency reasons. Instead, it wraps some of the Hairer methods, specifically dopri5 and radau, and recommends these. But it’s different than SciPy in that it takes in the specification of the ODE as a string, and compiles it to a C function, and uses this inside the ODE solver. By doing so, it’s much more efficient. We still note that its array of available methods is small and it offers radau which is great for high accuracy ODEs and DAEs, but is not the most efficient at lower accuracy so it would’ve been nice to see Rosenbrock and ESDIRK methods. It has some basic event handling and methods for DDEs (again as wrappers to a Hairer method). This is a pretty good suite if you can get it working, though I do caution that getting the extra (non-Python) methods setup and compiled is nontrivial. One big point to note is that I find the documentation spectacularly difficult to parse. Together, it’s pretty efficient and has a good set of methods which will do basic event handling and solve problems at double precision.

JiTCODE and JiTCSDE (Python)

JiTCODE is another Python library which makes things efficient by compiling the function that the user provides. It uses the SciPy integrators and does something similar to PyDSTool in order to get efficiency. I haven’t tried it out myself but I’ll assume this will get you as efficient as though you used it from Fortran. However, it is lacking in the features department, not offering advanced things like arbitrary number types, event handling, etc. But if you have a vanilla ODE to solve and you want to easily do it efficiently in Python, this is a good option to look at.

Additionally, JiTCDDE is a version for constant-lag DDEs similar to dde23. JiTCSDE is a version for stochastic differential equations. It uses the high order (strong order 1.5) adaptive Runge-Kutta method for diagonal noise SDEs developed by Rackauckas (that’s me) and Nie which has been demonstrated as much more efficient than the low order and fixed timestep methods found in the other offerings. It employs the same compilation setup as JitCODE so it should create efficient code as well. I haven’t used this myself but it would probably be a very efficient ODE/DDE/SDE solver if you want to use Python and don’t need events and other sugar.

Boost ODEINT Solver Library (C++)

The Boost ODEINT solver library has some efficient implementations of some basic explicit Runge-Kutta methods (including high order RK methods) and some basic Rosenbrock methods (including a high order one). Thus it can be pretty efficient for solving the most standard stiff and nonstiff ODEs. However, its implementations do not make use of advanced timestepping techniques (PI-controllers and Gustofsson accleration) which makes it require more steps to achieve the same accuracy as some of the more advanced software, making it not really any more efficient in practice. It doesn’t have event handling, but it is flexible with the number and array types you can put in there via C++ templates. It and DifferentialEquations.jl are the only two suites that are mentioned that allow for solving the differential equations on the GPU. Thus if you are familiar with templates and really want to make use of them, this might be the library to look at, otherwise you’re probably better off looking elsewhere like Sundials.

GSL ODE Solvers (C)

To say it lightly, the GSL ODE solvers are kind of a tragedy. Actually, they are just kind of weird. When comparing their choices against what is known to be efficient according to the ODE research and benchmarks, the methods that they choose to implement are pretty bizarre like extrapolation methods which have repeatedly been shown to not be very efficient, while not included other more important methods. But they do have some of the basics like multistep Adams and BDF methods. This, like Boost, doesn’t do all of the fancy PI-controlled adaptivity and everything though, so YMMV. This benchmark, while not measuring runtime and only uses function evaluations (which can be very very different according to more
sophisticated benchmarks like the Hairer and DifferentialEquations.jl ones!), clearly shows that the GSL solvers can take way too many function evaluations because of this and thus, since it’s using methods similar to LSODA/LSODE/dopri5, probably have much higher runtimes than they should.

Mathematica’s ODE Solvers

Mathematica’s ODE solvers are very sophisticated. It has a lot of explicit Runge-Kutta methods, including newer high order efficient methods due to Verner and Shampine’s efficient method mentioned in the MATLAB section. These methods can be almost 5x faster than the older high order explicit RK methods which themselves are the most efficient class of methods for many nonstiff ODEs, and thus these do quite well. It
includes interpolations to make their solutions continuous functions that plot nicely. Its native methods are able to make full use of Mathematica and its arbitrary precision, but sadly most of the methods it uses are wrappers for the classic solvers. Its stiff solvers mainly call into Sundials or LSODA. By using LSODA, it tends to be able to do automatic stiffness detection by default. It also wraps Sundials’ IDA for DAEs. It uses a method of steps implementation over its explicit Runge-Kutta methods for solving nonstiff DDEs efficiently, and includes high order Runge-Kutta methods for stochastic differential equations (though it doesn’t do adaptive timestepping in this case). One nice feature is that all solutions come with an interpolation to make them continuous. Additionally, it can use the symbolic form of the user-provided equation in order to create a function for the Jacobian, and use that (instead of a numerical differentiation fallback like all of the previous methods) in the stiff solvers for more accuracy and efficiency. It also has symplectic methods for solving second order ODEs, and its event handling is very expressive. It’s very impressive, but since it’s using a lot of wrapped methods you cannot always make use of Mathematica’s arbitrary precision inside of these numerical methods. Additionally, its interface doesn’t give you control over all of the fine details of the solvers that it wraps.

Maple’s ODE Solvers

Maple’s ODE solvers are pretty basic. It defaults to a 6th order explicit RK method due to Verner (not the newer more efficient ones though), and for stiff problems it uses a high order Rosenbrock method. It also wraps LSODE like everything else. It has some basic event handling, but not much more. As another symbolic programming language, it computes Jacobians analytically to pass to the stiff solvers like Mathematica, helping it out in that respect, but its offerings pale in comparison to Mathematica.

FATODE (Fortran)

FATODE
is a set of methods written in Fortran. It includes explicit Runge-Kutta methods, SDIRK methods, Rosenbrock methods and fully implicit RK methods. Thus it has something that’s pretty efficient for pretty much every case. What makes it special is that it includes the ability to do sensitivity analysis calcuations. It can’t do anything but double-precision numbers and doesn’t have event handling, but the sensitivity calculations makes it quite special. If you’re a FORTRAN programmer, this is worth a look, especially if you want to do sensitivity analysis.

DifferentialEquations.jl (Julia)

Okay, time for DifferentialEquations.jl. I left it for last because it is by far the most complex of the solver suites, and pulls ideas from many of them. While most of the other suite offer no more than about 15 methods on the high end (with most offering about 8 or less), DifferentialEquations.jl offers 200+ methods and is continually growing. Like the standard Python and R suites, it offers wrappers to Sundials, ODEPACK, and Hairer methods. However, since Julia code is always JIT compiled, its wrappers are more akin to PyDSTool or JiTCODE in terms of efficiency. Thus all of the standard methods mentioned before are available in this suite.

But then there are the native Julia methods. For ODEs, these include explicit Runge-Kutta methods, (E)SDIRK methods, and Rosenbrock methods. In each of these categories it has a huge amount of variety, offering pretty much every method from the other suites along with some unique methods. Some unique methods to point out are that it has the only 5th order Rosenbrock method, it has the efficient Verner methods discussed in the Mathematica part, it has newer 5th order methods (i.e. it includes the Bogacki-Shampine method discussed as an alternative to ode45’s tableau, along with an even newer tableau due to Tsitorious which is even more efficient). It has methods specialized to reduce interpolation error (the OwrenZen methods), and methods which are strong-stability presurving (for hyperbolic PDEs). It by default the solutions as continuous functions via a high order interpolation (though this can be turned off to make the saving more efficient). Each of these implementations employ a lot of extra tricks for efficiency. For example, the interpolation is “lazy”, meaning that if the method requires extra function evaluations for the continuous form, it will only do those extra calculations when the continuous function is used (so when you directly ask for it or when you plot). This is just a peak at the special things the library does to gain an edge.

The native Julia methods benchmark very well as well, and all of the benchmarks are openly available. Essentially, these methods make use of the native multithreading of modern BLAS/LINPACK, FMA, SIMD, and all of the extra little compiler goodies that allows code to be efficient, along with newer solver methods which theoretically reduce the amount of work that’s required to get the same error. They even allow you to tweak a lot of the internals and swap out the linear algebra routines to use parallel solvers like PETSc. The result is that these methods usually outperform the classic C/Fortran methods which are wrapped. Additionally, it has ways to symbolically calculate Jacobians like Mathematica/Maple, and instead of defaulting to numerical differentiation the stiff solvers fall back to automatic differentiation which is more efficient and has much increased accuracy.

Its event handling is the most advanced out of all of the offerings. You can change just about anything. You can make your ODE do things like change its size during the solving if you want, and you can make the event handling adjust and adapt internal solver parameters. It’s not a hyperbole to say that the user is given “full control” since the differential equation solvers themselves are written as essentially a method on the event handling interface, meaning that anything it can do internally you can do.

The variety of methods is also much more diverse than the other offerings. It has symplectic integrators like Harier’s suite, but has more high and low order methods. It has a range of Runge-Kutta Nystrom methods for efficiently solving second order ODEs. It has the same high order adaptive method for diagonal noise SDEs as JiTCSDE, but also includes high order adaptive methods specifically for additive noise SDEs. It also has methods for stiff SDEs in Ito and Stratanovich interpretations, and allows for event handling in the SDE case (with the full flexibility). It has DDE solvers for constant-lag and state-dependent delays, and it has stiff solvers for each of these cases. The stiff solvers also all allow for solving DAEs in mass matrix form (though fully implicit ODEs are possible, but can only be solved using a few methods like a wrapper to Sundials’ IDA and doesn’t include event handling here quite yet).

It allows arbitrary numbers and arrays like Boost. This means you can use arbitrary precision numbers in the native Julia methods, or you can use “array-like” objects to model multiscale biological organisms instead of always having to use simple contiguous arrays. It has addons for things like sensitivity analysis and parameter estimation. Also like Boost, it can solve equations on the
GPU by using a GPUArray.

It hits the strong points of each of the previously mentioned suites because it was designed from the get-go to do so. And it benchmarks extremely well. The only downside is that, because it is so feature and performance focused, its documentation is heavy. The beginning tutorial will give you the basics (making it hopefully as easy as MATLAB to pick up), but pretty much every control knob is available, making the extended portion of the documentation a long read.

Conclusion

Let’s take a step back and summarize this information. DifferentialEquations.jl objectively has the largest feature-set, swamping most of the others while wrapping all of the common solvers. Since it also features solvers for the non-ordinary differential equations and its unique Julia methods also benchmarks well, I think it’s clear that DifferentialEquations.jl is by far the best choice for “power-users” who are looking for a comprehensive suite.

As for the other choices from scripting languages, MATLAB wasn’t designed to have all of the most efficient methods, but it’ll handle basic equations with delays and events and output good plots. R’s deSolve is similar in most respects to MATLAB. SciPy’s offering is lacking in comparison to MATLAB and R’s due to the lack of event handling. But MATLAB/Python/R all have efficiency problems due to the fact that the user’s function is written in the scripting language. JiTCODE and PyDSTool are two Python offerings make the interface to the Fortran solvers more efficient than straight SciPy. Mathematica and Maple will do symbolic pre-calculations to speed things up and can JiT compile functions, along with offering pretty good event handling, and thus their wrappers are more like DifferentialEquations.jl in terms of flexibility and efficiency (and Mathematica had a few non-wrapper goodies mentioned as well). So in a pinch when not all of the bells and whistles are necessary, each of these scripting language suites will get you by. Behind DifferentialEquations.jl, I would definitely put Mathematica’s suite second for scripting languages with everything else much further behind.

If you’re already hardcore and writing in C++/Fortran, Sundials is a pretty good “one-stop-shop” to get everything you need, especially when you add in ARKCODE. Still, you’re going to have to write a lot of stuff yourself to make the rootfinding into an event handling interface, but if you put the work in
this will do you well. Hairer’s codes are a great set of classics which cover a wide variety of equations, and FATODE is the only non-DifferentialEquations.jl suite which offers a way to calculate sensitivity equations (and its sensitivity equations are more advanced). Any of these will do you well if you want to really get down-and-dirty in a compiled language and write a lot of the interfaces yourself, but they will be a sacrifice in productivity with no clear performance gain over the scripting language methods which also include some form of JIT compilation. With these in mind, I don’t really see a purpose for the GSL or Boost suites, and the ODEPACK methods are generally outdated.

I hope this review helps you make a choice which is right for you.

The post A Comparison Between Differential Equation Solver Suites In MATLAB, R, Julia, Python, C, Mathematica, Maple, and Fortran appeared first on Stochastic Lifestyle.

Julia calling C: A more minimal example

Earlier I presented a minimal example of Julia calling C. It mimics how one would go about writing C code, wrapping it a library and then calling it from Julia. Today I came across and even more minimal way of doing that while reading an excellent blog on Julia’s syntactic loop fusion. Associated with the blog was notebook that explores the matter further.

Basically, you an write you C in a string and pass it directly to the compiler. It goes something like

C_code= """
       double mean(double a, double b) {
         return (a+b) / 2;
       }
       """
const Clib=tempname()
open(`gcc -fPIC -O3 -xc -shared -o $(Clib * "." * Libdl.dlext) -`, "w") do f
     print(f, C_code)
end

The tempname function generate a unique temporary file path. On my Linux system Clib will be string like "/tmp/juliaivzRkT". That path is used to generate a library name "/tmp/juliaivzRkT.so" which will then used in the ccall:

julia> x=ccall((:mean,Clib),Float64,(Float64,Float64),2.0,5.0)
3.5

This approach would be be recommended if are writing anything sophisticated in C. However, it fun to experiment with for short bits of C code that you might like to call from Julia. Saves you the hassle of creating a Makefile, compiling, etc…