My Julia workflow

By: Tamás K. Papp

Re-posted from: https://tpapp.github.io/post/julia-workflow/

This is a summary of the workflow I find ideal for working with Julia. Although the manual has a section on workflow, it does not mention all the tools that I find useful, so perhaps this will benefit some users of Julia.

I use Emacs, with julia-mode (for editing the source) and julia-repl (REPL integration). The latter is my own package; you can use ESS instead, which has some advantages (eg multiple inferior processes) and disadvantages (no ANSI terminal support). The choice of an editor is highly subjective: at the end of the day, all you need is one that is capable of sending code to the REPL and can, in turn, be used by the REPL to open a file at a particular point. I use

ENV["EDITOR"] = "emacsclient"

in ~/.juliarc.jl to ensure this. This helps me find code with the @edit macro.

Small code snippets and experiments below ~30 lines just go into files, from which I send regions of code to the REPL. Frequently, for throwaway code, I just open a file in /tmp/, which will get removed automatically after the next reboot.

Even very small projects get their own package. This way I get version control1 and a sensible structure for unit tests set up automatically. I put my own packages in their own directory, keeping them separate from Pkg.dir(). This allows me to use the same package across Julia versions, and makes Pkg.update() ignore them. I tell Julia where they are with

const LOCAL_PACKAGES = expanduser("~/src/julia-local-packages/")
push!(LOAD_PATH, LOCAL_PACKAGES)

I create local packages with

import PkgDev
PkgDev.generate("MyPkg", "MIT"; path = joinpath(LOCAL_PACKAGES, "MyPkg"))

Then I open the file and start working on it with

using MyPkg

I use Revise.jl to automate reloading.2 This package has changed my workflow completely; it can cope with most changes, except for type redefinitions. For these, I need to restart the REPL.

To test my code, I use Pkg.test with RoguePkg.jl, which makes it find packages outside Pkg.dir() for testing and benchmarks:

Pkg.test(pkg_for"MyPkg")

  1. I use the amazing magit for interacting with git — having obtained funding on KickStarter recently, it is bound to become even more convenient. [return]
  2. You just need to set it up once according to its documentation, after that it is automatic. [return]

PyData Warsaw 2017 example

By: Bogumił Kamiński

Re-posted from: http://juliasnippets.blogspot.com/2017/10/pydata-warsaw-2017-example.html

Following several requests in this post I am presenting the Asian option pricing example that I have discussed during PyData Warsaw 2017 (after the first day of talks I highly recommend everyone to come tomorrow as it is an excellent event).

The problem is taken from the book Foundations and Methods of Stochastic Simulation by Barry Nelson where it is solved using VBA in section 4.5.

I will not repeat the whole discussion of the model, but focus on the numerical aspect. We are asked to calculate the value of the expression:

where X is generated by a continuous state geometric Brownian motion. We will use Monte Carlo simulation to approximate the above expected value.

A single pass of the Monte Carlo simulation is approximated by a discrete sum:

The parameters we will use in the simulation are: T=1, r=0.05, K=55, σ=0.3, m=1000 and starting value of X at time 0 is 50. We will run 100,000 replications of Monte Carlo simulation and calculate the average.

I want to compare five implementations of the above problem:

  • Julia using loops
  • Julia using vectorized code
  • Python using loops
  • Numba using loops
  • NumPy using vectorized code
First go the codes in Julia (running time is given in the comment at the end):
And here are the codes in Python:
Julia implementation is a bit more fancy as it promotes the arguments to a common type on the run and in Python I pass all values as floats (which is simpler). Also vectorized code in Julia uses in-place updating. In general I tried to make the implementations a normal code in respective languages.

The key take-aways are the following:

  • standard Python is a no-go solution for this problem;
  • loops in Julia are fastest;
  • somewhat surprisingly vectorized Julia code is faster than Numba although the former has to allocate more memory;
  • NumPy implementation is around three times slower than vectorized Julia;
  • Vectorized Julia code hugely benefits from in-place operations (that avoid memory allocation); however, even without these optimizations it was faster than Numba.

Switching from Common Lisp to Julia

By: Tamás K. Papp

Re-posted from: https://tpapp.github.io/post/common-lisp-to-julia/

I have written this post for developers in the Common Lisp community
who asked why I am switching to Julia. It may only be relevant for the
small set of people who use Common Lisp for scientific computing.

I used Common Lisp for scientific computing for a while, from 2008 to
about 2015, in combination with R and C++. This choice may surprise
people who don’t know about projects like
Maxima or
FEMLISP, but Common Lisp is not a bad
language for scientific
computing
:
it has a great FFI, compilers like SBCL can
generate very fast code with a few hints, and the language itself is
composed of convenient features that interact nicely.

However, around 2012 I started to become very frustrated with Common
Lisp. Despite various attempts, it became very clear that libraries
for scientific computing were not goint to take off: there were many
one-person efforts (including mine), but very few of
them evolved into general tools.

Initially, I was puzzled by this: Common Lisp is an extremely
convenient and productive language. Experienced Lisp hackers can write
very complex, fast, and elegant libraries in reasonably short
time. Why did this not happen for numerical code?

The problem with Common Lisp

Now I think that one of the main reasons for this is that while you
can write scientific code in CL that will be (1) fast, (2) portable,
and (3) convenient, you cannot do all of these at the same
time
. Arrays provide a convenient example for this.

Consider

(make-array 5 :element-type 'double-float)

The standard does not
guarantee that this gives you an array of double-float: it may (if
the implementation provides them), otherwise you get an array of
element type T. This turned out to be a major difficulty for
implementing portable scientific code in Common Lisp.

However, this gets worse: while you can tell a function that operates
on arrays that these arrays have element type double-float, you
cannot dispatch on this, as Common Lisp does not have parametric
types. For example, if you want to write a sum as

(defmethod mysum ((vec vector))
  (let ((s 0))
    (loop for elt across vec
       do (incf s elt))
    s))

you can dispatch on the argument being a vector, but not on the
element type. The compiled code may be generic.

You can of course branch on the array element types and maybe
even paper over the whole mess with sufficient macrology (which is
what LLA ended up doing), but this
approach is not very extensible, as eventually you end up hardcoding a
few special types for which your functions will be "fast", otherwise
they have to fall back to a generic, boxed type. With multiple
arguments, the number of combinations explodes very quickly.

How Julia solves this problem

A comparable native implementation in Julia would be1

function mysum(vec::AbstractVector{T}) where T
    s = zero(T)
    for elt in vec
        s += elt
    end
    s
end

This is still generic: it works for all subtypes of AbstractVector
(including vectors and vector-like objects), but notice how the generated code
is conditional on the element type:

julia> @code_warntype mysum([1, 2, 3])
Variables:
  #self#::#mysum
  vec::Array{Int64,1}
  elt::Int64
  #temp#::Int64
  s::Int64

Body:
  begin 
      s::Int64 = 0 # line 3:
      #temp#::Int64 = 1
      4: 
      unless (Base.not_int)((#temp#::Int64 === (Base.add_int)((Base.arraylen)(vec::Array{Int64,1
})::Int64, 1)::Int64)::Bool)::Bool goto 14                                                     
      SSAValue(2) = (Base.arrayref)(vec::Array{Int64,1}, #temp#::Int64)::Int64
      SSAValue(3) = (Base.add_int)(#temp#::Int64, 1)::Int64
      elt::Int64 = SSAValue(2)
      #temp#::Int64 = SSAValue(3) # line 4:
      s::Int64 = (Base.add_int)(s::Int64, elt::Int64)::Int64
      12: 
      goto 4
      14:  # line 6:
      return s::Int64
  end::Int64

julia> @code_warntype mysum([1.0, 2.0, 3.0])
Variables:
  #self#::#mysum
  vec::Array{Float64,1}
  elt::Float64
  #temp#::Int64
  s::Float64

Body:
  begin 
      s::Float64 = (Base.sitofp)(Float64, 0)::Float64 # line 3:
      #temp#::Int64 = 1
      4: 
      unless (Base.not_int)((#temp#::Int64 === (Base.add_int)((Base.arraylen)(vec::Array{Float64
,1})::Int64, 1)::Int64)::Bool)::Bool goto 14                                                   
      SSAValue(2) = (Base.arrayref)(vec::Array{Float64,1}, #temp#::Int64)::Float64
      SSAValue(3) = (Base.add_int)(#temp#::Int64, 1)::Int64
      elt::Float64 = SSAValue(2)
      #temp#::Int64 = SSAValue(3) # line 4:
      s::Float64 = (Base.add_float)(s::Float64, elt::Float64)::Float64
      12: 
      goto 4
      14:  # line 6:
      return s::Float64
  end::Float64

I mentioned "vector-like objects" above, since I can choose different
representations for special objects. For example, to do calculations
with a vector of 1s, I can define

struct Ones{T <: Number} <: AbstractVector{T}
    len::Int
end

At this point, in order to calculate the sum above, I have two choices:

  1. Implement the relevant
    interface
    ,
    with functions like

    Base.length(x::Ones) = x.len

    and similarly for element access, etc. This would generate specialized code for the method above, reasonably efficient code, but still iterate over the "elements".

  2. In addition, I can define

    mysum(vec::Ones{T}) where T = vec.len * one(T)

    which would provide a method for mysum.

A sufficiently rich parametric type system with multiple dispatch
integrated into the language and supported by a JIT compiler is the
secret weapon of Julia. Most of the time, you don’t have to do
anything
, as it happens automatically for concrete types. Sometimes,
you have to help the compiler a bit, by writing code where the result
is type
stable
,
ie the result type just depends on the type (not the value) of the
arguments and can be inferred by the compiler. Sometimes you have to
nudge the compiler a bit, and sometimes you have to be careful not to
mess up type inference: for example, the zero(T) above gives a 0
of type T, always ensuring a correct type that does not change
during the summation.

Comparison of other language features

While I would say that multiple dispatch with parametric types
designed into the language from the ground up is the most important
feature of Julia, there are other language features worth comparing to
Common Lisp.

Metaprogramming
is supported. Because of infix syntax, the AST is not as simple as
S-expressions, but the tools to work with it are evolving fast. That
said, I don’t write as many macros as I did in Common Lisp. Parametric
types are so powerful that I rarely need macros for performance
reasons, and instead of syntax extensions, I often go for zero-cost
abstraction via functions and wrapper types. An interesting
metaprogramming tool in Julia is generated
functions
,
which allow code generation based on argument templates, I use this
frequently. The equivalent of reader macros are called non-standard
string
literals

in Julia.

The foreign function interface of Julia is seamlessly integrated into
the language and very convenient to use. Docstrings are almost the
same as in Common Lisp, but they support Markdown. Strings are UTF8 by
default, and very fast. The
community is very vibrant, open,
and helpful. Simple questions get an answer within minutes,
complicated ones (eg compiler internals) may take a bit longer, but
are usually answered within a few hours or a day at most. If you are
coming from the Common Lisp community, you will see quite a few
familiar faces.

The library ecosystem already surpasses that of Common Lisp, at least
for scientific programming. High-quality, well-tested code is
available for linear algebra including sparse matrices (most of it in
the standard library), optimization, differential equations, and
automatic differentiation. The
latter is simply
amazing: by providing a type for dual numbers and a few operations,
forward-mode AD can be used without any implementation
overhead. Plotting libraries are available (mostly using foreign
function backends), and R-like "dataframes" are under development.

Conclusion

Common Lisp has been and remains a great language, with many excellent
features that preceded other languages by decades. It has an ANSI
standard, which means that portable code written decades ago
will run on a recent implementation. This is a great advantage, but at
the same time this freezes the language development at the point the
standard was finalized. No matter how flexible and forward-looking it
is, it cannot predict and accommodate all possible advances for
decades.

In contrast, Julia is rapidly evolving. At this stage, code that was
written half a year ago is very likely to be broken with the most
recent release.2 The pace of change will most likely slow down a bit after 1.0 is released, but for now, expect a lot of churning.

On the other hand, programmers who used Common Lisp for scientific
computing have always expected to get their hands dirty, since so
little existing code was available. This is a good time to consider
investing into Julia instead: you will get more done with less work,
and you still get to program in a very elegant language that traces a
lot of its roots to the Lisp family.


  1. This is not the fastest, nor the most precise implementation, just a comparable example. [return]
  2. An elegant deprecation mechanism is available, but that can’t deal with some fundamental language changes. [return]