Author Archives: Julia Computing, Inc.

Unums.jl: implementing a custom number format in Julia

Unums

The unum is a floating point format proposed by John Gustafson, proposed as an alternative to the now ubiquitious IEEE 754 formats. The proposal and justification are explained in his somewhat ambitiously-titled book The end of error, but the following slides give a good overview:

The two defining features of the unum format are:

  • a variable-width storage format for both the significand and exponent, and
  • an “u-bit”, which determines whether the unum corresponds to an exact number (u=0), or an interval between consecutive exact unums (u=1). In this way, the unums cover the entire extended real number line [-∞,+∞].

For performing computation with the format, Gustafson proposes using interval arithmetic with a pair of unums, what he calls an ubound, providing the guarantee that the resulting interval contains the exact solution (he also suggests a more complicated ubox method, but I won’t discuss this further).

Unums.jl

The Unums.jl package provides an implementation of the unum format and ubound arithmetic in Julia.

New instances can be constructed from existing numbers

julia> x = Unum22(1)
Unums.Unum{2,2,UInt16}
1.0

julia> y = Unum22(2.1)
Unums.Unum{2,2,UInt16}
(2.0,2.125)

As 2.1 (or, more correctly, the Float64 nearest 2.1) cannot be represented as an exact Unum, the resulting Unum represents an open interval.

The low-level bit representation of an unum is similar to that of an IEEE754 floating point number, but with extra places for the ubit and the size of the exponent and fraction:

|sign|exponent|fraction|ubit|exponent size|fraction size|

This is available via the print_bits function:

julia> Unums.print_bits(x)
|0|01|0|0|01|00|

julia> Unums.print_bits(y)
|0|1|0000|1|00|11|

Standard arithmetic operations are defined, and Unums will automatically be promoted to Ubounds:

julia> x+y
Unums.Ubound{2,2,UInt16}
(3.0,3.125)

julia> x/y
Unums.Ubound{2,2,UInt16}
(0.9375,1.0)

One of the Julia’s most powerful features are its generic methods. In particular, there are several linear algebra routines, such as LU and QR factorizations, which can be applied to an array of any numeric type, such as Quaternions, simply by defining the necessary elementary operations (+, -, *, / and sqrt) and a few simple methods such as abs and copysign.

We can make use of them here with Unums:

julia> X = map(Unum34,randn(10,6))
10x6 Array{Unums.Unum{3,4,UInt64},2}:
  (0.76873779296875,0.7687454223632812)       …  (-1.520599365234375,-1.5205841064453125)
  (1.505889892578125,1.5059051513671875)          (1.6515960693359375,1.651611328125)
 (-0.21679306030273438,-0.21679115295410156)      (0.640228271484375,0.6402359008789062)
  (0.3451271057128906,0.34513092041015625)       (-0.313323974609375,-0.3133201599121094)
  (0.3553314208984375,0.3553352355957031)        (-1.01434326171875,-1.0143280029296875)
  (2.5906982421875,2.590728759765625)         …  (-0.7313995361328125,-0.7313919067382812)
  (0.5986404418945312,0.5986480712890625)         (0.007965564727783203,0.007965683937072754)
  (0.4354515075683594,0.435455322265625)         (-1.214202880859375,-1.2141876220703125)
 (-1.7028656005859375,-1.702850341796875)         (0.68011474609375,0.6801223754882812)
 (-0.9994583129882812,-0.99945068359375)         (-1.211334228515625,-1.2113189697265625)

julia> Q,R = qr(X)
(
10x6 Array{Unums.Ubound{3,4,UInt64},2}:
 (-0.203277587890625,-0.2031707763671875)      …   (0.5075531005859375,0.8198089599609375)
 (-0.39812469482421875,-0.3980598449707031)       (-0.13979148864746094,0.17265892028808594)
  (0.057305335998535156,0.057314395904541016)     (-0.3092536926269531,-0.1703510284423828)
 (-0.09124469757080078,-0.09122943878173828)      (-0.062105655670166016,0.12235355377197266)
 (-0.09394264221191406,-0.09392642974853516)       (0.016231060028076172,0.21294784545898438)
 (-0.6849288940429688,-0.684814453125)         …  (-0.4050254821777344,-0.11310958862304688)
 (-0.15826988220214844,-0.15824127197265625)      (-0.26529693603515625,-0.04947376251220703)
 (-0.11512374877929688,-0.11510562896728516)       (0.2563362121582031,0.3899116516113281)
  (0.4501228332519531,0.4501953125)               (-0.3648338317871094,-0.1708221435546875)
  (0.264190673828125,0.2642326354980469)           (0.362701416015625,0.576568603515625)     ,

6x6 Array{Unums.Ubound{3,4,UInt64},2}:
 (-3.782867431640625,-3.78271484375)  …   (0.4373779296875,0.4379119873046875)
   0.0                                   (-1.2593994140625,-1.256591796875)
   0.0                                    (0.31097412109375,0.31465911865234375)
   0.0                                    (1.9207763671875,1.9372100830078125)
   0.0                                   (-0.14617156982421875,-0.10152435302734375)
   0.0                                …  (-2.2342529296875,-2.19671630859375)       )

We can check that the matrix reconstructed from the factors contains the original matrix X:

julia> Y = Q*R
10x6 Array{Unums.Ubound{3,4,UInt64},2}:
  (0.7685317993164062,0.7689743041992188)   …  (-1.89501953125,-1.15325927734375)
  (1.5057373046875,1.5060577392578125)          (1.2778778076171875,2.025543212890625)
 (-0.2168140411376953,-0.2167682647705078)      (0.4716682434082031,0.81317138671875)
  (0.3450927734375,0.3451690673828125)         (-0.5455322265625,-0.08253669738769531)
  (0.35529327392578125,0.3553733825683594)     (-1.248748779296875,-0.7854843139648438)
  (2.5904541015625,2.59100341796875)        …  (-1.071014404296875,-0.38515472412109375)
  (0.5985794067382812,0.5987167358398438)      (-0.2518424987792969,0.2725715637207031)
  (0.4354095458984375,0.4355010986328125)      (-1.37896728515625,-1.0521697998046875)
 (-1.703033447265625,-1.7026824951171875)       (0.4458961486816406,0.9188003540039062)
 (-0.9995574951171875,-0.9993515014648438)     (-1.4760894775390625,-0.9517135620117188)

julia> all(map(in,X,Y))
true

Unfortunately, as with any interval arithmetic approach, the problem is that the intervals grow as the number of operations increase. In this case, as each column of the orthogonal matrix Q depends on the previous ones, the widths increase with column number:

julia> W = map(Unums.width,Q)
10x6 Array{Float64,2}:
 0.000106812  0.000665665  0.00137138   0.00675392  0.0597467  0.312256
 6.48499e-5   0.000686646  0.00142288   0.0072937   0.0620613  0.31245
 9.05991e-6   0.000112534  0.000808716  0.00331306  0.0285912  0.138903
 1.52588e-5   0.000366211  0.000694275  0.00553513  0.0377731  0.184459
 1.62125e-5   0.000196457  0.00105286   0.00414467  0.0401134  0.196717
 0.000114441  0.000652313  0.00124741   0.00605774  0.0500031  0.291916
 2.86102e-5   0.000366211  0.000835419  0.00431061  0.0451851  0.215823
 1.81198e-5   0.000141144  0.000701904  0.00354385  0.0270338  0.133575
 7.24792e-5   0.000396729  0.000806808  0.00379539  0.0399399  0.194012
 4.19617e-5   0.000411987  0.000822067  0.00435543  0.0425873  0.213867

julia> using Gadfly

julia> spy(W, Scale.color_log10)

Conclusions

I think the key lesson here is that although interval arithemtic is an extremely powerful
tool, intervals cannot be used as simple substitutes for numbers. John Gustafson proposes
using fused operations for polynomial evaluation and dot products, and it should be possible
to test these ideas out in Julia, for example by overloading the dot function,
or via macros to detect reused bindings in expressions such as x*(2.0+x).

Fully leveraging the power of interval arithmetic does require a slightly different mindset, however,
using specialised algorithms such as interval Newton’s method.
Unfortunately, such approaches need to be designed from scratch, and as a result we can no
longer fully leverage Julia’s generic functions.

I didn’t really discuss the utility of variable-width encoding. As it stands, the Unum
format is far from the most efficient storage format, with a lot of redundant encodings.
This problem that has been recognised by John Gustafson in his recent presentation on
“Unums version 2.0”
.
Given the constraints of existing computing architectures, it may be that a byte-aligned
format (similar to UTF-8) might be more amenable to optimization.

Transpiling Julia to C – The LLVM-CBackend

static-julia-logo

Julia Computing carried out this work under contract from the Johns Hopkins University Applied Physics Laboratory (JHU APL) for the Federal Aviation Administration (FAA) to support its Airborne Collision Avoidance System X (ACAS X) program. JuliaCon 2015 had a very interesting talk by Robert Moss on this topic.

In my last blog post on the requirements for statically compiling julia, I promised to describe in more detail the ability to transpile Julia code into C output. I would now like to announce the open-source release of an updated LLVM CBackend that is able to handle the broad set of intrinsic operations used by Julia.

Julia already has a backend for compiling to the LLVM IR. The Static and Ahead of Time (AOT) compiled Julia work improved this backend to be able to handle any valid Julia method directly, instead of only those with sufficient type specialization. The updated LLVM-CBE project is then able to map the LLVM IR to the appropriate C construct, using only a limited subset of the C language (at this time, the output is nearly C89 compliant). The final executable then could be compiled by some other C compiler and would not have any dependency on the llvm project.

There are a few caveats to be aware of:

  • The LLVM IR is more expressive than C, so some parts of the output are not compatible with the JIT code. This means that you should not try to use the system image compiled by a C compiler with the JIT enabled. Additionally, C has much more undefined behavior than LLVM or Julia, so some places in the code may be more verbose than you might have expected to force C to give the intended behavior.

  • This is not a true cross-compiler. This is not the fault of the backend, but of the Julia program generator phase which encodes many platform-specific assumptions as it generates function definitions. This is the same situation you would encounter if you were to, for example, run a C file through cpp then try to compile the the output using a cross compiler. Some examples include:
    • OS-specializations
    • Endianness assumptions
    • Word (pointer) size
    • I am hopeful that the first two will be addressable in the future. Indeed, the coreimg.jl / inference.ji file is already able to avoid making any platform-specific assumption other than sizeof(uintptr_t) and can be used cross-compiled in this manner.
  • The resulting binary includes code for every method that was added to the system image, with no tree-shaking. The main reasons for this is that regardless of the contents of the program, the semantics of Julia require that the __init__ functions of each module will be called (with all of their side-effects). And beyond trivial programs, it is computationally infeasible to statically determine which methods might be called without solving the halting problem. It may be possible to statically bound the answer, but I expect that it will be more beneficial to better modularize the way code for Base is loaded.

Notwithstanding those limitations, the Julia compiler backend was previously not capable of handling all constructs. The backend has always been capable of compiling fully type-specialized method definitions to native code. And this turns out to be sufficient for compiling an unspecialized versions of most functions as well, by specifying a less specialized type signature for the lambda to the compiler. This is true since the semantics of a Julia program (notably dispatch) are defined as operating on the runtime types of the arguments. The ability to infer the types of variables and arguments ahead-of-time to unbox them and devirtualize the calls is important as an optimization, but neither optimization impacts the correctness of the code.

There were a few cases where the backend couldn’t handle the generic method signature without additional machinery: static parameters, intrinsics, and comprehensions.

Static method parameters represent a side channel of additional information, computed as a function of the method type signature and the runtime call types. So while f{T}(x::T) = T is a very simple method definition, it required some extra care to compile it correctly for any x. Compiling this generically requires the ability to pass T as an argument. However, the callee knows x and not T so it can’t readily provide this extra information. Usually, the Julia backend knows the method signature type exactly, which allows it to fill in T as a constant during compilation and avoids the issue. For the static compilation case, the method dispatch system needs to instead insert this extra parameter into the call. It does this by adding a hidden argument to the call signature, and then marking the function pointer as needing to be called with this extra argument.

A second challenge is intrinsics. These calls are function-like, but do not have a dynamic behavioral model behind them. The code-generator simply fails if it was not able to statically predict one of these types (it was the job of generic_unbox and auto_unbox to try to avoid reaching this failure situation. In practice, this isn’t usually an issue since the method signatures for these functions in Base have been strongly statically typed. However, to guarantee this can’t happen required an implementation of a generic version of all intrinsic functions that can be called if the static type is not sufficiently well-known. The updated LLVM CBackend even makes direct use of some of these dynamic copies of LLVM’s normal intrinsics to implement support for emulation of i128 operations when they are not natively supported by the compiler (for example, when using MSVC or when not using x64).

Another case is comprehensions. Currently this is an open issue. This could be worked around using the same mechanism that is used to add static parameters. But since this issue affects correctness of more than just statically compiled code, the general fix is being actively worked for the v0.5 release.


CBackend usage

These instructions assume you have built Julia from source. If you want to use it independent of Julia, please see the instructions in the repo README instead.

Overview:

  1. Build julia from source
  2. Install llvm-cbe
  3. Generate julia output .bc file with --compile=all
  4. Convert julia output .bc -> .c and compile

Start by defining the directory of the julia src root folder where you have previously built an (unmodified) version of julia master (>=v0.5-) as an environment variable:

JULIA_ROOT=`pwd`/julia

Install / compile llvm-cbe

# Build LLVM 3.7.1, for example, here we build it in-tree
make -C $JULIA_ROOT/deps compile-llvm LLVM_VER=3.7.1

# installs llvm-cbe to $JULIA_ROOT/deps/build/llvm-3.7.1/build_Release/Release/bin
git clone [email protected]:JuliaComputing/llvm-cbe.git $JULIA_ROOT/deps/srccache/llvm-3.7.1/projects/llvm-cbe
make -C $JULIA_ROOT/deps/build/llvm-3.7.1/build_Release/projects

Generate the statically-compiled Julia object file

Follow the steps in my previous blog post to create a .bc file with the desired content.

Convert to C source file (.c) and build

Running the llvm-cbe binary on this LLVM bitcode file converts it into C program:

$JULIA_ROOT/deps/build/llvm-3.7.0/build_Release/Release/bin/llvm-cbe 
    sys-plus.bc -o sys-plus.cbe.c

This output file then can be integrated into your normal toolchain and should work with your compiler. For gcc and clang, I have tested with the following flags to suppress uninteresting lint flags while demonstrating compliance to the standard:

WARN='-std=c99 -pedantic -Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-unused-parameter -Wno-sign-compare -Wno-unused-but-set-variable -Wno-long-long -Wno-invalid-noreturn'

The resulting output than can be used in the place of the default system image (with JIT compilation disabled):

./julia -J <file>.so --compile=no <ARGS>

or it could be linked into a larger embedded application executable and initialized with the embedding api:

jl_options.compile_enabled = JL_OPTIONS_COMPILE_OFF;
jl_options.image_file = argv[0];
julia_init(JL_IMAGE_CWD);

Static and Ahead of Time (AOT) compiled Julia

On running Julia code without a JIT


Julia Computing carried out this work under contract from the Johns
Hopkins University Applied Physics Laboratory (JHU APL) for the Federal Aviation
Administration (FAA) to support its Traffic-Alert and Collision Avoidance
System (TCAS) program. JuliaCon 2015 had a very interesting talk by Robert Moss on this topic. Part of this work was also sponsored by Blackrock, Inc.

I’m often asked when I tell someone about Julia: “What makes it fast?” and “Why can’t <insert favorite dynamic language> do the same?” That’s not an easy question, since the answer has many parts, many of them nuanced and sometimes specific to a particular application – or even developer. Being fast is one benefit, but exploring the answer to this question also reveals some other applications: static compilation (i.e. removing the JIT dependency entirely), theorem proving, static memory allocation, and more! Answering this question requires an understanding of the traditional definitions of static vs. dynamic languages, and how Julia fits into that spectrum.

Many languages, including Julia, support templated code, macros, or other forms of source code generation. In traditional static languages, these have often been written in their own language dialect. This makes the distinction between the application and the generator functions immediately clear to the reader. But it also means the user must actually learn two dialects – and how they interact – to be fully proficient in the one language. This templating language may be simple (such as C Preprocessor macros), but may also be a full turing-complete interpreter (such as C++ templates).

Dynamic languages, by contrast, have commonly exposed similar functionality by providing an eval function. The reasoning is that since all code is being dynamically interpreted, there is no disadvantage for some of this code not being available to the compiler until “just-in-time” for the code to be executed. The distinction between application and generator is still fairly clear: the generator function ends with a call to eval.


It’s easy to blur the line between these two camps, however. For example, if a C++ program links against libclang (for example, the cling project), it is possible to program in the dynamic style. Or if a program written in a dynamic language doesn’t use eval, then it can be transpiled to avoid the runtime interpreter[1]. Julia embraces this hybridization. But to discuss the possibility of static compilation requires an understanding of this distinction between these two phases in the life cycle of the execution of code.

Julia follows in the Lisp tradition and provides tools for manipulating the language using the language itself. This can make it non-obvious to the reader which parts of the code are generators for application logic, and which parts of the program are the actual application logic. But this is also what complicates attempts to answer the initial question of whether Julia programs can be statically compiled – and what that question really means. If compilation is defined as finding the most efficient mapping of the source code onto the primitive instructions understood by the machine, then accurate static analysis is a prerequisite for the compiler to be able to optimize this translation. If the entire program can be statically transformed, then it is possible to generate compiled binaries and remove the runtime dependency for a parser / interpreter / compiler. And while compiler instruction selection is probably the most common static analysis, it is far from the only possible static analysis pass. For example, theorem proving, automated testing, and race detection are all active research areas.

A user’s first encounter with the Julia language is usually at the interactive REPL prompt, and then by writing script files in a similar style. At this top-level scope, all forms of dynamic evaluation are permitted: new types can be defined; functions created; variables can be modified and introspected via reflection; and modules can be defined and imported. However, once the user defines a local scope (for example, a function definition, let block, for loop), only static constructs can be used, with three exceptions provided for user flexibility: eval, macros, and generated functions. This is important, because it means that if the programmer avoids using these three dynamic constructs for the application logic, it is possible to statically analyze and compile the program generated as a result of running the user’s code file.

Let’s take a closer look at each of these cases:

  1. A call to eval can be used as an escape hatch to invoke top-level expressions and the compiler from inside a function. This is akin to its purpose in a typical dynamic language (with the exception that it cannot introspect or modify a local variable, which many other languages do allow). Julia provides many constructs intended to help the user avoid needing this functionality, including closure (nested) functions, dynamic dispatch, type parameters, and macros.

  2. Macro calls are demarcated by @ to distinguish them from regular function calls. They are functionally equivalent to the code templating features of many traditional static languages (albeit more ergonomic since they are implemented in Julia itself, in the style of Lisp). They are run after parsing, but before the code is executed. Indeed, there is no mechanism for invoking them at runtime so the existence of their definitions in a program does not cause problems.

    Aside: If you’ve ever encountered the error: “unsupported or misplaced expression $”, this is specifically the runtime-macro behavior that is “unsupported”. Indeed, the syntax for a runtime invocation of a macro would be:

    $(quote @macrocall $(args...) end)
    

  3. Generated (aka staged) functions cannot be statically compiled. These functions are equivalent to calling eval on a new anonymous function computed as a function of the input types (a JIT-parsed lambda, if you will), and optionally memoizing the result. Therefore, it is possible to statically compile the memoization cache. This makes them, in this regard, superior to an unadorned eval call. But in the general case, generated functions are black boxes to the compiler and thus cannot be analyzed statically.

So there you have it. If you avoid eval and generated functions, any language – including Julia – can be statically compiled.

But that still leaves all of the important questions unanswered, such as: (a) why does this matter? (b) how can we use it? (c) what makes Julia special?

Julia is special because it was designed from the start as a dynamic language of the ilk described above, but one in which the programmer often can describe to the compiler the extent to which those features are used by a particular function. The built-in library of functionality (aka Base) was developed to provide this information and take advantage of these principles, which continues to influence authors of extension modules (aka packages) to also follow these principles. For example, the Julia community seems to have coined the term “type-stability” to describe a concept that static / compiled languages have historically enforced and dynamic / scripting languages have historically disregarded. These considerations are what allows Julia to claim both flexibility and speed. These concerns can be very difficult to retrofit onto a legacy codebase. Put another way, the speed potential of a language consists almost entirely of the properties that the compiler is able to prove ahead-of-time so that they don’t need to checked at runtime. And the flexibility comes from being able to get those runtime checks automatically whenever they are needed. Type-checking (and unboxing) is one aspect of these checks, but there are many other properties that can be computed such as stack allocation, statically-determined memory lifetimes, constant propagation, and call de-virtualization. (For a more complete discussion of these properties, see Oscar Blumberg’s Green Fairy Analysis)

This also means turning Julia code into Julia binaries requires no tricks, complicated incantations, or obscure limitations. In fact, the Julia runtime / compiler is already silently doing this for you on a regular basis. Sorry, if you were hoping for something really spectacular here – but that’s also not quite the end of the story, since we can exercise some direct control over it.

So now let’s pull back the covers on some of the options for the Julia binary. You may have glossed over this long list at some point (abridged):

~$ julia --help

julia [switches] -- [programfile] [args...]

-v, --version         	Display version information

-h, --help            	Print this message

-J, --sysimage <file> 	Start up with the given system image file

--precompiled={yes|no}	Use precompiled code from system image if available

--compilecache={yes|no}   Enable/disable incremental precompilation of modules

--startup-file={yes|no}   Load ~/.juliarc.jl

-e, --eval <expr>     	Evaluate <expr>

-E, --print <expr>    	Evaluate and show <expr>

-P, --post-boot <expr>	Evaluate <expr>, but don't disable interactive mode (deprecated, use -i -e instead)

-L, --load <file>     	Load <file> immediately on all processors

--compile={yes|no|all}	Enable or disable compiler, or request exhaustive compilation

--output-o name       	Generate an object file (including system image data)

--output-ji name      	Generate a system image data file (.ji)

--output-bc name      	Generate LLVM bitcode (.bc)

--output-incremental=no   Generate an incremental output file (rather than complete)

What you may not have been as readily aware of is that many of these options are used internally to handle various modes of operation. For instance, -p n (or addprocs(n)) will launch extra copies of julia on the indicated hosts with --worker.

The --output, --compile, and --sysimage are the ones that will be of primary interest for investigating the static compilations abilities of Julia.

When building the Julia language runtime from the .jl source files in base, the Julia runtime library code is run with a flag that tells it where to save the resulting application – code and variable declarations – after executing the input commands. The first call to ./julia during the source compilation evaluates the coreimg.jl file and writes a bytecode representation of the resulting Julia Inference analysis code to inference.ji:

./julia --output-ji inference.ji coreimg.jl

Then the system builds upon that image, to compile the entire Base system, by evaluating sysimg.jl in the runtime environment previously defined and saved to the inference.ji file:

./julia --output-o sys.o --sysimage inference.ji --startup-file=no 
	sysimg.jl

And since it compiled some of those functions to native code (due to directives in the precompile.jl file or other heuristics), that native code can be linked into a dynamic library for fast startup:

cc -shared -o sys.so sys.o -ljulia

In normal usage, the Julia runtime only invokes the compiler when a function is called. This is an intentional trade-off that incurs higher memory usage and longer compile times (aka JIT warm-up), with the expectation that the additional information from the presence of the types will enable the compiler to generate simpler code with fewer runtime operations – resulting in a net time savings. There’s another assumption in this behavior also: which is that the compiler will be available at runtime.

There are cases, however, where the user may want to or need to avoid running the compiler at runtime. The --compile=<yes|no|all> flag makes this possible (the default is yes). When Julia is run with the --compile=all flag, the compiler is invoked for all functions in the system image, so that the resulting sys-all.so binary contains native code for all functions defined in Base:

./julia --output-o sys-all.o --sysimage sys.so --startup-file=no 
	--compile=all --eval nothing
cc -shared -o sys-all.so sys-all.o -ljulia

This dynamic library no longer requires the compiler, which can be demonstrated by disabling the compiler by command line argument:

./julia --compile=no --sysimage sys-all.so

Or the compiler can be removed from the library entirely:

make JULIACODEGEN=none

resulting in a much smaller libjulia.so file –

– but which will throw an error if the system needs to use any methods at runtime that haven’t been pre-compiled to binary code:


I think it is worth mentioning here that the ./julia binary itself is actually just a very small utility wrapper for parsing the command line arguments and loading the actual Julia runtime from the sys.so dynamic library. This allows the same binary file to be used as both a dynamic library file and an executable, instead of needing to create two different output files. But the linker could instead be invoked slightly differently to embed the compiled code directly into the executable[2]:

cc -o julia-app sys.o repl.c -ljulia

Package code is handled similarly to the system image, so these same principles apply also to modules loaded with Base.__precompiled__(true) / Base.compilecache("Package"). These commands invoke the ./julia program in compiler-mode like the examples above, plus the addition of an incremental flag. This extra flag tells it that the output file should only include the delta of the code and definitions that are part of that package:

./julia --output-ji pkg.ji --sysimage sys.so --output-incremental=yes 
	pkg.jl

I think that about covers the current capabilities of Julia’s static compilation engine. Over time, I’m sure that I, and the rest of the team at Julia Computing Inc., will be adding many more under-the-hood features and optimizations to expand further on these powerful capabilities. This will allow Julia to be used on a broad variety of resource-constrained compute devices, many of which disallow JIT compilation or simply aren’t powerful enough for it to be beneficial – web-browsers (e.g. emscripten), smartphones, IoT devices (e.g. the Raspberry Pi), etc.

One other application of static analysis that I hadn’t yet touched on is the ability to convert Julia code to another language, such as C. In my next post, I plan to dive further into this capability and show how that can be done.


Supplemental Tools

Since Julia users often come to be aware of, and sometimes even fluent in, the esoterica of such tools as code_llvm and code_native, I feel it would be remiss of me if I didn’t point out that there are several standalone tools for analyzing the static files generated above. For complete documentation, refer to the llvm webpage for these tools.

  • To use most of these tools, you will need to start by re-running the command of interest above, and specifying --output-bc instead of (or in addition to) --output-o

  • llvm-dis : converts the .bc (llvm bitcode) binary file to .ll (llvm assembly text)
    • roughly equivalent to code_llvm
  • llc : compiles a .bc or .ll file to .o (equivalent to the file from --output-o)

  • llvm-objdump : disassembles a .o file to .S
    • roughly equivalent to code_native


[1]: This observation forms the basis of the JIT compilers for many popular languages such as Javascript.


[2]: There is near infinite variety in the flags that can be passed to cc to compile and link files. I’ve neglected to mention paths and a few flags that are frequently essential such as -L, -I, and -Wl,-rpath,$(pwd). I’m assuming here that the reader already has a toolchain configured for their purpose, so I’ve opted for trying to show a simple example clearly rather than trying to teach all of the nuances, which could fill a whole blog post of its own.