Re-posted from: https://blog.glcs.io/type-stability

This post was written by Steven Whitaker.

The Julia programming languageis a high-level languagethat boasts the ability to achieve C-like speeds.Julia can run fast,despite being a dynamic language,because it is compiledand has smart type-inference.

Type-inference is the processof the Julia compilerreasoning about the types of objects,enabling compilation to create efficient machine codefor the types at hand.

However,Julia code can be written in a waythat prevents type-inference from succeeding—specifically,by writing type-unstable functions.(I’ll explain type-instability later on.)When type-inference fails,Julia has to compile generic machine codethat can handle any type of input,sacrificing the C-like performanceand instead running more like an interpreted languagesuch as Python.

Fortunately,there are tools that Julia developers can useto track down what code causes type-inference to fail.Among the most powerful of these toolsare SnoopCompile.jl and Cthulhu.jl.Using these tools,developers can fix type-inference failuresand restore the C-like performancethey were hoping to achieve.

In this post,we will learn about type-stabilityand how it impacts performance.Then we will see how to useSnoopCompile.jl and Cthulhu.jlto locate and resolve type-instabilities.

Type-Stability

A function is type-stableif the type of the function’s output can be concretely determinedgiven the types of the inputs to the function,without any runtime information.

To illustrate, consider the following function methods:

f(x::Int) = "stable"f(x::Float64) = rand(Bool) ? 1 : 2.0

In this example,if we call f(x) where x is an Int,the compiler can figure outthat the output will be a Stringwithout knowing the value of x,so f(x::Int) is type-stable.In other words,it doesn’t matter whether x is 1, -1, or 176859431;the return value will always be a Stringif x is an Int.

On the other hand,if we call f(x) where x is a Float64,the compiler doesn’t knowwhether the output will be an Int or a Float64because that depends on the result of rand(Bool),which is computed at runtime.Therefore,f(x::Float64) is type-unstable.

Here’s a more subtle example of type-instability:

function g(x)    if x < 0        return 0    else        return x    endend

In this example,g(x) is type-unstablebecause the output will either be an Intor whatever the type of x is,and it all depends on the value of x,which isn’t known at compile time.(Note, however,that g(x) is type-stableif x is an Intbecause then both branches of the if statementreturn the same type of value.)

And sometimes a function that might look type-stablecan be type-unstabledepending on the input.For example:

h(x::Array) = x[1] + 1

In this case,h([1]) is type-stable,but h(Any[1]) is not.Why?Because with h([1]),x is a Vector{Int},so the compiler knowsthat the type of x[1] will be Int.On the other hand,with h(Any[1]),x is a Vector{Any},so the compiler thinksx[1] could be of any type.

To reiterate:a function is type-stableif the compiler can figure outthe concrete type of the return valuegiven only the types of the inputs,without any runtime information.

When Compilation Occurs

Another aspect of type-inferencethat is useful to understandis when compilation (including type-inference) occurs.

In a static language like C,an entire program is compiledbefore any code runs.This is possible because the types of all variablesare known in advance,so machine code specific to those typescan be generated in advance.

In an interpreted language like Python,no code is ever compiledbecause variables are dynamic,meaning their types aren’t really ever knownuntil variables are actually used(i.e., during runtime).

Julia programs can liepretty much anywhere betweenthe extremes of C and Python,and where on that spectrum a program liesdepends on type-stability.

In a just-in-time (JIT) compiled language like Julia,compilation occurs once types are known.

If a Julia program is completely type-stable,type-inference can figure out the types of all variablesin the programbefore running any code.As a result,the entire program can be compiledas if it were written in a static language.This is what allows Julia to achieve C-like speeds.
If a Julia program is entirely type-unstable,every function has to be compiled individually.In this case,compilation occurs at the momentthe function is calledbecause that’s when the runtime informationof all the input types is finally known.Furthermore,the machine code for a type-unstable functioncannot be efficientbecause it must be able tohandle a wide rangeof potential types.As a result,despite being compiled,the code runs essentially like an interpreted language.

Running a Julia program with type-instabilitiesis like driving down the streetand hitting all the red lights.Julia will compile all the codefor which type-inference succeedsand then start running.But when the program reaches a function callthat could not be inferred,that’s like a car stopping at a red light;Julia stops running the codeto compile the function callnow that it knows the runtime types of the inputs.After the function is compiled,the program can continue execution,like how the car can continue drivingonce the light turns green.

Type-Stability and Performance

As this analogy implies,and as I’ve stated before,type-stability has performance implications.Type-instabilities can cause various performance degradations, including:

Dynamic (aka runtime) dispatch.If the compiler knows the input types to a function,the generated machine code can include a callto the specific method determined by those types.But if the compiler doesn’t know those types,the machine code has to include instructionsto perform dynamic dispatch.As a result,rather than jumping directly to the correct method,Julia has to spend runtime CPU cyclesto look up the correct method to call.
Increased memory allocations.If the compiler doesn’t know what typea variable will have,it’s impossible to put it in a registeror even allocate stack space for it.As a result,it has to be heap-allocatedand managed by the garbage collector.
Suboptimal compiled code.Imagine summing the contents of an array in a loop.If the compiler knows the array contains just Float64s,it can perform optimizations to compute the sumas efficiently as possible,e.g., by using specialized CPU instructions.Such optimizations cannot occurif the compiler doesn’t know what type of datait’s working with.

Here’s an example(inspired by this Stack Overflow answer)that illustrates the impacttype-stability can have on performance:

# Type-unstable because `x` is a non-constant global variable.x = 0f() = [i + x for i = 1:10^6]# Type-stable because `y` is constant and therefore always an `Int`.const y = 0g() = [i + y for i = 1:10^6]using BenchmarkTools@btime f() #  16.868 ms (1998983 allocations: 38.13 MiB)@btime g() # 190.755 s (3 allocations: 7.63 MiB)

Note that the type-unstable versionis two orders of magnitude slower!Also note, however,that this is an extreme examplewhere essentially the entire computationis type-unstable.In practice,some type-instabilities will not impact performance very much.Type-stability mainly matters in “hot loops”,i.e., in parts of the codethat run very frequentlyand contribute to a significant portionof the program’s overall run time.

Detecting Type-Instabilities with SnoopCompile.jl

Now the question is,how do we know if or where our code is type-unstable?One excellent toolfor discovering where type-instabilities occur in codeis SnoopCompile.jl.This package provides functionalityfor reporting how many timesa Julia program needs to stop to compile code.(Remember that a perfectly type-stable programcan compile everything in one go,so every time execution stops for compilationindicates a type-instability was encountered.)

Let’s use an example to illustrate how to use SnoopCompile.jl.First, the code we want to analyze:

module Originalstruct Alg1 endstruct Alg2 endfunction process(alg::String)    if alg == "alg1"        a = Alg1()    elseif alg == "alg2"        a = Alg2()    end    data = get_data(a)    result = _process(a, data)    return resultendget_data(::Alg1) = (1, 1.0, 0x00, 1.0f0, "hi", [0.0], (1, 2.0))function _process(::Alg1, data)    val = data[1]    if val < 0        val = -val    end    result = map(data) do d        process_item(d, val)    end    return resultendprocess_item(d::Int, val) = d + valprocess_item(d::AbstractFloat, val) = d * valprocess_item(d::Unsigned, val) = d - valprocess_item(d::String, val) = d * string(val)process_item(d::Array, val) = d .+ valprocess_item(d::Tuple, val) = d .- valget_data(::Alg2) = rand(5)_process(::Alg2, data) = error("not implemented")end

We’ll use the @snoop_inference macroto analyze this code.Note that this macro should be usedin a fresh Julia session(after loading the code to be analyzed,but before running anything)to get the most accurate analysis results.

julia> using SnoopCompileCorejulia> tinf = @snoop_inference Original.process("alg1");julia> using SnoopCompilejulia> tinfInferenceTimingNode: 0.144601/0.247183 on Core.Compiler.Timings.ROOT() with 8 direct children

You can consult the SnoopCompile.jl docsfor more information about what we just did,but for now,notice that displaying tinf revealed 8 direct children.That means compilation occurred 8 timeswhile running Original.process("alg1").If this function were completely type-stable,@snoop_inference would have reported just 1 direct child,so we know there are type-instabilities somewhere.

Each of the 8 direct childrenis an inference trigger,i.e., calling the specific methodindicated in the inference triggercaused compilation to occur.We can collect the inference triggers:

julia> itrigs = inference_triggers(tinf) Inference triggered to call process(::String) from eval (./boot.jl:430) inlined into REPL.eval_user_input(::Any, ::REPL.REPLBackend, ::Module) (/cache/build/tester-amdci5-12/julialang/julia-release-1-dot-11/usr/share/julia/stdlib/v1.11/REPL/src/REPL.jl:261) Inference triggered to call process_item(::Int64, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Int64) Inference triggered to call process_item(::Float64, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Float64) Inference triggered to call process_item(::UInt8, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::UInt8) Inference triggered to call process_item(::Float32, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Float32) Inference triggered to call process_item(::String, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::String) Inference triggered to call process_item(::Vector{Float64}, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Vector{Float64}) Inference triggered to call process_item(::Tuple{Int64, Float64}, ::Int64) from #1 (./REPL[1]:30) with specialization (::var"#1#2")(::Tuple{Int64, Float64})

The first inference triggercorresponds to compiling the top-level process functionwe called(this is the inference trigger we always expect to see).But then it looks like Julia had to stop runningto compile several different methods of process_item.

Inference triggers tell us that type-instabilities existedwhen calling the given functions,but what we really want to know iswhere these type-instabilities originated.You’ll note that each displayed inference trigger abovealso indicates the calling functionby specifying from <calling function>.(Note that the from #1 in the above exampleindicates process_item was calledfrom an anonymous function.)

We can use accumulate_by_sourceto get an aggregated viewof what functions made calls via dynamic dispatch:

julia> mtrigs = accumulate_by_source(Method, itrigs)2-element Vector{SnoopCompile.TaggedTriggers{Method}}: eval_user_input(ast, backend::REPL.REPLBackend, mod::Module) @ REPL ~/.julia/juliaup/julia-1.11.5+0.x64.linux.gnu/share/julia/stdlib/v1.11/REPL/src/REPL.jl:247 (1 callees from 1 callers) (::var"#1#2")(d) @ Main REPL[1]:30 (7 callees from 7 callers)

From this,we can see that the example codereally has only one problematic function:the anonymous function var"#1#2".

Diving in with Cthulhu.jl

Now that we have a rough ideaof where the type-instabilities come from,we can drill down into the codeand pinpoint the precise causeswith Cthulhu.jl.We can use the ascend functionon an inference triggerto start investigating:

julia> using Cthulhujulia> ascend(itrigs[2]) # Skip `itrigs[1]` because that's the top-level compilation that should always occur.

ascend provides a menu that shows process_itemand the anonymous function.Select the anonymous function and press Enter.Here’s a screenshot of the Cthulhu output:

Cthulhu output

Reading the output of Cthulhu.jltakes some time to get used to(especially when it can’t display source code,as in this example),but the main thing to rememberis that red is bad.See the Cthulhu.jl README for more information.

In this example,the source of the type-instabilitywas fairly easy to pinpoint.I annotated the screenshotto indicate from where the type-instability arose,which is this Core.Box thing.These are always bad;they are essentially containersthat can hold values of any type,hence the type-instability that ariseswhen accessing the contents.In this particular case,Core.getfield(#self#, :val) indicatesval is a variablethat was captured by the anonymous function.

Once we determine what caused the type-instability,the solution varies on a case-by-case basis.Some potential solutions may include:

Ensure different branches of an if statementreturn data of the same type.
Add a type annotation to help out inference.For example,
```
x = Any[1]y = do_something(x[1]::Int)
```
Make sure a container type has a concrete element type.For example, x = Int[], not x = [].
Avoid loops over heterogeneous Tuples.
Use let blocks to define closures.(See this section of the Julia manual for more details.)

We’ll use this last solution in our example.The anonymous function in questionis defined by the do block in _process.So, let’s fix the issue of the captured variable val:

module Corrected# All other code is the same as in module `Original`.function _process(::Alg1, data)    val = data[1]    if val < 0        val = -val    end    f = let val = val        d -> process_item(d, val)    end    result = map(f, data)    return resultend# All other code is the same as in module `Original`.end

Now let’s see what @snoop_inference says:

julia> using SnoopCompileCorejulia> tinf = @snoop_inference Corrected.process("alg1");julia> using SnoopCompilejulia> tinfInferenceTimingNode: 0.113669/0.183888 on Core.Compiler.Timings.ROOT() with 1 direct children

There’s just one direct child.Hooray, type-stability!

Let’s see how performance compares:

julia> using BenchmarkToolsjulia> @btime Original.process("alg1");  220.506 ns (16 allocations: 496 bytes)julia> @btime Corrected.process("alg1");  51.104 ns (8 allocations: 288 bytes)

Awesome, the improved code is ~4 times faster!

Summary

In this post,we learned about type-stabilityand how type-instabilities affect compilationand runtime performance.We also walked through an examplethat demonstrated how to use SnoopCompile.jl and Cthulhu.jlto pinpoint the sources of type-instability in a program.Even though the example in this postwas a relatively easy fix,the principles discussed apply to more complicated programs as well.And, of course,check out the documentation for SnoopCompile.jl and Cthulhu.jlfor further examples to bolster your understanding.

Do you have type-instabilities that plague your Julia code?Contact us, and we can help you out!

Additional Links

SnoopCompile.jl Docs
- Documentation for SnoopCompile.jl.
Cthulhu.jl Docs
- Documentation (the package’s README) for Cthulhu.jl.
Julia Performance Tips
- Very good tips for improving the performance of Julia code.
GLCS Software Development
- Connect with us for Julia development help.
Upcoming JuliaCon Talk Announcement
- Check out our JuliaCon 2025 talk announcement!

]]>

juliabloggers.com

A Julia Language Blog Aggregator

Type-Stability with SnoopCompile.jl and Cthulhu.jl for High-Performance Julia