Author Archives: Lyndon White

The Emergent Features of JuliaLang: Part II – Traits

By: Lyndon White

Re-posted from: https://invenia.github.io/blog/2019/11/06/julialang-features-part-2/

Introduction

This blog post is based on a talk originally given at a Cambridge PyData Meetup, and also at a London Julia Users Meetup.

This is the second of a two-part series of posts on the Emergent Features of JuliaLang.
That is to say, features that were not designed, but came into existence from a combination of other features.
While Part I was more or less a grab-bag of interesting tricks,
Part II (this post), is focused solely on traits, which are, in practice, much more useful than anything discussed in Part I.
This post is fully indepedent from Part I.

Traits

Inheritance can be a gun with only one bullet, at least in the case of single inheritance like Julia uses.
Single-inheritance means that, once something has a super-type, there can’t be another.
Traits are one way to get something similar to multiple inheritance under such conditions.
They allow for a form of polymorphism,
that is orthogonal to the inheritence tree, and can be implemented using functions of types.
In an earlier blog post this is explained using a slightly different take.

Sometimes people claim that Julia doesn’t have traits, which is not correct.
Julia does not have syntactic sugar for traits, nor does it have the ubiqutious use of traits that some langauges feature.
But it does have traits, and in fact they are even used in the standard library.
In particular for iterators, IteratorSize and IteratorEltype, and for several other interfaces.

These are commonly called Holy traits, not out of religious glorification, but after Tim Holy.
They were originally proposed to make StridedArrays extensible.
Ironically, even though they are fairly well established today, StridedArray still does not use them.
There are on-going efforts
to add more traits to arrays, which one day will no doubt lead to powerful and general BLAS-type functionality.

There are a few ways to implement traits, though all are broadly similar.
Here we will focus on the implementation based on concrete types.
Similar things can be done with Type{<:SomeAbstractType} (ugly, but flexible),
or even with values if they are of types that constant-fold (like Bool),
particularly if you are happy to wrap them in Val when dispatching on them (an example of this will be shown later).

There are a few advantages to using traits:

  • Can be defined after the type is declared (unlike a supertype).
  • Don’t have to be created up-front, so types can be added later (unlike a Union).
  • Otherwise-unrelated types (unlike a supertype) can be used.

A motivating example from Python: the AsList function

In Python TensorFlow, _AsList is a helper function:

def _AsList(x):
    return x if isinstance(x, (list, tuple)) else [x]

This converts scalars to single item lists, and is useful because it only needs code that deals with lists.
This is not really idiomatic python code, but TensorFlow uses it in the wild.
However, _AsList fails for numpy arrays:

>>> _AsList(np.asarray([1,2,3]))
[array([1, 2, 3])]

This can be fixed in the following way:

def _AsList(x):
    return x if isinstance(x, (list, tuple, np.ndarray)) else [x]

But where will it end? What if other packages want to extend this?
What about other functions that also depend on whether something is a list or not?
The answer here is traits, which give the ability to mark types as having particular properties.
In Julia, in particular, dispatch works on these traits.
Often they will compile out of existence (due to static dispatch, during specialization).

Note that at some point we have to document what properties a trait requires (e.g. what methods must be implemented).
There is no static type-checker enforcing this, so it may be helpful to write a test suite for anything that has a trait which checks that it works properly.

The Parts of Traits

Traits have a few key parts:

  • Trait types: the different traits a type can have.
  • Trait function: what traits a type has.
  • Trait dispatch: using the traits.

To understand how traits work, it is important to understand the type of types in Julia.
Types are values, so they have a type themselves: DataType.
However, they also have the special pseudo-supertype Type, so a type T acts like T<:Type{T}.

typeof(String) === DataType
String isa Type{String} === true
String isa Type{<:AbstractString} === true

We can dispatch on Type{T} and have it resolved at compile time.

Trait Type

This is the type that is used to attribute a particular trait.
In the following example we will consider a trait that highlights the properties of a type for statistical modeling.
This is similar to the MLJ ScientificTypes.jl, or the StatsModel schema.

abstract type StatQualia end

struct Continuous <: StatQualia end
struct Ordinal <: StatQualia end
struct Categorical <: StatQualia end
struct Normable <: StatQualia end

Trait Function

The trait function takes a type as input, and returns an instance of the trait type.
We use the trait function to declare what traits a particular type has.
For example, we can say things like floats are continuous, booleans are categorical, etc.

statqualia(::Type{<:AbstractFloat}) = Continuous()
statqualia(::Type{<:Integer}) = Ordinal()

statqualia(::Type{<:Bool}) = Categorical()
statqualia(::Type{<:AbstractString}) = Categorical()

statqualia(::Type{<:Complex}) = Normable()

Using Traits

To use a trait we need to re-dispatch upon it.
This is where we take the type of an input, and invoke the trait function on it to get objects of the trait type, then dispatch on those.

In the following example we are going to define a bounds function, which will define some indication of the range of values a particular type has.
It will be defined on a collection of objects with a particular trait, and it will be defined differently depending on which statqualia they have.

using LinearAlgebra

# This is the trait re-dispatch; get the trait from the type
bounds(xs::AbstractVector{T}) where T = bounds(statqualia(T), xs)

# These functions dispatch on the trait
bounds(::Categorical, xs) = unique(xs)
bounds(::Normable, xs) = maximum(norm.(xs))
bounds(::Union{Ordinal, Continuous}, xs) = extrema(xs)

Using the above:

julia> bounds([false, false, true])
2-element Array{Bool,1}:
 false
 true

julia> bounds([false, false, false])
1-element Array{Bool,1}:
 false

julia> bounds([1,2,3,2])
(1, 3)

julia> bounds([1+1im, -2+4im, 0+-2im])
4.47213595499958

We can also extend traits after the fact: for example, if we want to add the
property that vectors have norms defined, we could define:

julia> statqualia(::Type{<:AbstractVector}) = Normable()
statqualia (generic function with 6 methods)

julia> bounds([[1,1], [-2,4], [0,-2]])
4.47213595499958

Back to AsList

First, we define the trait type and trait function:

struct List end
struct Nonlist end

islist(::Type{<:AbstractVector}) = List()
islist(::Type{<:Tuple}) = List()
islist(::Type{<:Number}) = Nonlist()

The we define trait dispatch:

aslist(x::T) where T = aslist(islist(T), x)
aslist(::List, x) = x
aslist(::Nonlist, x) = [x]

This allows aslist to work as expected.

julia> aslist(1)
1-element Array{Int64,1}:
 1

julia> aslist([1,2,3])
3-element Array{Int64,1}:
 1
 2
 3

julia> aslist([1])
1-element Array{Int64,1}:
 1

As discussed above this is fully extensible.

Dynamic dispatch as fallback.

All the traits discussed so far have been fully-static, and they compile away.
We can also write runtime code, at a small runtime cost.
The following makes a runtime call to hasmethod to look up if the given type
has an iterate method defined.
(There are plans to make hasmethod compile time.
But for now it can only be done at runtime.)
Similar code to this can be used to dispatch on the values of objects.

islist(T) = hasmethod(iterate, Tuple{T}) ? List() : Nonlist()

We can see that this works on strings, as it does not wrap the following into an array.

julia> aslist("ABC")
"ABC"

Traits on functions

We can also attach traits to functions, because functions are instances of singleton types,
e.g. foo::typeof(foo).
We can use this idea for declarative input transforms.

As an example, we could have different functions expect the arrangement of observations to be different.
More specifically, different machine learning models might expect the inputs be:

  • Iterator of Observations.
  • Matrix with Observations in Rows.
  • Matrix with Observations in Columns.

This isn’t a matter of personal perference or different field standards;
there are good performance-related reasons to choose one the above options depending on what operations are needed.
As a user of a model, however, we should not have to think about this.

The following examples use LIBSVM.jl, and DecisionTree.jl.
In practice we can just use the MLJ interface instead,
which takes care of this kind of thing.

In this demonstration of the simple use of a few ML libraries, first we need some basic functions to deal with our data.
In particular we want to define get_true_classes which as it would naturally be written would expect an iterator of observations.

using Statistics
is_large(x) = mean(x) > 0.5
get_true_classes(xs) = map(is_large, xs)

inputs = rand(100, 1_000);  # 100 features, 1000 observations
labels = get_true_classes(eachcol(inputs));

For simplicity, we will simply test on our training data,
which should be avoided in general (outside of simple validation that training is working).

The first library we will consider is LIBSVM, which expect the data to be a matrix with 1 observation per column.

using LIBSVM
svm = svmtrain(inputs, labels)

estimated_classes_svm, probs = svmpredict(svm, inputs)
mean(estimated_classes_svm .== labels)

Next, lets try DecisionTree.jl, which expects data to be a matrix with 1 observation per row.

using DecisionTree
tree = DecisionTreeClassifier(max_depth=10)
fit!(tree, permutedims(inputs), labels)

estimated_classes_tree = predict(tree, permutedims(inputs))
mean(estimated_classes_tree .== labels)

As we can see above, we had to know what each function needed, and use eachcol and permutedims to modify the data.
The user should not need to remember these details; they should simply be encoded in the program.

Traits to the Rescue

We will attach a trait to each function that needed to rearrange its inputs.
There is a more sophisticated version of this,
which also attaches traits to inputs saying how the observations are currently arranged, or lets the user specify.
For simplicity, we will assume the data starts out as a matrix with 1 observations per column.

We are considering three possible ways a function might like its data to be arranged.
Each of these will define a different trait type.

abstract type ObsArrangement end

struct IteratorOfObs <: ObsArrangement end
struct MatrixColsOfObs <: ObsArrangement end
struct MatrixRowsOfObs <: ObsArrangement end

We can encode this information about each function expectation into a trait,
rather than force the user to look it up from the documentation.

# Out intial code:
obs_arrangement(::typeof(get_true_classes)) = IteratorOfObs()

# LIBSVM
obs_arrangement(::typeof(svmtrain)) = MatrixColsOfObs()
obs_arrangement(::typeof(svmpredict)) = MatrixColsOfObs()

# DecisionTree
obs_arrangement(::typeof(fit!)) = MatrixRowsOfObs()
obs_arrangement(::typeof(predict)) = MatrixRowsOfObs()

We are also going to attach some simple traits to the data types to say whether or not they contain observations.
We will use value types for this,
rather than fully declare the trait types, so we can just skip straight to declaring the trait functions:

# All matrices contain observations
isobs(::AbstractMatrix) = Val{true}()

# It must be iterator of vectors, else it doesn't contain observations
isobs(::T) where T = Val{eltype(T) isa AbstractVector}()

Next, we can define model_call: a function which uses the traits to decide how
to rearrange the observations before calling the function, based on the type of the function and on the type of the argument.

function model_call(func, args...; kwargs...)
    return func(maybe_organise_obs.(func, args)...; kwargs...)
end

# trait re-dispatch: don't rearrange things that are not observations
maybe_organise_obs(func, arg) = maybe_organise_obs(func, arg, isobs(arg))
maybe_organise_obs(func, arg, ::Val{false}) = arg
function maybe_organise_obs(func, arg, ::Val{true})
    organise_obs(obs_arrangement(func), arg)
end

# The heavy lifting for rearranging the observations
organise_obs(::IteratorOfObs, obs_iter) = obs_iter
organise_obs(::MatrixColsOfObs, obsmat::AbstractMatrix) = obsmat

organise_obs(::IteratorOfObs, obsmat::AbstractMatrix) = eachcol(obsmat)
function organise_obs(::MatrixColsOfObs, obs_iter)
    reduce(hcat, obs_iter)
end

function organise_obs(::MatrixRowsOfObs, obs)
    permutedims(organise_obs(MatrixColsOfObs(), obs))
end

Now, rather than calling things directly, we can use model_call,
which takes care of rearranging things.
Notice that the code no longer needs to be aware of the particular cases for each library, which makes things much easier for the end-user: just use model_call and don’t worry about how the data is arranged.

inputs = rand(100, 1_000);  # 100 features, 1000 observations
labels = model_call(get_true_classes, inputs);
using LIBSVM
svm = model_call(svmtrain, inputs, labels)

estimated_classes_svm, probs = model_call(svmpredict, svm, inputs)
mean(estimated_classes_svm .== labels)
using DecisionTree
tree = DecisionTreeClassifier(max_depth=10)
model_call(fit!, tree, inputs, labels)

estimated_classes_tree = model_call(predict, tree, inputs)
mean(estimated_classes_tree .== labels)

Conclusion: JuliaLang is not magic

In this series of posts we have seen a few examples of how certain features can give rise to other features.
Unit syntax, Closure-based Objects, Contextual Compiler Passes, and Traits, all just fall out of the combination of other features.
We have also seen that Types turn out to be very powerful, especially with multiple dispatch.

The Emergent Features of JuliaLang: Part I

By: Lyndon White

Re-posted from: https://invenia.github.io/blog/2019/10/30/julialang-features-part-1/

Introduction

This blog post is based on a talk originally given at a Cambridge PyData Meetup, and also at a London Julia Users Meetup.

Julia is known to be a very expressive language with many interesting features.
It is worth considering that some features were not planned,
but simply emerged from combinations of other features.
This post will describe how several interesting features are implemented:

  1. Unit syntactic sugar,
  2. Pseudo-OO objects with public/private methods,
  3. Dynamic Source Tranformation / Custom Compiler Passes (Cassette),
  4. Traits.

Some of these (1 and 4) should be used when appropriate,
while others (2 and 3) are rarely appropriate, but are instructive.

Part I of this post (which you are reading now) will focus on the first three.
Part II of this post is about traits, which deserve there own section
because they are one of the most powerful and interesting Julia features.
They emerge from types, multiple dispatch,
and the ability to dispatch on the types themselves (rather than just instances).
We will review both the common use of traits on types,
and also the very interesting—but less common—use of traits on functions.

There are many other emergent features which are not discussed in this post.
For example, creating a vector using Int[] is an overload of
getindex, and constructors are overloads of ::Type{<:T}().

Juxtaposition Multiplication: Convenient Syntax for Units

Using units in certain kinds of calculations can be very helpful for protecting against arithmetic mistakes.
The package Unitful provides the ability
to do calculations using units in a very natural way, and makes it easy to write
things like “2 meters” as 2m. Here is an example of using Unitful units:

julia> using Unitful.DefaultSymbols

julia> 1m * 2m
2 m^2

julia> 10kg * 15m / 1s^2
150.0 kg m s^-2

julia> 150N == 10kg * 15m / 1s^2
true

How does this work? The answer is Juxtaposition Multiplication—a literal number
placed before an expression results in multiplication:

julia> x = 0.5π
1.5707963267948966

julia> 2x
3.141592653589793

julia> 2sin(x)
2.0

To make this work, we need to overload multiplication to invoke the constructor of the unit type.
Below is a simplified version of what goes on under the hood of Unitful.jl:

abstract type Unit<:Real end
struct Meter{T} <: Unit
    val::T
end

Base.:*(x::Any, U::Type{<:Unit}) = U(x)

Here we are overloading multiplication with a unit subtype, not an
instance of a unit subtype (Meter(2)), but with the subtype itself (Meter).
This is what ::Type{<:Unit} refers to. We can see this if we write:

julia> 5Meter
Meter{Int64}(5)

This shows that we create a Meter object with val=5.

To get to a full units system, we then need to define methods for everything that numbers need to work with,
such as addition and multiplication. The final result is units-style syntactic sugar.

Closures give us “Classic Object-Oriented Programming”

First, it is important to emphasize: don’t do this in real Julia code.
It is unidiomatic, and likely to hit edge-cases the compiler doesn’t optimize well
(see for example the infamous closure boxing bug).
This is all the more important because it often requires boxing (see below).

“Classic OO” has classes with member functions (methods) that can see all the fields and methods,
but that outside the methods of the class, only public fields and methods can be seen.
This idea was originally posted for Julia in one of Jeff Bezanson’s rare
Stack Overflow posts,
which uses a classic functional programming trick.

Let’s use closures to define a Duck type as follows:

function newDuck(name)
    # Declare fields:
    age=0

    # Declare methods:
    get_age() = age
    inc_age() = age+=1

    quack() = println("Quack!")
    speak() = quack()

    #Declare public:
    ()->(get_age, inc_age, speak, name)
end

This implements various things we would expect from “classic OO” encapsulation.
We can construct an object and call public methods:

julia> bill_the_duck = newDuck("Bill")
#7 (generic function with 1 method)

julia> bill_the_duck.get_age()
0

Public methods can change private fields:

julia> bill_the_duck.inc_age()
1

julia> bill_the_duck.get_age()
1

Outside the class we can’t access private fields:

julia> bill_the_duck.age
ERROR: type ##7#12 has no field age

We also can’t access private methods:

julia> bill_the_duck.quack()
ERROR: type ##7#12 has no field quack

However, accessing their functionality via public methods works:

julia> bill_the_duck.speak()
Quack!

How does this work?

Closures return singleton objects, with directly-referenced, closed variables as fields.
All the public fields/methods are directly referenced,
but the private fields (e.g age, quack) are not—they are closed over other methods that use them.
We can see those private methods and fields via accessing the public method closures:

julia> bill_the_duck.speak.quack
(::var"#quack#10") (generic function with 1 method)

julia> bill_the_duck.speak.quack()
Quack!
julia> bill_the_duck.inc_age.age
Core.Box(1)

An Aside on Boxing

The Box type is similar to the Ref type in function and purpose.
Box is the type Julia uses for variables that are closed over, but which might be rebound to a new value.
This is the case for primitives (like Int) which are rebound whenever they are incremented.
It is important to be clear on the difference between mutating the contents of a variable,
and rebinding that variable name.

julia> x = [1, 2, 3, 4];

julia> objectid(x)
0x79eedc509237c203

julia> x .= [10, 20, 30, 40];  # mutating contents

julia> objectid(x)
0x79eedc509237c203

julia> x = [100, 200, 300, 400];  # rebinding the variable name

julia> objectid(x)
0xfa2c022285c148ed

In closures, boxing applies only to rebinding, though the closure bug
means Julia will sometimes over-eagerly box variables because it believes that they might be rebound.
This has no bearing on what the code does, but it does impact performance.

While this kind of code itself should never be used (since Julia has a perfectly
functional system for dispatch and works well without “Classic OO”-style encapsulation),
knowing how closures work opens other opportunities to see how they can be used.
In our ChainRules.jl project,
we are considering the use of closures as callable named tuples as part of
a difficult problem
in extensibility and defaults.

Cassette, etc.

Cassette is built around a notable Julia feature, which goes by several names:
Custom Compiler Passes, Contextual Dispatch, Dynamically-scoped Macros or Dynamic Source Rewriting.
This same trick is used in IRTools.jl,
which is at the heart of the Zygote automatic differentiation package.
This is an incredibly powerful and general feature, and was the result of a very specific
Issue #21146 and very casual
PR #22440 suggestion that it might be useful for one particular case.
These describe everything about Cassette, but only in passing.

The Custom Compiler Pass feature is essential for:

Call Overloading

One of the core uses of Cassette is the ability to effectively overload what it means to call a function.
Call overloading is much more general than operator overloading, as it applies to every call (special-cased as appropriate), whereas operator overloading applies to just one function call and just one set of types.

To give a concrete example of how call overloading is more general,
operator overloading/dispatch (multiple or otherwise) would allow us to
to overload, for example, sin(::T) for different types T.
This way sin(::DualNumber) could be specialized to be different from sin(::Float64),
so that with DualNumber as input it could be set to calculate the nonreal part using the cos (which is the derivative of sin). This is exactly how
DualNumbers operate.
However, operator overloading can’t express the notion that, for all functions f, if the input type is DualNumber, that f should calculate the dual part using the deriviative of f.
Call overloading allows for much more expressivity and massively simplifies the implementation of automatic differentiation.

The above written out in code:

function Cassette.overdub(::AutoDualCtx, f, d::DualNumber)
    res = ChainRules.frule(f, val(d))
    if res !== nothing
        f_val, pushforward = res
        f_diff = pushforward(partial(f))
        return Dual(f_val, extern(f_diff))
    else
        return f(d)
    end
end

That is just one of the more basic things that can be done with Cassette.
Before we can explain how Cassette works,
we need to understand how generated functions work,
and before that we need to understand the different ways code is presented in Julia,
as it is compiled.

Layers of Representation

Julia has many representations of the code it moves through during compilation.

  • Source code (CodeTracking.definition(String,....)),
  • Abstract Syntax Tree (AST): (CodeTracking.definition(Expr, ...),
  • Untyped Intermidate Represention (IR): @code_lowered,
  • Typed Intermidate Represention: @code_typed,
  • LLVM: @code_llvm,
  • ASM: @code_native.
    We can retrieve the different representations using the functions/macros indicated.

The Untyped IR is of particular interest as this is what is needed for Cassette.
This is a linearization of the AST, with the following properties:

  • Only 1 operation per statement (nested expressions get broken up);
  • the return values for each statement are accessed as SSAValue(index);
  • variables become Slots;
  • control-flow becomes jumps (like Goto);
  • function names become qualified as GlobalRef(mod, func).
    Untyped IR is stored in a CodeInfo object.
    It is not particularly pleasent to write, though for a intermidate representation it is fairly readable.
    The particularly hard part is that expression return values are SSAValues which are defined by position.
    So if code is added, return values need to be renumbered.
    Core.Compiler, Cassette and IRTools define various helpers to make this easier, but for out simple example we don’t need them.

Generated Functions

Generated functions
take types as inputs and return the AST (Abstract Syntax Tree) for what code should run, based on information in the types. This is a kind of metaprogramming.
Take for example a function f with input an N-dimensional array (type AbstractArray{T, N}).
Then a generated function for f might construct code with N nested loops to process each element.
It is then possible to generate code that only accesses the fields we want, which gives substantial performance improvements.

For a more detailed example lets consider merging NamedTuples.
This could be done with a regular function:

function merge(a::NamedTuple{an}, b::NamedTuple{bn}) where {an, bn}
    names = merge_names(an, bn)
    types = merge_types(names, typeof(a), typeof(b))
    NamedTuple{names,types}(map(n->getfield(sym_in(n, bn) ? b : a, n), names))
end

This checks at runtime what fields each NamedTuple has, in order to decide what will be in the merge.
However, we actually know all this information based on the types alone,
because a list of fields is stored as part of the type (that is the an and bn type-parameters.)

@generated function merge(a::NamedTuple{an}, b::NamedTuple{bn}) where {an, bn}
    names = merge_names(an, bn)
    types = merge_types(names, a, b)
    vals = Any[ :(getfield($(sym_in(n, bn) ? :b : :a), $(QuoteNode(n)))) for n in names ]
    :( NamedTuple{$names,$types}(($(vals...),)) )
end

Recall that a generated function returns an AST, which is the case in the examples above.
However, it is also allowed to return a CodeInfo for Untyped IR.

Making Cassette

There are two key facts mentioned above:

  1. Generated Functions are allowed to return CodeInfo for Untyped IR.
  2. @code_lowered f(args...) provided the ability to retrieve the CodeInfo for the given method of f.

The core of Cassete it to use a generated function
to call code defined in a new CodeInfo of Untyped IR,
which is based on a modified copy of code that is returned by @code_lowered.
To properly understand how it works, lets run through a manual example (originally from a JuliaCon talk on MagneticReadHead.jl).

We define a generated function rewritten that makes a copy of the Untyped IR (a CodeInfo object that it gets back from @code_lowered) and then mutates it, replacing each call with a call to the function call_and_print.
Finally, this returns the new CodeInfo to be run when it is called.

call_and_print(f, args...) = (println(f, " ", args); f(args...))

@generated function rewritten(f)
    ci = deepcopy(@code_lowered f.instance())
    for ii in eachindex(ci.code)
        if ci.code[ii] isa Expr && ci.code[ii].head==:call
            func = GlobalRef(Main, :call_and_print)
            ci.code[ii] = Expr(:call, func, ci.code[ii].args...)
        end
    end
    return ci
end

We can see that this works:

julia> foo() = 2*(1+1)
foo (generic function with 1 method)

julia> rewritten(foo)
+ (1, 1)
* (2, 2)
4

Rather than replacing each call with call_and_print,
we could instead call a function that does the work we are interested in,
and then calls rewritten on the function that would have been called:

function work_and_recurse(f, args...)
    println(f, " ", args)
    rewritten(f, args...)
end

So, not only does the function we call get rewritten, but so does every function it calls, all the way down.

This is how Cassette and IRTools work.
There are a few complexities and special cases that need to be taken care of,
but this is the core of it: recursive invocation of generated functions that rewrite the IR, similar to what is returned by @code_lowered.

Wrapping up Part I

In this part we covered the basics of:

  1. Unit syntactic sugar,
  2. Pseudo-OO objects with public/private methods,
  3. Dynamic Source Tranformation / Custom Compiler Passes (Cassette).

In Part II, we will discuss Traits!