Static lists in Julia

By: Julia on μβ

Re-posted from: https://matbesancon.github.io/post/2019-03-30-static-list/


This post explores the possibility to build static lists in Julia, meaning
lists for which the size is known at compile-time. This is inspired by
a post
on a Scala equivalent but will take different roads to see more than a plain port.
Of course, this implementation is not that handy nor efficient but
is mostly meant to push the limits of the type system,
especially a trick of using recursive types as values
(replacing a dependent type system).
Some other references:

Table of Contents

First thoughts: value type parameter

Julia allows developers to define type parameters.
In the case of a list, the most obvious one may be the
type of data it contains:

abstract type MyList{T} end

Some types are however parametrized on other things, if we look at the
definition of AbstractArray for example:

  AbstractArray{T,N}

Supertype for N-dimensional arrays (or array-like types) with elements of type T.

The two type parameters are another type T and integer N for the
dimensionality (tensor rank). The only constraint for a value to be
an acceptable type parameter is to be composed of plain bits, complying
with isbitstype.

This looks great, we could define our StaticList
directly using integers.

"""
A static list of type `T` and length `L`
"""
abstract type StaticList{T,L} end

struct Nil{T} <: StaticList{T,0} end

StaticList{T}() where T = Nil{T}()
StaticList(v::T) where T = Cons(v, Nil{T}())

struct Cons{T,L} <: StaticList{T,L+1}
    h::T
    t::StaticList{T,L}
    function Cons(v::T, t::StaticList{T,L}) where {T,L}
        new{T,L}(v,t)
    end
end

# Usage:
# Cons(3, Nil{Int}()) is of type StaticList{Int,1}
# Cons(4, Cons(3, Nil{Int}())) is of type StaticList{Int,2}

If you try to evaluate this code, you will get an error:

ERROR: MethodError: no method matching +(::TypeVar, ::Int64)

Pretty explicit, you cannot perform any computation on values used as type
parameters. With more complex operations, this could make the compiler hang,
crash or at least perform poorly (we would be forcing the compiler to execute
this code at compile-time).

One way there might be around this is macros or replacing sub-typing with
another mechanism. For the macro-based approach,
ComputedFieldTypes.jl
does exactly that. More discussion on computed type parameters in
[1] and [2].

Edit: using integer type parameters can be achieved using ComputedFieldTypes.jl as such:

julia> using ComputedFieldTypes

julia> abstract type StaticList{T,L} end

julia> struct Nil{T} <: StaticList{T,0} end

julia> @computed struct Cons{T,L} <: StaticList{T,L}
           h::T
           t::StaticList{T,L-1}
           function Cons(v::T, t::StaticList{T,L0}) where {T,L0}
               L = L0+1
               new{T,L}(v,t)
           end
       end

julia> Cons(3, Nil{Int}())
Cons{Int64,1,0}(3, Nil{Int64}())

julia> Cons(4, Cons(3, Nil{Int}()))
Cons{Int64,2,1}(4, Cons{Int64,1,0}(3, Nil{Int64}()))

This might be the neatest option for building the StaticList.

Recursive natural numbers

We can use the same technique as in the Scala post, representing natural
number using recursive types.

  • ZeroLength is a special singleton type
  • Next{L} represents the number following the one represented by L

We can modify our previous example:

"""
A type parameter for List length, the numerical length can be retrieved
using `length(l::Length)`
"""
abstract type Length end
struct ZeroLength <: Length end
struct Next{L<:Length} <: Length end

"""
A linked list of size known at compile-time
"""
abstract type StaticList{T,L<:Length} end

struct Nil{T} <: StaticList{T,ZeroLength} end

StaticList{T}() where T = Nil{T}()
StaticList(v::T) where T = Cons(v, Nil{T}())

struct Cons{T,L<:Length} <: StaticList{T,Next{L}}
    h::T
    t::StaticList{T,L}
    function Cons(v::T, t::StaticList{T,L}) where {T,L<:Length}
        new{T,L}(v,t)
    end
end

"""
By default, the type of the Nil is ignored if different
from the type of first value
"""
Cons(v::T,::Type{Nil{T1}}) where {T,T1} = Cons(v, Nil{T}())

We can then define basic information for a list, its length:

Base.length(::Type{ZeroLength}) = 0
Base.length(::Type{Next{L}}) where {L} = 1 + length(L)

Base.eltype(::StaticList{T,L}) where {T,L} = T
Base.length(l::StaticList{T,L}) where {T,L} = length(L)

One thing should catch your attention in this block,
we use a recursive definition of length for the Length type,
which means we can blow our compiler. However, both of the definitions
are static, in the sense that they don’t use type information, so
the final call should reduce to spitting out the length cached at compile-time.
You can confirm this is the case by checking the produced assembly instructions with @code_native.
We respected our contract of a list with size known at compile-time.

Implementing a list-y behaviour

This part is heavily inspired by the DataStructures.jl list implementation,
as such we will not re-define methods with semantically similar but
implement them for our list type. Doing so for your own package
allows user to switch implementation for the same generic code.

The first operation is being able to join a head with an existing list:

DataStructures.cons(v::T,l::StaticList{T,L}) where {T,L} = Cons(v,l)

"""
Allows for `cons(v,Nil)`. Note that the `Nil` type is ignored.
"""
DataStructures.cons(v::T,::Type{Nil}) where {T} = StaticList(v)

(::Colon)(v::T,l::StaticList{T,L}) where {T,L} = DataStructures.cons(v, l)
(::Colon)(v::T,::Type{Nil}) where {T,L} = DataStructures.cons(v, Nil{T})

Implementing the odd ::Colon methods allows for a very neat syntax:

l0 = StaticList{Int}()
l1 = 1:l0
l2 = 2:l1

Unlike the Scala post, we are not using the :: operator which
is reserved for typing expressions in Julia.
We can add a basic head and tail methods, which allow querying
list elements without touching the inner structure. This
will be useful later on.

DataStructures.head(l::Cons{T,L}) where {T,L} = l.h
DataStructures.tail(l::Cons{T,L}) where {T,L} = l.t

Testing list equality can be done recursively, dispatching on the three
possible cases:

==(l1::StaticList, l2::StaticList) = false

function ==(l1::L1,l2::L2) where {T1,L,T2,L1<:Cons{T1,L},L2<:Cons{T2,L}}
    l1.h == l2.h && l1.t == l2.t
end

"""
Two `Nil` are always considered equal, no matter the type
"""
==(::Nil,::Nil) = true

We can now define basic higher-order functions, such as zip below,
and implement the iteration interface.

function Base.zip(l1::Nil{T1},l2::StaticList{T2,L2}) where {T1,T2,L2}
    Nil{Tuple{T1,T2}}
end

function Base.zip(l1::Cons{T1,L1},l2::Cons{T2,L2}) where {T1,L1,T2,L2}
    v = (l1.h, l2.h)
    Cons(v,zip(l1.t,l2.t))
end

Base.iterate(l::StaticList, ::Nil) = nothing
function Base.iterate(l::StaticList, state::Cons = l)
    (state.h, state.t)
end

Iterating over our lists is fairly straight-forward, and will be more efficient than
the recursive implementations of the higher-order functions, we still kept it for
equality checking, more a matter of keeping a functional style in line with the Scala post.

The case of list reversal is fairly straightforward: iterate and accumulate
the list in a new one.

function Base.reverse(l::StaticList{T,L}) where {T,L}
    l2 = Nil{T}
    for h in l
        l2 = Cons(h, l2)
    end
    l2
end

We define the cat operation between multiple lists.

function Base.cat(l1::StaticList{T,L},l2::StaticList{T,L}) where {T,L}
    l = l2
    for e in reverse(l1)
        l = Cons(e, l)
    end
    l
end

The reverse is necessary to keep the order of the two lists.

Special-valued lists

Now that we have a basic static list implementation, we can spice things up.
StaticList is just an abstract type in our case, not an algebraic data type
as in common functional implementations, meaning we can define other sub-types.

Imagine a numeric list, with a series of zeros or ones somewhere.
Instead of storing all of them, we can find a smart way of representing them.
Let us define a static list of ones:

struct OnesStaticList{T<:Number,L<:Length} end

Base.iterate(l::OnesStaticList, ::Type{ZeroLength}) = nothing
function Base.iterate(l::OnesStaticList{T,L}, state::Type{Next{L1}} = L) where $
    (one(T), L1)
end

This list corresponds to the 1 value of type T, repeated for all elements.
In a similar fashion, one can define a ZeroList:

struct ZerosStaticList{T<:Number,L<:Length} end

Base.iterate(l::ZerosStaticList, ::Type{ZeroLength}) = nothing
function Base.iterate(l::ZerosStaticList{T,L}, state::Type{Next{L1}} = L) where$
    (zero(T), L1)
end

One thing to note is that these lists are terminal, in the sense that they cannot
be part of a greater list. To fix this, we can add a tail to these as follows:

struct ZerosStaticList{T<:Number,L<:Length,TL<:StaticList{T,<:Length}}
	t::TL
end

Base.iterate(l::ZerosStaticList, ::Type{ZeroLength}) = l.t
function Base.iterate(l::ZerosStaticList{T,L}, state::Type{Next{L1}} = L) where$
    (zero(T), L1)
end

The t field of the list contains the tail after the series of zeros,
we can thus build a much simpler representation in case of long constant series.
In a similar fashion, one could define a constant list of N elements, storing
the value just once.

Multi-typed lists

There is one last extension we can think of with this data structure.
Since we have a recursive length parameter, why not add it a type at each new node?

abstract type TLength end
struct TZeroLength <: TLength end
struct TNext{T,L<:TLength} <: TLength end

abstract type TStaticList{L<:TLength} end

struct TNil <: TStaticList{TZeroLength} end

struct TCons{T, L<:TLength} <: TStaticList{TNext{T,L}}
    h::T
    t::TStaticList{L}
    function TCons(v::T, t::TStaticList{L}) where {T,L<:TLength}
        new{T,L}(v,t)
    end
end

With such construct, all nodes can be of a different type T, without
removing the type information from the compiler.

julia> TCons(3,TNil())
TCons{Int64,TZeroLength}(3, TNil())

julia> TCons("ha", TCons(3,TNil()))
TCons{String,TNext{Int64,TZeroLength}}("ha", TCons{Int64,TZeroLength}(3, TNil()))

One interesting thing to note here is that the type takes the same
structure as the list itself:

Type: either a T and a TLength containing the rest of the type, or TNil
Data: either a value of a given type and the rest of the list, or empty list

Conclusion

The Julia type system and compiler allow for sophisticated specifications
when designing data structures, which gives it a feel of compiled languages.
This however should not be abused, in our little toy example, the type parameter
grows in complexity as the list does, which means the compiler has to carry out
some computation.

If you want some further compile-time tricks, Andy Ferris’s
workshop at JuliaCon 2018 details how to perform compile-time computations
between bits and then bytes.

If you have any idea how to implement StaticList using integer parameters instead
of custom struct I would be glad to exchange. Porting this to
use ComputedFieldTypes.jl might be a fun
experiment.

Feel free to reach out any way you prefer, Twitter,
email to exchange or discuss this post.


Sources

Header image source: https://pxhere.com/en/photo/742575

[1] A proposal on Julia “Defer calculation of field types until type parameters are known”, julia/issues/18466
[2] Discussion on compile-time computations on Discourse