Playing with Chain.jl

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2021/01/15/chain.html

Introduction

DataFrames.jl was designed to support chaining of operations well.
For a long time my favorite package that helped with this was Pipe.jl.
It is very easy to understand how it works and is clear visually.

There are many alternative packages that support chaining, but they all required
much higher mental effort from the developer to master them. However, in
November 2020 Chain.jl was created and it really is as simple as
Pipe.jl, but at the same time more powerful. In this post I briefly investigate
what it has to offer.

This post was written with Julia 1.5.3, Chain.jl 0.4.2, and Combinatorics 1.0.2.

Experimenting with Chain.jl

The Chain.jl README.md does a really great job of explaining why and
how of the package so I refer you to the website to read the details. In short
it introduces:

  • a macro @chain and an extra annotation,
  • an @aside annotation that can be used inside a @chain block to produce
    side effects,
  • _ is used to signal where the value of the previous expression should be
    inserted (unless it is a first argument in which case _ can be omitted).

Let me give one exemplary usage of the @chain macro. Assume we have a 6
element set and want to get all permutations of its 4 element subsets (if you
ever try to implement a Mastermind solver you might need it).
Here is how you can generate it using Chain.jl:

julia> using Chain

julia> using Combinatorics

julia> @chain 1:6 begin
                 combinations(4)
                 @aside println("# of combinations: ", length(_))
                 collect # we could skip this step
                 @. permutations
                 mapreduce(collect, vcat, _)
             end
# of combinations: 15
360-element Array{Array{Int64,1},1}:
 [1, 2, 3, 4]
 [1, 2, 4, 3]
 [1, 3, 2, 4]
 ⋮
 [6, 4, 5, 3]
 [6, 5, 3, 4]
 [6, 5, 4, 3]

Note that in the first call combinations(4) Chain.jl has put _ implicitly
as the first argument of the combinations function, so the actual call is
combinations(1:6, 4).

In line collect result of combinations(1:6, 4) is passed as a single
argument to collect (you do not need to write parentheses). Similarly in line
@. permutations we use the same pattern but this time we broadcast the
permutations function over a collection passed from the previous step of the
chain. If we want to pass other than the first argument then _ is used as
shown in the mapreduce(collect, vcat, _) line.

In the second line @aside is executed but is ignored in the pipeline. Note
that it would be tempting to write

@aside println("# of combinations: ", length)

instead of

@aside println("# of combinations: ", length(_))

The reason is that length takes only one argument. However, in this case a
call to length is nested so you have to pass _ explicitly.

You can see that Chain.jl has two key features:

  • everything is wrapped in beginend block,
  • there is no visual separator (like standard |> in e.g. Pipe.jl)
    signaling an end of the expression.

Many people will find that it exactly fits their needs, but here is an
alternative syntax that I have found to be potentially usable with
Chain.jl:

@chain 1:6 (
    combinations(4);
    @aside println("# of combinations: ", length(_));
    collect; # we could skip it
    @. permutations;
    mapreduce(collect, vcat, _);
)

which produces the same result.

The difference here is that I replace beginend block with ( and ), so
it is a bit less typing. In this case one has to separate the expressions with
;. I also added ; at the end of the last expression, though it is not
strictly necessary, as in this way you can safely add/remove lines in @chain
without changing the remaining lines.

If having to add ; in this style is good or bad is a matter of taste. On one
hand it adds typing, but on the other hand it clearly shows the end of one
expression (which in beginend style is not explicit, sometimes it might
be confusing if someone needed to add a line break in an expression, and e.g.
indentation should be used to signal line continuation then).

Also () style has an additional benefit that in some editors it is easy to
select the code block enclosed in the parentheses if you would need to
copy-paste the contents of @chain.

Conclusions

I think Chain.jl is excellent. If you like chaining function calls in
your code I really recommend you to check it out.