Category Archives: Julia

The @view and @views macros: are you sure you know how they work?

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2022/06/10/view.html

Introduction

Macros in Julia are a very nice element of the language. You can easily visually
identify macros as they are called using the @ prefix. I most often use the
@assert, @show, @spawn, @edit, and @time macros. However, one needs to
understand how macros work to confidently use them.

Today I want to write about typical situations when using
the @view and @views macros to create views can lead to surprising results.

The post was tested under Julia 1.7.2.

The @view macro and expression range

Assume we have a vector representing 24 months of data and we want to compute
correlation between the first 12 observations and the last 12 observations.

Here is a simple way to do it:

julia> using Random

julia> using Statistics

julia> Random.seed!(1234);

julia> x = randn(24);

julia> cor(x[1:12], x[13:24])
-0.018264675865149734

However, this operation copies data. If we want to avoid this we can use
views. To start let us check the view function:

julia> cor(view(x, 1:12), view(x, 13:24))
-0.018264675865149734

Let us check if indeed this reduces allocations first (using the @allocated
macro):

julia> @allocated cor(x[1:12], x[13:24])
336

julia> @allocated cor(view(x, 1:12), view(x, 13:24))
112

Indeed it does. Of course in our toy example the benefit is minimal.

Now we are ready to get to the main point of my post. Suppose we have
cor(x[1:12], x[13:24]) and want to do use views. As you can see in
codes above turning indexing into view call requires re-writing of the
expressions. This is where the @view macro comes handy.

julia> cor(@view x[1:12], @view x[13:24])
ERROR: LoadError: ArgumentError: Invalid use of @view macro: argument must be a reference expression A[...].

Or does it? We have some problem when using the macro. Let us check the
Julia Manual:

Macros are invoked with the following general syntax:

@name expr1 expr2 ... or @name(expr1, expr2, ...)

Since we invoked @view using the first style Julia eagerly considers
everything that follows it as a single expression. In this case the whole
x[1:12], @view x[13:24] part of code is passed to @view as a single
expression and we get an error.

We need to use the second macro invocation style in this case:

julia> cor(@view(x[1:12]), @view(x[13:24]))
-0.018264675865149734

Now all worked as expected. The only downside is that it feels a bit
inconvenient to write @view(...). Fortunately, Julia’s creators have thought
about it and designed a @views macro which turns every array slicing
(i.e., array[...]) operation in a passed expression to a view. In our case
this would be:

julia> @views cor(x[1:12], x[13:24])
-0.018264675865149734

The @views macro surprises

Let us check using the @macroexpand macro if indeed @views works as I promised:

julia> @macroexpand @views cor(x[1:12], x[13:24])
:(cor((Base.maybeview)(x, 1:12), (Base.maybeview)(x, 13:24)))

What is this strange Base.maybeview function? Is like getindex, but returns
a view for array slicing operations, while remaining equivalent to getindex
for scalar indices and non-array types. So it is almost the same as view.

Let us see the difference between using @view and @views:

julia> @view x[1]
0-dimensional view(::Vector{Float64}, 1) with eltype Float64:
0.9706563288552144

julia> @views x[1]
0.9706563288552144

Most of the time using a 0-dimensional view or a scalar will not make a
difference, however, sometimes it does. Let us have a look:

julia> similar(@view x[1])
0-dimensional Array{Float64, 0}:
1.40721121e-315

julia> similar(@views x[1])
ERROR: MethodError: no method matching similar(::Float64)

So things are, unfortunately, not as simple as you might expect, especially if
you are writing generic code and do not know upfront if you will use a scalar
index or not.

Having learned what I have written above you might think that the following code
will not work:

julia> x, y, z = [1, 2, 3], [2, 3, 4], [4, 5, 6]
([1, 2, 3], [2, 3, 4], [4, 5, 6])

julia> @views x[1] = y[1] + z[1]
6

julia> x
3-element Vector{Int64}:
 6
 2
 3

However, it produces a correct result. What is the reason? Let us check:

julia> @macroexpand @views x[1] = y[1] + z[1]
:(x[1] = (Base.maybeview)(y, 1) + (Base.maybeview)(z, 1))

As we can see @views is smart enough not to apply the view transformation
to the left hand side of the assignment. This feature is not documented in its
docstring, but can be found (and is even commented about) in the source code of
the @views macro (in the _views function to be precise).

Interestingly, this rule is in play even in the example given in
the docsting of the @views macro. Let us check it:

julia> A = zeros(3, 3);

julia> @macroexpand @views for row in 1:3
                    b = A[row, :]
                    b[:] .= row
                end
:(for row = 1:3
      #= REPL[67]:2 =#
      b = (Base.maybeview)(A, row, :)
      #= REPL[67]:3 =#
      b[:] .= row
  end)

We can see that since b[:] is on left hand side of the assignment it is not
touched by @views.

Conclusions

What are the lessons learned?

  1. Macros in Julia are powerful. Well designed macros can make developer’s
    life easier and code more readable.
  2. Macros can be tricky. The most common problem when using macros is the
    @name expr1 style of invocation, which can process “too much” of your code
    unexpectedly.
  3. @views is not equivalent to multiple invocations of @view. There are
    subtle differences between them. Fortunately these differences matter
    mostly in generic code and thus package developers need to be aware of them.

DataFrames.jl for work and pleasure

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2022/06/03/selectors.html

Introduction

This week I have read a post on Why Vim is better than VSCode.
In it the author discusses a lot the operator – text object – motion
pattern in Vim. The post argues that it is not only efficient but fun to
learn and use.

It reminded me of the structure of the operation specification language
we have in DataFrames.jl that follows the pattern:

input columns => transformation => output column names

I have already written two posts about this topic that you can find
here and here. Therefore, today I decided to take the
fun part of using the minilanguage.

The post is written under Julia 1.7.2 and DataFrames.jl 1.3.4.

The challenge

The user has some data frame and wants to drop a :col column from it,
but the user is not sure if this column is present in the data frame.

Let us first create two test data frames on which we will test our solutions:

julia> using DataFrames

julia> df1 = DataFrame(a=1:2, b=3:4)
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     2      4

julia> df2 = DataFrame(a=1:2, col=["drop", "me"], b=3:4)
2×3 DataFrame
 Row │ a      col     b
     │ Int64  String  Int64
─────┼──────────────────────
   1 │     1  drop        3
   2 │     2  me          4

A basic approach

A natural thing to try is using the Not selector for this task. Let us
check it:

julia> select(df1, Not(:col))
ERROR: ArgumentError: column name :col not found in the data frame

julia> select(df2, Not(:col))
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     2      4

The operation worked on df2, but failed on df1.

You might ask why Not selector is so restrictive? The reason is to avoid bugs.
You could accidentally mistype column name and then, if such operation worked,
instead of erroring, your incorrect result would propagate.

An intermediate solution

A first solution that comes to mind is to drop the column only if it is present
in a data frame so you might write something like this:

julia> select(df1, names(df1) .!= "col")
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     2      4

julia> select(df2, names(df2) .!= "col")
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     2      4

This works, but you need to write the name of the source data frame twice,
so the solution feels a bit heavy.

The fun part

What is the way I find nice to do this operation then? Here is the approach:

julia> select(df1, Cols(!=("col")))
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     2      4

julia> select(df2, Cols(!=("col")))
2×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      3
   2 │     2      4

We are using a combo of a bit advanced features here.

First !=("col") creates a function that compares its argument to "col" using
!=. This is a very nice feature of Base Julia that it allows partial function
application for the != operator.

Next the Cols function accepts a predicate, in our case !=("col"). Then it
selects all columns of a data frame for which this predicate returns true.

Conclusions

The beauty of Julia is that it not only does the job you want done, but also is
quite fun to code with. At the same time, its design often helps you with
catching common possible bugs in code (like the Not behavior I have described
in this post).

Enjoy!

Julia ♥ ABM #0: Agent-based modelling in Julia

By: Frederik Banning

Re-posted from: https://forem.julialang.org/fbanning/dummy-agent-based-modelling-in-julia-pla

Designing an agent-based model is never easy. Especially in the beginning when you have the rough ideas for the model in your head, maybe sketched some of the core behavioural rules, and gathered the references for them in an unordered list. You will advance through the regular steps of deciding on a model purpose, formulating precise research questions, properly formalising your model’s inner workings, maybe putting its core functions in mathematical notation or possibly even pseudo-code, …

At some point, however, you will find yourself starting up your IDE of choice (some do this step rather sooner than later) and stare at the blank file before you. You may smile out of anticipation of the grand things to come, the epiphanies that will strike you out of nowhere to make your code readable, easy to navigate, and performant at the same time. Or you may already release a deep sigh at the thought of endless iterations over your algorithms to get them to do what you intend them to do, or you might start to feel discouraged after a short while because of the lack of good documentation for things that feel like core features to you. You don’t know where to start and what to do next, how to structure your code so that it stays maintainable in the long run, who to ask for help if you encounter a road block on your journey, and so on and so forth.

If you nodded at least once or twice while reading the paragraphs above, this little series might be for you. My name is Fred and I’m currently a PhD student at the Chair of Macroeconomics at Ruhr-University Bochum in Germany. In my free time I do volunteer work for the German Red Cross and I’m also a co-host of the Mikroökonomen podcast (in German). I neither have a background in computer science, nor am I outstandingly knowledgeable in the area of agent-based modelling (besides it being my day job). But over time I’ve picked up more and more how to code in Julia and how its community works. And due to how approachable both the language and its community are, I’ve ended up contributing to some Julia packages (most notably Agents.jl and InteractiveDynamics.jl and recently even started an own package (OSMMakie.jl) as a side project.

I’ve started my ventures into the field of ABMs a few years ago during a course of my master’s programme. We learned about behavioural rules of economic agents, some algorithmic thinking, and how to code in NetLogo and I had a lot of fun working with it. Over the past years, I’ve even had the opportunity to assist in teaching the practical sides of agent-based modelling with NetLogo and how students can use it to tackle their own research questions. Ever since my first contact with ABMs, I’ve got hooked on the concepts of Computational Economics to explore and hopefully better understand the underlying complex relationships in the economic parts of societies.

Even though I’ve initially learned and later taught agent-based modelling via the great tool that is NetLogo, I’ve relatively quickly came to realise that its great prototyping abilities, easy to learn and use syntax, and built-in visualisations come with some trade-offs. Two of which still stand out the most to me:

  1. For one, it’s relatively slow compared to other ABM frameworks and, to be quite frank, there’s not much that we as regular users can do about this. Have a look here for one attempt to quantify the differences between popular frameworks. As can be seen from this, Julia is well suited for executing highly computationally intensive ABMs. After all, agent-based models are at their core not really much more than repetitive number crunching under a set of given rules. And that, by chance, is a domain in which Julia really shines.

  2. Users also need at least a second tool or programming language to analyse the data generated by their ABMs and plot it in a visually appealing way (preferably even in a publishable form). Julia solves this issue both by design and through its rich package ecosystem which allows agent-based modellers to code continuous pipelines from the model itself into data wrangling and finally visualisation. No need for intermediate data conversion and, maybe more importantly, no need to learn multiple syntaxes.

It’s this kind of uniform approach utilising just a single programming language for all your needs, that can allow researchers to free up some precious mental capacities and gain speed in writing clear and concise code. All while allowing others to reproduce your findings and modify or extend your models. Taking these two points together, I firmly believe that Julia is indeed a great candidate for writing agent-based models.

A quick side note: Agent-based modelling has a broad community ranging from disciplines such as physics and biology to social sciences and economics. Due to my personal background, this series will naturally approach the topic of ABMs from this perspective as well. This means that in the examples I choose, the agents are more likely to represent humans, firms, or some other kind of institutions more often than for example cells, particles, or other animals. Let this not distract you from the fact, that ABMs can be used for a multitude of great applications and scenarios from various scientific backgrounds. At its core, the technicalities stay the same – it’s just the story and the subjects that change.

For now, this is just a dummy post to test this new platform in the broader Julia community (yet, already having spent a non-negligible amount of time on this post). And while I’m planning to contribute a few more posts to this series, it’s as of yet unclear how long this will last. Somehow it’s an attempt to give something back to the Julia community besides answering questions on Discourse, Zulip, Slack and Github. Hopefully it will succeed in conveying my enthusiasm for doing agent-based modelling in Julia and convince others to try it out for themselves. It’s not particularly harder to code good ABMs in Julia than it is in any other framework like NetLogo or Mesa. But it might take some time to get used to the “Julia way” when coming from another language. This is what I’m trying to help with. Maybe you’ll like it, maybe you won’t. In the end, I would be glad to hear about your experiences, no matter if they’re positive or negative.

Designing an agent-based model is never easy.
So let’s not make the coding part any harder than it has to be.

So long,
Fred