Values’ mutability in Julia

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2022/04/29/mutability.html

Introduction

This week chapter 4 of my upcoming Julia for Data Analysis book has been
released by Manning in MEAP. Therefore, following the plan I have announced
in this post, I will discuss a topic related to the material I cover
in chapter 4, but that was not included in the book.

In chapter 4 I give an introduction to working with collections in Julia.
A fundamental topic that is related with this subject is understanding
that values in Julia can be either mutable (like Vector or Dict) or
immutable (like Tuple or NamedTuple).

In this post I want to present an example showing the relevance of this
distinction.

The codes I use were tested under Julia 1.7.2 and BenchmarkTools.jl 1.3.1.

How to check if some value is mutable?

If you have some value you can check if it is mutable using the ismutable
function. Let us check it on some example:

julia> x = big(1)
1

julia> typeof(x)
BigInt

julia> ismutable(x)
true

As you can see the value having BigInt type is mutable. This means that
it can be changed in-place. Therefore, you now know that you must be
careful when passing BigInt values to functions as you cannot assume
that such functions will not change the passed value.

Let us check that this is indeed the case:

julia> x
1

julia> Base.GMP.flipsign!(x, -1)
-1

julia> x
-1

In this case the Base.GMP.flipsign! function mutated the value bound to the
x variable name.

Is mutability of BigInt useful?

You might ask why BigInt values are made mutable. This is a bit surprising
as other standard numeric types like Int64, Bool, or Float64 are
immutable. The reason is that in certain cases it allows to perform operations
on BigInt values in a faster way. Here is an example of computing a sum
of elements of an array storing BigInt values:

julia> using BenchmarkTools

julia> v = collect(big(1):big(1_000_000));

julia> function sum1(v::AbstractArray{BigInt})
           s = big(0)
           for x in v
               s += x
           end
           return s
       end
sum1 (generic function with 1 method)

julia> function sum2(v::AbstractArray{BigInt})
           s = big(0)
           for x in v
               Base.GMP.MPZ.add!(s, x)
           end
           return s
       end
sum2 (generic function with 1 method)

julia> @btime sum1($v)
  97.111 ms (2000002 allocations: 45.78 MiB)
500000500000

julia> @btime sum2($v)
  27.560 ms (3 allocations: 48 bytes)
500000500000

julia> @btime sum($v)
  27.557 ms (3 allocations: 48 bytes)
500000500000

The difference between sum1 and sum2 is that the former uses + to make
addition and the later uses Base.GMP.MPZ.add!, which updates its first
argument in-place. As you can see sum2 allocates much less memory and is
faster. Additionally I have shown you that sum has essentially the same
performance. This suggests that sum has a special method for performing
summation of BigInt values. Indeed this is the case, the implementation of
this method can be found in base/gmp.jl file and is as follows:

sum(arr::Union{AbstractArray{BigInt}, Tuple{BigInt, Vararg{BigInt}}}) =
    foldl(MPZ.add!, arr; init=BigInt(0))

We can see that it uses an in-place add! function.

Note that both in sum2 and sum functions it was crucial to use a new
BigInt value as an accumulator. The point is that since add! mutates
it in-place we must not use any value stored in the source array for this
purpose.

Conclusions

I hope that you will find the presented example useful. As a final comment let
me highlight one convention.

In our codes the add! function has a ! suffix. This is a convention that
signals that it might mutate its arguments. This is indeed what happens both in
sum2 and sum functions to the accumulator value. However, sum2 and sum
functions do not need a ! in their names as in their implementation the passed
array is not mutated (only accumulator value that is created inside these
functions is mutated).