Tag Archives: julialang

DataFrames.jl indexing rules

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2022/10/28/indexing.html

Introduction

In DataFrames.jl 1.4 release we reached the target state of data frame indexing
that we aimed for when we designed 1.0 release.

In this post I want to present you what mental model you should have when
thinking about indexing into a data frame. The reason why learning this is
important is that we needed to extend standard indexing rules that are defined
for arrays in Julia to cover all scenarios that users need when working with
data frames.

I will focus on working with a single column of a data frame as this is the most
common indexing scenario.

The code was tested under Julia 1.8.2 and DataFrames.jl 1.4.2.

Rule 1: indexing always requires passing both row and column selector

When you index into a data frame you must pass exactly two dimensions: a row
selector and a column selector like df[row_selector, column_selector].
When indexing you can think of a data frame as of a matrix.

Rule 2: all indexing that works on matrices works the same way for data frames

The benefit of the rule that data frame follows matrix indexing is that it is
easy to remember. If you know how matrix indexing works then all translates
directly to a data frame.

Here are some examples of extracting a column or its fragment from a data frame.

julia> using DataFrames

julia> df = DataFrame(a=1:3, b=4:6)
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │     1      4
   2 │     2      5
   3 │     3      6

julia> df[1, 1]
1

julia> df[1:2, 1]
2-element Vector{Int64}:
 1
 2

julia> df[:, 2]
3-element Vector{Int64}:
 4
 5
 6

(As I commented in the Introduction in this post I concentrate on getting a
single column from a data frame so the second index was always an integer.)

One important rule of Julia Base indexing is that when you get a column or its
part from a matrix a copy is made, except if you extract a single element. This
is exactly what happens for data frame: df[:, 2] is a copy of a second
column stored in df. You can check it by writing e.g.:

julia> df[:, 2] === eachcol(df)[2]
false

julia> df[:, 2] == eachcol(df)[2]
true

We see that we get the same data, but not the same object.

If you instead want a view of a column without copying the data, you can use
@view exactly like you would do it with matrices:

julia> @view df[1, 1]
0-dimensional view(::Vector{Int64}, 1) with eltype Int64:
1

julia> @view df[1:2, 1]
2-element view(::Vector{Int64}, 1:2) with eltype Int64:
 1
 2

julia> @view df[:, 2]
3-element view(::Vector{Int64}, :) with eltype Int64:
 4
 5
 6

As you can see, again all worked as if df were a matrix.

The same rules that worked for getting data from a data frame work for setting
data in a data frame. You have two options here: normal assignment and
broadcasted assignment.

julia> df[1:2, 1] = [11, 12]
2-element Vector{Int64}:
 11
 12

julia> df[:, 2] .= 100
3-element view(::Vector{Int64}, :) with eltype Int64:
 100
 100
 100

julia> df
3×2 DataFrame
 Row │ a      b
     │ Int64  Int64
─────┼──────────────
   1 │    11    100
   2 │    12    100
   3 │     3    100

These operations, again, work exactly like for matrices. In particular they
are in-place, that is no memory is allocated when performing them. The data
is written into already allocated column. This rule is important as it means
that by such assignment you cannot change the element type of a column:

julia> df[:, 1] = ["a", "b", "c"]
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Int64

julia> df[:, 2] .= "x"
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Int64

In summary, standard array indexing works the same way for matrices and for data
frames.

In what follows I describe the extensions to the indexing rules that are
DataFrames.jl specific.

Rule 3: you can use strings or symbols to pass column names

The first extension is related to the fact that standard matrices have to be
indexed by integers. In DataFrames.jl you can alternatively use string or symbol
to select a column when indexing. Here are some basic examples:

julia> df[1:2, "a"]
2-element Vector{Int64}:
 11
 12

julia> df[:, :b] .= 1000
3-element view(::Vector{Int64}, :) with eltype Int64:
 1000
 1000
 1000

This is intuitive so far. However, this rule leads to one extension. It is
related to the fact that you can pass a column name that does not exist yet
in a data frame. In this case if you pass : as a column selector a new column
in a data frame will be allocated (that is copy of the source data will be
performed) and the new column will be added at the end of the data frame:

julia> df[:, "d"] = [-1, -2, -3]
3-element Vector{Int64}:
 -1
 -2
 -3

julia> df
3×3 DataFrame
 Row │ a      b      d
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │    11   1000     -1
   2 │    12   1000     -2
   3 │     3   1000     -3

This rule has two additional special cases. The first is when a data frame
has no columns yet. In such a situation you can add any vector to a data frame:

julia> df2 = DataFrame()
0×0 DataFrame

julia> df2[:, :x] = [1, 2]
2-element Vector{Int64}:
 1
 2

julia> df2
2×1 DataFrame
 Row │ x
     │ Int64
─────┼───────
   1 │     1
   2 │     2

What is non-standard with this rule? We allow changing the number of rows
in a data frame from zero to whatever is needed. For a data frame having some
columns changing their number of rows is not allowed with indexing.

The second special case is when you try to create a new column in a view of
a data frame. It is not allowed in general, but if the view was created with
: as a column selector we accept it (the reason is that in this case view
subsets only rows, but does not change columns; this is a common use case
in practice). Here is an example:

julia> df
3×3 DataFrame
 Row │ a      b      d
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │    11   1000     -1
   2 │    12   1000     -2
   3 │     3   1000     -3

julia> dfv = @view df[[3, 1], :]
2×3 SubDataFrame
 Row │ a      b      d
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     3   1000     -3
   2 │    11   1000     -1

julia> dfv[:, :e] = [1, 2]
2-element Vector{Int64}:
 1
 2

julia> dfv
2×4 SubDataFrame
 Row │ a      b      d      e
     │ Int64  Int64  Int64  Int64?
─────┼─────────────────────────────
   1 │     3   1000     -3       1
   2 │    11   1000     -1       2

julia> df
3×4 DataFrame
 Row │ a      b      d      e
     │ Int64  Int64  Int64  Int64?
─────┼──────────────────────────────
   1 │    11   1000     -1        2
   2 │    12   1000     -2  missing
   3 │     3   1000     -3        1

As you can see in this case a new column gets missing for rows that are not
present in the dfv data frame.

Rule 4: You can use Not to negate row selection

This is a pretty simple rule. Instead of selecting rows, you can use Not
to say which rows you want to drop when indexing:

julia> df
3×4 DataFrame
 Row │ a      b      d      e
     │ Int64  Int64  Int64  Int64?
─────┼──────────────────────────────
   1 │    11   1000     -1        2
   2 │    12   1000     -2  missing
   3 │     3   1000     -3        1

julia> df[Not(2), :d]
2-element Vector{Int64}:
 -1
 -3

Rule 5: There is a special ! row selector

The special ! row selector selects all rows similarly to :, but it has
a different behavior in relation to column allocation.

When extracting the data from a data frame if you use ! no copying of data is
made. Instead, you just get a column as it is stored in the source data frame
(or an appropriate view if you work with a view of a data frame):

julia> df[!, 1]
3-element Vector{Int64}:
 11
 12
  3

julia> df[!, 1] === eachcol(df)[1]
true

julia> dfv[!, 1]
2-element view(::Vector{Int64}, [3, 1]) with eltype Int64:
  3
 11

The reason why ! is allowed for when extracting a column from a data frame
is performance. In general it is not safe to use !. You should prefer :
as copying data (in R like style) is safer. However, in some cases, when your
operations are memory-bound or you need to maximize performance, you have !
at hand to avoid unnecessary operations.

For writing data into a data frame if you use ! selector it has also a
different behavior than :. Recall, that : is always in-place. Instead
! stores a fresh column in a data frame.

julia> df
3×4 DataFrame
 Row │ a      b      d      e
     │ Int64  Int64  Int64  Int64?
─────┼──────────────────────────────
   1 │    11   1000     -1        2
   2 │    12   1000     -2  missing
   3 │     3   1000     -3        1

julia> df[!, "a"] = ["a", "b", "c"]
3-element Vector{String}:
 "a"
 "b"
 "c"

julia> dfv[!, "e"] .= "x"
2-element Vector{String}:
 "x"
 "x"

julia> df
3×4 DataFrame
 Row │ a       b      d      e
     │ String  Int64  Int64  Any
─────┼───────────────────────────────
   1 │ a        1000     -1  x
   2 │ b        1000     -2  missing
   3 │ c        1000     -3  x

How the fresh column is stored depends on the type of data frame you use and the
type of operation:

  • if you use standard assignment on a data frame (df[!, "a"] = ["a", "b", "c"])
    then source vector is not copied, but is stored as-is (so this kind of storage
    is the fastest way to store a column in a data frame).
  • if you use broadcasted assignment (df[!, "a"] .= ["a", "b", "c"]) or assign
    into a view of a data frame (dfv[!, "a"] = ["x", "x"]) then a new column is
    freshly allocated (this kind of behavior is needed if you want to change
    element type of the column already present in a data frame, but want a copy
    of the source vector to be made for safety reasons).

So for assignment with ! the basic rule to remember is that it replaces the
existing column in a data frame. Then the additional rule is that normal
assignment on a data frame is non-copying, and broadcasted assignment or
using a view of a data frame allocates a copy.

These rules were designed to cover all possible scenarios that users might need
when working with columns of a data frame.

Rule 6: property access works the same as if you used ! as row selector

With this rule we are getting to the point that changed in DataFrames.jl 1.4
(but was planned and announced much earlier).

When you write df.a it is always treated the same as df[!, "a"], for
extracting a column from a data frame, for assignment, and for broadcasted
assignment.

Since DataFrames.jl 1.4 this is all you need to remember about property access
to a data frame. We decided on this rule as it is easy to remember and it does
not add any new concepts or exceptions for users to learn.

However, unfortunately, this is the place where we had to deviate from how
property access works in Base Julia. Let me explain the issue step by step.

If you write df.a = ["a", "b", "c"] then you expect that column a in df
data frame is replaced by ["a", "b", "c"]. And this is what happens. Recall
that this is what df[!, "a"] = ["a", "b", "c"] also does.

Now this means that, if we want df.a .= ["a", "b", "c"] to work the same as
df[!, "a"] .= ["a", "b", "c"], then both operations replace column a in df.

And here we have a slight inconsistency, as in Base Julia users would expect
that df.a .= ["a", "b", "c"] would work in-place, like
df[:, "a"] .= ["a", "b", "c"]. This is not the case.

Therefore you need to remember that in DataFrames.jl if you use property access
syntax for setting data it always replaces the colum: both when doing assignment
and broadcasted assignment.

The reason why we decided on this design is twofold. First is learning. We have
a simple rule: ! selector and property access work the same way always.
The second reason is convenience. Most users preferred the following operation:

julia> df3 = DataFrame(c=1:3)
3×1 DataFrame
 Row │ c
     │ Int64
─────┼───────
   1 │     1
   2 │     2
   3 │     3

julia> df3.c .= 'x'
3-element Vector{Char}:
 'x': ASCII/Unicode U+0078 (category Ll: Letter, lowercase)
 'x': ASCII/Unicode U+0078 (category Ll: Letter, lowercase)
 'x': ASCII/Unicode U+0078 (category Ll: Letter, lowercase)

julia> df3
3×1 DataFrame
 Row │ c
     │ Char
─────┼──────
   1 │ x
   2 │ x
   3 │ x

to replace column c with a vector of 'x' values. Note that instead if you
use : as row selector you get a bit surprising result:

julia> df3 = DataFrame(c=1:3)
3×1 DataFrame
 Row │ c
     │ Int64
─────┼───────
   1 │     1
   2 │     2
   3 │     3

julia> df3[:, "c"] .= 'x'
3-element view(::Vector{Int64}, :) with eltype Int64:
 120
 120
 120

julia> df3
3×1 DataFrame
 Row │ c
     │ Int64
─────┼───────
   1 │   120
   2 │   120
   3 │   120

The reason of this behavior is that Char is silently converted to Int:

julia> convert(Int, 'x')
120

What is the general benefit of the behavior we adopted? When you write
df[!, "a"] .= 'x' or df.a .= 'x' you are sure that as a result you will
have a freshly allocated column "a" in df containing 'x' values as all
its elements, independent of the fact if column "a" was already present
in the data frame or not. So this is like git push --force operation.
It is guaranteed to succeed no matter what the starting state of the data
frame was (of course assuming that the object on the right hand side supports
broadcasting and has proper dimensions).

Why has it taken us so long to reach this state?

The reason why we landed with these rules only in DataFrames.jl 1.4 is that
before Julia 1.7 it was not possible to support the behavior we wanted.
Therefore we had to wait till Julia 1.7 to be released to achieve consistency
between property access and ! row selector behavior in DataFrames.jl in all
cases.

So people porting code from DataFrames.jl earlier than 1.4 or from Julia earlier
than 1.7 will notice a change that previously df.col .= value was in-place,
and currently it allocates a new column. We have considered the risks that this
change will be problematic for users, and the conclusion was that:

  • In a vast majority of code this change will not affect the result and users
    will not even notice this change.
  • It might affect the code that is performance critical (as an extra copy is now
    made). However, if this is the case it is easy to fix the issue (and it will
    not cause the code to be broken).
  • It might affect the code that relied on the fact that previously a conversion
    during in-place assignment was made (and e.g. relied on the fact that such
    assignment does not change element type of the column). In this case such code
    would indeed be broken. In this situation the conclusion was that such cases,
    although possible in practice, are most likely quite rare and if present would
    affect only experienced Julia users who have a good grasp of conversion upon
    assignment rules in Base Julia. We concluded that such users would be able
    to identify the problematic cases and change df.col .= value to
    df[:, :col] .= value in their code to get back the old behavior.

Conclusions

I hope you found this post useful in building your understanding how
the details of indexing in DataFrames.jl work and what are their intended
use cases.

Additionally, wanted to present you the mental process we went through when we
were making hard design decisions in DataFrames.jl development team. In this
process we had to balance three things:

  • user convenience (especially taking into account the fact that target audience
    of DataFrames.jl are data scientists, who sometimes are not computer science
    experts);
  • internal (within DataFrames.jl) consistency of rules that one needs to learn;
  • minimization of cases when we diverge from what Base Julia defines for similar
    objects (the challenge was that data frame is neither a matrix nor a struct,
    and has different design requirements, but we borrow the syntax from both).

In this post I have not covered all the details of DataFrames.jl indexing rules.
If you want to learn about the indexing behavior every supported scenario please
check the Indexing section of the DataFrames.jl manual.

What is the shortest path to AlphaTensor?

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2022/10/21/distances.html

Introduction

Recently I have talked with my friend who teaches undergraduate algebra class.
We discussed how one can show usefulness of matrix multiplication to students.
At the same time a paper about AlphaTensor has been published in nature
showing how reinforcement learning was used to discover fast matrix
multiplication algorithms.

Trying to combine these two events I thought of a problem where multiplication
is much more expensive than addition. You might ask why? Because this condition
is required for algorithms like ones discovered by AlphaTensor to be indeed
beneficial. One natural candidate is of course multiplication of large
matrices, but I wanted something where multiplication of underlying elements of
a matrix is much more expensive than their addition. At the same time I wanted
a nice example for my friend.

After some thought I decided to use some graph theory. The problem I will write
about today is finding shortest paths between all pairs of vertices in a graph.
One of the standard algorithms that can be used to perform this task is
Floyd–Warshall. Interestingly, the path finding problem can be expressed
in terms of matrix multiplication, as is mentioned here. See e.g.
chapter 5.9 in “The Design and Analysis of Computer Algorithms”
for an introductory explanation.

In this post I will not go into all details of the algorithmic discussion (to
leave something for teachers of algebra classes, as it requires giving an
introduction to both linear algebra and group theory). Instead, I want to show
you, by example, how to compute shortest paths using matrix multiplication and
check that indeed we get the expected result. This means that the post today is
a bit harder than usual to follow every detail of what I do, but I hope you will
still enjoy the narrative.

The presented code was tested under Julia 1.8.2 and Graphs.jl 1.7.4.

Setting up the scene

Let us first generate some test graph and compute distances between all
its nodes using Floyd-Warshall algorithm. In the example we will use
a well known Zachary’s karate club graph.

julia> using Graphs

julia> g = smallgraph(:karate)
{34, 78} undirected simple Int64 graph

julia> connected_components(g)
1-element Vector{Vector{Int64}}:
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  25, 26, 27, 28, 29, 30, 31, 32, 33, 34]

julia> g_distances = floyd_warshall_shortest_paths(g).dists
34×34 Matrix{Int64}:
 0  1  1  1  1  1  1  1  1  2  …  2  3  2  2  3  2  1  2  2
 1  0  1  1  2  2  2  1  2  2     3  3  2  2  3  1  2  2  2
 1  1  0  1  2  2  2  1  1  1     3  3  1  1  2  2  2  1  2
 1  1  1  0  2  2  2  1  2  2     3  3  2  2  3  2  2  2  2
 1  2  2  2  0  2  1  2  2  3     3  4  3  3  4  3  2  3  3
 1  2  2  2  2  0  1  2  2  3  …  3  4  3  3  4  3  2  3  3
 ⋮              ⋮              ⋱  ⋮              ⋮
 2  2  1  2  3  3  3  2  2  2     2  2  2  0  2  2  1  2  1
 3  3  2  3  4  4  4  3  2  2     2  1  2  2  0  2  2  1  1
 2  1  2  2  3  3  3  2  1  2  …  3  2  2  2  2  0  2  1  1
 1  2  2  2  2  2  2  2  2  2     1  2  2  1  2  2  0  1  1
 2  2  1  2  3  3  3  2  1  2     2  2  2  2  1  1  1  0  1
 2  2  2  2  3  3  3  3  1  1     2  1  1  1  1  1  1  1  0

julia> maximum(g_distances)
5

Let us look at the plot of the graph:

Zachary's karate club graph

Indeed, we can notice that the graph has one connected component and the biggest
distance between two nodes is 5, e.g. between nodes 16 and 17.

We will want now to reproduce the same distances using matrix multiplication
and next find the shortest paths also using matrix multiplication.

The first step: distances

To calculate distances let us define Distance type (I make it as simple
as possible to concentrate on matrix multiplication). I supply this type
with custom addition and multiplication definitions that are used for
distance calculation.

julia> struct Distance
           d::Float64
       end

julia> Base.:+(a::Distance, b::Distance) = Distance(min(a.d, b.d))

julia> Base.:*(a::Distance, b::Distance) = Distance(a.d + b.d)

julia> Base.zero(::Distance) = Distance(Inf)

julia> Base.show(io::IO, d::Distance) = show(io, d.d)

julia> Base.float(d::Distance) = d.d

We will start our computations with adjacency matrix, where we put Inf
for nodes that are not connected:

julia> n = nv(g)
34

julia> dm = Distance.([has_edge(g, i, j) ? 1 : i == j ? 0 : Inf
                       for i in 1:n, j in 1:n])
34×34 Matrix{Distance}:
 0.0  1.0  1.0  1.0  1.0  1.0  …  Inf  Inf  1.0  Inf  Inf
 1.0  0.0  1.0  1.0  Inf  Inf     Inf  1.0  Inf  Inf  Inf
 1.0  1.0  0.0  1.0  Inf  Inf     Inf  Inf  Inf  1.0  Inf
 1.0  1.0  1.0  0.0  Inf  Inf     Inf  Inf  Inf  Inf  Inf
 1.0  Inf  Inf  Inf  0.0  Inf     Inf  Inf  Inf  Inf  Inf
 1.0  Inf  Inf  Inf  Inf  0.0  …  Inf  Inf  Inf  Inf  Inf
 ⋮                        ⋮    ⋱       ⋮
 Inf  Inf  1.0  Inf  Inf  Inf     Inf  Inf  1.0  Inf  1.0
 Inf  Inf  Inf  Inf  Inf  Inf     0.0  Inf  Inf  1.0  1.0
 Inf  1.0  Inf  Inf  Inf  Inf  …  Inf  0.0  Inf  1.0  1.0
 1.0  Inf  Inf  Inf  Inf  Inf     Inf  Inf  0.0  1.0  1.0
 Inf  Inf  1.0  Inf  Inf  Inf     1.0  1.0  1.0  0.0  1.0
 Inf  Inf  Inf  Inf  Inf  Inf     1.0  1.0  1.0  1.0  0.0

In the implementation Inf means that at some point we have not yet found
any path between nodes so their distance is infinite.

We are now ready to solve the problem. Since we know that maximal distance
in the graph is five, we just can do:

julia> dm^5
34×34 Matrix{Distance}:
 0.0  1.0  1.0  1.0  1.0  1.0  …  3.0  2.0  1.0  2.0  2.0
 1.0  0.0  1.0  1.0  2.0  2.0     3.0  1.0  2.0  2.0  2.0
 1.0  1.0  0.0  1.0  2.0  2.0     2.0  2.0  2.0  1.0  2.0
 1.0  1.0  1.0  0.0  2.0  2.0     3.0  2.0  2.0  2.0  2.0
 1.0  2.0  2.0  2.0  0.0  2.0     4.0  3.0  2.0  3.0  3.0
 1.0  2.0  2.0  2.0  2.0  0.0  …  4.0  3.0  2.0  3.0  3.0
 ⋮                        ⋮    ⋱       ⋮
 2.0  2.0  1.0  2.0  3.0  3.0     2.0  2.0  1.0  2.0  1.0
 3.0  3.0  2.0  3.0  4.0  4.0     0.0  2.0  2.0  1.0  1.0
 2.0  1.0  2.0  2.0  3.0  3.0  …  2.0  0.0  2.0  1.0  1.0
 1.0  2.0  2.0  2.0  2.0  2.0     2.0  2.0  0.0  1.0  1.0
 2.0  2.0  1.0  2.0  3.0  3.0     1.0  1.0  1.0  0.0  1.0
 2.0  2.0  2.0  2.0  3.0  3.0     1.0  1.0  1.0  1.0  0.0

to get the required distances. Let us check:

julia> float.(dm^5) == g_distances
true

Also note that further multiplications of dm do not change anything:

julia> dm^5 == dm^6
true

However, clearly five multiplications are needed:

julia> dm^5 == dm^4
false

In this example addition is min and multiplication is +. Can we do better
and make a bigger difference in cost of these two operations? Yes, we can.
Let us, instead of calculating distances calculate shortest paths.

The final blow: shortest paths

Let us now use the same pattern to compute shortest paths. First define
Path type:

julia> struct Path
           p::Union{Nothing, Vector{Int}}
       end

julia> function Base.:+(a::Path, b::Path)
           isnothing(a.p) && return b
           isnothing(b.p) && return a
           return length(a.p) < length(b.p) ? a : b
       end

julia> function Base.:*(a::Path, b::Path)
           isnothing(a.p) && return Path(nothing)
           isnothing(b.p) && return Path(nothing)
           @assert last(a.p) == first(b.p)
           return Path([a.p; b.p[2:end]])
       end

julia> Base.zero(::Path) = Path(nothing)

julia> Base.show(io::IO, p::Path) = show(io, p.p)

julia> Distance(p::Path) = Distance(isnothing(p.p) ? Inf : length(p.p) - 1)

julia> Base.:(==)(a::Path, b::Path) = a.p == b.p

This time the design is a bit more complex. We denote non-existent path with
nothing. The plus is that clearly multiplication, which involves vector
concatenation, is significantly more expensive than addition (I know it could
be optimized, but I wanted to keep the code simple).

Let us check if indeed still matrix multiplication gives us the desired solution:

julia> pm = [has_edge(g, i, j) ? Path([i, j]) : Path(i == j ? [i] : nothing)
             for i in 1:n, j in 1:n]
34×34 Matrix{Path}:
 [1]      [1, 2]   [1, 3]   [1, 4]   …  nothing   nothing
 [2, 1]   [2]      [2, 3]   [2, 4]      nothing   nothing
 [3, 1]   [3, 2]   [3]      [3, 4]      [3, 33]   nothing
 [4, 1]   [4, 2]   [4, 3]   [4]         nothing   nothing
 [5, 1]   nothing  nothing  nothing     nothing   nothing
 [6, 1]   nothing  nothing  nothing  …  nothing   nothing
 ⋮                                   ⋱
 nothing  nothing  [29, 3]  nothing     nothing   [29, 34]
 nothing  nothing  nothing  nothing     [30, 33]  [30, 34]
 nothing  [31, 2]  nothing  nothing  …  [31, 33]  [31, 34]
 [32, 1]  nothing  nothing  nothing     [32, 33]  [32, 34]
 nothing  nothing  [33, 3]  nothing     [33]      [33, 34]
 nothing  nothing  nothing  nothing     [34, 33]  [34]

julia> pm5 = pm^5
34×34 Matrix{Path}:
 [1]              [1, 2]           …  [1, 32, 34]
 [2, 1]           [2]                 [2, 31, 34]
 [3, 1]           [3, 2]              [3, 33, 34]
 [4, 1]           [4, 2]              [4, 14, 34]
 [5, 1]           [5, 1, 2]           [5, 1, 32, 34]
 [6, 1]           [6, 1, 2]        …  [6, 1, 32, 34]
 ⋮                                 ⋱
 [29, 32, 1]      [29, 3, 2]          [29, 34]
 [30, 34, 32, 1]  [30, 34, 31, 2]     [30, 34]
 [31, 9, 1]       [31, 2]          …  [31, 34]
 [32, 1]          [32, 1, 2]          [32, 34]
 [33, 32, 1]      [33, 31, 2]         [33, 34]
 [34, 32, 1]      [34, 31, 2]         [34]

julia> Distance.(pm5) == dm^5
true

julia> Distance.(pm^6) == dm^5
true

julia> validate(g, p) = all(i -> has_path(g, p.p[i], p.p[i+1]), 1:length(p.p)-1)
validate (generic function with 1 method)

julia> all(p -> validate(g, p), pm5)
true

The validate function checks if indeed some Path is a path in our graph g.

Conclusions

So where is the AlphaTensor stuff in this post? Well – it turns out that it
was a click bait. Why cannot we apply AlphaTensor algorithm to our problem?
The issue is that it requires the set of values we operate on to be a
ring, while the non-standard addition and multiplication we have
defined provide only a tropical semiring. Therefore, standard matrix
multiplication algorithm works here, as it requires only addition and
multiplication. However, AlphaTensor algorithm requires subtraction.
So maybe it was not a click bait, but rather an example they one should always
check all assumptions of every theorem before using it?

Now let me comment a bit on a Julia side of the examples. I think it is amazing
that everything just worked if we defined custom types with a few extra methods.
Still, there is one thing that might worry you. What if matrix multiplication
algorithm implemented in Base Julia were changed to AlphaTensor or e.g.
Strassen variant (we all know that people behind linear algebra in
Julia are obsessed about performance)? I have a bad and a good news here. The
bad is that we would get an error (and would need to explicitly ask for a
standard multiplication algorithm). Why? Because when subtraction would be
attempted no method to perform it on Distance or Path type would be found.
So what is good news? Well, we would get an error and not an incorrect result,
so we would be explicitly informed that something went wrong.

You might ask why, by default, Julia uses the standard matrix multiplication
algorithm? The reason is what we discussed in the introduction. The faster
algorithms, like AlphaTensor or Strassen, give benefit only if
multiplication is much more expensive than addition. This in practice is
encountered for very large matrices. This is not a typical use case against
which default linear algebra algorithms in Julia were optimized.

I hope you enjoyed this post! Next week we will go back to discussion of
new features introduced in DataFrames.jl 1.4, so stay tuned.