Category Archives: Julia

What is the shortest path to AlphaTensor?

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2022/10/21/distances.html

Introduction

Recently I have talked with my friend who teaches undergraduate algebra class.
We discussed how one can show usefulness of matrix multiplication to students.
At the same time a paper about AlphaTensor has been published in nature
showing how reinforcement learning was used to discover fast matrix
multiplication algorithms.

Trying to combine these two events I thought of a problem where multiplication
is much more expensive than addition. You might ask why? Because this condition
is required for algorithms like ones discovered by AlphaTensor to be indeed
beneficial. One natural candidate is of course multiplication of large
matrices, but I wanted something where multiplication of underlying elements of
a matrix is much more expensive than their addition. At the same time I wanted
a nice example for my friend.

After some thought I decided to use some graph theory. The problem I will write
about today is finding shortest paths between all pairs of vertices in a graph.
One of the standard algorithms that can be used to perform this task is
Floyd–Warshall. Interestingly, the path finding problem can be expressed
in terms of matrix multiplication, as is mentioned here. See e.g.
chapter 5.9 in “The Design and Analysis of Computer Algorithms”
for an introductory explanation.

In this post I will not go into all details of the algorithmic discussion (to
leave something for teachers of algebra classes, as it requires giving an
introduction to both linear algebra and group theory). Instead, I want to show
you, by example, how to compute shortest paths using matrix multiplication and
check that indeed we get the expected result. This means that the post today is
a bit harder than usual to follow every detail of what I do, but I hope you will
still enjoy the narrative.

The presented code was tested under Julia 1.8.2 and Graphs.jl 1.7.4.

Setting up the scene

Let us first generate some test graph and compute distances between all
its nodes using Floyd-Warshall algorithm. In the example we will use
a well known Zachary’s karate club graph.

julia> using Graphs

julia> g = smallgraph(:karate)
{34, 78} undirected simple Int64 graph

julia> connected_components(g)
1-element Vector{Vector{Int64}}:
 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  25, 26, 27, 28, 29, 30, 31, 32, 33, 34]

julia> g_distances = floyd_warshall_shortest_paths(g).dists
34×34 Matrix{Int64}:
 0  1  1  1  1  1  1  1  1  2  …  2  3  2  2  3  2  1  2  2
 1  0  1  1  2  2  2  1  2  2     3  3  2  2  3  1  2  2  2
 1  1  0  1  2  2  2  1  1  1     3  3  1  1  2  2  2  1  2
 1  1  1  0  2  2  2  1  2  2     3  3  2  2  3  2  2  2  2
 1  2  2  2  0  2  1  2  2  3     3  4  3  3  4  3  2  3  3
 1  2  2  2  2  0  1  2  2  3  …  3  4  3  3  4  3  2  3  3
 ⋮              ⋮              ⋱  ⋮              ⋮
 2  2  1  2  3  3  3  2  2  2     2  2  2  0  2  2  1  2  1
 3  3  2  3  4  4  4  3  2  2     2  1  2  2  0  2  2  1  1
 2  1  2  2  3  3  3  2  1  2  …  3  2  2  2  2  0  2  1  1
 1  2  2  2  2  2  2  2  2  2     1  2  2  1  2  2  0  1  1
 2  2  1  2  3  3  3  2  1  2     2  2  2  2  1  1  1  0  1
 2  2  2  2  3  3  3  3  1  1     2  1  1  1  1  1  1  1  0

julia> maximum(g_distances)
5

Let us look at the plot of the graph:

Zachary's karate club graph

Indeed, we can notice that the graph has one connected component and the biggest
distance between two nodes is 5, e.g. between nodes 16 and 17.

We will want now to reproduce the same distances using matrix multiplication
and next find the shortest paths also using matrix multiplication.

The first step: distances

To calculate distances let us define Distance type (I make it as simple
as possible to concentrate on matrix multiplication). I supply this type
with custom addition and multiplication definitions that are used for
distance calculation.

julia> struct Distance
           d::Float64
       end

julia> Base.:+(a::Distance, b::Distance) = Distance(min(a.d, b.d))

julia> Base.:*(a::Distance, b::Distance) = Distance(a.d + b.d)

julia> Base.zero(::Distance) = Distance(Inf)

julia> Base.show(io::IO, d::Distance) = show(io, d.d)

julia> Base.float(d::Distance) = d.d

We will start our computations with adjacency matrix, where we put Inf
for nodes that are not connected:

julia> n = nv(g)
34

julia> dm = Distance.([has_edge(g, i, j) ? 1 : i == j ? 0 : Inf
                       for i in 1:n, j in 1:n])
34×34 Matrix{Distance}:
 0.0  1.0  1.0  1.0  1.0  1.0  …  Inf  Inf  1.0  Inf  Inf
 1.0  0.0  1.0  1.0  Inf  Inf     Inf  1.0  Inf  Inf  Inf
 1.0  1.0  0.0  1.0  Inf  Inf     Inf  Inf  Inf  1.0  Inf
 1.0  1.0  1.0  0.0  Inf  Inf     Inf  Inf  Inf  Inf  Inf
 1.0  Inf  Inf  Inf  0.0  Inf     Inf  Inf  Inf  Inf  Inf
 1.0  Inf  Inf  Inf  Inf  0.0  …  Inf  Inf  Inf  Inf  Inf
 ⋮                        ⋮    ⋱       ⋮
 Inf  Inf  1.0  Inf  Inf  Inf     Inf  Inf  1.0  Inf  1.0
 Inf  Inf  Inf  Inf  Inf  Inf     0.0  Inf  Inf  1.0  1.0
 Inf  1.0  Inf  Inf  Inf  Inf  …  Inf  0.0  Inf  1.0  1.0
 1.0  Inf  Inf  Inf  Inf  Inf     Inf  Inf  0.0  1.0  1.0
 Inf  Inf  1.0  Inf  Inf  Inf     1.0  1.0  1.0  0.0  1.0
 Inf  Inf  Inf  Inf  Inf  Inf     1.0  1.0  1.0  1.0  0.0

In the implementation Inf means that at some point we have not yet found
any path between nodes so their distance is infinite.

We are now ready to solve the problem. Since we know that maximal distance
in the graph is five, we just can do:

julia> dm^5
34×34 Matrix{Distance}:
 0.0  1.0  1.0  1.0  1.0  1.0  …  3.0  2.0  1.0  2.0  2.0
 1.0  0.0  1.0  1.0  2.0  2.0     3.0  1.0  2.0  2.0  2.0
 1.0  1.0  0.0  1.0  2.0  2.0     2.0  2.0  2.0  1.0  2.0
 1.0  1.0  1.0  0.0  2.0  2.0     3.0  2.0  2.0  2.0  2.0
 1.0  2.0  2.0  2.0  0.0  2.0     4.0  3.0  2.0  3.0  3.0
 1.0  2.0  2.0  2.0  2.0  0.0  …  4.0  3.0  2.0  3.0  3.0
 ⋮                        ⋮    ⋱       ⋮
 2.0  2.0  1.0  2.0  3.0  3.0     2.0  2.0  1.0  2.0  1.0
 3.0  3.0  2.0  3.0  4.0  4.0     0.0  2.0  2.0  1.0  1.0
 2.0  1.0  2.0  2.0  3.0  3.0  …  2.0  0.0  2.0  1.0  1.0
 1.0  2.0  2.0  2.0  2.0  2.0     2.0  2.0  0.0  1.0  1.0
 2.0  2.0  1.0  2.0  3.0  3.0     1.0  1.0  1.0  0.0  1.0
 2.0  2.0  2.0  2.0  3.0  3.0     1.0  1.0  1.0  1.0  0.0

to get the required distances. Let us check:

julia> float.(dm^5) == g_distances
true

Also note that further multiplications of dm do not change anything:

julia> dm^5 == dm^6
true

However, clearly five multiplications are needed:

julia> dm^5 == dm^4
false

In this example addition is min and multiplication is +. Can we do better
and make a bigger difference in cost of these two operations? Yes, we can.
Let us, instead of calculating distances calculate shortest paths.

The final blow: shortest paths

Let us now use the same pattern to compute shortest paths. First define
Path type:

julia> struct Path
           p::Union{Nothing, Vector{Int}}
       end

julia> function Base.:+(a::Path, b::Path)
           isnothing(a.p) && return b
           isnothing(b.p) && return a
           return length(a.p) < length(b.p) ? a : b
       end

julia> function Base.:*(a::Path, b::Path)
           isnothing(a.p) && return Path(nothing)
           isnothing(b.p) && return Path(nothing)
           @assert last(a.p) == first(b.p)
           return Path([a.p; b.p[2:end]])
       end

julia> Base.zero(::Path) = Path(nothing)

julia> Base.show(io::IO, p::Path) = show(io, p.p)

julia> Distance(p::Path) = Distance(isnothing(p.p) ? Inf : length(p.p) - 1)

julia> Base.:(==)(a::Path, b::Path) = a.p == b.p

This time the design is a bit more complex. We denote non-existent path with
nothing. The plus is that clearly multiplication, which involves vector
concatenation, is significantly more expensive than addition (I know it could
be optimized, but I wanted to keep the code simple).

Let us check if indeed still matrix multiplication gives us the desired solution:

julia> pm = [has_edge(g, i, j) ? Path([i, j]) : Path(i == j ? [i] : nothing)
             for i in 1:n, j in 1:n]
34×34 Matrix{Path}:
 [1]      [1, 2]   [1, 3]   [1, 4]   …  nothing   nothing
 [2, 1]   [2]      [2, 3]   [2, 4]      nothing   nothing
 [3, 1]   [3, 2]   [3]      [3, 4]      [3, 33]   nothing
 [4, 1]   [4, 2]   [4, 3]   [4]         nothing   nothing
 [5, 1]   nothing  nothing  nothing     nothing   nothing
 [6, 1]   nothing  nothing  nothing  …  nothing   nothing
 ⋮                                   ⋱
 nothing  nothing  [29, 3]  nothing     nothing   [29, 34]
 nothing  nothing  nothing  nothing     [30, 33]  [30, 34]
 nothing  [31, 2]  nothing  nothing  …  [31, 33]  [31, 34]
 [32, 1]  nothing  nothing  nothing     [32, 33]  [32, 34]
 nothing  nothing  [33, 3]  nothing     [33]      [33, 34]
 nothing  nothing  nothing  nothing     [34, 33]  [34]

julia> pm5 = pm^5
34×34 Matrix{Path}:
 [1]              [1, 2]           …  [1, 32, 34]
 [2, 1]           [2]                 [2, 31, 34]
 [3, 1]           [3, 2]              [3, 33, 34]
 [4, 1]           [4, 2]              [4, 14, 34]
 [5, 1]           [5, 1, 2]           [5, 1, 32, 34]
 [6, 1]           [6, 1, 2]        …  [6, 1, 32, 34]
 ⋮                                 ⋱
 [29, 32, 1]      [29, 3, 2]          [29, 34]
 [30, 34, 32, 1]  [30, 34, 31, 2]     [30, 34]
 [31, 9, 1]       [31, 2]          …  [31, 34]
 [32, 1]          [32, 1, 2]          [32, 34]
 [33, 32, 1]      [33, 31, 2]         [33, 34]
 [34, 32, 1]      [34, 31, 2]         [34]

julia> Distance.(pm5) == dm^5
true

julia> Distance.(pm^6) == dm^5
true

julia> validate(g, p) = all(i -> has_path(g, p.p[i], p.p[i+1]), 1:length(p.p)-1)
validate (generic function with 1 method)

julia> all(p -> validate(g, p), pm5)
true

The validate function checks if indeed some Path is a path in our graph g.

Conclusions

So where is the AlphaTensor stuff in this post? Well – it turns out that it
was a click bait. Why cannot we apply AlphaTensor algorithm to our problem?
The issue is that it requires the set of values we operate on to be a
ring, while the non-standard addition and multiplication we have
defined provide only a tropical semiring. Therefore, standard matrix
multiplication algorithm works here, as it requires only addition and
multiplication. However, AlphaTensor algorithm requires subtraction.
So maybe it was not a click bait, but rather an example they one should always
check all assumptions of every theorem before using it?

Now let me comment a bit on a Julia side of the examples. I think it is amazing
that everything just worked if we defined custom types with a few extra methods.
Still, there is one thing that might worry you. What if matrix multiplication
algorithm implemented in Base Julia were changed to AlphaTensor or e.g.
Strassen variant (we all know that people behind linear algebra in
Julia are obsessed about performance)? I have a bad and a good news here. The
bad is that we would get an error (and would need to explicitly ask for a
standard multiplication algorithm). Why? Because when subtraction would be
attempted no method to perform it on Distance or Path type would be found.
So what is good news? Well, we would get an error and not an incorrect result,
so we would be explicitly informed that something went wrong.

You might ask why, by default, Julia uses the standard matrix multiplication
algorithm? The reason is what we discussed in the introduction. The faster
algorithms, like AlphaTensor or Strassen, give benefit only if
multiplication is much more expensive than addition. This in practice is
encountered for very large matrices. This is not a typical use case against
which default linear algebra algorithms in Julia were optimized.

I hope you enjoyed this post! Next week we will go back to discussion of
new features introduced in DataFrames.jl 1.4, so stay tuned.

GitHub Sponsors – the good, the bad, and the ugly

By: Miguel Raz Guzmán Macedo

Re-posted from: https://miguelraz.github.io/blog/githubsponsors/index.html

Saludos ?

Hola! Me llamo Raz y este es mi granito de arena para quienes quieren empezar con el software libre (o encontrar fuentes de ingreso en este mundo). Puedes ser mi sponsor en GitHub para que escriba más cosas como esta y contribuya al software libre de Julia/Rust y sus traducciones en español.

GitHub Sponsors

with apologies to Holge and Randy and David and Logan and Hector and…

Summary:

You can get money via donations and your GitHub profile. This can take a few hours to setup and be a good side revenue, or an emergency lifeboat. Depending on where you live, your marketing skills, larger community environment and a slew of other factors, this may or may not work for you. This post is an exercise in Your Mileage May Vary from someone who spawned the game of life in Easy Mode (white cis english speaking STEM focused able male), so take what you can and dismiss what doesn't apply.

Make no mistake: living from GitHub sponsorship means charity with extra steps, and with a multinational corporation (ultimately, Microsoft, Github's owner) as your uncaring and all powerful boss. Assume they will use power to control your "wage" to align with their interests and diversify your risks accordingly.

If you live in the Global South and this program can apply for you, please consider getting started NOW. If you live in the Global North, consider sponsoring devs in the Global South with monthly payments – foreign exchange rate makes them life changing. If you are giddy for how to get started, my recommendations for a setup are farther down in this post.

Who what when where why

GitHub sponsors is a program you can sign up for (free) on github.com and receive donations from other people for (mostly) working on open source stuff. If you were already doing open source stuff, you can setup a profile and people can choose to give you one-time or monthly donations (more on that later) at tiers you are free to specify – from 1USD a month to full 666USD Bezos level patronage.

Github sponsors is attractive in comparison to other platforms because, at time of writing, they do not charge a percentage on your sponsorships.

Patreon charges (at time of writing), 5%+, and other (main stream) platforms are not far off.

This is an attractive proposition, but you should investigate before you launch yourself into the "content creator" lifestyle.

  1. GitHub has yet, to my knowledge, bound itself legally into always respecting this financial agreement. It could start changing it's terms tomorrow, lest you obey some orders from corporate, as was the case recently with Twitch streamer's revenue sharing overnight change with …Amazon. That news was the most recent one off the top of my head, but every subscription/donation based platform can, has, and will unilaterally change your ~~wage split~~donations how they see fit.

  2. This doesn't apply to every country. At time of writing, 68 regions can qualify for sponsorships of which, sadly, I only count 4 African countries, for a "global" program that has been running for almost 3 years now. Mexico initially wasn't on the beta list (for which there was promo where GitHub would match donations, to my chagrin), but I joined the waitlist and received my email after a while. Note that Iran, Venezuela, Russia and CubaN

  • who i am

    • how i got started, my presence

    • marketing

  • who i am not

    • orgs, umbrella, collectives

So. I think I've had what I would call a "moderately" succesful run with my GitHub sponsors. A lot of people have asked me about tips and tricks, and I'm not the one for

  • what sponsors is

how it works

  • how to setup your profile

  • countries, bans, availability

  • taxes

  • Dos

  • public presence: famous code, repos, books, talks, online presence (pandemic podcasts, community building)*

  • big disclaimers, aka the don'ts:

  • don't assume you won't pay taxes.

  • uni might clash – grad students/scholarships may not work. dont knmow if amazon wish lists or payments in kind can be setup

  • make this yourplan A, at least forever.