Category Archives: Julia

Element type surprises when processing collections in Julia

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2023/04/07/narrowing.html

Introduction

Today I want to write about a topic that is a quite tricky
element of design of Julia. The issue is that it is sometimes hard
to predict the element type of the output collection produced by
an operation that transforms an input collection.

The description above looks complicated, but the problem is
encountered in practice, so let me explain it by example.

The post was written under Julia 1.9.0-rc1.

A basic example

Assume you have some input collection [1, 2, 3] and you
want to compute square root of all its elements.

Let us consider three standard ways how you can do it:

julia> x = [1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

julia> sqrt.(x)
3-element Vector{Float64}:
 1.0
 1.4142135623730951
 1.7320508075688772

julia> map(sqrt, x)
3-element Vector{Float64}:
 1.0
 1.4142135623730951
 1.7320508075688772

julia> [sqrt(v) for v in x]
3-element Vector{Float64}:
 1.0
 1.4142135623730951
 1.7320508075688772

As you can see in each case a proper element type, that is
Float64, was determined for the returned collection.

This behavior is useful, as the user does not have to think
about specifying the output element type. In fact,
in combination with the transformation using the identity function,
this behavior can be used to conveniently narrow
down element type of some collection:

julia> y = Any[1, 2, 3]
3-element Vector{Any}:
 1
 2
 3

julia> identity.(y)
3-element Vector{Int64}:
 1
 2
 3

julia> map(identity, y)
3-element Vector{Int64}:
 1
 2
 3

julia> [v for v in y]
3-element Vector{Int64}:
 1
 2
 3

This pattern comes handy if we have input data that does not have a known
element type, but later we want to perform element type narrowing
when processing it (one of the major benefits of such narrowing
is that processing vectors of Any values is slow so we typically want to avoid it).

The hard case

Automatic output element type detection works nice most of the time.
Unfortunately, when we work with empty collections, it becomes hard to predict.
Here is a simple example:

julia> String.([])
Any[]

julia> map(String, [])
String[]

julia> [String(v) for v in []]
Any[]

julia> string.([])
AbstractString[]

julia> map(string, [])
Any[]

julia> [string(v) for v in []]
Any[]

As you can see from it broadcating, map, and comprehension use a different set of
rules to automatically determine the produced element type. These rules of course
exist and could be learned, but the point is that the issue is non-trivial.

The problem is that when you are writing production code
(e.g. you are developing a package) you want to be sure
what the element type of the collection you produce will be, as often
you cannot know upfront if the input collection the user is going to provide
will be empty or not.

The solution I use

In situations when it matters what the element type of the collection
produced by some transformation is going to be I use comprehensions
with output element type annotation:

julia> [string(v) for v in []]
Any[]

julia> String[string(v) for v in []]
String[]

Such annotation has an additional consequence that it is going to perform
conversion of the produced elements to the target type if needed:

julia> using Test

julia> s = ["a", GenericString("a")]
2-element Vector{AbstractString}:
 "a"
 "a"

julia> [string(v) for v in s]
2-element Vector{AbstractString}:
 "a"
 "a"

julia> typeof.([string(v) for v in s])
2-element Vector{DataType}:
 String
 GenericString

julia> String[string(v) for v in s]
2-element Vector{String}:
 "a"
 "a"

julia> typeof.(String[string(v) for v in s])
2-element Vector{DataType}:
 String
 String

Note that in the example prefixing the comprehension with
String made sure that the result of the operation has
String element type and all produced values have this type.

Element type widening

Let me comment on one common related operation. What if we
want to initialize some container with a given value but
we want its element type to be wider? This is not an artificial
case – it often happens with missing (where we initialize
some container with this value only to later replace missing with proper
values).

Using the fill function is a first thing we might try:

julia> fill(missing, 3)
3-element Vector{Missing}:
 missing
 missing
 missing

However, the produced container has Missing element type which
is not useful if we e.g. wanted to later also store integers in it.

One can use a comprehension annotated with a proper output element type
instead:

julia> Union{Int, Missing}[missing for _ in 1:3]
3-element Vector{Union{Missing, Int64}}:
 missing
 missing
 missing

The pattern with missing is needed often enough that we have
a custom function in the Missings.jl package that can be used
to get the desired result more conveniently:

julia> using Missings

julia> missings(Int, 3)
3-element Vector{Union{Missing, Int64}}:
 missing
 missing
 missing

Conclusions

Fortunately, in interactive use the problem with setting
of the proper element type for some collection does not
occur often. However, when I write production programs I make
sure to always think if I need to use comprehension with
element type specification to ensure type stability of my code.

Polyglot Exploration of Function Overloading

By: Jonathan Carroll

Re-posted from: https://jcarroll.com.au/2023/04/03/polyglot-overloading/

I’ve been working my way through Exercism exercises in a variety of
languages because I strongly believe every language you learn something about teaches
you about all the others you know, and makes for useful comparisons between what
features they offer. I was1 Learning Me a Haskell for Great Good
(there’s a guide/book by that name) and something about Pattern Matching
just seemed extremely familiar.

Pattern Matching is sort of like a case statement, but rather than just comparing literal
values against some enum, it takes into consideration how the input “looks”. A simple example
is to match against either an empty list [] (just that; an empty list) or a non-empty list denoted
(x:xs). In Haskell, : is a concatenation operator (cons in lisp) so this is the concatenation
of x and the rest of a list, xs. The wildcard pattern _ matching “whatever”.

A map function definition (from here) is then

map _ []     = []
map f (x:xs) = f x : map f xs

This is two definitions for map, depending on which pattern is provided as the two arguments. The first
takes “whatever” (doesn’t matter, is ignored) and an empty list and just returns an empty list. The
second takes some function f and a non-empty list, and concatenates (:) (f x) (the first
element of the list x provided to the function f) with map f xs (the result of providing f and the
rest of the list xs to map, recursively).

Since Haskell is strongly typed, I don’t think this can be used to define the same named function for
different types, but it can certainly do something different depending on the pattern of the data.
In this example, if the argument is an empty list, return 0; if the argument is a length-1 list (arg1
concatenated with an empty list) then return arg1 * 100, and if the argument is a longer list, return
the product of the first element and the second. This then prints out calling fun 5.0 and fun [5.0, 5.0]

fun :: [Float] -> Float
fun [] = 0.0
fun (arg1:[]) = arg1 * 100.0
fun (arg1:arg2) = arg1 * (head arg2)

main = do
  print (fun [5.0])
  print (fun [5.0, 5.0])
500.0
25.0

Woo! A different function called depending on the input. I believe it might be possible to actually
have optional arguments via the Data.Maybe package but I couldn’t get it to compile an example the way
I wanted2.

Rust has something similar but more specific to a case statement; a match expression
can take patterns as options and return whichever matches (example from here)

fn main() {
    let input = 's';

    match input {
        'q'                   => println!("Quitting"),
        'a' | 's' | 'w' | 'd' => println!("Moving around"),
        '0'..='9'             => println!("Number input"),
        _                     => println!("Something else"),
    }
}
Moving around

Another common use of match is to switch between the enums Some and None
or Ok and Err (see here).

The familiarity of the Haskell pattern matching / function definition took me back to one of the
very first programming ‘tricks’ I learned way back in the late 2000’s working on my PhD, using Fortran;
“function overloading”. I wasn’t formally taught programming at all (an oversight, given how important
it became to doing my research), so I just had to pick up bits and pieces from people who knew more.

I had a bunch of integration routines3 which were slightly different depending on whether or not
the limits were finite4, so I had to call
the right one with various if statements. The ‘trick’ I was
taught was to use INTERFACE / MODULE PROCEDURE blocks to “dispatch” depending on the function
signature, or at least the number of arguments. This meant that I could just call integrate regardless of
whether it was a signature with 4 arguments, or a signature with an additional argument if a bound was Infty.

A “small” (Fortran is hardly economical with page real-estate) example of this,
following the Haskell example, defines two functions Fun1arg and Fun2arg which
can be consolidated into fun with the INTERFACE block. Calling fun(x) or fun(x, y) is
routed to the function with the relevant signature.

MODULE exampleDispatch
  IMPLICIT NONE

  INTERFACE fun
     MODULE PROCEDURE Fun1arg, Fun2arg
  END INTERFACE fun

  CONTAINS

    ! A function that takes one argument
    ! and multiplies it by 100
    REAL FUNCTION Fun1arg(arg1)
      IMPLICIT NONE
      REAL, INTENT( IN ) :: arg1
      Fun1arg = arg1 * 100.0
    END FUNCTION Fun1arg

    ! A function that takes two arguments
    ! and multiplies them
    REAL FUNCTION Fun2arg(arg1, arg2)
      IMPLICIT NONE
      REAL, INTENT( IN ) :: arg1, arg2
      Fun2arg = arg1 * arg2
    END FUNCTION Fun2arg

END MODULE exampleDispatch

PROGRAM dispatch

  USE exampleDispatch

  IMPLICIT NONE
  REAL :: a = 5.0
  REAL :: fun

  PRINT *, fun(a)
  PRINT *, fun(a, a)

END PROGRAM dispatch
   500.000000    
   25.0000000

That takes me back! I’m going to dig out my old research code and get it into GitHub for
posterity. I’m also going to do the Fortran exercises in Exercism to
reminisce some more.

So, not quite the same as the Haskell version, but it got me thinking about dispatch. R has
several approaches. The most common is S3 in which dispatch occurs based on the class
of the first argument to a function, so you can have something different happen to a data.frame
argument and a tibble argument, but in both cases the signature has the same “shape” – only the
types vary.

Wiring that up to work differently with a list and any other value (the default case, which
would break for anything that doesn’t vectorize, but it’s a toy example) looks like

fun <- function(x) {
  UseMethod("fun")
}

fun.default <- function(x) { 
  x * 100
}

fun.list <- function(x) {
  x[[1]] * x[[2]]
}

fun(5)
fun(list(5, 5))
[1] 500
[1] 25

Another option is to use S4 which is more complicated but more powerful. Here, dispatch
can occur based on the entire signature, though (and I may be wrong) I believe that, too, still
needs to have a consistent “shape”. A fantastic guide to S4 is Stuart Lee’s post here.

A S4 version of my example could have two options for the signature; one where both
x and y are "numeric", and another where y is "missing". "ANY" would also work and
encompass a wider scope.

setGeneric("fun", function(x, y, ...) standardGeneric("fun"))

setMethod("fun", c("numeric", "missing"), function(x, y) {
  x * 100
})

setMethod("fun", c("numeric", "numeric"), function(x, y) {
  x * y
})

fun(5)
fun(5, 5)
[1] 500
[1] 25

So, can we ever do what I was originally inspired to do – write a simple definition of a
function that calculates differently depending on the number of arguments? Aha – Julia to
the rescue!! Julia has a beautifully simple syntax for defining methods on signatures:
just write it out!

fun(x) = x * 100
fun(x, y) = x * y

println(fun(5))
println(fun(5, 5))
500
25

That’s two different signatures for fun computing different things, and a lot less
boilerplate compared to the other languages, especially Fortran. What’s written above
is the entire script. You can even go further
and be specific about the types, say, mixing Int and Float64 definitions

fun(x::Int) = x * 100
fun(x::Float64) = x * 200

fun(x::Int, y::Int) = x * y
fun(x::Int, y::Float64) = x * y * 2

println(fun(5))
println(fun(5.))
println(fun(5, 5))
println(fun(5, 5.))
500
1000.0
25
50.0

It doesn’t get simpler or more powerful than that!!

I’ve added all these examples to a repo split out by language, and some
instructions for running them (assuming you have the language tooling already set up).

Do you have another example from a language that does this (well? poorly?) or similar?
Leave a comment if you have one, or find me on Mastodon

devtools::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.2 (2021-11-01)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  tz       Australia/Adelaide
##  date     2023-04-03
##  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.13    2022-09-24 [1] CRAN (R 4.1.2)
##  bookdown      0.29    2022-09-12 [1] CRAN (R 4.1.2)
##  bslib         0.4.1   2022-11-02 [3] CRAN (R 4.2.2)
##  cachem        1.0.6   2021-08-19 [3] CRAN (R 4.2.0)
##  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.4.1   2022-09-23 [3] CRAN (R 4.2.1)
##  crayon        1.5.2   2022-09-29 [3] CRAN (R 4.2.1)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.1.2)
##  digest        0.6.30  2022-10-18 [3] CRAN (R 4.2.1)
##  ellipsis      0.3.2   2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.18    2022-11-07 [3] CRAN (R 4.2.2)
##  fastmap       1.1.0   2021-01-25 [3] CRAN (R 4.2.0)
##  fs            1.5.2   2021-12-08 [3] CRAN (R 4.1.2)
##  glue          1.6.2   2022-02-24 [3] CRAN (R 4.2.0)
##  htmltools     0.5.3   2022-07-18 [3] CRAN (R 4.2.1)
##  htmlwidgets   1.5.4   2021-09-08 [1] CRAN (R 4.1.2)
##  httpuv        1.6.6   2022-09-08 [1] CRAN (R 4.1.2)
##  jquerylib     0.1.4   2021-04-26 [3] CRAN (R 4.1.2)
##  jsonlite      1.8.3   2022-10-21 [3] CRAN (R 4.2.1)
##  knitr         1.40    2022-08-24 [3] CRAN (R 4.2.1)
##  later         1.3.0   2021-08-18 [1] CRAN (R 4.1.2)
##  lifecycle     1.0.3   2022-10-07 [3] CRAN (R 4.2.1)
##  magrittr      2.0.3   2022-03-30 [3] CRAN (R 4.2.0)
##  memoise       2.0.1   2021-11-26 [3] CRAN (R 4.2.0)
##  mime          0.12    2021-09-28 [3] CRAN (R 4.2.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.1.2)
##  pkgbuild      1.3.1   2021-12-20 [1] CRAN (R 4.1.2)
##  pkgload       1.3.0   2022-06-27 [1] CRAN (R 4.1.2)
##  prettyunits   1.1.1   2020-01-24 [3] CRAN (R 4.0.1)
##  processx      3.8.0   2022-10-26 [3] CRAN (R 4.2.1)
##  profvis       0.3.7   2020-11-02 [1] CRAN (R 4.1.2)
##  promises      1.2.0.1 2021-02-11 [1] CRAN (R 4.1.2)
##  ps            1.7.2   2022-10-26 [3] CRAN (R 4.2.2)
##  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.1.2)
##  R6            2.5.1   2021-08-19 [3] CRAN (R 4.2.0)
##  Rcpp          1.0.9   2022-07-08 [1] CRAN (R 4.1.2)
##  remotes       2.4.2   2021-11-30 [1] CRAN (R 4.1.2)
##  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.1.2)
##  rmarkdown     2.18    2022-11-09 [3] CRAN (R 4.2.2)
##  rstudioapi    0.14    2022-08-22 [3] CRAN (R 4.2.1)
##  sass          0.4.2   2022-07-16 [3] CRAN (R 4.2.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.2)
##  shiny         1.7.2   2022-07-19 [1] CRAN (R 4.1.2)
##  stringi       1.7.8   2022-07-11 [3] CRAN (R 4.2.1)
##  stringr       1.5.0   2022-12-02 [1] CRAN (R 4.1.2)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.1.2)
##  usethis       2.1.6   2022-05-25 [1] CRAN (R 4.1.2)
##  vctrs         0.5.2   2023-01-23 [1] CRAN (R 4.1.2)
##  xfun          0.34    2022-10-18 [3] CRAN (R 4.2.1)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.1.2)
##  yaml          2.3.6   2022-10-18 [3] CRAN (R 4.2.1)
## 
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────


  1. in part due to a strong representation of
    Haskell at my local Functional Programming Meetup↩︎

  2. I’m highly likely doing something wrong – I never wrote any Haskell before last week↩︎

  3. Numerical Recipes in Fortran 90 was about the most
    important book we had for writing code, basically nothing else was trusted – getting a digital copy
    of the code was considered a sign of true power↩︎

  4. what, you don’t have to integrate up to infinity in your code?↩︎

Broadcast fusion in Julia: all you need to know to avoid pitfalls

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2023/03/31/broadcast.html

Introduction

Broadcasting is a powerful feature of Julia and quickly becomes a tool of choice
of developers because it is convenient to use.

However, in more complicated scenarios it can be tricky. The issue is that because
of its power broadcasting has a complex design and even the Julia Manual has it covered in four
different parts to discuss different aspects of the functionality:
here, here, here, and here.

For this reason, some time ago I have written a post about @. to
clarify its usage. A recent discussion on Julia Discourse prompted me to write
another post about this topic.

I will cover two things related to broadcast fusion today:
broadcasting of containers having different shape
and aliasing in broadcasted assignment.

The presented examples were tested under Julia 1.9.0-rc1.

Broadcasting of containers having different shape

Let me start with an example:

julia> x = string.([1, 2, 3], ",", ["a" "b"], ":")
3×2 Matrix{String}:
 "1,a:"  "1,b:"
 "2,a:"  "2,b:"
 "3,a:"  "3,b:"

julia> y = rand.(Int8, 1:3)
3-element Vector{Vector{Int8}}:
 [-95]
 [65, -119]
 [-77, -78, -5]

julia> string.(x, y)
3×2 Matrix{String}:
 "1,a:Int8[-95]"           "1,b:Int8[-95]"
 "2,a:Int8[65, -119]"      "2,b:Int8[65, -119]"
 "3,a:Int8[-77, -78, -5]"  "3,b:Int8[-77, -78, -5]"

julia> string.([1, 2, 3], ",", ["a" "b"], ":", rand.(Int8, 1:3))
3×2 Matrix{String}:
 "1,a:Int8[-127]"            "1,b:Int8[-93]"
 "2,a:Int8[92, -91]"         "2,b:Int8[-88, 118]"
 "3,a:Int8[-104, -29, -38]"  "3,b:Int8[23, 109, 76]"

What we can see here is that the x = string.([1, 2, 3], ",", ["a" "b"], ":") operation creates a 3×2 matrix,
while y = rand.(Int8, 1:3) creates a 3-element vector. Since in Julia vectors are treated as columnar
the matrix and vector have matching dimensions and can be used in broadcasting.
The call string.(x, y) reuses elements of y in each row. As a result the suffix of every string
in the resulting matrix is the same for each row.

Therefore, you might be surprised that when you combine the expressions into one call
string.([1, 2, 3], ",", ["a" "b"], ":", rand.(Int8, 1:3)) you get a different result.
Now each suffix is different (I used random numbers to show you that the suffix is different indeed).

What is the reason for this behavior? As is explained in the Julia Manual entries I linked to in the introduction
Julia performs broadcast fusion. This means that it behaves as if it created a single loop over two
dimensions of the output matrix and evaluates the expression:
string(p, ",", q, ":", rand.(Int8, r)) for values of p, q and r appropriately determined
from the source data [1, 2, 3], ["a" "b"], and 1:3 without caching them when doing the expansion
of the 1:3 vector over the second dimension. This means that we get different suffix in each cell.
Sometimes it is indeed desired, in other cases it can be surprising and not wanted.

First, let me explain how to resolve this issue. You can use the identity function (not-broacasted)
to break broadcast fusion behavior. Here is how you can do it:

julia> string.([1, 2, 3], ",", ["a" "b"], ":", identity(rand.(Int8, 1:3)))
3×2 Matrix{String}:
 "1,a:Int8[45]"           "1,b:Int8[45]"
 "2,a:Int8[-121, 60]"     "2,b:Int8[-121, 60]"
 "3,a:Int8[47, -25, 42]"  "3,b:Int8[47, -25, 42]"

The part of the expression wrapped in identity gets evaluated and then is fed into the enclosing
broadcasting expression.

In our example this changes the result of the operation, because we generated random numbers.
However, even if the result would not be impacted it can affect the performance significantly.
Have a look at this example (timings are after compilation):

julia> @time sin.(1:1000) .+ cos.((1:1000)');
  0.032546 seconds (2 allocations: 7.629 MiB)

julia> @time identity(sin.(1:1000)) .+ identity(cos.((1:1000)'));
  0.005688 seconds (4 allocations: 7.645 MiB)

What is the reason of the difference? In the first case both sin and cos
are evaluated 1,000,000 times (for each cell separately).
In the second example we have only 1000 calls of sin and cos.

You might ask when the default behavior might be desirable? It is useful when for example
you want to avoid aliasing. Take a look:

julia> m1 = tuple.([1 2], vcat.(1:3, 4:6))
3×2 Matrix{Tuple{Int64, Vector{Int64}}}:
 (1, [1, 4])  (2, [1, 4])
 (1, [2, 5])  (2, [2, 5])
 (1, [3, 6])  (2, [3, 6])

julia> push!(m1[1, 1][2], 100)
3-element Vector{Int64}:
   1
   4
 100

julia> m1
3×2 Matrix{Tuple{Int64, Vector{Int64}}}:
 (1, [1, 4, 100])  (2, [1, 4])
 (1, [2, 5])       (2, [2, 5])
 (1, [3, 6])       (2, [3, 6])

julia> m2 = tuple.([1 2], identity(vcat.(1:3, 4:6)))
3×2 Matrix{Tuple{Int64, Vector{Int64}}}:
 (1, [1, 4])  (2, [1, 4])
 (1, [2, 5])  (2, [2, 5])
 (1, [3, 6])  (2, [3, 6])

julia> push!(m2[1, 1][2], 100)
3-element Vector{Int64}:
   1
   4
 100

julia> m2
3×2 Matrix{Tuple{Int64, Vector{Int64}}}:
 (1, [1, 4, 100])  (2, [1, 4, 100])
 (1, [2, 5])       (2, [2, 5])
 (1, [3, 6])       (2, [3, 6])

As you can see, in this case we typically would want vcat to be called separately for every cell.
When we broke broadcast fusion with identity(vcat.(1:3, 4:6)) we get the same vector in every cell in a single row,
which could lead to hard-to-catch bugs.

Another question is when broadcast fusion is useful from the performance perspective?
The answer is that in simple calls like (timings are after compilation):

julia> @time cot.(sin.(cos.(tan.(1:10^6))));
  0.057595 seconds (2 allocations: 7.629 MiB)

we avoid unnecessary allocation of intermediate objects. We can simulate non-fused performance
by injecting identity to see the difference:

julia> @time cot.(identity(sin.(identity(cos.(identity(tan.(1:10^6)))))));
  0.085532 seconds (8 allocations: 30.518 MiB)

Aliasing in broadcasted assignment

Another potential issue is aliasing in broadcasted assignment .=. Have a look at this example:

julia> x = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4

julia> x .= sum.(Ref(x))
2×2 Matrix{Int64}:
 10  35
 19  68

julia> x = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4

julia> x .= identity(sum.(Ref(x)))
2×2 Matrix{Int64}:
 10  10
 10  10

In the first case of x .= sum.(Ref(x)), as we already discussed, sum.(Ref(x))
gets executed for each cell of x matrix. Now, since we use .= broadcasted assignment
the operation happens in-place, which means that x gets updated during the process
and consecutive sum.(Ref(x)) calls use changed x. Again, breaking broadcasting
fusion with identity(sum.(Ref(x))) forces Julia to materialize the sum before
doing the outer broadcasted assignment and we get 10 in every cell.

To give another example let us have a look how we can fill a vector with consecutive
powers of 2 (of course there are better ways to do it):

julia> x = [1, 0, 0, 0, 0, 0, 0]
7-element Vector{Int64}:
 1
 0
 0
 0
 0
 0
 0

julia> x .= sum.(Ref(x))
7-element Vector{Int64}:
  1
  1
  2
  4
  8
 16
 32

Conclusions

In summary, it is important to keep in mind that Julia performs broadcast fusion when
operating on several broadcasted function calls that are chained together.

This broadcast fusion in general improves performance and reduces allocations, but in some
cases it is not desirable. The most common scenarios are:

  • when we broadcast operations over containers of different dimensions
    (when it can degrade performance, or lead to different results).
  • when we perform broadcasted assignment to a container that is also used on right hand side of
    an expression (when it can lead to unexpectedly incorrect results).

As I have shown, in such cases one of the ways to fix the problem is to break broadcast fusion
by injecting a non-broadcasted function call forcing materialization of intermediate results
of computation. The identity function can be used to achieve this effect.