Tag Archives: julialang

Working with a grouped data frame, part 2

Re-posted from: https://bkamins.github.io/julialang/2024/03/08/gdf.html

Introduction

This is a follow up to the post from last week. We will continue
discussing how one can work with GroupedDataFrame objects in DataFrames.jl.
Today we focus on indexing of grouped data frames.

The post was written under Julia 1.10.1 and DataFrames.jl 1.6.1.

Warm-up: getting group indices

First create some grouped data frame:

julia> using DataFrames

julia> df = DataFrame(int=[1, 3, 2, 1, 3, 2],
                      str=["a", "a", "c", "c", "b", "b"])
6×2 DataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a
   3 │     2  c
   4 │     1  c
   5 │     3  b
   6 │     2  b

julia> gdf = groupby(df, :str, sort=true)
GroupedDataFrame with 3 groups based on key: str
First Group (2 rows): str = "a"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a
⋮
Last Group (2 rows): str = "c"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     1  c

It is sometimes useful to learn what is a group number of each row of the source data frame df in a grouped data frame gdf.
You can easily get this information with groupindices:

julia> groupindices(gdf)
6-element Vector{Union{Missing, Int64}}:
 1
 1
 3
 3
 2
 2

Extracting a single group

A basic operation when indexing a GroupedDataFrame is to pick a group by its number. Here is an example:

julia> gdf[1]
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a

julia> gdf[2]
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  b
   2 │     2  b

julia> gdf[3]
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     1  c

Note, that gdf behaves similarly to a vector. You can even use begin and end in indexing:

julia> gdf[begin]
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a

julia> gdf[end]
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     1  c

Often you might want to extract a group not by its position in gdf, but by the value of the grouping
variable or variables. In this case you can use GroupKey, dictionary, tuple, or named tuple to achieve this.

Let us check how it works. Start with dictionary, tuple, and named tuple:

julia> gdf[Dict("str" => "b")] # dictionary
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  b
   2 │     2  b

julia> gdf[("b",)] # tuple
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  b
   2 │     2  b

julia> gdf[(; str="b")] # named tuple
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  b
   2 │     2  b

With GroupKey we first need to get it from keys, but everything else works the same:

julia> key = keys(gdf)[1]
GroupKey: (str = "a",)

julia> gdf[key]
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a

You might ask why we require passing grouping variable in a container (dictionary, tuple, named tuple, GroupKey)
and not directly pass the required value when indexing? The reason is that if you grouped your data by integer column
the result would be ambiguous. Here is an example showing that under the defined rules there is no such ambiguity:

julia> gdf2 = groupby(df, :int, sort=false)
GroupedDataFrame with 3 groups based on key: int
First Group (2 rows): int = 1
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
⋮
Last Group (2 rows): int = 2
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b

julia> gdf2[3] # third group
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b

julia> gdf2[(3, )] # group with value of the grouping variable equal to 3
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  a
   2 │     3  b

Extracting multiple groups

You now know how to pick a single group, so selecting multiple groups is a natural next step.
You can use a collection of any of the selectors we have already discussed. Here are some examples:

julia> gdf[[3, 1]] # selection by group number
GroupedDataFrame with 2 groups based on key: str
First Group (2 rows): str = "c"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     1  c
⋮
Last Group (2 rows): str = "a"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a

julia> gdf[[("c",), ("a",)]] # selection by grouping variable value
GroupedDataFrame with 2 groups based on key: str
First Group (2 rows): str = "c"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     1  c
⋮
Last Group (2 rows): str = "a"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a

Note that indexing allows both for reordering and for dropping groups, which often comes handy when analyzing data.
Also note that groupindices is aware of such changes:

julia> groupindices(gdf[[3, 1]])
6-element Vector{Union{Missing, Int64}}:
 2
 2
 1
 1
  missing
  missing

Here group with "c" is first, with "a" is second and with "b" is dropped, so missing is returned in the produced vector.

It is also worth to remember that subset and filter can be used with GroupedDataFrames. This topic is discussed in this post.

Key lookup

Sometimes we do not want to index into a grouped data frame, but just check if it contains some key. This is easily achievable with the haskey function:

julia> haskey(gdf, ("a",))
true

julia> haskey(gdf, ("z",))
false

Conclusions

In this post we discussed indexing of GroupedDataFrames. This concludes the basic tutorial of working with these data structures.
I hope you will find the functionalities I have covered useful in your work.

Working with a grouped data frame, part 1

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2024/03/01/gdf.html

Introduction

One of the features of DataFrames.jl that I often find useful is that when you group
a data frame by some of its columns the resulting GroupedDataFrame is an object
that gains new and useful functionalities.

Some time ago I have discussed how GroupedDataFrame can be filtered. You can find
this post here. In this post and the following one that I plan to write next
week I thought that it would be useful to review other key functionalities of
a GroupedDataFrame.

The post was written under Julia 1.10.1 and DataFrames.jl 1.6.1.

Creating a grouped data frame

You can create a GroupedDataFrame using the groupby function.

Here are some examples:

julia> using DataFrames

julia> df = DataFrame(int=[1, 3, 2, 1, 3, 2],
                      str=["a", "a", "c", "c", "b", "b"])
6×2 DataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a
   3 │     2  c
   4 │     1  c
   5 │     3  b
   6 │     2  b

julia> show(groupby(df, :int), allgroups=true)
GroupedDataFrame with 3 groups based on key: int
Group 1 (2 rows): int = 1
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
Group 2 (2 rows): int = 2
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b
Group 3 (2 rows): int = 3
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  a
   2 │     3  b
julia> show(groupby(df, :int; sort=true), allgroups=true)
GroupedDataFrame with 3 groups based on key: int
Group 1 (2 rows): int = 1
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
Group 2 (2 rows): int = 2
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b
Group 3 (2 rows): int = 3
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  a
   2 │     3  b
julia> show(groupby(df, :int; sort=false), allgroups=true)
GroupedDataFrame with 3 groups based on key: int
Group 1 (2 rows): int = 1
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
Group 2 (2 rows): int = 3
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  a
   2 │     3  b
Group 3 (2 rows): int = 2
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b
julia> show(groupby(df, :str), allgroups=true)
GroupedDataFrame with 3 groups based on key: str
Group 1 (2 rows): str = "a"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a
Group 2 (2 rows): str = "c"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     1  c
Group 3 (2 rows): str = "b"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  b
   2 │     2  b
julia> show(groupby(df, :str; sort=true), allgroups=true)
GroupedDataFrame with 3 groups based on key: str
Group 1 (2 rows): str = "a"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a
Group 2 (2 rows): str = "b"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  b
   2 │     2  b
Group 3 (2 rows): str = "c"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     1  c
julia> show(groupby(df, :str; sort=false), allgroups=true)
GroupedDataFrame with 3 groups based on key: str
Group 1 (2 rows): str = "a"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a
Group 2 (2 rows): str = "c"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     1  c
Group 3 (2 rows): str = "b"
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  b
   2 │     2  b

What this example shows is that the key thing you need to remember
to decide about a grouped data frame is the order of groups.

There are two options here:

groups sorted by the grouping column value, when you pass sort=true;
groups sorted by the order of appearance of values in the source, when you pass sort=true.

You might ask what happens if you do not pass the sort keyword argument?
In this case either of the options is used depending on which one is faster.
Therefore, omitting sort, can be thought of as an information that the user does not
care about the order of groups but wants the grouping operation to be as fast as possible.

When does the order of groups not matter?

In some cases the order of groups is irrelevant (so you can safely skip passing it).
The most important scenario of this kind is when you use the select or transform function
with a GroupedDataFrame. The reason is that these functions anyway always keep the order of
rows from the source data frame (no matter how the groups are rearranged in a GroupedDataFrame).
However, it is not the case with combine, as it respects the order of groups in a GroupedDataFrame.

Let us see an example highlighting the difference between these cases:

julia> select(groupby(df, :int, sort=true), :str)
6×2 DataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a
   3 │     2  c
   4 │     1  c
   5 │     3  b
   6 │     2  b

julia> combine(groupby(df, :int, sort=true), :str)
6×2 DataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
   3 │     2  c
   4 │     2  b
   5 │     3  a
   6 │     3  b

julia> select(groupby(df, :int, sort=false), :str)
6×2 DataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     3  a
   3 │     2  c
   4 │     1  c
   5 │     3  b
   6 │     2  b

julia> combine(groupby(df, :int, sort=false), :str)
6×2 DataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
   3 │     3  a
   4 │     3  b
   5 │     2  c
   6 │     2  b

As you can see select kept the rows in the order in which they are present in df no matter if we
passed sort=true or sort=false. On the other hand combine returns rows grouped by the groups and
the order of groups corresponds to their order in GroupedDataFrame, so passing sort=true or
sort=false in general changes.

Special operation specification syntax for working with grouped data frames

When discussing select or combine in conjunction with GroupedDataFrame it is important to mention
that there are four special cases of operation specification syntax designed specifically for working with
them. They are:

nrow to compute the number of rows in each group.
proprow to compute the proportion of rows in each group.
eachindex to return a vector holding the number of each row within each group.
groupindices to return the group number.

Each of them optionally allows you to specify the name of the target column by => syntax.
Here are some examples:

julia> combine(groupby(df, :int, sort=false), nrow)
3×2 DataFrame
 Row │ int    nrow
     │ Int64  Int64
─────┼──────────────
   1 │     1      2
   2 │     3      2
   3 │     2      2

julia> combine(groupby(df, :int, sort=false), proprow => "row %")
3×2 DataFrame
 Row │ int    row %
     │ Int64  Float64
─────┼─────────────────
   1 │     1  0.333333
   2 │     3  0.333333
   3 │     2  0.333333

julia> combine(groupby(df, :int, sort=false), eachindex)
6×2 DataFrame
 Row │ int    eachindex
     │ Int64  Int64
─────┼──────────────────
   1 │     1          1
   2 │     1          2
   3 │     3          1
   4 │     3          2
   5 │     2          1
   6 │     2          2

julia> combine(groupby(df, :int, sort=false), groupindices => "group #")
3×2 DataFrame
 Row │ int    group #
     │ Int64  Int64
─────┼────────────────
   1 │     1        1
   2 │     3        2
   3 │     2        3

Iterating a grouped data frame

Apart from using functions such as select or combine on a GroupedDataFrame it is useful to know
that it supports iteration. Therefore you can use a GroupedDataFrame in a loop or in a comprehension.
When iterated GroupedDataFrame returns data frames corresponding to the groups. Let us see:

julia> for v in groupby(df, :int, sort=false)
           println(v)
       end
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  a
   2 │     3  b
2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b

julia> [v for v in groupby(df, :int, sort=false)]
3-element Vector{SubDataFrame{DataFrame, DataFrames.Index, Vector{Int64}}}:
 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  a
   2 │     3  b
 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b

julia> collect(groupby(df, :int, sort=false))
3-element Vector{Any}:
 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  a
   2 │     3  b
 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b

The last example has shown you that you can pass a GroupedDataFrame to a function expecting an iterable, in this case the collect function. The one exception to this rule is that you cannot use GroupedDataFrame with the map function directly:

julia> map(identity, groupby(df, :int, sort=false))
ERROR: ArgumentError: using map over `GroupedDataFrame`s is reserved

The reason is that it was not clear if such operation should produce a vector or a data frame, and it is easy enough to achieve both results with other means. If you want e vector use e.g. a comprehension. If you want a data frame use e.g. combine or select.

Advanced iteration

Sometimes, when iterating a GroupedDataFrame we might be interested not only in a data frame per group, but also in a value of grouping variable. This is easily achieved with the keys and pairs functions (depending on whether you only want grouping values or both grouping values and data frames):

julia> map(identity, keys(groupby(df, :int, sort=false)))
3-element Vector{DataFrames.GroupKey{GroupedDataFrame{DataFrame}}}:
 GroupKey: (int = 1,)
 GroupKey: (int = 3,)
 GroupKey: (int = 2,)

julia> map(identity, pairs(groupby(df, :int, sort=false)))
3-element Vector{Pair{DataFrames.GroupKey{GroupedDataFrame{DataFrame}}, SubDataFrame{DataFrame, DataFrames.Index, Vector{Int64}}}}:
 GroupKey: (int = 1,) => 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     1  a
   2 │     1  c
 GroupKey: (int = 3,) => 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     3  a
   2 │     3  b
 GroupKey: (int = 2,) => 2×2 SubDataFrame
 Row │ int    str
     │ Int64  String
─────┼───────────────
   1 │     2  c
   2 │     2  b

I used the map function to show you that it is only reserved to use it with plain GroupedDataFrame.

Working with group keys

As you can see in this example each group in a GroupedDataFrame is associated with a GroupKey. To get all
keys use the keys function:

julia> keys(groupby(df, :int, sort=false))
3-element DataFrames.GroupKeys{GroupedDataFrame{DataFrame}}:
 GroupKey: (int = 1,)
 GroupKey: (int = 3,)
 GroupKey: (int = 2,)

Let us, as an example extract the last key so see how one can work with it:

julia> key = last(keys(groupby(df, :int, sort=false)))
GroupKey: (int = 2,)

You can get a value of the key by property access or indexing:

julia> key.int
2

julia> key[1]
2

julia> key["int"]
2

julia> key[:int]
2

It is also easy co convert GroupKey to a dictionary, vector, Tuple or NamedTuple if you would need it:

julia> Dict(key)
Dict{Symbol, Int64} with 1 entry:
  :int => 2

julia> collect(key)
1-element Vector{Int64}:
 2

julia> Tuple(key)
(2,)

julia> NamedTuple(key)
(int = 2,)

Note that, in general, you can group a data frame by multiple columns so you could query value of any grouping column
in the examples above. If you needed to get a list of grouping columns use the groupcols function:

julia> groupcols(groupby(df, :int, sort=false))
1-element Vector{Symbol}:
 :int

Conclusions

In this post we have learned how one can create a grouped data frame and how to choose the order of groups in it.
As a follow-up we have shown how GroupedDataFrame interacts with functions like select or combine.
Next we discussed iterator interface support by GroupedDataFrame and how to get and use information about
values of grouping columns for each group. I hope you found these examples useful.

In the post next week we will discuss how GroupedDataFrame supports the indexing interface.

Partial function application in Julia

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2024/02/23/fix.html

Introduction

Some functions provided in Base Julia support partial application.
I often find this functionality useful.
Therefore in this post I want to give you its explanation and a summary which functions have this property.

The post was tested with Julia Version 1.12.0-DEV.53.

Explaining partial function application

We will focus on partial application of functions having two positional arguments.
Let us work by example.

Consider the in function. You can call it to check if some item is in a collection.
Here is an example:

julia> in('a', "Abracadabra")
true

julia> in('x', "Abracadabra")
false

A common pattern you might need is to perform a repeated check if various items are contained in the same collection.
For example assume you have a vector of characters and you want to filer it to keep only the elements contained in a reference collection.
You can do it like this:

julia> v = 'a':'z'
'a':1:'z'

julia> filter(x -> in(x, "Abracadabra"), v)
5-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
 'd': ASCII/Unicode U+0064 (category Ll: Letter, lowercase)
 'r': ASCII/Unicode U+0072 (category Ll: Letter, lowercase)

This pattern is so commonly needed that there is a shorthand for x -> in(x, "Abracadabra").
Instead of creating this anonymous function you can just write in("Abracadabra").
The value returned by this function call behaves in the same way as x -> in(x, "Abracadabra").
Let us check:

julia> filter(in("Abracadabra"), v)
5-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
 'd': ASCII/Unicode U+0064 (category Ll: Letter, lowercase)
 'r': ASCII/Unicode U+0072 (category Ll: Letter, lowercase)

You can think of this operation as if we partially applied the in function by fixing its second argument
(the collection) and leaving the first (the item we check) to be specified later.

In other words the following two operations are equivalent:

julia> in('a', "Abracadabra")
true

julia> in("Abracadabra")('a')
true

Fixing of the second argument is most common. However, sometimes it is useful to fix the first argument.
This is exactly the case of the filter function we have just used.

What if you wanted to perform the filter(in("Abracadabra"), v) test for multiple different values of v but with a fixed predicate function?
Here is an example:

julia> vv = ['a'+i:'z' for i in 0:4]
5-element Vector{StepRange{Char, Int64}}:
 'a':1:'z'
 'b':1:'z'
 'c':1:'z'
 'd':1:'z'
 'e':1:'z'

julia> map(v -> filter(in("Abracadabra"), v), vv)
5-element Vector{Vector{Char}}:
 ['a', 'b', 'c', 'd', 'r']
 ['b', 'c', 'd', 'r']
 ['c', 'd', 'r']
 ['d', 'r']
 ['r']

You probably see, where I am getting at. Instead of v -> filter(in("Abracadabra"), v) we can write filter(in("Abracadabra")) and fix
the first positional argument of filter, leaving the second to be specified later.
Let us check if this works:

julia> map(filter(in("Abracadabra")), vv)
5-element Vector{Vector{Char}}:
 ['a', 'b', 'c', 'd', 'r']
 ['b', 'c', 'd', 'r']
 ['c', 'd', 'r']
 ['d', 'r']
 ['r']

Indeed, we get what we expected. Again, for a reference note that the following two operations are equivalent:

julia> filter(in("Abracadabra"), v)
5-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
 'd': ASCII/Unicode U+0064 (category Ll: Letter, lowercase)
 'r': ASCII/Unicode U+0072 (category Ll: Letter, lowercase)

julia> filter(in("Abracadabra"))(v)
5-element Vector{Char}:
 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)
 'b': ASCII/Unicode U+0062 (category Ll: Letter, lowercase)
 'c': ASCII/Unicode U+0063 (category Ll: Letter, lowercase)
 'd': ASCII/Unicode U+0064 (category Ll: Letter, lowercase)
 'r': ASCII/Unicode U+0072 (category Ll: Letter, lowercase)

Before I finish this section let me note that if you do not like writing that many parentheses you could use the |> operator.
In our example we could write:

julia> map("Abracadabra" |> in |> filter, vv)
5-element Vector{Vector{Char}}:
 ['a', 'b', 'c', 'd', 'r']
 ['b', 'c', 'd', 'r']
 ['c', 'd', 'r']
 ['d', 'r']
 ['r']

Which style you use is a matter of preference.

Catalogue of function supporting partial applications

We saw that some functions taking two arguments support partial application.
Below I give you a list of all of them that are currently supported (and this is the reason why the post is written under Julia nightly,
as there were recent changes in this list).

There is only one function in Base Julia that supports fixing its first argument and this function is filter.

However, there are many functions supporting fixing of their second argument. Here is their list:

comparisons: isequal, ==, !=, >=, <=, >, <;
inclusion testing: in, ∈, ∋, ∉, ∌;
string checking: contains, occursin, endswith, startswith;
set operations (supported since Julia 1.11; not released yet): issubset,⊆, ⊇, ⊈, ⊉, ⊊, ⊋, isdisjoint, issetequal.

Conclusions

After reading this post you know how to use partial function application in Julia and which functions from Base support it.
I hope you will find this functionality useful in your code.

juliabloggers.com

A Julia Language Blog Aggregator

Tag Archives: julialang

Working with a grouped data frame, part 2

Introduction

Warm-up: getting group indices

Extracting a single group

Extracting multiple groups

Key lookup

Conclusions

Working with a grouped data frame, part 1

Introduction

Creating a grouped data frame

When does the order of groups not matter?

Special operation specification syntax for working with grouped data frames

Iterating a grouped data frame

Advanced iteration

Working with group keys

Conclusions

Partial function application in Julia

Introduction

Explaining partial function application

Catalogue of function supporting partial applications

Conclusions