Category Archives: Julia

Basic Data Structures Explained

By: Great Lakes Consulting

Re-posted from: https://blog.glcs.io/basic-data-structures

Julia is a relatively new,
free, and open-source programming language.
It has a syntax
similar to that of other popular programming languages
such as MATLAB and Python,
but it boasts being able to achieve C-like speeds.

Julia provides several useful data structures
for storing and manipulating data.
Some of these data structures,
like arrays and dictionaries,
are ubiquitous in Julia code
because of their usefulness
and wide applicability.
Others, like sets,
have more limited uses
but nevertheless
still are useful data structures.

In this post,
we will learn about
arrays, dictionaries, and sets
in Julia.
We will discuss how to construct them
and describe various functions
for working with and manipulating them.

This post assumes you already have Julia installed.
If you haven’t yet,
check out our earlier
post on how to install Julia.

Arrays

One of the most basic and ubiquitous data structures
is the array.
Arrays are used for storing values,
iterating through values,
and even representing mathematical vectors and matrices.

The basic array type in Julia is Array{T,N},
where T is the type of the elements in the array
(or an abstract supertype of the elements
if not all elements are of the same type),
and N is the number of array dimensions.
For example,
a list of strings would be of type Array{String,1},
while a matrix of numbers would be of type Array{Float64,2}.

Constructing Arrays

There are various ways
to construct arrays.
One common way
is to construct an array
directly from the values
it will contain:

julia> ["some", "strings"]
2-element Vector{String}:
 "some"
 "strings"

julia> [1 2; 3 4]
2x2 Matrix{Int64}:
 1  2
 3  4

(Note that Vector{T} is equivalent to Array{T,1}
and that Matrix{T} is equivalent to Array{T,2}.)

Example arrays

Another common way
to construct arrays
is using array comprehensions.
An array comprehension
creates an array
by looping through a collection of values
and computing an array element
for each value.
For example,
the following creates an array
containing the squares
of the first five natural numbers:

julia> [x^2 for x = 1:5]
5-element Vector{Int64}:
  1
  4
  9
 16
 25

Multidimensional comprehensions also exist:

julia> [(x - 2)^2 + (y - 3)^2 <= 1 for x = 1:3, y = 1:5]
3x5 Matrix{Bool}:
 0  0  1  0  0
 0  1  1  1  0
 0  0  1  0  0

We can also create uninitialized arrays,
either by passing undef to the array constructor
or by calling similar:

julia> Array{Int,1}(undef, 1)
1-element Vector{Int64}:
 6303840

julia> similar([1.0, 2.0])
2-element Vector{Float64}:
 6.94291544947797e-310
 6.94291610129443e-310

(Note that the seemingly random numbers above
come from whatever bits happen to be set
in the memory allocated for the arrays.)

To create an array of zeros,
call zeros:

julia> zeros(2, 3)
2x3 Matrix{Float64}:
 0.0  0.0  0.0
 0.0  0.0  0.0

Inspecting Arrays

Information about an array
can be obtained using various functions.

length gives the number of elements
in an array:

julia> length([1, 2, 3])
3

julia> length(zeros(2, 3))
6

size(x) gives the size of x,
while size(x, d) gives the size
of the dth dimension:

julia> size([1, 2, 3])
(3,)

julia> size(zeros(2, 3))
(2, 3)

julia> size(zeros(2, 3), 2)
3

ndims gives the number of dimensions
of an array:

julia> ndims(zeros(1, 2, 3, 4, 5, 6))
6

And eltype gives the type
of the elements of an array:

julia> eltype(["two", "strings"])
String

julia> eltype([2, "different types"])
Any

Array Operations

Accessing array elements
is achieved using brackets:

julia> a = [10, 20, 30];

julia> a[2]
20

(Note that arrays use one-based indexing
in Julia.)

A similar syntax is used
to modify the contents of an array:

julia> a[1] = 0
0

julia> a
3-element Vector{Int64}:
  0
 20
 30

Use commas (,) to separate indexes
for different dimensions,
and use a colon (:)
to select all the values
along a dimension:

julia> m = [1 2; 3 4]
2x2 Matrix{Int64}:
 1  2
 3  4

julia> m[1,2]
2

julia> m[:,1]
2-element Vector{Int64}:
 1
 3

Multiple indexes can be provided:

julia> a[[1, 3]]
2-element Vector{Int64}:
  0
 30

To assign a single value
to multiple array locations,
use broadcasting:

julia> a[2:3] .= 0
2-element view(::Vector{Int64}, 2:3) with eltype Int64:
 0
 0

julia> a
3-element Vector{Int64}:
 0
 0
 0

Arrays are also iterable,
meaning we can loop through
the values of an array:

julia> words = ["this", "is", "a", "sentence"];

julia> for w in words
           println(w)
       end
this
is
a
sentence

Arrays as Stacks/Queues/Dequeues

Julia also provides some functions
that allow arrays to be used
in a similar way
as stacks, queues, and dequeues.
For example,
push!(array, x) inserts x
at the end of an array,
and pop!(array) removes the last element
of an array.
Similarly,
pushfirst! and popfirst
act on the beginning of an array.

Ranges

Ranges are another useful type of array,
often used for looping
and array indexing.

The simplest syntax
for creating a range
is a:b,
which creates a range
that starts at a
and includes all values
a + 1, a + 2, etc.,
as long as a + n <= b.
For example,
1:5 contains 1, 2, 3, 4, and 5,
while 1.0:2.5 contains 1.0 and 2.0.

A step size, s, can also be specified,
as in a:s:b.
In this case, the spacing between values
in the range
is s instead of 1.

To create a range of N points
between a and b, inclusive,
use range(a, b, N).

Unlike Arrays,
ranges are immutable,
meaning their elements
can’t be modified.
If modifying an element of a range
is necessary,
it must first be converted
into an Array
by calling collect:

julia> r = 1:2
1:2

julia> r[1] = 10
ERROR: CanonicalIndexError: setindex! not defined for UnitRange{Int64}

julia> r_arr = collect(r)
2-element Vector{Int64}:
 1
 2

julia> r_arr[1] = 10; r_arr
2-element Vector{Int64}:
 10
  2

That concludes our discussion of arrays,
so now let’s move on to dictionaries.

Dictionaries

Another very common data structure
is the dictionary.
A dictionary is a mapping
from keys to values:
give a dictionary a key,
and it will return
the value associated with that key
(if present).

In Julia,
dictionaries are of type Dict{K,V},
where K is the type of the keys,
and V is the type of the values.

Dictionaries are constructed
by providing key-value pairs:

julia> d = Dict("key1" => 1, "key2" => 2, "key3" => 3)
Dict{String, Int64} with 3 entries:
  "key2" => 2
  "key3" => 3
  "key1" => 1

(Note that a => b
creates a Pair in Julia.)

Indexing a dictionary
uses the same syntax
as indexing an array,
just using keys
instead of array indexes:

julia> d["key2"]
2

Accessing a dictionary

Use haskey to check
for the presence of a key:

julia> haskey(d, "key3")
true

julia> haskey(d, "nope")
false

Dictionaries can also be updated:

julia> d["key1"] = 9999
9999

julia> d["newkey"] = -9999
-9999

julia> d
Dict{String, Int64} with 4 entries:
  "key2"   => 2
  "key3"   => 3
  "key1"   => 9999
  "newkey" => -9999

Use delete!(dict, key)
to delete the mapping
for the given key,
if present.

We can also iterate
through the keys and/or values
of a dictionary:

  • Iterating keys: for k in keys(dict)
  • Iterating values: for v in values(dict)
  • Iterating both: for (k, v) in dict

That wraps up our discussion of dictionaries,
so now we will move on to sets.

Sets

A set is a collection of unique elements.
In Julia,
sets are of type Set{T},
where T is the type
of the elements of the set.
Sets are useful
for their efficient set operations,
such as membership testing,
union, and intersect.

Create an empty set of Float64 values as follows:

julia> s = Set{Float64}()
Set{Float64}()

Use push! to add values
to the set,
noticing that the set changes
only if the value does not already exist
in the set:

julia> push!(s, 1.0);

julia> push!(s, 1.2);

julia> push!(s, 3.14)
Set{Float64} with 3 elements:
  1.2
  3.14
  1.0

julia> push!(s, 1.0)
Set{Float64} with 3 elements:
  1.2
  3.14
  1.0

Use union to take the union
of two sets:

julia> t = Set([1.0, 2.0])
Set{Float64} with 2 elements:
  2.0
  1.0

julia> r = s  t # type \cup<tab> to get the union symbol
Set{Float64} with 4 elements:
  1.2
  2.0
  3.14
  1.0

(Note that s t == union(s, t).)

Use intersect to take the intersection
of two sets:

julia> r  t # type \cap<tab> to get the intersection symbol
Set{Float64} with 2 elements:
  2.0
  1.0

(Note that s t == intersect(s, t).)

Finally,
we can check if an element
belongs to a set
with in:

julia> 1.0  r # type \in<tab> to get the "is an element of" symbol
true

(Note that and in are interchangeable here.)

And with that,
we conclude our overview
of some important Julia data structures.

Summary

In this post,
we learned about a few data structures
that Julia provides:
arrays, dictionaries, and sets.
We learned how to construct them
and how to work with and manipulate them.

What are the most useful data structures
you have used?
Let us know in the comments below!

Additional Links

Basic Data Structures Explained

By: Steven Whitaker

Re-posted from: https://glcs.hashnode.dev/basic-data-structures

Julia is a relatively new,free, and open-source programming language.It has a syntaxsimilar to that of other popular programming languagessuch as MATLAB and Python,but it boasts being able to achieve C-like speeds.

Julia provides several useful data structuresfor storing and manipulating data.Some of these data structures,like arrays and dictionaries,are ubiquitous in Julia codebecause of their usefulnessand wide applicability.Others, like sets,have more limited usesbut neverthelessstill are useful data structures.

In this post,we will learn aboutarrays, dictionaries, and setsin Julia.We will discuss how to construct themand describe various functionsfor working with and manipulating them.

This post assumes you already have Julia installed.If you haven’t yet,check out our earlierpost on how to install Julia.

Arrays

One of the most basic and ubiquitous data structuresis the array.Arrays are used for storing values,iterating through values,and even representing mathematical vectors and matrices.

The basic array type in Julia is Array{T,N},where T is the type of the elements in the array(or an abstract supertype of the elementsif not all elements are of the same type),and N is the number of array dimensions.For example,a list of strings would be of type Array{String,1},while a matrix of numbers would be of type Array{Float64,2}.

Constructing Arrays

There are various waysto construct arrays.One common wayis to construct an arraydirectly from the valuesit will contain:

julia> ["some", "strings"]2-element Vector{String}: "some" "strings"julia> [1 2; 3 4]2x2 Matrix{Int64}: 1  2 3  4

(Note that Vector{T} is equivalent to Array{T,1}and that Matrix{T} is equivalent to Array{T,2}.)

Example arrays

Another common wayto construct arraysis using array comprehensions.An array comprehensioncreates an arrayby looping through a collection of valuesand computing an array elementfor each value.For example,the following creates an arraycontaining the squaresof the first five natural numbers:

julia> [x^2 for x = 1:5]5-element Vector{Int64}:  1  4  9 16 25

Multidimensional comprehensions also exist:

julia> [(x - 2)^2 + (y - 3)^2 <= 1 for x = 1:3, y = 1:5]3x5 Matrix{Bool}: 0  0  1  0  0 0  1  1  1  0 0  0  1  0  0

We can also create uninitialized arrays,either by passing undef to the array constructoror by calling similar:

julia> Array{Int,1}(undef, 1)1-element Vector{Int64}: 6303840julia> similar([1.0, 2.0])2-element Vector{Float64}: 6.94291544947797e-310 6.94291610129443e-310

(Note that the seemingly random numbers abovecome from whatever bits happen to be setin the memory allocated for the arrays.)

To create an array of zeros,call zeros:

julia> zeros(2, 3)2x3 Matrix{Float64}: 0.0  0.0  0.0 0.0  0.0  0.0

Inspecting Arrays

Information about an arraycan be obtained using various functions.

length gives the number of elementsin an array:

julia> length([1, 2, 3])3julia> length(zeros(2, 3))6

size(x) gives the size of x,while size(x, d) gives the sizeof the dth dimension:

julia> size([1, 2, 3])(3,)julia> size(zeros(2, 3))(2, 3)julia> size(zeros(2, 3), 2)3

ndims gives the number of dimensionsof an array:

julia> ndims(zeros(1, 2, 3, 4, 5, 6))6

And eltype gives the typeof the elements of an array:

julia> eltype(["two", "strings"])Stringjulia> eltype([2, "different types"])Any

Array Operations

Accessing array elementsis achieved using brackets:

julia> a = [10, 20, 30];julia> a[2]20

(Note that arrays use one-based indexingin Julia.)

A similar syntax is usedto modify the contents of an array:

julia> a[1] = 00julia> a3-element Vector{Int64}:  0 20 30

Use commas (,) to separate indexesfor different dimensions,and use a colon (:)to select all the valuesalong a dimension:

julia> m = [1 2; 3 4]2x2 Matrix{Int64}: 1  2 3  4julia> m[1,2]2julia> m[:,1]2-element Vector{Int64}: 1 3

Multiple indexes can be provided:

julia> a[[1, 3]]2-element Vector{Int64}:  0 30

To assign a single valueto multiple array locations,use broadcasting:

julia> a[2:3] .= 02-element view(::Vector{Int64}, 2:3) with eltype Int64: 0 0julia> a3-element Vector{Int64}: 0 0 0

Arrays are also iterable,meaning we can loop throughthe values of an array:

julia> words = ["this", "is", "a", "sentence"];julia> for w in words           println(w)       endthisisasentence

Arrays as Stacks/Queues/Dequeues

Julia also provides some functionsthat allow arrays to be usedin a similar wayas stacks, queues, and dequeues.For example,push!(array, x) inserts xat the end of an array,and pop!(array) removes the last elementof an array.Similarly,pushfirst! and popfirstact on the beginning of an array.

Ranges

Ranges are another useful type of array,often used for loopingand array indexing.

The simplest syntaxfor creating a rangeis a:b,which creates a rangethat starts at aand includes all valuesa + 1, a + 2, etc.,as long as a + n <= b.For example,1:5 contains 1, 2, 3, 4, and 5,while 1.0:2.5 contains 1.0 and 2.0.

A step size, s, can also be specified,as in a:s:b.In this case, the spacing between valuesin the rangeis s instead of 1.

To create a range of N pointsbetween a and b, inclusive,use range(a, b, N).

Unlike Arrays,ranges are immutable,meaning their elementscan’t be modified.If modifying an element of a rangeis necessary,it must first be convertedinto an Arrayby calling collect:

julia> r = 1:21:2julia> r[1] = 10ERROR: CanonicalIndexError: setindex! not defined for UnitRange{Int64}julia> r_arr = collect(r)2-element Vector{Int64}: 1 2julia> r_arr[1] = 10; r_arr2-element Vector{Int64}: 10  2

That concludes our discussion of arrays,so now let’s move on to dictionaries.

Dictionaries

Another very common data structureis the dictionary.A dictionary is a mappingfrom keys to values:give a dictionary a key,and it will returnthe value associated with that key(if present).

In Julia,dictionaries are of type Dict{K,V},where K is the type of the keys,and V is the type of the values.

Dictionaries are constructedby providing key-value pairs:

julia> d = Dict("key1" => 1, "key2" => 2, "key3" => 3)Dict{String, Int64} with 3 entries:  "key2" => 2  "key3" => 3  "key1" => 1

(Note that a => bcreates a Pair in Julia.)

Indexing a dictionaryuses the same syntaxas indexing an array,just using keysinstead of array indexes:

julia> d["key2"]2

Accessing a dictionary

Use haskey to checkfor the presence of a key:

julia> haskey(d, "key3")truejulia> haskey(d, "nope")false

Dictionaries can also be updated:

julia> d["key1"] = 99999999julia> d["newkey"] = -9999-9999julia> dDict{String, Int64} with 4 entries:  "key2"   => 2  "key3"   => 3  "key1"   => 9999  "newkey" => -9999

Use delete!(dict, key)to delete the mappingfor the given key,if present.

We can also iteratethrough the keys and/or valuesof a dictionary:

  • Iterating keys: for k in keys(dict)
  • Iterating values: for v in values(dict)
  • Iterating both: for (k, v) in dict

That wraps up our discussion of dictionaries,so now we will move on to sets.

Sets

A set is a collection of unique elements.In Julia,sets are of type Set{T},where T is the typeof the elements of the set.Sets are usefulfor their efficient set operations,such as membership testing,union, and intersect.

Create an empty set of Float64 values as follows:

julia> s = Set{Float64}()Set{Float64}()

Use push! to add valuesto the set,noticing that the set changesonly if the value does not already existin the set:

julia> push!(s, 1.0);julia> push!(s, 1.2);julia> push!(s, 3.14)Set{Float64} with 3 elements:  1.2  3.14  1.0julia> push!(s, 1.0)Set{Float64} with 3 elements:  1.2  3.14  1.0

Use union to take the unionof two sets:

julia> t = Set([1.0, 2.0])Set{Float64} with 2 elements:  2.0  1.0julia> r = s  t # type \cup<tab> to get the union symbolSet{Float64} with 4 elements:  1.2  2.0  3.14  1.0

(Note that s t == union(s, t).)

Use intersect to take the intersectionof two sets:

julia> r  t # type \cap<tab> to get the intersection symbolSet{Float64} with 2 elements:  2.0  1.0

(Note that s t == intersect(s, t).)

Finally,we can check if an elementbelongs to a setwith in:

julia> 1.0  r # type \in<tab> to get the "is an element of" symboltrue

(Note that and in are interchangeable here.)

And with that,we conclude our overviewof some important Julia data structures.

Summary

In this post,we learned about a few data structuresthat Julia provides:arrays, dictionaries, and sets.We learned how to construct themand how to work with and manipulate them.

What are the most useful data structuresyou have used?Let us know in the comments below!

Have a better understandingof Julia’s basic data structures?Move on to thenext post to learn about multiple dispatch,one of Julia’s most distinctive features!Or,feel free to take a lookat our other Julia tutorial posts!

Additional Links

Learning to create a vector in Julia

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2023/11/03/vec.html

Introduction

Last week I have written a “Learning to zip stuff in Julia” post
that discussed the zip function. To my surprise, even though the topic was basic,
it has received a lot of positive feedback. Therefore I thought of writing about
another entry-level problem.

Often, when working with arrays in Julia we want to flatten them to a vector.
Today, I want to discuss two ways how you can do it and the differences between them.

The post was written under Julia 1.9.2.

The problem

Assume you have the following three dimensional array:

julia> a3d = [(i, j, k) for i in 1:2, j in 1:3, k in 1:4]
2×3×4 Array{Tuple{Int64, Int64, Int64}, 3}:
[:, :, 1] =
 (1, 1, 1)  (1, 2, 1)  (1, 3, 1)
 (2, 1, 1)  (2, 2, 1)  (2, 3, 1)

[:, :, 2] =
 (1, 1, 2)  (1, 2, 2)  (1, 3, 2)
 (2, 1, 2)  (2, 2, 2)  (2, 3, 2)

[:, :, 3] =
 (1, 1, 3)  (1, 2, 3)  (1, 3, 3)
 (2, 1, 3)  (2, 2, 3)  (2, 3, 3)

[:, :, 4] =
 (1, 1, 4)  (1, 2, 4)  (1, 3, 4)
 (2, 1, 4)  (2, 2, 4)  (2, 3, 4)

In many situations you might want to transform it into a one dimensional vector.
For example many functions explicitly require AbstractVector as their input.
The question is how can you do it.

There are two fundamental ways to perform this operation. The first one creates
a new independent vector from the source data, and the second one reuses the memory
of the source data. Let me discuss them in more detail.

Copied vector

If you want to copy the data in a3d into a new vector you can write:

julia> a3d[:]
24-element Vector{Tuple{Int64, Int64, Int64}}:
 (1, 1, 1)
 (2, 1, 1)
 (1, 2, 1)
 (2, 2, 1)
 (1, 3, 1)
 (2, 3, 1)
 (1, 1, 2)
 (2, 1, 2)
 (1, 2, 2)
 (2, 2, 2)
 (1, 3, 2)
 (2, 3, 2)
 (1, 1, 3)
 (2, 1, 3)
 (1, 2, 3)
 (2, 2, 3)
 (1, 3, 3)
 (2, 3, 3)
 (1, 1, 4)
 (2, 1, 4)
 (1, 2, 4)
 (2, 2, 4)
 (1, 3, 4)
 (2, 3, 4)

The syntax is short and easy to read. It takes advantage of the fact that every Julia
array should support linear indexing, as is explained in the Julia Manual section
on Linear Indexing.

The benefit of this approach is that the new object is freshly allocated, so modifying
it will not modify the source. The downside is that it requires memory allocation.

Aliased vector

If we want to avoid excessive memory allocation (which might be relevant for large objects)
we can create a vector from an array without copying the data. This can be achieved using
the vec function:

julia> v = vec(a3d)
24-element Vector{Tuple{Int64, Int64, Int64}}:
 (1, 1, 1)
 (2, 1, 1)
 (1, 2, 1)
 (2, 2, 1)
 (1, 3, 1)
 (2, 3, 1)
 (1, 1, 2)
 (2, 1, 2)
 (1, 2, 2)
 (2, 2, 2)
 (1, 3, 2)
 (2, 3, 2)
 (1, 1, 3)
 (2, 1, 3)
 (1, 2, 3)
 (2, 2, 3)
 (1, 3, 3)
 (2, 3, 3)
 (1, 1, 4)
 (2, 1, 4)
 (1, 2, 4)
 (2, 2, 4)
 (1, 3, 4)
 (2, 3, 4)

Now v is a vector that shares memory with a3d. We can check it using the pointer function:

julia> pointer(v)
Ptr{Tuple{Int64, Int64, Int64}} @0x000002606d121540

julia> pointer(a3d)
Ptr{Tuple{Int64, Int64, Int64}} @0x000002606d121540

or the Base.mightalias function:

julia> Base.mightalias(a3d, v)
true

The benefit is that vec(a3d) is in general much faster than a3d[:] since it makes less work.
The downside is that you need to remember that mutating one of the objects will change the other.
For example:

julia> a3d[1, 1, 1] = (100, 100, 100);

julia> a3d
2×3×4 Array{Tuple{Int64, Int64, Int64}, 3}:
[:, :, 1] =
 (100, 100, 100)  (1, 2, 1)  (1, 3, 1)
 (2, 1, 1)        (2, 2, 1)  (2, 3, 1)

[:, :, 2] =
 (1, 1, 2)  (1, 2, 2)  (1, 3, 2)
 (2, 1, 2)  (2, 2, 2)  (2, 3, 2)

[:, :, 3] =
 (1, 1, 3)  (1, 2, 3)  (1, 3, 3)
 (2, 1, 3)  (2, 2, 3)  (2, 3, 3)

[:, :, 4] =
 (1, 1, 4)  (1, 2, 4)  (1, 3, 4)
 (2, 1, 4)  (2, 2, 4)  (2, 3, 4)

julia> v
24-element Vector{Tuple{Int64, Int64, Int64}}:
 (100, 100, 100)
 (2, 1, 1)
 (1, 2, 1)
 (2, 2, 1)
 (1, 3, 1)
 (2, 3, 1)
 (1, 1, 2)
 (2, 1, 2)
 (1, 2, 2)
 (2, 2, 2)
 (1, 3, 2)
 (2, 3, 2)
 (1, 1, 3)
 (2, 1, 3)
 (1, 2, 3)
 (2, 2, 3)
 (1, 3, 3)
 (2, 3, 3)
 (1, 1, 4)
 (2, 1, 4)
 (1, 2, 4)
 (2, 2, 4)
 (1, 3, 4)
 (2, 3, 4)

Conclusions

In summary if in your analytical workflow you need a vector,
but you have a general array a as an input you have two options:

  • write a[:], which creates a copy (safer, but slower and uses more memory);
  • write vec(a), which creates an alias (unsafe, but faster and uses less memory).

In practice I find both options useful depending on the circumstances, so I think it is worth to be aware of them.