Understanding Variables and Functions

By: Steven Whitaker

Re-posted from: https://glcs.hashnode.dev/variables-and-functions

Variables and functionsare the building blocksof any programmer’s code.Variables allow computations to be reused,and functions help keep code organized.

In this post,we will cover some of the basicsof variables and functionsin Julia,a relatively new,free, and open-source programming language.In particular,we will discuss what a variable is,what sorts of datacan be assigned to a variable,andhow to define and use functions.

Variables

A variable is a labelused to refer to an object.

julia> a = 11

In the above code snippet,we assigned a value of 1to a variable called a.Now we can use a in other expressions,and the value of a (1 in this case)will be used.

julia> a + 23

We can also reassign variables.

julia> a = 44julia> a = "Hello""Hello"julia> a = [1, 2, 3]3-element Vector{Int64}: 1 2 3

We can even use Unicode characters!We can write many math symbolsby typing the corresponding LaTeX commandand then pressing <tab>.Here,we assign (a Julia constant equal to \( \pi \)and typed with \pi<tab>)to (typed with \theta<tab>).

julia>  =  = 3.1415926535897...

Variables Are Labels

One important thing to remember about variablesis that they are labels for data,not the data itself.Let’s illustrate what that means.At this point,a refers to a Vector.We will create another variable, b,and assign it the value of a.

julia> b = a3-element Vector{Int64}: 1 2 3

Now let’s change one of the elements of b.

julia> b[1] = 100; b3-element Vector{Int64}: 100   2   3

We didn’t change a,so it should be the same as before, right?Nope!

julia> a3-element Vector{Int64}: 100   2   3

What happened?Remember, a is just a labelfor some data (a Vector).When we created b,we created a new labelfor the same data.Both a and b referto the same data,so modifying one modifies the other.

Two labels on the same box

If you want bto have the same values as abut refer to different data,use copy.

julia> b = copy(a)3-element Vector{Int64}: 100   2   3julia> b[1] = 1; b3-element Vector{Int64}: 1 2 3julia> a3-element Vector{Int64}: 100   2   3

Two labels on different boxes

Now that we knowhow to create variables,let’s learn aboutsome basic types of datawe can assign to variables.

Basic Types

Julia has many basic data types.

integer = 9000floating_point = 3.14boolean = trueimaginary = 1 + 2imrational = 4//3char = 'x'str = "a string"array = [1.0, 2.0]

Integers and floating-point numberscan be expressedwith different numbers of bits.

Int64, Int32, Int16       # and moreFloat64, Float32, Float16 # and more

By default,integer numbers(technically, literals)are of type Int64 on 64-bit computersor of type Int32 on 32-bit computers.(Note that Int is shorthand for Int64 or Int32for 64-bit or 32-bit computers, respectively.Therefore, all integer literals are of type Int.)

On the other hand,floating-point numbers (literals)of the form 3.14 or 2.3e5are of type Float64 on all computers,while those of the form 2.3f5are of type Float32.

To use different numbers of bits,just use the appropriate constructor.

julia> Int16(20)20julia> Float16(1.2)Float16(1.2)

Basic Operations

Now we will coversome basic operations.This is by no means an exhaustive list;check out the Julia documentationfor more details.

# Mathaddition = 1 + 2.0subtraction = 1 - 1multiplication = 3 * 4//3division = 6 / 4integer_division = 6  4 # Type \div<tab>power = 2^7# Booleannot = !falseand = true && notor = not || and# Comparisonequality = addition == 1greater = division > integer_divisionchained = addition < subtraction <= power# Stringsstring_concatenation = "hi " * "there"string_interpolation = "1 - 1 = $subtraction"string_indexing = string_interpolation[5]substring = string_concatenation[4:end]parsing = parse(Int, string_indexing)# Arraysa = [1, 2, 3]b = [4, 5, 6]concat_vert = [a; b]concat_horiz = [a b]vector_indexing = b[2]vector_range_indexing = b[1:2]matrix_indexing = concat_horiz[2:3,1]elementwise1 = a .+ 1elementwise2 = a .- b# Displayingprint(addition)println(string_concatenation)@show not@info "some variables" power a

Function Basics

Some of the basic operations we saw above,e.g., parse and print,were functions.As demonstrated above,functions are calledusing the following familiar syntax:

func()           # For no inputsfunc(arg1)       # For one inputfunc(arg1, arg2) # For two inputs# etc.

Note that just writing the function name(i.e., without parentheses)is valid syntax, but it is not a function call.In this case,the function name is treated essentially like a variable,meaning, for example, it can be used as an inputto another function.

For example,one way to compute the sumof the absolute valueof an array of numbersis as follows:

julia> sum(abs, [-1, 0, 1])2

Here,the function abs is not being called (by us)but is used as an input to the function sumas if it were a variable.

Function Vectorization

Often,we have a functionthat operates on a single inputthat we want to applyto every element of an array.Julia provides a convenient syntaxto do so:just add a dot (.).For example,the following takes the absolute valueof every array element:

julia> abs.([-1, 0, 1])3-element Vector{Int64} 1 0 1

There is also a function, map,that does the same thingin this example:

julia> map(abs, [-1, 0, 1])3-element Vector{Int64} 1 0 1

(Note, however,that map and the dot syntaxare not always interchangeable.)

Defining Functions

When writing Julia code,it is convenientto place code inside of functions.There are two main syntaxesfor creating a function.

  1. Using the function keyword:
    function myfunc(x)    return x + 1end
  2. Using the assignment form:
    myfunc2(x, y) = x + y

Optional Arguments

Sometimes we want a functionto have optional inputs.The syntax for specifying optional arguments is

function myfunc3(required, optional = "hello")    println(required, optional)end

Here,optional is optionaland has a default value of "hello"if not provided by the caller.

julia> myfunc3("say ")say hellojulia> myfunc3("see you ", "later")see you later

Keyword Arguments

Another way to specify optional argumentsis to use keyword arguments.The syntax is almost the sameas regular optional arguments,except we use a semicolon (;) instead of a comma (,).

function myfunc4(x; y = 3)    return x * yend

Here,y is optional,but to specify itwe need to use the keyword y.

julia> myfunc4(2)6julia> myfunc4(2; y = 10)20julia> myfunc4(2, 10)ERROR: MethodError: no method matching myfunc4(::Int64, ::Int64)

When calling myfunc4we can also use a commawhen specifying y.

julia> myfunc4(2, y = 1)2

Returning Multiple Values

Sometimes we need a functionto return multiple values.The way to do this in Juliais to return a Tuple.Here’s an example:

function plusminus1(x)    return (x + 1, x - 1)end

Then multiple variables can be assigned at once.

julia> (plus1, minus1) = plusminus1(1)(2, 0)julia> plus12julia> minus10

Note that taking the outputof a function with multiple return valuesand assigning it to a single variablewill assign that variable the whole Tuple of outputs.The following code illustrates thisand shows how to return just one output:

julia> both = plusminus1(1);julia> both(2, 0)julia> (one,) = plusminus1(1);julia> one2

(Note, however, that in this last casethe second output is still computed;it is just immediately discarded,so there are no savings in computation.)

Vectorizing a Function with Multiple Return Values

Vectorizing a function with multiple return valuesrequires a bit more work.For this example,we will use the sincos functionthat computes the sine and cosine simultaneously.We can still use the dot syntax,but we might be tempted to try the following:

julia> (s, c) = sincos.([0, /2, ]);julia> s(0.0, 1.0)julia> c(1.0, 6.123233995736766e-17)

Here, s has the value of sincos(0),not the value of sin.([0, /2, ])like we might have expected.

Instead, we can do the following:

julia> sc = sincos.([0, /2, ])3-element Vector{Tuple{Float64, Float64}}: (0.0, 1.0) (1.0, 6.123233995736766e-17) (1.2246467991473532e-16, -1.0)julia> s = first.(sc)3-element Vector{Float64}: 0.0 1.0 1.2246467991473532e-16julia> c = last.(sc)3-element Vector{Float64}:  1.0  6.123233995736766e-17 -1.0

(Note that instead of using first or last,we could write it this way:output_i = getindex.(sc, i).This way also works for functionsthat return more than two values.)

Summary

In this post,we learned about what a variable isand some basic data types.We also learned abouthow to define and use functions.

There is a lot more we could coverabout these topics,so if you want to learn more,check out the links below,or write a comment belowletting us know what additional concepts or topicsyou would like to see!

Understand variables and functions in Julia?Move on to thenext post to learn how to master the Julia REPL!Or,feel free to take a lookat our other Julia tutorial posts!

Additional Links

Understanding Variables and Functions

By: Steven Whitaker

Re-posted from: https://blog.glcs.io/variables-and-functions

Variables and functions
are the building blocks
of any programmer’s code.
Variables allow computations to be reused,
and functions help keep code organized.

In this post,
we will cover some of the basics
of variables and functions
in Julia,
a relatively new,
free, and open-source programming language.
In particular,
we will discuss what a variable is,
what sorts of data
can be assigned to a variable,
and
how to define and use functions.

Variables

A variable is a label
used to refer to an object.

julia> a = 1
1

In the above code snippet,
we assigned a value of 1
to a variable called a.
Now we can use a in other expressions,
and the value of a (1 in this case)
will be used.

julia> a + 2
3

We can also reassign variables.

julia> a = 4
4

julia> a = "Hello"
"Hello"

julia> a = [1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

We can even use Unicode characters!
We can write many math symbols
by typing the corresponding LaTeX command
and then pressing <tab>.
Here,
we assign
(a Julia constant equal to \( \pi \)
and typed with \pi<tab>)
to
(typed with \theta<tab>).

julia>  = 
 = 3.1415926535897...

Variables Are Labels

One important thing to remember about variables
is that they are labels for data,
not the data itself.
Let’s illustrate what that means.
At this point,
a refers to a Vector.
We will create another variable, b,
and assign it the value of a.

julia> b = a
3-element Vector{Int64}:
 1
 2
 3

Now let’s change one of the elements of b.

julia> b[1] = 100; b
3-element Vector{Int64}:
 100
   2
   3

We didn’t change a,
so it should be the same as before, right?
Nope!

julia> a
3-element Vector{Int64}:
 100
   2
   3

What happened?
Remember, a is just a label
for some data (a Vector).
When we created b,
we created a new label
for the same data.
Both a and b refer
to the same data,
so modifying one modifies the other.

Two labels on the same box

If you want b
to have the same values as a
but refer to different data,
use copy.

julia> b = copy(a)
3-element Vector{Int64}:
 100
   2
   3

julia> b[1] = 1; b
3-element Vector{Int64}:
 1
 2
 3

julia> a
3-element Vector{Int64}:
 100
   2
   3

Two labels on different boxes

Now that we know
how to create variables,
let’s learn about
some basic types of data
we can assign to variables.

Basic Types

Julia has many basic data types.

integer = 9000
floating_point = 3.14
boolean = true
imaginary = 1 + 2im
rational = 4//3
char = 'x'
str = "a string"
array = [1.0, 2.0]

Integers and floating-point numbers
can be expressed
with different numbers of bits.

Int64, Int32, Int16       # and more
Float64, Float32, Float16 # and more

By default,
integer numbers
(technically, literals)
are of type Int64 on 64-bit computers
or of type Int32 on 32-bit computers.
(Note that Int is shorthand for Int64 or Int32
for 64-bit or 32-bit computers, respectively.
Therefore, all integer literals are of type Int.)

On the other hand,
floating-point numbers (literals)
of the form 3.14 or 2.3e5
are of type Float64 on all computers,
while those of the form 2.3f5
are of type Float32.

To use different numbers of bits,
just use the appropriate constructor.

julia> Int16(20)
20

julia> Float16(1.2)
Float16(1.2)

Basic Operations

Now we will cover
some basic operations.
This is by no means an exhaustive list;
check out the Julia documentation
for more details.

# Math
addition = 1 + 2.0
subtraction = 1 - 1
multiplication = 3 * 4//3
division = 6 / 4
integer_division = 6  4 # Type \div<tab>
power = 2^7

# Boolean
not = !false
and = true && not
or = not || and

# Comparison
equality = addition == 1
greater = division > integer_division
chained = addition < subtraction <= power

# Strings
string_concatenation = "hi " * "there"
string_interpolation = "1 - 1 = $subtraction"
string_indexing = string_interpolation[5]
substring = string_concatenation[4:end]
parsing = parse(Int, string_indexing)

# Arrays
a = [1, 2, 3]
b = [4, 5, 6]
concat_vert = [a; b]
concat_horiz = [a b]
vector_indexing = b[2]
vector_range_indexing = b[1:2]
matrix_indexing = concat_horiz[2:3,1]
elementwise1 = a .+ 1
elementwise2 = a .- b

# Displaying
print(addition)
println(string_concatenation)
@show not
@info "some variables" power a

Function Basics

Some of the basic operations we saw above,
e.g., parse and print,
were functions.
As demonstrated above,
functions are called
using the following familiar syntax:

func()           # For no inputs
func(arg1)       # For one input
func(arg1, arg2) # For two inputs
# etc.

Note that just writing the function name
(i.e., without parentheses)
is valid syntax, but it is not a function call.
In this case,
the function name is treated essentially like a variable,
meaning, for example, it can be used as an input
to another function.

For example,
one way to compute the sum
of the absolute value
of an array of numbers
is as follows:

julia> sum(abs, [-1, 0, 1])
2

Here,
the function abs is not being called (by us)
but is used as an input to the function sum
as if it were a variable.

Function Vectorization

Often,
we have a function
that operates on a single input
that we want to apply
to every element of an array.
Julia provides a convenient syntax
to do so:
just add a dot (.).
For example,
the following takes the absolute value
of every array element:

julia> abs.([-1, 0, 1])
3-element Vector{Int64}
 1
 0
 1

There is also a function, map,
that does the same thing
in this example:

julia> map(abs, [-1, 0, 1])
3-element Vector{Int64}
 1
 0
 1

(Note, however,
that map and the dot syntax
are not always interchangeable.)

Defining Functions

When writing Julia code,
it is convenient
to place code inside of functions.
There are two main syntaxes
for creating a function.

  1. Using the function keyword:
    function myfunc(x)
        return x + 1
    end
    
  2. Using the assignment form:
    myfunc2(x, y) = x + y
    

Optional Arguments

Sometimes we want a function
to have optional inputs.
The syntax for specifying optional arguments is

function myfunc3(required, optional = "hello")
    println(required, optional)
end

Here,
optional is optional
and has a default value of "hello"
if not provided by the caller.

julia> myfunc3("say ")
say hello

julia> myfunc3("see you ", "later")
see you later

Keyword Arguments

Another way to specify optional arguments
is to use keyword arguments.
The syntax is almost the same
as regular optional arguments,
except we use a semicolon (;) instead of a comma (,).

function myfunc4(x; y = 3)
    return x * y
end

Here,
y is optional,
but to specify it
we need to use the keyword y.

julia> myfunc4(2)
6

julia> myfunc4(2; y = 10)
20

julia> myfunc4(2, 10)
ERROR: MethodError: no method matching myfunc4(::Int64, ::Int64)

When calling myfunc4
we can also use a comma
when specifying y.

julia> myfunc4(2, y = 1)
2

Returning Multiple Values

Sometimes we need a function
to return multiple values.
The way to do this in Julia
is to return a Tuple.
Here’s an example:

function plusminus1(x)
    return (x + 1, x - 1)
end

Then multiple variables can be assigned at once.

julia> (plus1, minus1) = plusminus1(1)
(2, 0)

julia> plus1
2

julia> minus1
0

Note that taking the output
of a function with multiple return values
and assigning it to a single variable
will assign that variable the whole Tuple of outputs.
The following code illustrates this
and shows how to return just one output:

julia> both = plusminus1(1);

julia> both
(2, 0)

julia> (one,) = plusminus1(1);

julia> one
2

(Note, however, that in this last case
the second output is still computed;
it is just immediately discarded,
so there are no savings in computation.)

Vectorizing a Function with Multiple Return Values

Vectorizing a function with multiple return values
requires a bit more work.
For this example,
we will use the sincos function
that computes the sine and cosine simultaneously.
We can still use the dot syntax,
but we might be tempted to try the following:

julia> (s, c) = sincos.([0, /2, ]);

julia> s
(0.0, 1.0)

julia> c
(1.0, 6.123233995736766e-17)

Here, s has the value of sincos(0),
not the value of sin.([0, /2, ])
like we might have expected.

Instead, we can do the following:

julia> sc = sincos.([0, /2, ])
3-element Vector{Tuple{Float64, Float64}}:
 (0.0, 1.0)
 (1.0, 6.123233995736766e-17)
 (1.2246467991473532e-16, -1.0)

julia> s = first.(sc)
3-element Vector{Float64}:
 0.0
 1.0
 1.2246467991473532e-16

julia> c = last.(sc)
3-element Vector{Float64}:
  1.0
  6.123233995736766e-17
 -1.0

(Note that instead of using first or last,
we could write it this way:
output_i = getindex.(sc, i).
This way also works for functions
that return more than two values.)

Summary

In this post,
we learned about what a variable is
and some basic data types.
We also learned about
how to define and use functions.

There is a lot more we could cover
about these topics,
so if you want to learn more,
check out the links below,
or write a comment below
letting us know what additional concepts or topics
you would like to see!

Understand variables and functions in Julia?
Move on to the
next post to learn how to master the Julia REPL!
Or,
feel free to take a look
at our other Julia tutorial posts!

Additional Links

Working with rows of Tables.jl tables

By: Blog by Bogumił Kamiński

Re-posted from: https://bkamins.github.io/julialang/2023/09/22/tables.html

Introduction

Three weeks ago I wrote a post about getting a schema of Tables.jl tables.
Therefore today, to complement, I thought to discuss how one can get rows of such tables.

The post was written using Julia 1.9.2, Tables.jl 1.11.0, DataAPI.jl 1.15.0, and DataFrames.jl 1.6.1.

Why getting rows of a table is needed?

Many Julia users are happy with using DataFrames.jl to work with their tables.
However, this is only one of the available options.
This means that, especially package creators, prefer not to hardcode DataFrame
as a specific type that their package supports, but allow for generic Tables.jl tables.

An example of such need is, for example, a function that could take a generic table and
split it into train-validation-test subsets. To achieve this you need to be able
to take a subset of its rows.

How row sub-setting is supported in Tables.jl?

There are two functions that, in combination, can be used to generically subset a Tables.jl table:

  • the DataAPI.nrow function that returns a number of rows in a table;
  • the Tables.subset function that allows you to get a subset of rows of a table.

Before I turn to showing you how they work let me highlight one issue. Most of Tables.jl tables
support these functions. However, their support is not guaranteed. The reason is that some tables
are never materialized in memory, e.g. are only a stream of rows that can be read only once.
In such a case we will not know the number of rows in such a table (as it is dynamic) and, similarly,
to get a subset of its rows you would need to scan the whole stream anyway.

Using the row sub-setting interface of Tables.jl

The DataAPI.nrow function is easy to understand. You pass it a table and in return you get the number of its rows.
Let us see it in practice:

julia> using DataAPI

julia> using Tables

julia> table = (a=1:10, b=11:20, c=21:30)
(a = 1:10, b = 11:20, c = 21:30)

julia> DataAPI.nrow(table)
10

The Tables.subset accepts two positional arguments. The first is a table, and the second
are 1-based row indices that should be picked. You have two options for passing indices.
You can pass a single integer index like this:

julia> Tables.subset(table, 2)
(a = 2, b = 12, c = 22)

In which case you get a single row of a table.
The other option is to pass a collection of indices, in which case, you get a table (not a single row):

julia> Tables.subset(table, 2:3)
(a = 2:3, b = 12:13, c = 22:23)

To see that indeed it works for other tables, let us check a DataFrame from DataFrames.jl:

julia> using DataFrames

julia> df = DataFrame(table)
10×3 DataFrame
 Row │ a      b      c
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     1     11     21
   2 │     2     12     22
   3 │     3     13     23
   4 │     4     14     24
   5 │     5     15     25
   6 │     6     16     26
   7 │     7     17     27
   8 │     8     18     28
   9 │     9     19     29
  10 │    10     20     30

julia> nrow(df)
10

julia> Tables.subset(df, 2)
DataFrameRow
 Row │ a      b      c
     │ Int64  Int64  Int64
─────┼─────────────────────
   2 │     2     12     22

julia> Tables.subset(df, 2:3)
2×3 DataFrame
 Row │ a      b      c
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     2     12     22
   2 │     3     13     23

Again, note that Tables.subset(df, 2) returned DataFrameRow (a single row of a table),
while Tables.subset(df, 2:3) returned a DataFrame (a table).

Advanced sub-setting options

If you work with large tables you often hit performance and memory consumption considerations.
In terms of Tables.subset this is related to the question if this function copies data
or just makes a view of the source table. This option is handled by the viewhint keyword argument.

Let us first see how it works:

julia> Tables.subset(df, 2:3, viewhint=true)
2×3 SubDataFrame
 Row │ a      b      c
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     2     12     22
   2 │     3     13     23

julia> Tables.subset(df, 2:3, viewhint=false)
2×3 DataFrame
 Row │ a      b      c
     │ Int64  Int64  Int64
─────┼─────────────────────
   1 │     2     12     22
   2 │     3     13     23

As you can see viewhint=true returned a view (a SubDataFrame), while viewhint=false produced a copy.

Let us see another example:

julia> table2 = Tables.rowtable(df)
10-element Vector{NamedTuple{(:a, :b, :c), Tuple{Int64, Int64, Int64}}}:
 (a = 1, b = 11, c = 21)
 (a = 2, b = 12, c = 22)
 (a = 3, b = 13, c = 23)
 (a = 4, b = 14, c = 24)
 (a = 5, b = 15, c = 25)
 (a = 6, b = 16, c = 26)
 (a = 7, b = 17, c = 27)
 (a = 8, b = 18, c = 28)
 (a = 9, b = 19, c = 29)
 (a = 10, b = 20, c = 30)

julia> Tables.subset(table2, 2:3, viewhint=true)
2-element view(::Vector{NamedTuple{(:a, :b, :c), Tuple{Int64, Int64, Int64}}}, 2:3) with eltype NamedTuple{(:a, :b, :c), Tuple{Int64, Int64, Int64}}:
 (a = 2, b = 12, c = 22)
 (a = 3, b = 13, c = 23)

julia> Tables.subset(table2, 2:3, viewhint=false)
2-element Vector{NamedTuple{(:a, :b, :c), Tuple{Int64, Int64, Int64}}}:
 (a = 2, b = 12, c = 22)
 (a = 3, b = 13, c = 23)

As you can see viewhint=true produced a view of a vector, while viewhint=false made a copy of source data.

Now you might ask why the keyword argument is called viewhint? The reason is that not all Tables.jl tables allow
for flexibility of making a view or a copy. Therefore the rules are as follows:

  • if viewhint is not passed then table decides on its side if it returns a copy or a view (depending on what is possible);
  • if viewhint=true then table should return a view, but if it is not possible this can be a copy;
  • if viewhint=false then table should return a copy, but if it is not possible this can be a view.

In other words viewhint should be considered as a performance hint only.
It does not guarantee to produce what you ask for (as for some tables satisfying this request might be impossible).

Conclusions

Summarizing our post. If you want to write a generic function that subsets a Tables.jl table then you can use:

  • the DataAPI.nrow function to learn how many rows it has;
  • the Tables.subset function to get a subset of its rows using 1-based indexing.

I hope these examples are useful for your work.