Author Archives: DSB

Deep Learning with Julia

By: DSB

Re-posted from: https://medium.com/coffee-in-a-klein-bottle/deep-learning-with-julia-e7f15ad5080b?source=rss-8bd6ec95ab58------2

A brief tutorial on training a Neural Network with Flux.jl

Flux.jl is the most popular Deep Learning framework in Julia. It provides a very elegant way of programming Neural Networks. Unfortunately, since Julia is still not as popular as Python, there aren’t as many tutorial guides on how to use it. Also, Julia is improving very fast, so things can change a lot in a short amount of time.

I’ve been trying to learn Flux.jl for a while, and I realized that most tutorials out there are actually outdated. So this is a brief updated tutorial.

1. What we are going to build

So, the goal of this tutorial is to build a simple classification Neural Network. This will be enough for anyone who is interested in using Flux. After learning the very basics, the rest is pretty much altering Networks architectures and loss functions.

2. Generating our Dataset

Instead of importing data from somewhere, let’s do everything self-contained. Hence, we write two auxiliary functions to generate our data:

#Auxiliary functions for generating our data
function generate_real_data(n)
x1 = rand(1,n) .- 0.5
x2 = (x1 .* x1)*3 .+ randn(1,n)*0.1
return vcat(x1,x2)
end
function generate_fake_data(n)
θ = 2*π*rand(1,n)
r = rand(1,n)/3
x1 = @. r*cos(θ)
x2 = @. r*sin(θ)+0.5
return vcat(x1,x2)
end
# Creating our data
train_size = 5000
real = generate_real_data(train_size)
fake = generate_fake_data(train_size)
# Visualizing
scatter(real[1,1:500],real[2,1:500])
scatter!(fake[1,1:500],fake[2,1:500])
Visualizing the Dataset

3. Creating the Neural Network

The creation of Neural Network architectures with Flux.jl is very direct and clean (cleaner than any other Library I know). Here is how you do it:

function NeuralNetwork()
return Chain(
Dense(2, 25,relu),
Dense(25,1,x->σ.(x))
)
end

The code is very self-explanatory. The first layer is a dense layer with input 2, output 25 and relu for activation function. The second is a dense layer with input 25, output 1 and a sigmoid activation function. The Chain ties the layers together. Yeah, it’s that simple.

4. Training our Model

Next, let’s prepare our model to be trained.

# Organizing the data in batches
X = hcat(real,fake)
Y = vcat(ones(train_size),zeros(train_size))
data = Flux.Data.DataLoader(X, Y', batchsize=100,shuffle=true);
# Defining our model, optimization algorithm and loss function
m = NeuralNetwork()
opt = Descent(0.05)
loss(x, y) = sum(Flux.Losses.binarycrossentropy(m(x), y))

In the code above, we first organize our data into one single dataset. We use the DataLoader function from Flux, that helps us create the batches and shuffles our data. Then, we call our model and define the loss function and the optimization algorithm. In this example, we are using gradient descent for optimization and cross-entropy for the loss function.

Everything is ready, and we can start training the model. Here, I’ll show two way of doing it.

Training Method 1

ps = Flux.params(m)
epochs = 20
for i in 1:epochs
Flux.train!(loss, ps, data, opt)
end
println(mean(m(real)),mean(m(fake))) # Print model prediction

In this code, first we declare what parameters are going to be trained, which is done using the Flux.params() function. The reason for this is that we can choose not to train a layer in our network, which might be useful in the case of transfer learning. Since in our example we are training the whole model, we just pass all the parameters to the training function.

Other then this, there is not much to be said. The final line of code is just printing the mean prediction probability our model is giving.

Training Method 2

m    = NeuralNetwork()
function trainModel!(m,data;epochs=20)
for epoch = 1:epochs
for d in data
gs = gradient(Flux.params(m)) do
l = loss(d...)
end
Flux.update!(opt, Flux.params(m), gs)
end
end
@show mean(m(real)),mean(m(fake))
end
trainModel!(m,data;epochs=20)

This method is a bit more convoluted, because we are doing the training “manually”, instead of using the training function given by Flux. This is interesting since one has more control over the training, which can be useful for more personalized training methods. Perhaps the most confusing part of the code is this one:

gs = gradient(Flux.params(m)) do
l = loss(d...)
end
Flux.update!(opt, Flux.params(m), gs)

The function gradient receives the parameters to which it will calculate the gradient, and applies it to the loss function, that is calculated for the batch d. The splater operator (the three dots) is just a neat way of passing x and y to the loss function. Finally, the update! function is adjusting the parameters according to the gradients, which are stored in the variable gs.

5. Visualizing the Results

Finally, the model is trained, and we can visualize it’s performance again the dataset.

scatter(real[1,1:100],real[2,1:100],zcolor=m(real)')
scatter!(fake[1,1:100],fake[2,1:100],zcolor=m(fake)',legend=false)
Neural Network prediction again the training dataset

Note that our model is performing quite well, it can properly classify the points in the middle with probability close to 0, implying that it belongs to the “fake data”, while the rest has probability close to 1, meaning that it belongs to the “real data”.

6. Conclusion

That’s all for our brief introduction. Hopefully this is a first article on a series on how to do Machine Learning with Julia.

Note that this tutorial is focused on simplicity, and not on writing the most efficient code. For that learning how to improve performance, look here.

TL;DR
Here is the code with everything put together:

#Auxiliary functions for generating our data
function generate_real_data(n)
x1 = rand(1,n) .- 0.5
x2 = (x1 .* x1)*3 .+ randn(1,n)*0.1
return vcat(x1,x2)
end
function generate_fake_data(n)
θ = 2*π*rand(1,n)
r = rand(1,n)/3
x1 = @. r*cos(θ)
x2 = @. r*sin(θ)+0.5
return vcat(x1,x2)
end
# Creating our data
train_size = 5000
real = generate_real_data(train_size)
fake = generate_fake_data(train_size)
# Visualizing
scatter(real[1,1:500],real[2,1:500])
scatter!(fake[1,1:500],fake[2,1:500])
function NeuralNetwork()
return Chain(
Dense(2, 25,relu),
Dense(25,1,x->σ.(x))
)
end
# Organizing the data in batches
X = hcat(real,fake)
Y = vcat(ones(train_size),zeros(train_size))
data = Flux.Data.DataLoader(X, Y', batchsize=100,shuffle=true);
# Defining our model, optimization algorithm and loss function
m = NeuralNetwork()
opt = Descent(0.05)
loss(x, y) = sum(Flux.Losses.binarycrossentropy(m(x), y))
# Training Method 1
ps = Flux.params(m)
epochs = 20
for i in 1:epochs
Flux.train!(loss, ps, data, opt)
end
println(mean(m(real)),mean(m(fake))) # Print model prediction
# Visualizing the model predictions
scatter(real[1,1:100],real[2,1:100],zcolor=m(real)')
scatter!(fake[1,1:100],fake[2,1:100],zcolor=m(fake)',legend=false)


Deep Learning with Julia was originally published in Coffee in a Klein Bottle on Medium, where people are continuing the conversation by highlighting and responding to this story.

Creating and Deploying your Julia Package Documentation

By: DSB

Re-posted from: https://medium.com/coffee-in-a-klein-bottle/creating-and-deploying-your-julia-package-documentation-1d09ddc90474?source=rss-8bd6ec95ab58------2

A tutorial on how to create and deploy your Julia Package documentation using Documenter.jl and GitHub Actions.

If you are developing a new package for Julia, you might’ve followed the steps in this article, and is now wondering how to create the documentation for your package. Well, this is what this article is for. Here, our new package is also called VegaGraphs.jl, which is a package that I’m developing at the moment.

In this tutorial I’ll be using the package Documenter.jl together with the GitHub Actions plugin. The Documenter.jl package will help us create the documentation, and the GitHub Actions plugin will create a bot for us that will publish our documentation on our GitHub page.

1. Creating Docstring

First of all, when you write the functions in your package, above each function you should write a Docstring explaining the arguments used in the function, what the function does, etc.

# Example of function inside ./src/VegaGraphs.jl
"""
MyFunction(x,y)
This is an example of Docstring. This function receives two 
numbers x and y and returns the sum of the squares.
```math
x^2 + y^2
```
"""
function MyFunction(x,y)
return x^2+y^2
end

Note that you should use triple quotes, and place the text right above the function you are documenting. Also, you may use LaTeX to write math equations, as shown in the lines:

```math
x^2 + y^2
```

When you generate your documentation, this equation will be properly rendered, and you will have a beautiful mathematical equation.

2. Setting up Documenter.jl

Next we must set up the Documenter.jl. To do this, first create a folder named docs and inside of it create a file named make.jland another folder named ./src . Your package folder should look something like this:

VegaGraphs/
├── docs/
│ └── make.jl
│ └── src/
├── src/
│ └── VegaGraphs.jl
...

Inside the make.jl file we will write the code that Documenter.jl will use to create a nice webpage for our documentation. Inside make.jl write the following (changing the name of the package from VegaGraph to yours):

# Inside make.jl
push!(LOAD_PATH,"../src/")
using VegaGraphs
using Documenter
makedocs(
sitename = "VegaGraphs.jl",
modules = [VegaGraphs],
pages=[
"Home" => "index.md"
])
deploydocs(;
repo="github.com/USERNAME/VegaGraphs.jl",
)

Most of the code here is self-explanatory. You are defining the name the website for the documentation, the module which you will be documenting, and the pages your website will have. For now, our documentation will only have “Home”, and the information that will be on this page will be inside the index.m file.

Inside the ./docs/src you need to create the file named index.md. This is a markdown file where you will write how the “Home” page should look like. Here is an example:

# VegaGraphs.jl
*The best summation package.*
## Package Features
- Sum the squares of two numbers
## Function Documentation
```@docs
MyFunction
```

Everything here should be familiar to you if you know markdown. The only thing that looks different are the last 4 lines. Here is where our Docstring comes in. The Documenter.jl package will take the Docstring from the function MyFunction and place where we wrote:

```@docs
MyFunction
```

As your create new functions, just add more of this to your index.md , and you will rapidly create your package’s documentation.

The final step in regards to Documenter.jl is to build the whole thing:

# from your terminal,inside the ./docs/src
# Remember to install Documenter.jl before running this
julia make.jl

After running this command, a new folder called build will be created inside the docs , and this folder will contain all the html files for your documentation. You may now open this folder

3. Deploying your Documentation with GitHub Actions

Your website containing the documentation for the package is already created, and you may host the webpages using any method you want. In this section, I’ll then explain how to use GitHub Actions to automatically publish the documentation using GitHub pages.

Assuming you followed this article here on how to develop your package, you already have GitHub Actions working on the background. What you must do now is create a file named Documentation.yml inside the .github/workflows folder. Inside this file, you should have something like this:

name: Documentation
on:
push:
branches:
- master
tags: '*'
pull_request:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@latest
with:
version: '1.5'
- name: Install dependencies
run: julia --project=docs/ -e 'using Pkg; Pkg.develop(PackageSpec(path=pwd())); Pkg.instantiate()'
- name: Build and deploy
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # For authentication with GitHub Actions token
run: julia --project=docs/ docs/make.jl

You can pretty much copy and past the code above, and the next time you push new commits to your repository, the bot will run and generate the documentation. Also, note that it will create a branch named “gh-pages”. This is the branch containing the webpages for the documentation.

To use GitHub pages for hosting our documentation, we must enable GitHub pages on the repository containing the package. To do this, just go to the repository GitHub’s page, click on setttings , scroll down to the “GitHub Pages” section and enable it.

Example showing how to enable the hosting of your documentation

After doing all this, your documentation will be available at “https://username.github.io/VegaGraphs.jl/dev.

And you now has a beautiful website for your documentation.

Example of documentation page generated with Documenter.jl


Creating and Deploying your Julia Package Documentation was originally published in Coffee in a Klein Bottle on Medium, where people are continuing the conversation by highlighting and responding to this story.

Developing your Julia package

By: DSB

Re-posted from: https://medium.com/coffee-in-a-klein-bottle/developing-your-julia-package-682c1d309507?source=rss-8bd6ec95ab58------2

A Tutorial on how to quickly and easily develop your own Julia package

We’ll show step-by-step how to develop your first package in Julia. To do this, we use the very useful PkgTemplate.jl, and we base our Tutorial on this guide by Quantecon and this video by Chris Rackauckas.

First things first. Start by creating the folder to store your package. In our case, we’ll be creating a package named “VegaGraphs”, which is a package that I’m developing. So, inside this folder, open your Julia REPL and install PkgTemplate.jl.

Installing PkgTemplates

1. Creating your Package Template

First, we need to create a template for our package. This template will include things like Licensing, Plugins, Authors, etc. For our example, we’ll be using a simple template with the MIT License. The plugins used will be GitHub Actions and Codecov. I’ll not be diving into these plugins, but just for the sake of clarity, I’ll briefly explain what they do.

  • GitHub Actions is a plugin that automatically builds a virtual machine and tests your code, hence, you can control the machine configurations necessary to running your package.
  • Codecov is a plugin that analysis your code, and evaluates how much of it is covered in your tests. So, for example, suppose that you wrote 3 different functions, but forgot to write a test for one of them. Then, Codecov will point out that there is no tests for such function.

Still inside the REPL, run the following commands:

t = Template(;user="YourUserNameOnGithub", plugins = [GitHubActions(), Codecov()], manifest = true)
generate("VegaGraphs.jl",t)

The first line of code defines the template, while the second one will generate your package. Note that I didn’t specify a folder, so the package will be created in a default location, which will be ~/.julia/dev (for Linux).

Now, just copy the files from the ~/.julia/dev/VegaGraphs to the folder where you will be working from, and then setup your git repository by running the following commands in the terminal:

# inside the VegaGraphs working folder
git init
git add -A
git commit -m "first commit"
git branch -M master
git remote add origin git@github.com:YOURUSERNAME/VegaGraphs.git
git push -u origin master

Taking a look inside the generated folder, you’ll have two folders and four files:

./src
./test
README.md
LICENSE
Manifest.toml
Project.toml
  • /src : This folder is where you will write the code for your package per se;
  • /test : Here is for storing your tests;
  • Project.toml: This is where you will store information such as the author, dependencies, julia version, etc;
  • Manifest.toml : This is a machine generated file, and you should just leave it be;

The other files are self explanatory.

2. Writing Code

We are ready to start coding our package. Note that inside the /src folder we already have a file named VegaGraphs.jl , which is the main file of our package. We can do all our coding directly inside VegaGraphs.jl , but as our code gets large, this might become messy.

Instead, we can write many different files for organizing our code, and then use VegaGraphs.jl to join everything together. Let’s do an example. We’ll code a very simple function, which we’ll store in another file, called graph_functions.jl , that will also be inside the ./src folder.

Here is some example code:

# code inside graph_functions.jl
function sum_values(x,y)
return x+y
end

The code above is a simple implementation of a function. Below I show how to actually make this function available to users. One just needs to “include” the graph_functions.jl file, and to export the plot_scatter . Once exported, the function is now available to anyone who imports our package.

# code inside VegaGraphs.jl
module VegaGraphs
using VegaLite
export sum_values
include("graph_functions.jl")
end

Note that, besides including our function, I’ve also imported the VegaLite package. Hence, I need to specify VegaLite.jl as a dependency. We’ll do this by using the REPL again.

Go to the root of your package and open the REPL by running the command julia in the terminal.Now, press ] . This will put you on “package mode”. Next, write activate . , which will activate the Julia environment to your current folder. Finally, write add VegaLite , and this will add VegaLite.jl to your dependencies inside the Project.toml file.

Adding dependency to your package

3. Creating and running tests

So we’ve implemented a function and added a dependency to our package. The following step is to write a test, to guarantee that our code is indeed working. The code is self-explanatory, we just write our test inside a testset. You can write as many as you like to guarantee that your function is working properly.

# code inside ./test/runtests.jl
using VegaGraphs
using Test
@testset "VegaGraphs.jl" begin
x = 2
y = 2
@test VegaGraphs.sum_values(x,y) == 4
end

Once the test is written, we have to run it and see if everything is working. Again, go to the root of the project and open your REPL. Guarantee that your environment is activated and run the tests as shown in the image below:

Running tests for your package

This will run your tests and see if everything passes. Once everything passes, we can trust that our code is working properly, and we can move on to more implementations.

4. Workflow – Text editor + Jupyter Notebook

With everything shown up until now, you are already ready to develop your package in Julia. Still, you might be interested on how to develop an efficient workflow. There are many possibilities here, such as using IDEs such as Juno and VsCode. I prefer to use Jupyter Notebooks with Vim (my text editor of choice), and doing everything from the terminal.

To use your developing package in your Notebook, you will need to activate your environment in a similar way that we’ve been doing up until now. As you open a new Notebook, run the following code in the very first cell.

Using Jupyter Notebook for developing packages

Note here that besides activating the environment, we also imported a package called Revise. This package is very helpful, and it should be imported right in the beginning of your notebook, before you actually import your own package, otherwise it won’t work properly!

Every time you modify the code in your package, to load the changes to your notebook you would have to restart the kernel. But when you use Revise.jl, you don’t need to restart your kernel, just import your package again, and the modifications will be applied.

Now, my workflow is very simple. I use Vim to modify the code in my package, and use the Notebook for trying things out. Once I get everything working as it should, I write down some tests, and test the whole thing.

5. Registering/Publishing your Package

Finally, suppose that you’ve finished writing your package, and you are ready to share it with the Julia community. Like with other packages, you want users to be able to write a simple Pkg.add("MyPackage") and install it. This process is called registering.

To do this, first go to Registrator.jl and install the app to your Github account.

In the Registrator.jl Gitub page, click in the “install app”

Next, go to your package Github’s page and enter the Issues tab. Create a new issue, an write @JuliaRegistrator register() , as shown in the image below.

Registering your package

Once you’ve done this, the JuliaRegistrator bot will open a pull request for your package to be registered. Unless you actually want your package to be registered, don’t actually submit the issue.

And that’s all.


Developing your Julia package was originally published in Coffee in a Klein Bottle on Medium, where people are continuing the conversation by highlighting and responding to this story.