Category Archives: Julia

Experience JuliaCon Local – Eindhoven 2023

By: Jasmine Chokshi

Re-posted from: https://info.juliahub.com/blog/juliacon-local-eindhoven-2023

JuliaCon Local Eindhoven 2023 kicks off next Friday (December 1, 2023), at the High Tech Campus Conference Center in Eindhoven, Netherlands. Co-hosted with PyData Eindhoven 2023, this conference promises a day filled with insights, innovations, and networking opportunities for Julia enthusiasts.

World Happiness Report – EDA & clustering with Julia

By: Navi

Re-posted from: https://indymnv.dev/posts/005_happines/index.html

World Happiness Report – EDA & clustering with Julia

Date: 2023-11-23

Summary: An exploration of Happiness Report using Julia

tags: #Julia #economy #clustering #EDA




Table of Contents

  1. Introduction
  2. Packages used
  3. Clustering
  4. Conclusions

Introduction

The purpose of this post is to show Julia as a language for data analysis and Machine Learning. Sadly Kaggle does not support Julia Kernels (hopefully, they will add it in the future). Therefore I wanted to take advantage of this space to show a reimplementation of Python/R Notebooks to Julia. In this context, I took data on happiness in countries in 2021 and some factors considered in this exciting survey.

  • You can get the dataset in Kaggle

  • The full code is in my Github

Packages used

I'm using Julia version 1.8.0 in this project, and the library versions are in the Project.toml, there are some installed that I didn't end up using for this analysis, but these are the important ones

using DataFrames
using DataFramesMeta
using CSV
using Plots
using StatsPlots
using Statistics
using HypothesisTests
Plots.theme(:ggplot2)

Let's start reading the file.

df_2021 = DataFrame(CSV.File("./data/2021.csv", normalizenames=true))

You can see the dataset in the REPL.

julia> df_2021 = DataFrame(CSV.File("./data/2021.csv", normalizenames=true))
149×20 DataFrame
 Row │ Country_name    Regional_indicator            Ladder_score  Standard_error_of_ladder_score  upperwhi ⋯
     │ String31        String                        Float64       Float64                         Float64  ⋯
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ Finland         Western Europe                       7.842                           0.032         7 ⋯
   2 │ Denmark         Western Europe                       7.62                            0.035         7
   3 │ Switzerland     Western Europe                       7.571                           0.036         7
   4 │ Iceland         Western Europe                       7.554                           0.059         7
   5 │ Netherlands     Western Europe                       7.464                           0.027         7 ⋯
   6 │ Norway          Western Europe                       7.392                           0.035         7
   7 │ Sweden          Western Europe                       7.363                           0.036         7
   8 │ Luxembourg      Western Europe                       7.324                           0.037         7
   9 │ New Zealand     North America and ANZ                7.277                           0.04          7 ⋯
  10 │ Austria         Western Europe                       7.268                           0.036         7
  11 │ Australia       North America and ANZ                7.183                           0.041         7
  12 │ Israel          Middle East and North Africa         7.157                           0.034         7
  13 │ Germany         Western Europe                       7.155                           0.04          7 ⋯
  14 │ Canada          North America and ANZ                7.103                           0.042         7
  ⋮  │       ⋮                      ⋮                     ⋮                      ⋮                      ⋮   ⋱
 136 │ Togo            Sub-Saharan Africa                   4.107                           0.077         4
 137 │ Zambia          Sub-Saharan Africa                   4.073                           0.069         4
 138 │ Sierra Leone    Sub-Saharan Africa                   3.849                           0.077         4 ⋯
 139 │ India           South Asia                           3.819                           0.026         3
 140 │ Burundi         Sub-Saharan Africa                   3.775                           0.107         3
 141 │ Yemen           Middle East and North Africa         3.658                           0.07          3
 142 │ Tanzania        Sub-Saharan Africa                   3.623                           0.071         3 ⋯
 143 │ Haiti           Latin America and Caribbean          3.615                           0.173         3
 144 │ Malawi          Sub-Saharan Africa                   3.6                             0.092         3
 145 │ Lesotho         Sub-Saharan Africa                   3.512                           0.12          3
 146 │ Botswana        Sub-Saharan Africa                   3.467                           0.074         3 ⋯
 147 │ Rwanda          Sub-Saharan Africa                   3.415                           0.068         3
 148 │ Zimbabwe        Sub-Saharan Africa                   3.145                           0.058         3
 149 │ Afghanistan     South Asia                           2.523                           0.038         2

To see the columns name, simply use

names(df_2021)

getting a vector with all column names

julia> names(df_2021)
20-element Vector{String}:
 "Country_name"
 "Regional_indicator"
 "Ladder_score"
 "Standard_error_of_ladder_score"
 "upperwhisker"
 "lowerwhisker"
 "Logged_GDP_per_capita"
 "Social_support"
 "Healthy_life_expectancy"
 "Freedom_to_make_life_choices"
 "Generosity"
 "Perceptions_of_corruption"
 "Ladder_score_in_Dystopia"
 "Explained_by_Log_GDP_per_capita"
 "Explained_by_Social_support"
 "Explained_by_Healthy_life_expectancy"
 "Explained_by_Freedom_to_make_life_choices"
 "Explained_by_Generosity"
 "Explained_by_Perceptions_of_corruption"
 "Dystopia_residual"

To see what is a regional indicator, we can see how every country is grouped.

julia> unique(df_2021.Regional_indicator)
10-element Vector{String}:
 "Western Europe"
 "North America and ANZ"
 "Middle East and North Africa"
 "Latin America and Caribbean"
 "Central and Eastern Europe"
 "East Asia"
 "Southeast Asia"
 "Commonwealth of Independent States"
 "Sub-Saharan Africa"
 "South Asia"

Let's do a simple operation with the dataframe getting the number of countries by regional indicator and sorting those

sort(
    combine(groupby(df_2021, :Regional_indicator), nrow), 
    :nrow
)

Getting this output

julia> sort(
           combine(groupby(df_2021, :Regional_indicator), nrow),
           :nrow
       )
10×2 DataFrame
 Row │ Regional_indicator                 nrow
     │ String                             Int64
─────┼──────────────────────────────────────────
   1 │ North America and ANZ                  4
   2 │ East Asia                              6
   3 │ South Asia                             7
   4 │ Southeast Asia                         9
   5 │ Commonwealth of Independent Stat…     12
   6 │ Middle East and North Africa          17
   7 │ Central and Eastern Europe            17
   8 │ Latin America and Caribbean           20
   9 │ Western Europe                        21
  10 │ Sub-Saharan Africa                    36

With this, we can see a more significant number of countries in Sub-Saharan Africa and only a smaller group of countries in North America and ANZ.

Now, let's try to slice our data. We will create a data frame called float_df that contains only the Float64 variables but excludes the "explained_" variables. This new dataframe will help us with some operations later.

#Get all columns Float64
float_df = select(df_2021, findall(col -> eltype(col) <: Float64, eachcol(df_2021)))#Take away the Explained variables
float_df = float_df[:,Not(names(select(float_df, r"Explained")))]

Let's make our first plot.

scatter(
    df_2021.Social_support,
    df_2021.Ladder_score,
    size = (1000,800),
    label="country",
    xaxis = "Social Support",
    yaxis = "Ladder Score",
    title = "Relation between Social Support and Happiness Index Score by country"
)

![scatterplot with ladder score and social support](/assets/005_happines/scatterplot.png)

If we want a view of all float variables in several histograms, we can add this code using Statsplots.

N = ncol(float_df)
numerical_cols = Symbol.(names(float_df,Real))
@df float_df Plots.histogram(cols();
                             layout=N,
                             size=(1400,800),
                             title=permutedims(numerical_cols),
                             label = false)

Histogram of all variables

And If we want to compare it with boxplots.

@df float_df boxplot(cols(), 
                     fillalpha=0.75, 
                     linewidth=2,
                     title = "Comparing distribution for all variables in dataset",
                     legend = :topleft)

Boxplot all variables

Without going into so much detail, we can affirm that the Ladder Score is the variable related to the result of the survey on the degree of happiness in the country (our dependent variable). Explained variables correspond to the preprocessing to build the Ladder Score, for this reason, we remove them from the dataframe and will hold with only the raw data.

What are the top 5 countries and bottom 5?

# Top 5 and bottom 5 countries by ladder score
sort!(df_2021, :Ladder_score, rev=true)
plot(
    bar(
        first(df_2021.Country_name, 5 ),
        first(df_2021.Ladder_score, 5 ),
        color= "green",
        title = "Top 5 countries by Happiness score",
        legend = false,
    ),
    bar(
        last(df_2021.Country_name, 5 ),
        last(df_2021.Ladder_score, 5 ),
        color ="red",
        title = "Bottom 5 countries by Happiness score",
        legend = false,
    ),
size=(1000,800),
yaxis = "Happines Score",
)

top5 and bottom 5

And the classic heatmap for correlation with the following function.

function heatmap_cor(df)
    cm = cor(Matrix(df))
    cols = Symbol.(names(df))    (n,m) = size(cm)
    display(
    heatmap(cm, 
        fc = cgrad([:white,:dodgerblue4]),
        xticks = (1:m,cols),
        xrot= 90,
        size= (800, 800),
        yticks = (1:m,cols),
        yflip=true))
    display(
    annotate!([(j, i, text(round(cm[i,j],digits=3),
                       8,"Computer Modern",:black))
           for i in 1:n for j in 1:m])
    )
end

heatmap

And now, we can build a function where we can get the mean ladder score by regional indicator and compare it with the distribution of all countries.

function distribution_plot(df)
    display(
        @df df density(:Ladder_score,
        legend = :topleft, size=(1000,800) , 
        fill=(0, .3,:yellow),
        label="Distribution" ,
        xaxis="Happiness Index Score", 
        yaxis ="Density", 
        title ="Comparison Happiness Index Score by Region 2021") 
    )
    display(
        plot!([mean(df_2021.Ladder_score)],
        seriestype="vline",
        line = (:dash), 
        lw = 3,
        label="Mean")
    )
    for element in unique(df_2021.Regional_indicator)
        display(
            plot!(
            [mean(mean([filter(row->row["Regional_indicator"]==element, df).Ladder_score]))],
            seriestype="vline",
            lw = 3,
            label="$element") 
        )
    end
end

distribution region

Suppose we want to try the same idea but with countries. In that case, we can take advantage of multiple dispatch and create a function that receives a list of countries and creates a variation of the distribution with countries.

function distribution_plot(df, var_filter, list_elements)
    display(
        @df df density(:Ladder_score,
        legend = :topleft, size=(1000,800) , 
        fill=(0, .3,:yellow),
        label="Distribution" ,
        xaxis="Happiness Index Score", 
        yaxis ="Density", 
        title ="Happiness index score compare by countries 2021") 
    )
    display(
        plot!([mean(df_2021.Ladder_score)],
        seriestype="vline",
        line = (:dash), 
        lw = 3,
        label="Mean")
    )
    for element in list_elements
        display(
            plot!(
            mean([filter(row->row[var_filter]==element, df).Ladder_score]),
            seriestype="vline",
            lw = 3,
            label="$element") 
        )
    end
end

Let's test our new function, comparing three countries.

distribution_plot(df_2021, "Country_name", ["Chile",
                                            "United States",
                                            "Japan",
                                           ])

distribution countries

Here we can see how the USA has the highest score, followed by Chile and Japan.

To end the first part, let's apply some statistical tests. We will use an equal variance T-test to compare distribution from different regions. The function is as follows.

# Perform a simple test to compare distributions
# This function performs a two-sample t-test of the null hypothesis that s1 and s2 
# come from distributions with equal means and variances 
# against the alternative hypothesis that the distributions have different means 
# but equal variances.
function t_test_sample(df, var, x , y)
    x = filter(row ->row[var] == x, df).Ladder_score
    y = filter(row ->row[var] == y, df).Ladder_score
    EqualVarianceTTest(vec(x), vec(y))
end

We will have this output if we compare Western Europe and North America and ANZ.

t_test_sample(df_2021, "Regional_indicator", "Western Europe", "North America and ANZ")
julia> t_test_sample(df_2021, "Regional_indicator", "Western Europe", "North America and ANZ")
Two sample t-test (equal variance)
----------------------------------
Population details:
    parameter of interest:   Mean difference
    value under h_0:         0
    point estimate:          -0.213595
    95% confidence interval: (-0.9068, 0.4796)Test summary:
    outcome with 95% confidence: fail to reject h_0
    two-sided p-value:           0.5301Details:
    number of observations:   [21,4]
    t-statistic:              -0.6374218416101513
    degrees of freedom:       23
    empirical standard error: 0.3350924366753546

We don't have enough evidence to reject the hypothesis that these samples come from distributions with equal means and variance. On another side, if we try comparing Western Europe with South Asia, we can see this:

julia> t_test_sample(df_2021, "Regional_indicator", "South Asia", "Western Europe")
Two sample t-test (equal variance)
----------------------------------
Population details:
    parameter of interest:   Mean difference
    value under h_0:         0
    point estimate:          -2.47305
    95% confidence interval: (-3.144, -1.802)Test summary:
    outcome with 95% confidence: reject h_0
    two-sided p-value:           <1e-07Details:
    number of observations:   [7,21]
    t-statistic:              -7.576776118465833
    degrees of freedom:       26
    empirical standard error: 0.32639840222022687

In this case, we can reject that hypothesis.

Clustering

Now we will cluster the countries using the popular algorithm Kmeans. My first option was to use clustering.jl. However, determining the ideal number of clusters is necessary to get the Wcss (within-cluster sum of the square). With this, we can evaluate it with the elbow method, so I used Scikit-learn wrapper. I also include an issue. Well, let's continue with the last part. I started adding some libraries.

using Random
using ScikitLearn
using PyCall@sk_import preprocessing: StandardScaler
@sk_import cluster: KMeans

Let's take out from the float_df all the variables related to Ladder_score, and keep only the variables considered in the survey.

select!(float_df, Not([:Standard_error_of_ladder_score, 
                           :Ladder_score, 
                           :Ladder_score_in_Dystopia, 
                           :Dystopia_residual]))

To train our model, we need to standardize the data, and then we will create a list to retrieve the wcss in every iteration. The function is as follows:

function kmeans_train(df)
    X = fit_transform!(StandardScaler(), Matrix(df))    wcss = []
    for n in 1:10        Random.seed!(123)
        cluster =KMeans(n_clusters=n,
                        init = "k-means++",
                        max_iter = 20,
                        n_init = 10,
                        random_state = 0)
        cluster.fit(X)
        push!(wcss, cluster.inertia_)
    end
    return wcss
end

Let's invoke the function and plot the wcss.

wcss = kmeans_train(float_df)plot(wcss, title = "wcss in each cluster",
    xaxis = "cluster",
   yaxis = "Wcss")

Elbow Method

In this case, I decided to go for three clusters. We can <del>abuse</del> make use of multiple dispatch again, adding n for a defined number of clusters.

function kmeans_train(df, n)
    X = fit_transform!(StandardScaler(), Matrix(df))    Random.seed!(123)
    cluster =KMeans(n_clusters=n,
                    init = "k-means++",
                    max_iter = 20,
                    n_init = 10,
                    random_state = 0)
    cluster.fit(X)
    return cluster
endcluster= kmeans_train(float_df, 3)

If we take the first plot we did at the beginning of the post, but now we add the cluster labels, we have this plot.

scatter(df.Social_support,
        df.Ladder_score,
        marker_z = cluster.labels_,
        legend = false,
        size = (1000,800),
        xaxis = "Social Support",
        yaxis = "Ladder Score",
        title = "Comparison between social support and ladder score by country incorporating clustering")

Scatter with cluster

With these clusters, we have a group with developed countries with the highest happiness index score. For example, Finland, Australia and Germany, followed by a group of emerging countries. Finally, countries that still have a significant debt for the well-being of their population.

histogram(filter(row ->row.cluster ==1,df).Ladder_score, label = "cluster 1", title = "Distribution of Happiness Score by Cluster", xaxis = "Ladder Score", yaxis="n° countries")
histogram!(filter(row ->row.cluster ==2,df).Ladder_score, label = "cluster 2")
histogram!(filter(row ->row.cluster ==3,df).Ladder_score, label = "cluster 3")

histogram happiness cluster

Finally, we can compare how this cluster affects all the variables.

@df float_df Plots.density(cols();
                             layout=N,
                             size=(1600,1200),
                             title=permutedims(numerical_cols),
                             group = df.cluster,
                             label = false)

Distribution by variables with cluster

Conclusions

From my experience using Python for about two years in data analysis and recently dabbling with Julia, I can say that the ecosystem generally seems quite mature for this purpose. I had some questions that the community immediately answered on Julia Discourse. More content like this is needed so that the data science community can more widely adopt this technology.

JuliaCon 2024 Announcement | JuliaHub

By: JuliaHub

Re-posted from: https://info.juliahub.com/blog/newsletter-november-2023-juliacon-2024-eindhoven-netherlands-july-9-13

JuliaCon 2024: JuliaCon 2024 will take place in Eindhoven, Netherlands July 9-13. Workshops will take place on Tuesday July 9, presentations will take place Wednesday July 10-Friday July 12 and there will be a hackathon on Saturday July 13. Stay tuned for the call for proposals and other information.

JuliaCon Local Eindhoven 2023: Eindhoven will also host the first ever JuliaCon local event on December 1, 2023. Click here for tickets and more information.

JuliaHub Welcomes Brad Carman as Director of Consulting Services: JuliaHub is pleased to welcome Brad Carman as Director of Consulting Services. Brad has over two decades of experience in model-based innovation and design. Please click here for more.

JuliaHub Consulting Services: Would your organization benefit from a 100x increase in simulation speeds? JuliaSim might be the solution you need. Click here for more information about JuliaSim, and to contact us to learn how JuliaSim can help your business succeed.

Free Upcoming Webinars from JuliaHub: JuliaHub provides free Webinars covering a range of Julia topics. The Webinars are free but advance registration is required and space is limited. Please click the links below to register.

Webinar

Presenter

Date

Acausal Modeling for Nonlinear Control and Analysis

Dr. Fredrik Bagge Carlson, JuliaHub Senior Software Engineer

Tue Nov 21, 1-2 pm Eastern (US)

Accelerating Simulations Using JuliaSimCompiler

Yingbo Ma, JuliaHub Engineering Team Lead

Wed Nov 29, 2-3 pm Eastern (US)

Ingesting and Deploying Functional Mockup Units in JuliaSim

Dr. Ranjan Anantharaman, JuliaHub Sales Engineer

Tue Dec 5, 9:30-10:30 am  Eastern (US)

Introduction to ModelingToolkit for Industrial Modelers: A Hands-On Training

Dr. Michael Tiller, Senior Director of JuliaSim Product Management and Brad Carman, JuliaHub Director of Consulting

Thu Dec 7, 1-2:30 pm Eastern (US)

Free JuliaHub Webinar Archive: JuliaHub provides free access to more than 70 of our past Webinars. Recent Webinars include:

JuliaHub Policies and Private Registry Solutions: JuliaHub Policies and Private Registry Solutions is a new blog post from Bill Burdick (JuliaHub Senior Software Developer) and Deep Datta (JuliaHub Product Director). It describes new JuliaHub features including Package Analytics and Package Policies.

JuliaHub at American Conference on Pharmacometrics (ACoP): PumasAI, a JuliaHub partner, presented two workshops using JuliaHub at the American Conference on Pharmacometrics (ACoP) this month in National Harbor, MD. Click here to learn more: ACoP14 Showcases Pharmacometric Tools on JuliaHub in November Workshops.

CUDA.jl 5.1: CUDA.jl 5.1 – Unified Memory and Cooperative Groups is a new blog post in which Dr. Tim Besard (JuliaHub Software Engineer) explains new features and benefits available using CUDA.jl 5.1. Click here to learn more.

New Podcast Episode with Dr. Chris Rackauckas, JuliaHub VP Modeling and Simulation: Dr. Chris Rackauckas, JuliaHub VP Modeling and Simulation, discusses Computational Chemistry with Catalyst in a 30 minute podcast episode. “Chris break[s] down the ambitions and insights of his paper, and spill[s] the beans on how Catalyst is shaking up the status quo in chemical modeling…. Chris guide[s] us through Catalyst’s synergy with other Julia packages, crafting a comprehensive toolkit for researchers. And because we all love a good teaser, Chrisl share[s] a glimpse into the horizon for Catalyst’s evolution.”

Generalizing Scientific Machine Learning and Differentiable Simulation Beyond Continuous Models: Dr. Chris Rackauckas, JuliaHub VP Modeling and Simulation, presented Generalizing Scientific Machine Learning and Differentiable Simulation Beyond Continuous Models. This one-hour seminar is available for free online. This seminar is part of a data-driven physical simulation series at Lawrence Livermore National Laboratory.

JuliaHub v6.3.0: JuliaHub v6.3.0 is now available with a number of new features. Release notes are available here and below.

  • v6.3.0 adds the capability to build a sysimage to go along with your job run. The sysimage is built before the job starts. After the sysimage build completes the sysimage is mounted to every Julia process the job utilizes (main and workers). JuliaHub users can choose to create a SYSIMG by checking the “Build SYSIMG” checkbox during job submission. More information on “What a SYSIMG is?” can be found in following link: https://julialang.github.io/PackageCompiler.jl/dev/sysimages.html#sysimages
  • We now have simple caching based on the pre-built SYSIMG’s, this feature ensures that additional runs of a job with the same manifest will reuse the already built sysimage.
  • We’re excited to introduce a fresh and modern user interface for JuliaHub. This update brings a host of usability improvements and a cleaner design, making user interactions smoother and ensures faster loading. UI improvements includes a major overhaul to Notifications, Registrator and Projects features.
  • We have added a new grouping for shared datasets for easy distinction, using these groupings; end-users can easily distinguish between shared datasets based on who shared them; without going into details.
  • Users can now create folders or directories in File explorer UI, this will help users to organize there files in an efficient way
  • We now have a new search filter for dependencies & dependents in the packages UI, using this feature, end-users can now search for direct and indirect dependencies & dependents for a particular package.
  • You can now remove a project viewer’s access by setting the resource’s general access level to “No Access”

Enterprise

  • Job time limits can now be made optional by an admin on enterprise installs, if this option is enabled, end-users on the JuliaHub instance can start jobs with no time limits.

Applications

  • Julia version has been updated to v1.9.2 in Julia IDE and batch jobs
  • We have added following R packages to WindowsWorkstation app:- ggplot, ggPMX, xpose, xpose4 and vpc
  • End-users can now access JuliaHub Datasets through RStudio app
  • R package “units” is now part of the RStudio app
  • End-Users can now install TinyTex packages in RStudio server and all the newly installed TinyTex packages will automatically go to persistent storage, hence, these packages can be loaded across the sessions without installing them again.

JuliaSim v0.30.0: JuliaSim v0.30.0 is now available with new capabilities and features. Release notes are available here and below. If interested in running JuliaSim on juliahub.com or locally, please do contact us.

Features

  • Accept environment path for info
  • Drop PDESurrogates from JuliaSim
  • Upgrade to Julia v1.9.3
  • When creating new Pluto notebooks on JuliaHub with JuliaSim, the cells necessary to make a notebook work with JuliaSim are automatically added to the notebook

JuliaSimBatteries

  • Add support for time- and state-varying experimental control inputs
  • Reduce the package compilation times
  • Improve documentation landing page and model comparisons.

JuliaSimControl

  • Improve fixed step integrator providing additional features
  • Improve documentation structure, including video tutorials
  • Reduce complexity of various APIs

JuliaSimModelOptimizer

  • Add support for multiple models in the same InverseProblem
  • Add support for Prediction Error Method which is very useful for unstable systems
  • Improvements in performance, stability and correctness of Collocation Methods
  • Overall stability in the API for generic use cases

Julia for Differential Equations Using Graphics Processing Units (GPUs): Automated Translation and Accelerated Solving of Differential Equations on Multiple GPU Platforms is a new paper co-authored by JuliaHub’s Yingbo Ma (Modeling & Numerics Team Lead), Tim Besard (Software Engineer), Alan Edelman (Co-Founder and Chief Scientist) and Chris Rackauckas (VP Modeling & Simulation). Click here for more.

Julia for Solving Inverse Problems: A Tutorial on the Bayesian Statistical Approach to Inverse Problems is a new paper from 3 Oregon State researchers using Turing.jl’s No-U-Turn Sampler (NUTS), a Markov Chain Monte Carlo (MCMC) algorithm to solve inverse problems.

Julia for Fish: A Bayesian Inverse Approach to Identify and Quantify Organisms from Fisheries Acoustic Data is a new paper in the ICES Journal of Marine Science. Researchers from the National Oceanic and Atmospheric Administration (NOAA) Alaska Fisheries Science Center and the University of Washington Applied Physics Lab use sonar with ProbabilisticEchoInversion.jl to identify and count fish. According to co-author Sam Urmy: “This work was inspired by Celeste.jl, and would not be possible without Julia’s automatic differentiation capabilities and the Turing.jl ecosystem.”

Julia for Epidemics and Spin Dynamics: Matrix Product Belief Propagation for Reweighted Stochastic Dynamics Over Graphs is a new paper in the Proceedings of the Natural Academy of Sciences using Julia for two applications: “inference of infection probabilities from sparse observations within the SIRS epidemic model and the computation of both typical observables and large deviations of several kinetic Ising models.” Click here for the paper and source code.

Julia for Net-Zero Hydrogen Production: A Cost Comparison of Various Hourly-Reliable and Net-Zero Hydrogen Production Pathways in the United States is a new paper in Nature Communications. The authors “build a model that enables direct comparison of the cost of producing net-zero, hourly-reliable hydrogen from various pathways … For the United States (California, Texas, and New York), model results indicate next-decade hybrid electricity-based solutions are lower cost ($2.02-$2.88/kg) than fossil-based pathways with natural gas leakage greater than 4% ($2.73-$5.94/kg). These results also apply to regions outside of the U.S. with a similar climate and electric grid.” Furthermore, the authors “use the Gurobi Linear Optimizer in the Julia mathematical programming tool (JuMP) to size system components such that we minimize the LCOH from electricity-based pathways.”

Why Jahan.ai Uses Julia: Why We Use Julia in Our AI Startup is a new blog post from Jahan.ai, an AI startup. According to Jahan.ai, “Handling enormous data for retail giants, including forecasts for over 20 million series in near real-time, demands a robust, efficient, and cost-effective language. Julia fits this bill perfectly with its minimal latency and high-performance capabilities. It allows us to scale effortlessly, processing vast datasets quickly thanks to efficient memory usage and parallel processing. This speed is crucial in an industry where trends shift rapidly. Julia’s just-in-time (JIT) compilation and performance, rivaling that of C, enables us to update forecasts multiple times a day. Its unique blend of speed and interpretability, without sacrificing power, helps our clients to react proactively to the ever-changing market demands.”

Oak Ridge National Laboratory (ORNL) Researchers Win SC23 Best Paper Award Using Julia for Exascale Supercomputing: Julia as a Unifying End-to-End Workflow Language on the Frontier Exascale System is a new paper from Oak Ridge National Laboratory (ORNL) scientists using Frontier, the first exascale supercomputer. The authors received the Best Paper award at the SC23 WORKS Workshop. Click here to learn more.

Great Lakes Consulting: Great Lakes Consulting (GLC) is a JuliaHub partner. They use Julia to help solve their customers’ problems. Their Julia blog describes a number of Julia features and capabilities.

Free Compute on JuliaHub (20 hours): In addition to the features JuliaHub has always offered for free – Julia ecosystem search, package registration tools, a dedicated package server – the platform now also gives every user 20 hours of free compute. This allows people to seamlessly share Pluto notebooks and IDE projects with others and let them get their feet wet with computing without having to open up their wallets. Click here to get started or check out Deep Datta’s introductory video, “JuliaHub Is a Free Platform to Start Your Technical Computing Journey”, where he explains how and why to start using JuliaHub for cloud computing.

Converting from Proprietary Software to Julia: Are you looking to leverage Julia’s superior speed and ease of use, but limited due to legacy software and code? JuliaHub and our partners can help accelerate replacing your existing proprietary applications, improve performance, reduce development time, augment or replace existing systems and provide an extended trusted team to deliver Julia solutions. Leverage experienced resources from JuliaHub and our partners to get your team up and running quickly. For more information, please contact us.

Careers at JuliaHub: JuliaHub is a fast-growing tech company with fully remote employees in 20 countries on 6 continents. Click here to learn more about exciting careers and internships with JuliaHub.

Julia and JuliaHub in the News

  • Automation: The Last Word – The Evolving Language of Automation Engineering
  • AIP Publishing: A Tutorial on the Bayesian Statistical Approach to Inverse Problems
  • Onrec: Top 10 AI Skills You Need to Land Your Dream Job in 2024
  • I-Programmer: Practical Julia: A Hands-On Introduction for Scientific Minds (No Starch)
  • Finextra: From RAG to Riches in a GenAI World: Some Jargon Explainers & Current Trends
  • Jax: Programing with Chapel: Making the Power of Parallelism and Supercomputers More Accessible
  • The Global Recruiter: AI Skills Required
  • Proceedings of the National Academy of Sciences (PNAS): Matrix Product Belief Propagation for Reweighted Stochastic Dynamics Over Graphs
  • ICES Journal of Marine Sciences: A Bayesian Inverse Approach to Identify and Quantify Organisms from Fisheries Acoustic Data
  • HGPU: Julia as a Unifying End-to-End Workflow Language on the Frontier Exascale System
  • Nature Communications: A Cost Comparison of Various Hourly-Reliable and Net-Zero Hydrogen Production Pathways in the United States
  • Science Direct: Automated Translation and Accelerated Solving of Differential Equations on Multiple GPU Platforms

Julia Blog Posts

Upcoming Julia and JuliaHub Events

Recent Julia and JuliaHub Events

Contact Us: Please contact us if you want to:

  • Learn more about JuliaHub, JuliaSim, Pumas, PumasQSP or CedarEDA
  • Obtain pricing for Julia consulting projects for your organization
  • Schedule Julia training for your organization
  • Share information about exciting new Julia case studies or use cases
  • Spread the word about an upcoming online or offline event involving Julia
  • Partner with JuliaHub to organize a Julia event online or offline
  • Submit a Julia internship, fellowship or job posting

About JuliaHub and Julia

JuliaHub is a fast and easy-to-use code-to-cloud platform that accelerates the development and deployment of Julia programs. JuliaHub users include some of the most innovative companies in a range of industries including pharmaceuticals, automotive, energy, manufacturing, and semiconductor design and manufacture.

Julia is a high performance open source programming language that powers computationally demanding applications in modeling and simulation, drug development, design of multi-physical systems, electronic design automation, big data analytics, scientific machine learning and artificial intelligence. Julia solves the two language problem by combining the ease of use of Python and R with the speed of C++. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. Julia has been downloaded by users at more than 10,000 companies and is used at more than 1,500 universities. Julia co-creators are the winners of the prestigious James H. Wilkinson Prize for Numerical Software and the Sidney Fernbach Award.