Author Archives: Julia Computing, Inc.

Newsletter – January 2018

By: Julia Computing, Inc.

Re-posted from: http://juliacomputing.com/blog/2018/01/04/january-newsletter.html

We hope you had a wonderful holiday season, and we wish you a productive and prosperous 2018!

  1. Julia 2017 Growth Statistics
  2. Number of News Mentions of Julia or Julia Computing
  3. Julia Computing on DM Radio
  4. Distributed Computing Available on JuliaBox
  5. JuliaCon 2018 Call for Corporate Sponsors
  6. Julia Computing Co-Founder Alan Edelman Selected 2018 IEEE Fellow
  7. Julia on Coursera
  8. Learn Julia
  9. Julia Blog Posts
  10. Julia and Julia Computing in the News
  11. Upcoming Events Featuring Julia
  12. Recent Events Featuring Julia
  13. Julia Jobs and Internships
  14. Contact Us


1. Julia 2017 Growth Statistics

2. Number of News Mentions of Julia or Julia Computing

The number of news articles mentioning Julia or Julia Computing increased dramatically in 2016 and in 2017.

3. Julia Computing on DM Radio

Alan Edelman, co-founder of Julia Computing and MIT Professor of Applied Mathematics participated in a broadcast radio show and podcast called DM Radio, hosted by Eric Kavanagh. Click here to listen to the show.

4. Distributed Computing Available on JuliaBox

JuliaRun facilitates running distributed computing jobs in Julia across a cluster of computers. Try it out using our hosted service at https://www.juliabox.com. Free tier users have access to a 3 CPU cluster that allows you to run one master and two child processes. Please write to us at juliabox@juliacomputing.com if you require a larger allocation for a temporary trial. More information can be found by logging into JuliaBox to access JuliaBox documentation and the JuliaBox tutorial.

5. JuliaCon 2018 Call for Corporate Sponsors

JuliaCon 2018 has corporate sponsorship opportunities available. JuliaCon 2018 will be held Aug 7-11, 2018 at University College London.

6. Julia Computing Co-Founder Alan Edelman Selected 2018 IEEE Fellow

Alan Edelman, co-founder of Julia Computing and MIT Professor of Applied Mathematics, was selected by the Institute for Electrical and Electronics Engineers (IEEE) for his “extraordinary accomplishments” and “contributions to the development of technical computing languages, namely the Julia language for numerical and scientific computing.”

7. Julia on Coursera

The University of Cape Town has created a four module introductory Julia course. Julia Scientific Programming is taught by Dr. Juan Klopper, Department of Surgery, and Dr. Henri Laurie, Department of Mathematics and Applied Mathematics.

8. Learn Julia

Jane Herriman, Julia Computing’s Director of Diversity and Outreach, hosted a free online introductory Julia tutorial on YouTube on December 19, to be repeated monthly. Julia Computing will also be hosting tutorials with deeper dives into Julia. You can subscribe to our YouTube channel by clicking here. If you are an author or experienced user of a Julia package and would like to help run a tutorial on that package, please let us know. We would love to partner with you and support you in creating teaching materials and in running the event.

9. Julia Blog Posts

Do you want to share photos, videos or details of your most recent conference, meetup, training, hackathon, talk, presentation or workshop involving Julia? Please send us an email with details and links.

10. Julia and Julia Computing in the News

11. Upcoming Events Featuring Julia

Do you know of any upcoming conferences, meetups, trainings, hackathons, talks, presentations or workshops involving Julia? Would you like to organize a Julia event on your own, or in partnership with your company, university or other organization? Let us help you spread the word and support your event by sending us an email with details. Here are some upcoming events:

12. Recent Events Featuring Julia

Do you want to share photos, videos or details of your most recent conference, meetup, training, hackathon, talk, presentation or workshop involving Julia? Please send us an email with details and links.

13. Julia Jobs and Internships

Do you work at or know of an institution looking to hire Julia programmers as staff, research fellows or interns? Would your employer be interested in hiring interns to work on open source packages that are useful to their business? Help us connect members of our community to great opportunities by sending us an email, and we’ll get the word out!

There are more than 200 Julia jobs currently listed on Indeed.com, including jobs at Google, Facebook, IBM, KPMG, Ernst & Young, Booz Allen Hamilton, Comcast, Zulily, National Renewable Energy Research Laboratory, Los Alamos National Laboratory, Brown, Princeton, Columbia, Notre Dame, MIT, University of Chicago and many more.

14. Contact Us

Please contact us if you wish to:

  • Purchase or obtain license information for Julia products such as JuliaPro, JuliaPro Enterprise, JuliaRun, JuliaDB, JuliaFin or JuliaBox
  • Obtain pricing for Julia consulting projects for your organization
  • Schedule Julia training for your organization
  • Share information about exciting new Julia case studies or use cases
  • Spread the word about an upcoming conference, workshop, training, hackathon, meetup, talk or presentation involving Julia
  • Partner with Julia Computing to organize a Julia meetup, conference, workshop, training, hackathon, talk or presentation involving Julia
  • Submit a Julia internship or job posting

About Julia and Julia Computing

Julia is the fastest high performance open source computing language for data, analytics, algorithmic trading, machine learning, artificial intelligence, and many other domains. Julia solves the two language problem by combining the ease of use of Python and R with the speed of C++. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. For example, Julia has run at petascale on 650,000 cores with 1.3 million threads to analyze over 56 terabytes of data using Cori, the world’s sixth-largest supercomputer. With more than 1.8 million downloads and +101% annual growth, Julia is one of the top programming languages developed on GitHub. Julia adoption is growing rapidly in finance, insurance, machine learning, energy, robotics, genomics, aerospace, medicine and many other fields.

Julia Computing was founded in 2015 by all the creators of Julia to develop products and provide professional services to businesses and researchers using Julia. Julia Computing offers the following products:

  • JuliaPro for data science professionals and researchers to install and run Julia with more than one hundred carefully curated popular Julia packages on a laptop or desktop computer.
  • JuliaRun for deploying Julia at scale on dozens, hundreds or thousands of nodes in the public or private cloud, including AWS and Microsoft Azure.
  • JuliaFin for financial modeling, algorithmic trading and risk analysis including Bloomberg and Excel integration, Miletus for designing and executing trading strategies and advanced time-series analytics.
  • JuliaDB for in-database in-memory analytics and advanced time-series analysis.
  • JuliaBox for students or new Julia users to experience Julia in a Jupyter notebook right from a Web browser with no download or installation required.

To learn more about how Julia users deploy these products to solve problems using Julia, please visit the Case Studies section on the Julia Computing Website.

Julia users, partners and employers hiring Julia programmers in 2018 include Amazon, Apple, BlackRock, Booz Allen Hamilton, Capital One, Comcast, Disney, Ernst & Young, Facebook, Ford, Google, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC, Uber, and many more.

Strategies Learned at the DISC Unconference for Measuring Diversity in the Julia Community

By: Julia Computing, Inc.

Re-posted from: http://juliacomputing.com/blog/2017/12/21/DISC-unconference.html

This November, NumFOCUS hosted the Diversity and Inclusion in Scientific Computing (DISC) Unconference alongside the 2017 NYC PyData Conference.

The Unconference was a fantastic learning opportunity and a great place to meet individuals whose interests in diversity align well with ours at Julia Computing. The unconference attendees proved to be a valuable source of ideas for our upcoming diversity initiatives since many of them have already encountered and considered some of the challenges we face.


Unconference participants sort their ideas to choose topics to cover during the event.

One of the topics covered at the unconference that was particularly helpful for me was about metrics to quantify diversity. Julia Computing and the broader Julia community want to make the Julia user community more diverse, but there are many ways we might attempt to do this. How can we determine what works and what does not? How can we know if our efforts are paying off?

One way to evaluate our success might be to use the results of surveys given to attendees at Julia events to measure how diverse our audience is and how diversity within our audiences changes over time. Prior to the unconference, I had not thought about the trade-off between accurately and responsibly collecting personal information for diversity statistics. Mandatory self-reporting of, for example, gender, race, and age may allow conference/workshop organizers to generate more representative statistics than would optional self-reporting, but mandating that attendees provide this information may make them feel uncomfortable. Furthermore, asking an attendee to identify as a member of a set of pre-defined categories may leave some attendees feeling excluded.

DISC Unconference participants discussed strategies to quantify diversity responsibly. For example, event organizers may choose to give attendees a mandatory survey for quantifying diversity statistics that gives the attendee the option to decline to respond to each question. By requiring survey participation without requiring self-reporting, this approach will heighten self-reporting amongst those who are comfortable relaying how they identify, but who may not have taken the time to do so in an optional survey. Inclusive wording on a survey for quantifying diversity might include merely asking event attendees whether they identify as under-represented in their field because of their gender, for example, rather than asking them to specify their gender. Moreover, surveys asking for personal information from attendees should tell those attendees what will be done with their information and why it is being requested of them. Thanks to these discussions, I feel more prepared to measure the effectiveness of our diversity efforts at Julia Computing.

Jane Herriman is Director of Diversity and Outreach at Julia Computing and a PhD candidate in applied physics and materials science at Caltech.

Julia and Spark, Better Together

By: Julia Computing, Inc.

Re-posted from: http://juliacomputing.com/blog/2017/12/12/julia-and-spark.html

The use of Apache Spark as a distributed data and computation engine has grown rapidly in recent times. Leveraging the Hadoop ecosystem, enterprise workloads have swiftly migrated to Spark. Hosted Spark instances from AWS and Azure have made it even easier to get started, and to run large, on-demand clusters for dynamic workloads.

Scala, the primary language of Spark, is not everyone’s cup of tea when it comes to numeric computing problems. Mostly arising out the JVM, problems include floating point inaccuracy, lack of performance on user defined mathematical constructs, and library support for complex optimisation or linear algebra routines.

Being built for numerical computing, Julia is however perfectly suited to create fast and accurate numerical applications, while leveraging the large scale data handling capabilities of the Spark platform.

Spark.jl

The Spark.jl package, created by Andrei Zhabinsky, with subsequent contributions by a larger worldwide group of developers, enables the use of Julia programs on Spark. It allows you to connect to a Spark cluster from the Julia REP and load data and submit jobs. The typical operating model involves creating a Spark RDD by loading file, or from any Julia iterator. Then, Julia functions can be applied to the RDD using the standard Spark verbs, all from within Julia. This first class integration is enabled via the JavaCall julia package that allow interoperability of Julia and Java codebases.

As an example, a typical session to compute a distributed wordcount (the “Hello World” of distributed computing) from Julia would look like this (all code typed in the Julia REPL)

using Spark
Spark.init()
sc = SparkContext(master="local")
text = parallelize(sc, ["hello world", "the world is one", "we are the world"])
words = flat_map(text, split)
words_tuple = cartesian(words, parallelize(sc, [1]))
counts = reduce_by_key(words_tuple, +)
result = collect(counts)

   7-element Array{Any,1}:
   ("are", 1)
   ("is", 1)
   ("one", 1)
   ("we", 1)
   ("hello", 1)
   ("world", 3)
   ("the", 2)

A second example shows the code to calculate using a simple Monte Carlo method.

NUM_SAMPLES = 10000
samples = parallelize(sc, 1:NUM_SAMPLES)
c = filter(samples, (_)->begin;x=rand(2); x[1]^2 + x[2]^2 <1;end) |> count
print(4 * c / NUM_SAMPLES)
    3.1432

It is important to note that in these examples, the core domain calculations are being done in Julia code – in the spilt and + functions of the first example, and in the anonymous function of the second example. In addition however, familiar Spark API functions names such as parallelize/map/reduce/reduce_by_key, are being used to distribute the code and the data to the various Spark nodes that make up the cluster.

A large proportion of the Spark RDD api is accessible from Julia, as well as the beginnings of support for the Dataframes and Spark SQL api. Detailed documentation can be perused at http://dfdx.github.io/Spark.jl/.

Installing and Running

Installing the Julia Spark bindings is as simple as adding the package via the Pkg.add(“Spark.jl”) command from the julia REPL. This will install a local standalone Spark environment for testing, in addition to the Julia bindings. Java and maven are prerequisites, and the latter should be present in the system path.

When running this in a production setting, a Julia process is used as a driver, and it connects to an existing Spark cluster in client mode. Standalone, Mesos and YARN clusters are supported. On the cluster, Julia and it’s dependencies needs to be installed on all nodes. This should be automated, and pre-built scripts are available for the major cloud providers. This makes the cloud hosted Spark clusters provided by Amazon EMR and Azure HDInsight the easiest environments to run this on.

Julia on Azure HDInsight

Creating an HDInsight cluster on Azure is a matter of following the online wizard on the Azure portal. Choose Spark 2.1 on Linux (HDI 3.6) as the cluster type. Default settings can be used for everything else.

Create an Azure Data Lake Store principal if you intend to load data out of ADL Store. Choose a cluster size based on your requirements. By default HDInsight creates a cluster with 2 master nodes, and 4 workers. One the basic settings are provided, choose to edit the Advanced Settings and configure a script action. You can use the example supplied with the package to create a basic Julia installation on the cluster. For production use, you will want to edit the script to satisfy your requirements, for example adding packages, or installing JuliaPro.

Finally, once the cluster has been created, SSH to the master node, where you will find Julia available on the PATH. The cluster is running using the YARN cluster manager, where all endpoints are configured using property files. As a result, connecting to the cluster from Julia is simply a matter of specifying YARN as the cluster mode.

This post was formatted for the Julia Computing blog by Rajshekar Behar