Author Archives: Julia Computing, Inc.

Julia Computing Raises $4.6M in Seed Funding

Berkeley, California – Julia Computing is pleased to announce seed funding of $4.6M from investors General Catalyst and Founder Collective.

Julia Computing CEO Viral Shah says, “We selected General Catalyst and Founder Collective as our initial investors because of their success backing entrepreneurs with business models based on open source software. This investment helps us accelerate product development and continue delivering outstanding support to our customers, while the entire Julia community benefits from Julia Computing’s contributions to the Julia open source programming language.”

The General Catalyst team was led by Donald Fischer, who was an early product manager for Red Hat Enterprise Linux, and the Founder Collective team was led by David Frankel.

Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia, a product of the open source community, MIT CSAIL and MIT Mathematics, combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of C++ and Java. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. With more than 1 million downloads and +161% annual growth, Julia is one of the top 10 programming languages developed on GitHub and adoption is growing rapidly in finance, insurance, energy, robotics, genomics, aerospace and many other fields.

According to Tim Thornham, Director of Financial Solutions Modeling at Aviva, Britain’s second-largest insurer, “Solvency II compliant models in Julia are 1,000x faster than Algorithmics, use 93% fewer lines of code and took one-tenth the time to implement.”

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Comcast, Disney, Facebook, Ford, Google, Grindr, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC, Raytheon and Uber.

  1. Julia is lightning fast. Julia provides speed improvements up to
    1,000x for insurance model estimation, 225x for parallel
    supercomputing image analysis and 11x for macroeconomic modeling.

  2. Julia is easy to learn. Julia’s flexible syntax is familiar and
    comfortable for users of Python, R and Matlab.

  3. Julia integrates well with existing code and platforms. Users of
    Python, R, Matlab and other languages can easily integrate their
    existing code into Julia.

  4. Elegant code. Julia was built from the ground up for
    mathematical, scientific and statistical computing, and has advanced
    libraries that make coding simple and fast, and dramatically reduce
    the number of lines of code required – in some cases, by 90%
    or more.

  5. Julia solves the two language problem. Because Julia combines
    the ease of use and familiar syntax of Python, R and Matlab with the
    speed of C, C++ or Java, programmers no longer need to estimate
    models in one language and reproduce them in a faster
    production language. This saves time and reduces error and cost.

Julia Computing was founded in 2015 by the creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia. Julia Computing’s founders are Viral Shah, Alan Edelman, Jeff Bezanson, Stefan Karpinski, Keno Fischer and Deepak Vinchhi.

A few examples of how Julia is being used today include:

BlackRock, the world’s largest asset manager, is using Julia to power their trademark Aladdin analytics platform.

Aviva, Britain’s second-largest insurer, is using Julia to make Solvency II compliance models run 1,000x faster using just 7% as much code as the legacy program it replaced.

“Solvency II compliant models in Julia are 1,000x faster than Algorithmics, use 93% fewer lines of code and took one-tenth the time to implement.” Tim Thornham, Director of Financial Solutions Modeling

Berkery Noyes is using Julia for mergers and acquisitions analysis.

“Julia is 20 times faster than Python, 100 times faster than R, 93 times faster than Matlab and 1.3 times faster than Fortran. What really excites us is that it’s interesting that you can write high-level, scientific and numerical computing but without having to re-translate that. Usually, if you have something in R or Matlab and you want to make it go faster, you have to re-translate it to C++, or some other faster language; with Julia, you don’t—it sits right on top.” Keith Lubell, CTO

UC Berkeley Autonomous Race Car (BARC) is using Julia for self-driving vehicle navigation.

“Julia has some amazing new features for our research. The port to ARM has made it easy for us to translate our research codes into real world applications.” Francesco Borrelli, Professor of Mechanical Engineering and co-director of the Hyundai Center of Excellence in Integrated Vehicle Safety Systems and Control at UC Berkeley

Federal Aviation Administration (FAA) and MIT Lincoln Labs are using Julia for the Next-Generation Aircraft Collision Avoidance System.

“The previous way of doing things was very costly. Julia is very easy to understand. It’s a very familiar syntax, which helps the reader understand the document with clarity, and it helps the writer develop algorithms that are concise. Julia resolves many of our conflicts, reduces cost during technology transfer, and because Julia is fast, it enables us to run the entire system and allows the specification to be executed directly. We continue to push Julia as a standard for specifications in the avionics industry. Julia is the right answer for us and exceeds all our needs.” Robert Moss, MIT Lincoln Labs

Augmedics is using Julia to give surgeons ‘x-ray vision’ via augmented reality.

“I stumbled upon Julia and gave it a try for a few days. I fell in love with the syntax, which is in so many ways exactly how I wanted it to be. The Julia community is helpful, Juno (the interactive development environment for Julia) is super-helpful. I don’t know how one can write without it. As a result, we are achieving much more and far more efficiently using Julia.” Tsur Herman, Senior Algorithms Developer

Path BioAnalytics is using Julia for personalized medicine.

“We were initially attracted to Julia because of the innovation we saw going on in the community. The computational efficiency and interactivity of the data visualization packages were exactly what we needed in order to process our data quickly and present results in a compelling fashion. Julia is instrumental to the efficient execution of multiple workflows, and with the dynamic development of the language, we expect Julia will continue to be a key part of our business going forward.” Katerina Kucera, Lead Scientist

Voxel8 is using Julia for 3D printing and drone manufacture.

“The expressiveness of a language matters. Being high level and having an ability to iterate quickly makes a major difference in a fast-paced innovative environment like at Voxel8. The speed at which we’ve been able to develop this has been incredible. If we were doing this in a more traditional language like C or C++, we wouldn’t be nearly as far as we are today with the number of developers we have, and we wouldn’t be able to respond nearly as quickly to customer feedback regarding what features they want. There is a large number of packages for Julia that we find useful. Julia is very stable – the core language is stable and fast and most packages are very stable.” Jack Minardi, Co-Founder and Software Lead

Federal Reserve Bank of New York and Nobel Laureate Thomas J. Sargent are using Julia to solve macroeconomic models 10x faster.

“We tested our code and found that the model estimation is about ten times faster with Julia than before, a very large improvement. Our ports (computer lingo for “translations”) of certain algorithms, such as Chris Sims’s gensys (which computes the model solution), also ran about six times faster in Julia than the … versions we had previously used.” Marco Del Negro, Marc Giannoni, Pearl Li, Erica Moszkowski and Micah Smith, Federal Reserve Bank of New York

“Julia is a great tool. We like Julia. We are very excited about Julia because our models are complicated. It’s easy to write the problem down, but it’s hard to solve it – especially if our model is high dimensional. That’s why we need Julia. Figuring out how to solve these problems requires some creativity. This is a walking advertisement for Julia.” Thomas J. Sargent, Nobel Laureate

Intel, Lawrence Berkeley National Laboratory, UC Berkeley and the National Energy Research Scientific Computing Center are using Julia for parallel supercomputing to increase the speed of astronomical image analysis 225x.

Barts Cancer Institute, Institute of Cancer Research, University College London and Queen Mary University of London are using Julia to model cancer genomes.

“Coming from using Matlab, Julia was pretty easy, and I was surprised by how easy it was to write pretty fast code. Obviously the speed, conciseness and dynamic nature of Julia is a big plus and the initial draw, but there are other perhaps unexpected benefits. For example, I’ve learned a lot about programming through using Julia. Learning Julia has helped me reason about how to write better and faster code. I think this is primarily because Julia is very upfront about why it can be fast and nothing is hidden away or “under the hood”. Also as most of the base language and packages are written in Julia, it’s great to be able to delve into what’s going on without running into a wall of C code, as might be the case in other languages. I think this is a big plus for its use in scientific research too, where we hope that our methods and conclusions are reproducible. Having a language that’s both fast enough to implement potentially sophisticated algorithms at a big scale but also be readable by most people is a great resource. Also, I find the code to be very clean looking, which multiple dispatch helps with a lot, and I like the ability to write in a functional style.” Marc Williams, Barts Cancer Institute, Queen Mary University of London and University College London

Julia Ranks Among Top 10 Programming Languages Developed on GitHub

Cambridge, MA – Julia ranks as one of the top 10 programming languages developed on GitHub as measured by the number of stars and forks.

GitHub users star a repository in order to show appreciation and create a bookmark for easy access. Programmers fork a repository in order to add features or fix bugs and contribute to the project.

Julia ranks #10 in GitHub stars and #8 in forks among programming languages developed on GitHub.

Top GitHub Programming Languages

Rank Language GitHub Stars Number of Repositories
1 Swift 38,513 5,755
2 Go 28,230 3,770
3 TypeScript 22,248 3,230
4 Rust 21,938 4,072
5 CoffeeScript 13,972 1,909
6 Kotlin 13,074 1,219
7 Ruby 12,212 3,401
8 PHP 11,989 4,037
9 Elixir 10,103 1,399
10 Julia 8,651 1,982
11 Scala 8,254 2,092
12 Crystal 8,161 656
13 Python 7,835 1,294
14 Roslyn 7,538 1,851
15 PowerShell 7,010 913

Ranked by Number of GitHub Stars

About Julia Computing and Julia

Julia Computing (JuliaComputing.com) was founded in 2015 by the co-creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia.

Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of Java and C++. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. With more than 1 million downloads and +161% annual growth, Julia adoption is growing rapidly in finance, energy, robotics, genomics and many other fields.

  1. Julia is lightning fast. Julia provides speed improvements up to
    1,000x for insurance model estimation, 225x for parallel
    supercomputing image analysis and 11x for macroeconomic modeling.

  2. Julia is easy to learn. Julia’s flexible syntax is familiar and
    comfortable for users of Python, R and Matlab.

  3. Julia integrates well with existing code and platforms. Users of
    Python, R, Matlab and other languages can easily integrate their
    existing code into Julia.

  4. Elegant code. Julia was built from the ground up for
    mathematical, scientific and statistical computing, and has advanced
    libraries that make coding simple and fast, and dramatically reduce
    the number of lines of code required – in some cases, by 90%
    or more.

  5. Julia solves the two language problem. Because Julia combines
    the ease of use and familiar syntax of Python, R and Matlab with the
    speed of C, C++ or Java, programmers no longer need to estimate
    models in one language and reproduce them in a faster
    production language. This saves time and reduces error and cost.

Julia users, partners and employers looking to hire Julia programmers in 2017 include: Google, Apple, Amazon, Facebook, IBM, Intel, Microsoft, BlackRock, Capital One, PwC, Ford, Oracle, Comcast, DARPA, Moore Foundation, Federal Reserve Bank of New York (FRBNY), UC Berkeley Autonomous Race Car (BARC), Federal Aviation Administration (FAA), MIT Lincoln Labs, Nobel Laureate Thomas J. Sargent, Brazilian National Development Bank (BNDES), Conning, Berkery Noyes, BestX, Path BioAnalytics, Invenia, AOT Energy, AlgoCircle, Trinity Health, Gambit, Augmedics, Tangent Works, Voxel8, Massachusetts General Hospital, NaviHealth, Farmers Insurance, Pilot Flying J, Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC), Oak Ridge National Laboratory, Los Alamos National Laboratory, Lawrence Livermore National Laboratory, National Renewable Energy Laboratory, MIT, Caltech, Stanford, UC Berkeley, Harvard, Columbia, NYU, Oxford, NUS, UCL, Nantes, Alan Turing Institute, University of Chicago, Cornell, Max Planck Institute, Australian National University, University of Warwick, University of Colorado, Queen Mary University of London, London Institute of Cancer Research, UC Irvine, University of Kaiserslautern, University of Queensland.

Julia is being used to: analyze images of the universe and research dark matter, drive parallel supercomputing, diagnose medical conditions, provide surgeons with real-time imagery using augmented reality, analyze cancer genomes, manage 3D printers, pilot self-driving racecars, build drones, improve air safety, manage the electric grid, provide analytics for foreign exchange trading, energy trading, insurance, regulatory compliance, macroeconomic modeling, sports analytics, manufacturing, and much, much more.

Inference Convergence algorithm in Julia – revisited

Adventures in Type Inference Convergence: 2017 edition

Corrected Convergence

In my last post on type inference convergence,
I described a correct inference algorithm.
However, while this was a tremendous improvement over being wrong, I was unhappy with it.
Indeed, I wrote that “a full discussion of these heuristics will have to wait for a future blog post”.
But what I didn’t say is that I had already written many notes for that post,
and had reached a point where I understood that the current algorithm wouldn’t actually permit
reliable, well-tuned heuristics.

Plus, the algorithm required lots of global state (to manage the work-queue),
which required tricky locks and coordination to ensure it stayed consistent.

Convergence Algorithm 2.0

The improved convergence algorithm in PR #21677
maintains all of the correctness guarantees of the existing algorithm,
but uses a completely revised cycle-detection algorithm that provides much stronger guarantees
about the order in which work will be done.

By comparison to be current algorithm, the revised one has half as many states.
They are:

  1. processing: any leaf function of the call-stack (usually there’s just one at a time)
  2. on the call-stack: the tree DAG (directed-acyclic-graph) of functions with edges representing inferred potential calls
  3. in a cycle: groups of functions nodes discovered not to form a DAG (represented instead as an unsorted set)
  4. finished: done

The outline of the algorithm that manages these states is that the call-stack is always maintained as a simple tree,
only permitting the existence of nodes with forward edges.
During the course of running inference, if adding an edge would cause a cycle in this graph,
the algorithm instead replaces all nodes that are participating in the cycle
with a single node that represents that cyclic set.

Conceptually, this set is similar to either the “fixed-point” set in the current algorithm,
or the “active lines” set.
It differs from the “fixed-point” set because it can also guarantee that only nodes in the cycle
will be in the set.
The “active lines” set only allows describing the active set within a single method –
the new representation expands that to iterate convergence of an entire set of methods.

Because the algorithm maintains a acyclic call-stack,
it becomes now possible to depend on the inspection of an edge in inference to
result in exactly one of two results: an inferred return type,
or a cycle detection that replaces the current node with a convergence set.
This greatly increases the options available for integrating new inference heuristics!

There’s one additional nice benefit to this new representation:
improved inlining heuristics!
It may not be obvious how improved cycle-detection during inference convergence
would impact the optimization afterwards,
but one of the unintuitive products of inference is the inlining order.
Deciding on the profitability of inlining is dependent upon knowing the
precise structure (ordering) of the call-stack and any cycles in it.
Since the new inference structure provides precise information on which
functions potentially participate in cycles,
the inliner order can take advantage of that information to decide which functions to inline,
and where to terminate the inlining due to recursion.

Undecidability Heuristics

Without heuristics imposed on top of the existing inference algorithm,
it would already be Turing-complete[^turing],
with just the current support for recursively inspecting dispatch over computed types.

This means that (without some additional constraints added) inference might attempt to
compute anything (including the halting problem, busy beaver, infinite recursion).
In some cases, code is written that explicitly tries to make use of this
using “lisp-y style” function application.
However, code written in that manner often quickly runs into limits
that are inherently required by the compiler to avoid other very similar
code patterns where the inferred recursion is either unintentional
(it won’t affect runtime performance / behavior)
or unreachable (solving for reachability is commonly referred to as the halting problem).

It is suboptimal for this inference system to be undecidable (or even slow),
since its sole purpose is to find optimizations to make your program run faster.
And inference is going to be painfully slow when it is being used as an interpreter
(rather than just running the program unspecialized).
So, there needs to be some mechanism or heuristics for preventing this from happening.

It’s also important to realize that these heuristics need to be tuned precisely
to the capabilities of inference.

Turing-completeness requires several features:

  • loops
  • conditionals
  • arbitrarily large (but finite) memory

Removing any one of these would be sufficient to ensure the system is computable.
But loops (recursion), conditionals (dispatch), and memory (apply-type)
are all essential to getting useful results from inference
and thus it is generally not desirable to simply remove them.
Additionally, even if it was possible now,
as inference becomes more capable
(e.g. inter-procedural constant propagation, effect-free computation,
speculative evaluation of constant arithmetic),
new forms of loops and memory would appear that would also need to be addressed.

Current limits

Since Julia’s inference algorithm currently only operates on Julia’s Type’s,
(and does not support inference of constant expressions like 1 + 1),
we only need to consider how these can be used to express an arbitrarily large memory.
Imply both that it can form arbitrarily large constants, and do arithmetic (conditional loops) on them.

There are a finite number of type names.
But there are two primary mechanisms by which a type can be used to express an arbitrary value.
For any type, the parameters can be arbitrarily nested in depth.
Additionally, there are tuples and unions, which (in addition to depth), can have arbitrary length.
We can express this complexity as:

depth(type) = 1 + maximum(depth, type.parameters)
length(type) = length(type.parameters)

It is possible, using either property (or a combination of both), to represent any number.

The current heuristics in inference prohibit recursion which increases either depth or length.
without this ability, the program cannot express any number larger than the input,
forcing it to (eventually) terminate.
Additionally, there are various (semi-arbitrary) limits places on the lengths and depths of
the types that are flowing through the system.

Independence of the cached result

For predictability, it is preferable that the heuristic limits imposed should only be local properties,
computable from the function itself or its forward edges (the functions that it calls).
This remains an unsolved problem, since the heuristics are based on examining the call-stack (following back-edges).
This means that the order in which functions enter inference can affect how precisely they are permitted to compute their results.

heuristics

(A diagram showing how the context-sensativity of the specialization type signature can create repeating patterns instead of simple recursion)

To illustrate what I mean by wanting to only use “local properties”, consider if fuchsia entered inference first,
then inference will see the widening recursion at fuchsia´.
And green´ will not be encountered.
Thus, the result of green will be different than if fuchsia was not being inferred
(recursion would have been detected at green´, changing the results assigned and cached for both –
including fuchsia when it is later inferred, and uses the cached return type for green).

An optimal limit would notice the recursion at fuchsia´, but change the topmost function (fuchsia)
to use an approximation of green while canceling type-inference on the original attempted edges.
This would preserve the characteristic that the cached result of type-inferring green shouldn’t
depend on having been called from fuschia.

Looking forward

A downside of the current arbitrary limits is that they penalize valid code which uses complex types,
even if they are not changing under recursion.

Another downside is that it only takes into account the possibility of expressing memory
via the type-system “length” and “depth” definitions above.

These challenges remain, and will need to be addressed in a future update to this blog
(and to the implementation!).