juliabloggers.com http://www.juliabloggers.com A Julia Language Blog Aggregator Mon, 26 Jun 2017 00:00:00 +0000 en-US hourly 1 https://wordpress.org/?v=4.8 69897842 Julia Computing Awarded $910,000 Grant by Alfred P. Sloan Foundation, Including $160,000 for STEM Diversity http://www.juliabloggers.com/julia-computing-awarded-910000-grant-by-alfred-p-sloan-foundation-including-160000-for-stem-diversity/ Mon, 26 Jun 2017 00:00:00 +0000 http://juliacomputing.com/press/2017/06/26/grant Cambridge, MA – Julia Computing has been granted $910,000 by the Alfred P. Sloan Foundation to support open-source Julia development, including $160,000 to promote diversity in the Julia community.

The grant will support Julia training, adoption, usability, compilation, package development, tooling and documentation.

The diversity portion of the grant will fund a new full-time Director of Diversity Initiatives plus travel, scholarships, training sessions, workshops, hackathons and Webinars. Further information about the new Director of Diversity Initiatives position is below for interested applicants.

Julia Computing CEO Viral Shah says, “Diversity of backgrounds increases diversity of ideas. With this grant, the Sloan Foundation is setting a new standard of support for diversity which we hope will be emulated throughout STEM.”

Diversity efforts in the Julia community have been led by JuliaCon Diversity Chair, Erica Moszkowski. According to Moszkowski, “This year, we awarded $12,600 in diversity grants to help 16 participants travel to, attend and present at JuliaCon 2017. Those awards, combined with anonymous talk review, directed outreach, and other efforts have paid off. To give one example, there are many more women attending and presenting than in previous years, but there is a lot more we can do to expand participation from underrepresented groups in the Julia community. This support from the Sloan Foundation will allow us to scale up these efforts and apply them not just at JuliaCon, but much more broadly through Julia workshops and recruitment.”

Julia Computing seeks job applicants for Director of Diversity Initiatives. This is a full-time salaried position. The ideal candidate would have the following characteristics:

  • Familiarity with Julia
  • Strong scientific, mathematical or numeric programming skills required – e.g. Julia, Python, R
  • Eager to travel, organize and conduct Julia trainings, conferences, workshops and hackathons
  • Enthusiastic about outreach, developing and leveraging relationships with universities and STEM diversity organizations such as YesWeCode, Girls Who Code, Code Latino and Black Girls Code
  • Strong organizational, communication, public speaking and training skills required
  • Passionate evangelist for Julia, open source computing, scientific computing and increasing diversity in the Julia community and STEM
  • This position is based in Cambridge, MA

Interested applicants should send a resume and statement of interest to jobs@juliacomputing.com.

Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia, a product of the open source community, MIT CSAIL and MIT Mathematics, combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of C++ and Java. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. With more than 1 million downloads and +161% annual growth, Julia is one of the top 10 programming languages developed on GitHub and adoption is growing rapidly in finance, insurance, energy, robotics, genomics, aerospace and many other fields.

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Comcast, Disney, Facebook, Ford, Google, Grindr, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC, Raytheon and Uber.

  1. Julia is lightning fast. Julia provides speed improvements up to 1,000x for insurance model estimation, 225x for parallel supercomputing image analysis and 11x for macroeconomic modeling.

  2. Julia is easy to learn. Julia’s flexible syntax is familiar and comfortable for users of Python, R and Matlab.

  3. Julia integrates well with existing code and platforms. Users of Python, R, Matlab and other languages can easily integrate their existing code into Julia.

  4. Elegant code. Julia was built from the ground up for mathematical, scientific and statistical computing, and has advanced libraries that make coding simple and fast, and dramatically reduce the number of lines of code required – in some cases, by 90% or more.

  5. Julia solves the two language problem. Because Julia combines the ease of use and familiar syntax of Python, R and Matlab with the speed of C, C++ or Java, programmers no longer need to estimate models in one language and reproduce them in a faster production language. This saves time and reduces error and cost.

Julia Computing was founded in 2015 by the creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia.

The Alfred P. Sloan Foundation is a not-for-profit grantmaking institution based in New York City. Founded by industrialist Alfred P. Sloan Jr., the Foundation makes grants in support of basic research and education in science, technology, engineering, mathematics, and economics. This grant was provided through the Foundation’s Data and Computational Research program, which makes grants that seek to leverage developments in digital information technology to maximize the efficiency and trustedness of research. sloan.org

A few examples of how Julia is being used today include:

BlackRock, the world’s largest asset manager, is using Julia to power their trademark Aladdin analytics platform.

Aviva, Britain’s second-largest insurer, is using Julia to make Solvency II compliance models run 1,000x faster using just 7% as much code as the legacy program it replaced.

“Solvency II compliant models in Julia are 1,000x faster than Algorithmics, use 93% fewer lines of code and took one-tenth the time to implement.” Tim Thornham, Director of Financial Solutions Modeling

Berkery Noyes is using Julia for mergers and acquisitions analysis.

“Julia is 20 times faster than Python, 100 times faster than R, 93 times faster than Matlab and 1.3 times faster than Fortran. What really excites us is that it’s interesting that you can write high-level, scientific and numerical computing but without having to re-translate that. Usually, if you have something in R or Matlab and you want to make it go faster, you have to re-translate it to C++, or some other faster language; with Julia, you don’t—it sits right on top.” Keith Lubell, CTO

UC Berkeley Autonomous Race Car (BARC) is using Julia for self-driving vehicle navigation.

“Julia has some amazing new features for our research. The port to ARM has made it easy for us to translate our research codes into real world applications.” Francesco Borrelli, Professor of Mechanical Engineering and co-director of the Hyundai Center of Excellence in Integrated Vehicle Safety Systems and Control at UC Berkeley

Federal Aviation Administration (FAA) and MIT Lincoln Labs are using Julia for the Next-Generation Aircraft Collision Avoidance System.

“The previous way of doing things was very costly. Julia is very easy to understand. It’s a very familiar syntax, which helps the reader understand the document with clarity, and it helps the writer develop algorithms that are concise. Julia resolves many of our conflicts, reduces cost during technology transfer, and because Julia is fast, it enables us to run the entire system and allows the specification to be executed directly. We continue to push Julia as a standard for specifications in the avionics industry. Julia is the right answer for us and exceeds all our needs.” Robert Moss, MIT Lincoln Labs

Augmedics is using Julia to give surgeons ‘x-ray vision’ via augmented reality.

“I stumbled upon Julia and gave it a try for a few days. I fell in love with the syntax, which is in so many ways exactly how I wanted it to be. The Julia community is helpful, Juno (the interactive development environment for Julia) is super-helpful. I don’t know how one can write without it. As a result, we are achieving much more and far more efficiently using Julia.” Tsur Herman, Senior Algorithms Developer

Path BioAnalytics is using Julia for personalized medicine.

“We were initially attracted to Julia because of the innovation we saw going on in the community. The computational efficiency and interactivity of the data visualization packages were exactly what we needed in order to process our data quickly and present results in a compelling fashion. Julia is instrumental to the efficient execution of multiple workflows, and with the dynamic development of the language, we expect Julia will continue to be a key part of our business going forward.” Katerina Kucera, Lead Scientist

Voxel8 is using Julia for 3D printing and drone manufacture.

“The expressiveness of a language matters. Being high level and having an ability to iterate quickly makes a major difference in a fast-paced innovative environment like at Voxel8. The speed at which we’ve been able to develop this has been incredible. If we were doing this in a more traditional language like C or C++, we wouldn’t be nearly as far as we are today with the number of developers we have, and we wouldn’t be able to respond nearly as quickly to customer feedback regarding what features they want. There is a large number of packages for Julia that we find useful. Julia is very stable – the core language is stable and fast and most packages are very stable.” Jack Minardi, Co-Founder and Software Lead

Federal Reserve Bank of New York and Nobel Laureate Thomas J. Sargent are using Julia to solve macroeconomic models 10x faster.

“We tested our code and found that the model estimation is about ten times faster with Julia than before, a very large improvement. Our ports (computer lingo for “translations”) of certain algorithms, such as Chris Sims’s gensys (which computes the model solution), also ran about six times faster in Julia than the … versions we had previously used.” Marco Del Negro, Marc Giannoni, Pearl Li, Erica Moszkowski and Micah Smith, Federal Reserve Bank of New York

“Julia is a great tool. We like Julia. We are very excited about Julia because our models are complicated. It’s easy to write the problem down, but it’s hard to solve it – especially if our model is high dimensional. That’s why we need Julia. Figuring out how to solve these problems requires some creativity. This is a walking advertisement for Julia.” Thomas J. Sargent, Nobel Laureate

Intel, Lawrence Berkeley National Laboratory, UC Berkeley and the National Energy Research Scientific Computing Center are using Julia for parallel supercomputing to increase the speed of astronomical image analysis 225x.

Barts Cancer Institute, Institute of Cancer Research, University College London and Queen Mary University of London are using Julia to model cancer genomes.

“Coming from using Matlab, Julia was pretty easy, and I was surprised by how easy it was to write pretty fast code. Obviously the speed, conciseness and dynamic nature of Julia is a big plus and the initial draw, but there are other perhaps unexpected benefits. For example, I’ve learned a lot about programming through using Julia. Learning Julia has helped me reason about how to write better and faster code. I think this is primarily because Julia is very upfront about why it can be fast and nothing is hidden away or “under the hood”. Also as most of the base language and packages are written in Julia, it’s great to be able to delve into what’s going on without running into a wall of C code, as might be the case in other languages. I think this is a big plus for its use in scientific research too, where we hope that our methods and conclusions are reproducible. Having a language that’s both fast enough to implement potentially sophisticated algorithms at a big scale but also be readable by most people is a great resource. Also, I find the code to be very clean looking, which multiple dispatch helps with a lot, and I like the ability to write in a functional style.” Marc Williams, Barts Cancer Institute, Queen Mary University of London and University College London

]]>
By: Julia Computing, Inc.

Re-posted from: http://juliacomputing.com/press/2017/06/26/grant.html

Cambridge, MA – Julia Computing has been granted $910,000 by the Alfred P. Sloan Foundation to support open-source Julia development, including $160,000 to promote diversity in the Julia community.

The grant will support Julia training, adoption, usability, compilation, package development, tooling and documentation.

The diversity portion of the grant will fund a new full-time Director of Diversity Initiatives plus travel, scholarships, training sessions, workshops, hackathons and Webinars. Further information about the new Director of Diversity Initiatives position is below for interested applicants.

Julia Computing CEO Viral Shah says, “Diversity of backgrounds increases diversity of ideas. With this grant, the Sloan Foundation is setting a new standard of support for diversity which we hope will be emulated throughout STEM.”

Diversity efforts in the Julia community have been led by JuliaCon Diversity Chair, Erica Moszkowski. According to Moszkowski, “This year, we awarded $12,600 in diversity grants to help 16 participants travel to, attend and present at JuliaCon 2017. Those awards, combined with anonymous talk review, directed outreach, and other efforts have paid off. To give one example, there are many more women attending and presenting than in previous years, but there is a lot more we can do to expand participation from underrepresented groups in the Julia community. This support from the Sloan Foundation will allow us to scale up these efforts and apply them not just at JuliaCon, but much more broadly through Julia workshops and recruitment.”

Julia Computing seeks job applicants for Director of Diversity Initiatives. This is a full-time salaried position. The ideal candidate would have the following characteristics:

  • Familiarity with Julia
  • Strong scientific, mathematical or numeric programming skills required – e.g. Julia, Python, R
  • Eager to travel, organize and conduct Julia trainings, conferences, workshops and hackathons
  • Enthusiastic about outreach, developing and leveraging relationships with universities and STEM diversity organizations such as YesWeCode, Girls Who Code, Code Latino and Black Girls Code
  • Strong organizational, communication, public speaking and training skills required
  • Passionate evangelist for Julia, open source computing, scientific computing and increasing diversity in the Julia community and STEM
  • This position is based in Cambridge, MA

Interested applicants should send a resume and statement of interest to jobs@juliacomputing.com.

Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia, a product of the open source community, MIT CSAIL and MIT Mathematics, combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of C++ and Java. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. With more than 1 million downloads and +161% annual growth, Julia is one of the top 10 programming languages developed on GitHub and adoption is growing rapidly in finance, insurance, energy, robotics, genomics, aerospace and many other fields.

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Comcast, Disney, Facebook, Ford, Google, Grindr, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC, Raytheon and Uber.

  1. Julia is lightning fast. Julia provides speed improvements up to
    1,000x for insurance model estimation, 225x for parallel
    supercomputing image analysis and 11x for macroeconomic modeling.

  2. Julia is easy to learn. Julia’s flexible syntax is familiar and
    comfortable for users of Python, R and Matlab.

  3. Julia integrates well with existing code and platforms. Users of
    Python, R, Matlab and other languages can easily integrate their
    existing code into Julia.

  4. Elegant code. Julia was built from the ground up for
    mathematical, scientific and statistical computing, and has advanced
    libraries that make coding simple and fast, and dramatically reduce
    the number of lines of code required – in some cases, by 90%
    or more.

  5. Julia solves the two language problem. Because Julia combines
    the ease of use and familiar syntax of Python, R and Matlab with the
    speed of C, C++ or Java, programmers no longer need to estimate
    models in one language and reproduce them in a faster
    production language. This saves time and reduces error and cost.

Julia Computing was founded in 2015 by the creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia.

The Alfred P. Sloan Foundation is a not-for-profit grantmaking institution based in New York City. Founded by industrialist Alfred P. Sloan Jr., the Foundation makes grants in support of basic research and education in science, technology, engineering, mathematics, and economics. This grant was provided through the Foundation’s Data and Computational Research program, which makes grants that seek to leverage developments in digital information technology to maximize the efficiency and trustedness of research. sloan.org

A few examples of how Julia is being used today include:

BlackRock, the world’s largest asset manager, is using Julia to power their trademark Aladdin analytics platform.

Aviva, Britain’s second-largest insurer, is using Julia to make Solvency II compliance models run 1,000x faster using just 7% as much code as the legacy program it replaced.

“Solvency II compliant models in Julia are 1,000x faster than Algorithmics, use 93% fewer lines of code and took one-tenth the time to implement.” Tim Thornham, Director of Financial Solutions Modeling

Berkery Noyes is using Julia for mergers and acquisitions analysis.

“Julia is 20 times faster than Python, 100 times faster than R, 93 times faster than Matlab and 1.3 times faster than Fortran. What really excites us is that it’s interesting that you can write high-level, scientific and numerical computing but without having to re-translate that. Usually, if you have something in R or Matlab and you want to make it go faster, you have to re-translate it to C++, or some other faster language; with Julia, you don’t—it sits right on top.” Keith Lubell, CTO

UC Berkeley Autonomous Race Car (BARC) is using Julia for self-driving vehicle navigation.

“Julia has some amazing new features for our research. The port to ARM has made it easy for us to translate our research codes into real world applications.” Francesco Borrelli, Professor of Mechanical Engineering and co-director of the Hyundai Center of Excellence in Integrated Vehicle Safety Systems and Control at UC Berkeley

Federal Aviation Administration (FAA) and MIT Lincoln Labs are using Julia for the Next-Generation Aircraft Collision Avoidance System.

“The previous way of doing things was very costly. Julia is very easy to understand. It’s a very familiar syntax, which helps the reader understand the document with clarity, and it helps the writer develop algorithms that are concise. Julia resolves many of our conflicts, reduces cost during technology transfer, and because Julia is fast, it enables us to run the entire system and allows the specification to be executed directly. We continue to push Julia as a standard for specifications in the avionics industry. Julia is the right answer for us and exceeds all our needs.” Robert Moss, MIT Lincoln Labs

Augmedics is using Julia to give surgeons ‘x-ray vision’ via augmented reality.

“I stumbled upon Julia and gave it a try for a few days. I fell in love with the syntax, which is in so many ways exactly how I wanted it to be. The Julia community is helpful, Juno (the interactive development environment for Julia) is super-helpful. I don’t know how one can write without it. As a result, we are achieving much more and far more efficiently using Julia.” Tsur Herman, Senior Algorithms Developer

Path BioAnalytics is using Julia for personalized medicine.

“We were initially attracted to Julia because of the innovation we saw going on in the community. The computational efficiency and interactivity of the data visualization packages were exactly what we needed in order to process our data quickly and present results in a compelling fashion. Julia is instrumental to the efficient execution of multiple workflows, and with the dynamic development of the language, we expect Julia will continue to be a key part of our business going forward.” Katerina Kucera, Lead Scientist

Voxel8 is using Julia for 3D printing and drone manufacture.

“The expressiveness of a language matters. Being high level and having an ability to iterate quickly makes a major difference in a fast-paced innovative environment like at Voxel8. The speed at which we’ve been able to develop this has been incredible. If we were doing this in a more traditional language like C or C++, we wouldn’t be nearly as far as we are today with the number of developers we have, and we wouldn’t be able to respond nearly as quickly to customer feedback regarding what features they want. There is a large number of packages for Julia that we find useful. Julia is very stable – the core language is stable and fast and most packages are very stable.” Jack Minardi, Co-Founder and Software Lead

Federal Reserve Bank of New York and Nobel Laureate Thomas J. Sargent are using Julia to solve macroeconomic models 10x faster.

“We tested our code and found that the model estimation is about ten times faster with Julia than before, a very large improvement. Our ports (computer lingo for “translations”) of certain algorithms, such as Chris Sims’s gensys (which computes the model solution), also ran about six times faster in Julia than the … versions we had previously used.” Marco Del Negro, Marc Giannoni, Pearl Li, Erica Moszkowski and Micah Smith, Federal Reserve Bank of New York

“Julia is a great tool. We like Julia. We are very excited about Julia because our models are complicated. It’s easy to write the problem down, but it’s hard to solve it – especially if our model is high dimensional. That’s why we need Julia. Figuring out how to solve these problems requires some creativity. This is a walking advertisement for Julia.” Thomas J. Sargent, Nobel Laureate

Intel, Lawrence Berkeley National Laboratory, UC Berkeley and the National Energy Research Scientific Computing Center are using Julia for parallel supercomputing to increase the speed of astronomical image analysis 225x.

Barts Cancer Institute, Institute of Cancer Research, University College London and Queen Mary University of London are using Julia to model cancer genomes.

“Coming from using Matlab, Julia was pretty easy, and I was surprised by how easy it was to write pretty fast code. Obviously the speed, conciseness and dynamic nature of Julia is a big plus and the initial draw, but there are other perhaps unexpected benefits. For example, I’ve learned a lot about programming through using Julia. Learning Julia has helped me reason about how to write better and faster code. I think this is primarily because Julia is very upfront about why it can be fast and nothing is hidden away or “under the hood”. Also as most of the base language and packages are written in Julia, it’s great to be able to delve into what’s going on without running into a wall of C code, as might be the case in other languages. I think this is a big plus for its use in scientific research too, where we hope that our methods and conclusions are reproducible. Having a language that’s both fast enough to implement potentially sophisticated algorithms at a big scale but also be readable by most people is a great resource. Also, I find the code to be very clean looking, which multiple dispatch helps with a lot, and I like the ability to write in a functional style.” Marc Williams, Barts Cancer Institute, Queen Mary University of London and University College London

]]>
3697
Julia Computing Raises $4.6M in Seed Funding http://www.juliabloggers.com/julia-computing-raises-4-6m-in-seed-funding/ Mon, 19 Jun 2017 00:00:00 +0000 http://juliacomputing.com/press/2017/06/19/funding Berkeley, California – Julia Computing is pleased to announce seed funding of $4.6M from investors General Catalyst and Founder Collective.

Julia Computing CEO Viral Shah says, “We selected General Catalyst and Founder Collective as our initial investors because of their success backing entrepreneurs with business models based on open source software. This investment helps us accelerate product development and continue delivering outstanding support to our customers, while the entire Julia community benefits from Julia Computing’s contributions to the Julia open source programming language.”

The General Catalyst team was led by Donald Fischer, who was an early product manager for Red Hat Enterprise Linux, and the Founder Collective team was led by David Frankel.

Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of C++ and Java. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. With more than 1 million downloads and +161% annual growth, Julia is one of the top 10 programming languages developed on GitHub and adoption is growing rapidly in finance, insurance, energy, robotics, genomics, aerospace and many other fields.

According to Tim Thornham, Director of Financial Solutions Modeling at Aviva, Britain’s second-largest insurer, “Solvency II compliant models in Julia are 1,000x faster than Algorithmics, use 93% fewer lines of code and took one-tenth the time to implement.”

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Comcast, Disney, Facebook, Ford, Google, Grindr, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC, Raytheon and Uber.

  1. Julia is lightning fast. Julia provides speed improvements up to 1,000x for insurance model estimation, 225x for parallel supercomputing image analysis and 11x for macroeconomic modeling.

  2. Julia is easy to learn. Julia’s flexible syntax is familiar and comfortable for users of Python, R and Matlab.

  3. Julia integrates well with existing code and platforms. Users of Python, R, Matlab and other languages can easily integrate their existing code into Julia.

  4. Elegant code. Julia was built from the ground up for mathematical, scientific and statistical computing, and has advanced libraries that make coding simple and fast, and dramatically reduce the number of lines of code required – in some cases, by 90% or more.

  5. Julia solves the two language problem. Because Julia combines the ease of use and familiar syntax of Python, R and Matlab with the speed of C, C++ or Java, programmers no longer need to estimate models in one language and reproduce them in a faster production language. This saves time and reduces error and cost.

Julia Computing was founded in 2015 by the creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia. Julia Computing’s founders are Viral Shah, Alan Edelman, Jeff Bezanson, Stefan Karpinski, Keno Fischer and Deepak Vinchhi.

A few examples of how Julia is being used today include:

BlackRock, the world’s largest asset manager, is using Julia to power their trademark Aladdin analytics platform.

Aviva, Britain’s second-largest insurer, is using Julia to make Solvency II compliance models run 1,000x faster using just 7% as much code as the legacy program it replaced.

“Solvency II compliant models in Julia are 1,000x faster than Algorithmics, use 93% fewer lines of code and took one-tenth the time to implement.” Tim Thornham, Director of Financial Solutions Modeling

Berkery Noyes is using Julia for mergers and acquisitions analysis.

“Julia is 20 times faster than Python, 100 times faster than R, 93 times faster than Matlab and 1.3 times faster than Fortran. What really excites us is that it’s interesting that you can write high-level, scientific and numerical computing but without having to re-translate that. Usually, if you have something in R or Matlab and you want to make it go faster, you have to re-translate it to C++, or some other faster language; with Julia, you don’t—it sits right on top.” Keith Lubell, CTO

UC Berkeley Autonomous Race Car (BARC) is using Julia for self-driving vehicle navigation.

“Julia has some amazing new features for our research. The port to ARM has made it easy for us to translate our research codes into real world applications.” Francesco Borrelli, Professor of Mechanical Engineering and co-director of the Hyundai Center of Excellence in Integrated Vehicle Safety Systems and Control at UC Berkeley

Federal Aviation Administration (FAA) and MIT Lincoln Labs are using Julia for the Next-Generation Aircraft Collision Avoidance System.

“The previous way of doing things was very costly. Julia is very easy to understand. It’s a very familiar syntax, which helps the reader understand the document with clarity, and it helps the writer develop algorithms that are concise. Julia resolves many of our conflicts, reduces cost during technology transfer, and because Julia is fast, it enables us to run the entire system and allows the specification to be executed directly. We continue to push Julia as a standard for specifications in the avionics industry. Julia is the right answer for us and exceeds all our needs.” Robert Moss, MIT Lincoln Labs

Augmedics is using Julia to give surgeons ‘x-ray vision’ via augmented reality.

“I stumbled upon Julia and gave it a try for a few days. I fell in love with the syntax, which is in so many ways exactly how I wanted it to be. The Julia community is helpful, Juno (the interactive development environment for Julia) is super-helpful. I don’t know how one can write without it. As a result, we are achieving much more and far more efficiently using Julia.” Tsur Herman, Senior Algorithms Developer

Path BioAnalytics is using Julia for personalized medicine.

“We were initially attracted to Julia because of the innovation we saw going on in the community. The computational efficiency and interactivity of the data visualization packages were exactly what we needed in order to process our data quickly and present results in a compelling fashion. Julia is instrumental to the efficient execution of multiple workflows, and with the dynamic development of the language, we expect Julia will continue to be a key part of our business going forward.” Katerina Kucera, Lead Scientist

Voxel8 is using Julia for 3D printing and drone manufacture.

“The expressiveness of a language matters. Being high level and having an ability to iterate quickly makes a major difference in a fast-paced innovative environment like at Voxel8. The speed at which we’ve been able to develop this has been incredible. If we were doing this in a more traditional language like C or C++, we wouldn’t be nearly as far as we are today with the number of developers we have, and we wouldn’t be able to respond nearly as quickly to customer feedback regarding what features they want. There is a large number of packages for Julia that we find useful. Julia is very stable – the core language is stable and fast and most packages are very stable.” Jack Minardi, Co-Founder and Software Lead

Federal Reserve Bank of New York and Nobel Laureate Thomas J. Sargent are using Julia to solve macroeconomic models 10x faster.

“We tested our code and found that the model estimation is about ten times faster with Julia than before, a very large improvement. Our ports (computer lingo for “translations”) of certain algorithms, such as Chris Sims’s gensys (which computes the model solution), also ran about six times faster in Julia than the … versions we had previously used.” Marco Del Negro, Marc Giannoni, Pearl Li, Erica Moszkowski and Micah Smith, Federal Reserve Bank of New York

“Julia is a great tool. We like Julia. We are very excited about Julia because our models are complicated. It’s easy to write the problem down, but it’s hard to solve it – especially if our model is high dimensional. That’s why we need Julia. Figuring out how to solve these problems requires some creativity. This is a walking advertisement for Julia.” Thomas J. Sargent, Nobel Laureate

Intel, Lawrence Berkeley National Laboratory, UC Berkeley and the National Energy Research Scientific Computing Center are using Julia for parallel supercomputing to increase the speed of astronomical image analysis 225x.

Barts Cancer Institute, Institute of Cancer Research, University College London and Queen Mary University of London are using Julia to model cancer genomes.

“Coming from using Matlab, Julia was pretty easy, and I was surprised by how easy it was to write pretty fast code. Obviously the speed, conciseness and dynamic nature of Julia is a big plus and the initial draw, but there are other perhaps unexpected benefits. For example, I’ve learned a lot about programming through using Julia. Learning Julia has helped me reason about how to write better and faster code. I think this is primarily because Julia is very upfront about why it can be fast and nothing is hidden away or “under the hood”. Also as most of the base language and packages are written in Julia, it’s great to be able to delve into what’s going on without running into a wall of C code, as might be the case in other languages. I think this is a big plus for its use in scientific research too, where we hope that our methods and conclusions are reproducible. Having a language that’s both fast enough to implement potentially sophisticated algorithms at a big scale but also be readable by most people is a great resource. Also, I find the code to be very clean looking, which multiple dispatch helps with a lot, and I like the ability to write in a functional style.” Marc Williams, Barts Cancer Institute, Queen Mary University of London and University College London

]]>
By: Julia Computing, Inc.

Re-posted from: http://juliacomputing.com/press/2017/06/19/funding.html

Berkeley, California – Julia Computing is pleased to announce seed funding of $4.6M from investors General Catalyst and Founder Collective.

Julia Computing CEO Viral Shah says, “We selected General Catalyst and Founder Collective as our initial investors because of their success backing entrepreneurs with business models based on open source software. This investment helps us accelerate product development and continue delivering outstanding support to our customers, while the entire Julia community benefits from Julia Computing’s contributions to the Julia open source programming language.”

The General Catalyst team was led by Donald Fischer, who was an early product manager for Red Hat Enterprise Linux, and the Founder Collective team was led by David Frankel.

Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of C++ and Java. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. With more than 1 million downloads and +161% annual growth, Julia is one of the top 10 programming languages developed on GitHub and adoption is growing rapidly in finance, insurance, energy, robotics, genomics, aerospace and many other fields.

According to Tim Thornham, Director of Financial Solutions Modeling at Aviva, Britain’s second-largest insurer, “Solvency II compliant models in Julia are 1,000x faster than Algorithmics, use 93% fewer lines of code and took one-tenth the time to implement.”

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Comcast, Disney, Facebook, Ford, Google, Grindr, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC, Raytheon and Uber.

  1. Julia is lightning fast. Julia provides speed improvements up to
    1,000x for insurance model estimation, 225x for parallel
    supercomputing image analysis and 11x for macroeconomic modeling.

  2. Julia is easy to learn. Julia’s flexible syntax is familiar and
    comfortable for users of Python, R and Matlab.

  3. Julia integrates well with existing code and platforms. Users of
    Python, R, Matlab and other languages can easily integrate their
    existing code into Julia.

  4. Elegant code. Julia was built from the ground up for
    mathematical, scientific and statistical computing, and has advanced
    libraries that make coding simple and fast, and dramatically reduce
    the number of lines of code required – in some cases, by 90%
    or more.

  5. Julia solves the two language problem. Because Julia combines
    the ease of use and familiar syntax of Python, R and Matlab with the
    speed of C, C++ or Java, programmers no longer need to estimate
    models in one language and reproduce them in a faster
    production language. This saves time and reduces error and cost.

Julia Computing was founded in 2015 by the creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia. Julia Computing’s founders are Viral Shah, Alan Edelman, Jeff Bezanson, Stefan Karpinski, Keno Fischer and Deepak Vinchhi.

A few examples of how Julia is being used today include:

BlackRock, the world’s largest asset manager, is using Julia to power their trademark Aladdin analytics platform.

Aviva, Britain’s second-largest insurer, is using Julia to make Solvency II compliance models run 1,000x faster using just 7% as much code as the legacy program it replaced.

“Solvency II compliant models in Julia are 1,000x faster than Algorithmics, use 93% fewer lines of code and took one-tenth the time to implement.” Tim Thornham, Director of Financial Solutions Modeling

Berkery Noyes is using Julia for mergers and acquisitions analysis.

“Julia is 20 times faster than Python, 100 times faster than R, 93 times faster than Matlab and 1.3 times faster than Fortran. What really excites us is that it’s interesting that you can write high-level, scientific and numerical computing but without having to re-translate that. Usually, if you have something in R or Matlab and you want to make it go faster, you have to re-translate it to C++, or some other faster language; with Julia, you don’t—it sits right on top.” Keith Lubell, CTO

UC Berkeley Autonomous Race Car (BARC) is using Julia for self-driving vehicle navigation.

“Julia has some amazing new features for our research. The port to ARM has made it easy for us to translate our research codes into real world applications.” Francesco Borrelli, Professor of Mechanical Engineering and co-director of the Hyundai Center of Excellence in Integrated Vehicle Safety Systems and Control at UC Berkeley

Federal Aviation Administration (FAA) and MIT Lincoln Labs are using Julia for the Next-Generation Aircraft Collision Avoidance System.

“The previous way of doing things was very costly. Julia is very easy to understand. It’s a very familiar syntax, which helps the reader understand the document with clarity, and it helps the writer develop algorithms that are concise. Julia resolves many of our conflicts, reduces cost during technology transfer, and because Julia is fast, it enables us to run the entire system and allows the specification to be executed directly. We continue to push Julia as a standard for specifications in the avionics industry. Julia is the right answer for us and exceeds all our needs.” Robert Moss, MIT Lincoln Labs

Augmedics is using Julia to give surgeons ‘x-ray vision’ via augmented reality.

“I stumbled upon Julia and gave it a try for a few days. I fell in love with the syntax, which is in so many ways exactly how I wanted it to be. The Julia community is helpful, Juno (the interactive development environment for Julia) is super-helpful. I don’t know how one can write without it. As a result, we are achieving much more and far more efficiently using Julia.” Tsur Herman, Senior Algorithms Developer

Path BioAnalytics is using Julia for personalized medicine.

“We were initially attracted to Julia because of the innovation we saw going on in the community. The computational efficiency and interactivity of the data visualization packages were exactly what we needed in order to process our data quickly and present results in a compelling fashion. Julia is instrumental to the efficient execution of multiple workflows, and with the dynamic development of the language, we expect Julia will continue to be a key part of our business going forward.” Katerina Kucera, Lead Scientist

Voxel8 is using Julia for 3D printing and drone manufacture.

“The expressiveness of a language matters. Being high level and having an ability to iterate quickly makes a major difference in a fast-paced innovative environment like at Voxel8. The speed at which we’ve been able to develop this has been incredible. If we were doing this in a more traditional language like C or C++, we wouldn’t be nearly as far as we are today with the number of developers we have, and we wouldn’t be able to respond nearly as quickly to customer feedback regarding what features they want. There is a large number of packages for Julia that we find useful. Julia is very stable – the core language is stable and fast and most packages are very stable.” Jack Minardi, Co-Founder and Software Lead

Federal Reserve Bank of New York and Nobel Laureate Thomas J. Sargent are using Julia to solve macroeconomic models 10x faster.

“We tested our code and found that the model estimation is about ten times faster with Julia than before, a very large improvement. Our ports (computer lingo for “translations”) of certain algorithms, such as Chris Sims’s gensys (which computes the model solution), also ran about six times faster in Julia than the … versions we had previously used.” Marco Del Negro, Marc Giannoni, Pearl Li, Erica Moszkowski and Micah Smith, Federal Reserve Bank of New York

“Julia is a great tool. We like Julia. We are very excited about Julia because our models are complicated. It’s easy to write the problem down, but it’s hard to solve it – especially if our model is high dimensional. That’s why we need Julia. Figuring out how to solve these problems requires some creativity. This is a walking advertisement for Julia.” Thomas J. Sargent, Nobel Laureate

Intel, Lawrence Berkeley National Laboratory, UC Berkeley and the National Energy Research Scientific Computing Center are using Julia for parallel supercomputing to increase the speed of astronomical image analysis 225x.

Barts Cancer Institute, Institute of Cancer Research, University College London and Queen Mary University of London are using Julia to model cancer genomes.

“Coming from using Matlab, Julia was pretty easy, and I was surprised by how easy it was to write pretty fast code. Obviously the speed, conciseness and dynamic nature of Julia is a big plus and the initial draw, but there are other perhaps unexpected benefits. For example, I’ve learned a lot about programming through using Julia. Learning Julia has helped me reason about how to write better and faster code. I think this is primarily because Julia is very upfront about why it can be fast and nothing is hidden away or “under the hood”. Also as most of the base language and packages are written in Julia, it’s great to be able to delve into what’s going on without running into a wall of C code, as might be the case in other languages. I think this is a big plus for its use in scientific research too, where we hope that our methods and conclusions are reproducible. Having a language that’s both fast enough to implement potentially sophisticated algorithms at a big scale but also be readable by most people is a great resource. Also, I find the code to be very clean looking, which multiple dispatch helps with a lot, and I like the ability to write in a functional style.” Marc Williams, Barts Cancer Institute, Queen Mary University of London and University College London

]]>
3690
Reading DataFrames with non-UTF8 encoding in Julia http://www.juliabloggers.com/reading-dataframes-with-non-utf8-encoding-in-julia/ Mon, 12 Jun 2017 15:51:55 +0000 http://perfectionatic.org/?p=414 ]]> By: perfectionatic

Re-posted from: http://perfectionatic.org/?p=414

Recently I ran into problem where I was trying to read a CSV files from a Scandinavian friend into a DataFrame. I was getting errors it could not properly parse the latin1 encoded names.

I tried running

using DataFrames
dataT=readtable("example.csv", encoding=:latin1)

but the got this error

ArgumentError: Argument 'encoding' only supports ':utf8' currently.

The solution make use of (StringEncodings.jl)[https://github.com/nalimilan/StringEncodings.jl] to wrap the file data stream before presenting it to the readtable function.

f=open("example.csv","r")
s=StringDecoder(f,"LATIN1", "UTF-8")
dataT=readtable(s)
close(s)
close(f)

The StringDecoder generates an IO stream that appears to be utf8 for the readtable function.

]]>
3684
Tupper’s self-referential formula in Julia http://www.juliabloggers.com/tuppers-self-referential-formula-in-julia/ Mon, 12 Jun 2017 15:15:29 +0000 http://perfectionatic.org/?p=399 ]]> By: perfectionatic

Re-posted from: http://perfectionatic.org/?p=399

I was surprised when I came across on Tupper’s formula on twitter. I felt the compulsion to implement it in Julia.

The formula is expressed as

{1\over 2} < \left\lfloor \mathrm{mod}\left(\left\lfloor {y \over 17} \right\rfloor 2^{-17 \lfloor x \rfloor - \mathrm{mod}(\lfloor y\rfloor, 17)},2\right)\right\rfloor

and yields bitmap facsimile of itself.

In [1]:
k=big"960 939 379 918 958 884 971 672 962 127 852 754 715 004 339 660 129 306 651 505 519 271 702 802 395 266 424 689 642 842 174 350 718 121 267 153 782 770 623 355 993 237 280 874 144 307 891 325 963 941 337 723 487 857 735 749 823 926 629 715 517 173 716 995 165 232 890 538 221 612 403 238 855 866 184 013 235 585 136 048 828 693 337 902 491 454 229 288 667 081 096 184 496 091 705 183 454 067 827 731 551 705 405 381 627 380 967 602 565 625 016 981 482 083 418 783 163 849 115 590 225 610 003 652 351 370 343 874 461 848 378 737 238 198 224 849 863 465 033 159 410 054 974 700 593 138 339 226 497 249 461 751 545 728 366 702 369 745 461 014 655 997 933 798 537 483 143 786 841 806 593 422 227 898 388 722 980 000 748 404 719"
setprecision(BigFloat,10000);

In the above, the big integer is the magic number that lets us generate the image of the formula. I also need to setprecision of BigFloat to be very high, as rounding errors using the default precision does not get us the desired results. The implementation was inspired by the one in Python, but I see Julia a great deal more concise and clearer.

In [2]:
function tupper_field(k)
    field=Array{Bool}(17,106)
    for (ix,x) in enumerate(0.0:1:105.0), (iy,y) in enumerate(k:k+16)
        field[iy,107-ix]=1/2<floor(mod(floor(y/17)*2^(-17*floor(x)-mod(floor(y),17)),2))
    end
   field 
end
In [3]:
f=tupper_field(k);
using Images
img = colorview(Gray,.!f)
Out[3]:

I just inverted the boolean array here to get the desired bitmap output.

 

]]>
3686
Sampling variation in effective sample size estimates (MCMC) http://www.juliabloggers.com/sampling-variation-in-effective-sample-size-estimates-mcmc-2/ Mon, 12 Jun 2017 14:25:57 +0000 http://tpapp.github.io/post/ess-sampling/ess-sampling/ ]]> By: Julia on Tamás K. Papp's website

Re-posted from: http://tpapp.github.io/post/ess-sampling/ess-sampling/

Introduction MCMC samples, used in Bayesian statistics, are not independent — in fact, unless one uses specialized methods or modern HMC, posterior draws are usually at highly autocorrelated. For independent draws, [ \text{variance of simulation mean} \propto \frac1N ] where $N$ is the sample size, but for correlated draws, one has to scale the sample size with a factor [ \tau = \frac{1}{1+2\sum_{k=1}^\infty \rho_k} ] where $\rho_k$ is the lag-$k$ autocorrelation.

]]>
3688
Sampling variation in effective sample size estimates (MCMC) http://www.juliabloggers.com/sampling-variation-in-effective-sample-size-estimates-mcmc/ Mon, 12 Jun 2017 14:25:57 +0000 http://tpapp.github.io/post/ess-sampling/ ]]> By: Julia on Tamás K. Papp's website

Re-posted from: http://tpapp.github.io/post/ess-sampling/

Introduction MCMC samples, used in Bayesian statistics, are not independent — in fact, unless one uses specialized methods or modern HMC, posterior draws are usually at highly autocorrelated. For independent draws, [ \text{variance of simulation mean} \propto \frac1N ] where $N$ is the sample size, but for correlated draws, one has to scale the sample size with a factor [ \tau = \frac{1}{1+2\sum_{k=1}^\infty \rho_k} ] where $\rho_k$ is the lag-$k$ autocorrelation.

]]>
3682
Optim.jl v0.9.0 http://www.juliabloggers.com/optim-jl-v0-9-0/ Fri, 02 Jun 2017 20:47:33 +0000 http://www.pkofod.com/?p=265 Continue reading Optim.jl v0.9.0 ]]> By: pkofod

Re-posted from: http://www.pkofod.com/2017/06/02/optim-jl-v0-9-0/

I am very happy to say that we can finally announce that Optim.jl v0.9.0 is out. This version has quite a few user facing changes. Please read about the changes below if you use Optim.jl in a package, a script, or anything else, as you will quite likely have to make some changes to your code.

As always, I have to thank my two partners in crime: Asbjørn Nilsen Riseth (@anriseth) and Christoph Ortner (@cortner) for their help in making the changes, transitions, and tests that are included in v0.9.0.

The last update (form v0.6.0 to v0.7.0 had) some changes that were a long time coming, and so does v0.9.0. Hopefully, these fixes to old design problems will greatly improve the user experience and performance of Optim.jl, and pave the way for more exiting features in the future.

We’ve tried to make the transition as smooth as possible, although we do have breaking changes in this update. Please consult the documentation if you face problems, join us on gitter or ask the community at discourse!

Okay, now to the changes.

Why not v0.8.0?
First of all, why v0.9.0? Last version was v0.7.8! The is because we are dropping support for Julia v0.4 and v0.5 simultaneously, so we are reserving v0.8.0 for backporting serious fixes to Julia v0.5. However, v0.6 should be just around the corner. With Julia v0.7 and v1.0.0 not too far out in the horizon either, I’ve decided it’s more important to move forward than to keep v0.4 and v0.5 up to speed. The dev time is constrained, so currently it’s one or the other. Of course, for users of Julia v0.5. they can simply continue to use Optim.jl v0.7.8. Post Julia’s proper release, backwards compatibility and continuity will be more important, even if it comes at the expense of development speed.

Another note about the version number: The next version of Optim.jl will be v1.0.0, and we will follow SEMVER 2.0 fully.

Change order of evaluation point and storage arguments
This one is very breaking, although we have set up op a system such that all gradients and Hessians will be checked before proceeding. This check will be removed shortly in a v1.0.0 version bump, so please correct your code now. Basically, we closed a very old issue (#156) concerning the input argument order in gradients and Hessians. In Julia, an in-place function typically has an exclamation mark at the end of its name, and the cache as the first argument. In Optim.jl it has been the other way around for the argument order. We’ve changed that, and this means that you now have to provide “g” or “H” as the first argument, and “x” as the second. The old version

function g!(x, g)
    ... do something ...
end

is now

function g!(g, x)
    ... do something ...
end

NLSolversBase.jl
Since v0.7.0, we’ve moved some of the basic infrastructure of Optim.jl to NLSolversBase.jl. This is currently the Non-, Once-, and TwiceDifferentiable types and constructors. This is done to, as a first step, share code between Optim.jl and LineSearches.jl, and but also NLsolve.jl in the future. At the same time, we’ve made the code a little smarter, such that superfluous calls to the objective function, gradient, and Hessian are now avoided. As an example, compare the objective and gradient calls in the example in our readme. Here, we optimize the Rosenbrock “banana” function using BFGS. Since last version of Optim we had to change the output, as it has gone from 157 calls to 53. Much of this comes from this refactoring, but some of it also comes form a better choices for initial line search steps for BFGS and Newton introduced in #328.

As mentioned, we’ve made the *Differentiable-types a bit smarter, including moving the gradient and Hessian caches into the respective types. This also means, that a OnceDifferentiable type instance needs to know what the return type of the gradient is. This is done by providing an x seed in the constructor

rosenbrock = Optim.UnconstrainedProblems.examples["Rosenbrock"]
f = rosenbrock.f
g! = rosenbrock.g!
x_seed = rosenbrock.initial_x
od = OnceDifferentiable(f, g!, x_seed)

If the seed also happens to be the initial x, then you do not have to provide an x when calling optimize

julia> optimize(od, BFGS(), Optim.Options(g_tol=0.1))
Results of Optimization Algorithm
 * Algorithm: BFGS
 * Starting Point: [1.0005999613152214,1.001138415164852]
 * Minimizer: [1.0005999613152214,1.001138415164852]
 * Minimum: 7.427113e-07
 * Iterations: 13
 * Convergence: true
   * |x - x'| < 1.0e-32: false 
     |x - x'| = 1.08e-02 
   * |f(x) - f(x')| / |f(x)| < 1.0e-32: false
     |f(x) - f(x')| / |f(x)| = NaN 
   * |g(x)| < 1.0e-01: true 
     |g(x)| = 2.60e-02 
   * stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 45
 * Gradient Calls: 45

If you’ve used Optim.jl before, you’ll notice that the output carries a bit more information about the convergence criteria.

LineSearches.jl turned Julian
Line searches used to be chosen using symbols in the method constructor for line search based methods such as GradientDescent, BFGS, and Newton by use of the linesearch keyword. The new version of LineSearches.jl uses types and dispatch exactly like Optim.jl does for solvers. This means that you now have to pass a type instance instead of a keyword, and this also means that we can open up for easy tweaking of line search parameters through fields in the line search types.

Let us illustrate by the following example how the new syntax works. First, we construct a BFGS instance without specifying the linesearch. This defaults to HagerZhang.

julia> rosenbrock(x) =  (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2
rosenbrock (generic function with 1 method)
 
julia> result = optimize(rosenbrock, zeros(2), BFGS())
Results of Optimization Algorithm
 * Algorithm: BFGS
 * Starting Point: [0.0,0.0]
 * Minimizer: [0.9999999926033423,0.9999999852005353]
 * Minimum: 5.471433e-17
 * Iterations: 16
 * Convergence: true
   * |x - x'| < 1.0e-32: false
     |x - x'| = 3.47e-07
   * |f(x) - f(x')| / |f(x)| < 1.0e-32: false
     |f(x) - f(x')| / |f(x)| = NaN
   * |g(x)| < 1.0e-08: true
     |g(x)| = 2.33e-09
   * stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 53
 * Gradient Calls: 53

or we could choose a backtracking line search instead

 
julia> optimize(rosenbrock, zeros(2), BFGS(linesearch = LineSearches.BackTracking()))
Results of Optimization Algorithm
 * Algorithm: BFGS
 * Starting Point: [0.0,0.0]
 * Minimizer: [0.9999999926655744,0.9999999853309254]
 * Minimum: 5.379380e-17
 * Iterations: 23
 * Convergence: true
   * |x - x'| < 1.0e-32: false
     |x - x'| = 1.13e-09
   * |f(x) - f(x')| / |f(x)| < 1.0e-32: false
     |f(x) - f(x')| / |f(x)| = NaN
   * |g(x)| < 1.0e-08: true
     |g(x)| = 8.79e-11
   * stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 31
 * Gradient Calls: 24

this defaults to cubic backtracking, but quadratic can be chosen using the order keyword

julia> optimize(rosenbrock, zeros(2), BFGS(linesearch = LineSearches.BackTracking(order = 2)))
Results of Optimization Algorithm
 * Algorithm: BFGS
 * Starting Point: [0.0,0.0]
 * Minimizer: [0.9999999926644578,0.9999999853284671]
 * Minimum: 5.381020e-17
 * Iterations: 23
 * Convergence: true
   * |x - x'| < 1.0e-32: false
     |x - x'| = 4.73e-09
   * |f(x) - f(x')| / |f(x)| < 1.0e-32: false
     |f(x) - f(x')| / |f(x)| = NaN
   * |g(x)| < 1.0e-08: true
     |g(x)| = 1.76e-10
   * stopped by an increasing objective: false
   * Reached Maximum Number of Iterations: false
 * Objective Calls: 29
 * Gradient Calls: 24

LineSearches.jl should have better documentation coming soon, but the code is quite self-explanatory for those who want to twiddle around with these parameters.

The method state is now an argument to optimize
While not always that useful to know for users, we use method states internally to hold all the pre-allocated cache variables that are needed. In the new version of Optim.jl, this can be explicitly provided by the user such that you can retrieve various diagnostics after the optimization routine is done. One such example is the inverse Hessian estimate that BFGS spits out.

method = BFGS()
options = Optim.Options()
initial_x = rand(2)
d = OnceDifferentiable(f, g!, initial_x)
my_state = Optim.initial_state(method, options, d, initial_x)
optimize{d, method, options, my_state)

The future
We have more changes coming in the near future. There’s PR #356 for a Trust Region solver for cases where you can explicitly calculate Hessian-vector products without forming the Hessian (from @jeff-regier from the Celeste.jl project), the interior point replacement for our current barrier function approach to box constrained optimization in PR #303, and more.

]]>
3676
TensorFlow’s SVD is significantly worse than LAPACK’s, but still very good http://www.juliabloggers.com/tensorflows-svd-is-significantly-worse-than-lapacks-but-still-very-good/ Fri, 02 Jun 2017 00:00:00 +0000 http://www.juliabloggers.com/?guid=a307999ea2209ec167320442c9fa7d5f TensorFlow’s SVD is significantly less accurate than LAPACK’s (i.e. julia’s and numpy/SciPy’s backing library for linear algebra). But still incredibly accurate, so probably don’t panic. Unless your matrices have very large ($>10^6$) values, then the accuracy difference might be relevant for you (but probably isn’t). However, both LAPACK and TensorFlow are not great then – LAPACK is still much better.

TensorFlow.jl recently gained bindings for the SVD operation. This came as a result of @malmaud’s great work to automatically creates bindings for all the ops defined by the TensorFlow backend (It is pretty awesome). Some may be surprised to find that TensorFlow supports single value decomposition(SVD) at all. After all, isn’t it a neural network library? My response to that, which I have said before and will say again, is that TensorFlow is (effectively) a linear algebra library with automatic differentiation and GPU support. And having those features, makes it great for implementing neural networks. But it has more general functionality that you would every expect. SVD is one of those features; though I am sure it does have a collection of uses for neural networks – using it to implement PCA for dimensionality reduction as preprocessing comes to mind.

After I had added the binding, to match julia’s return value ordering for Base.svd, I wanted to add a test to make sure it was working correctly. As there are multiple different correct SVD solutions for a given input $M$ I can’t just directly check the returned $U,S,V$ against those returned by julia’s svd. So instead we want to use $U,S,V$ to reconstruct $M$ and test that that reconstruction is close-enough Then what is close enough?
Being as close as julia’s SVD gets makes sense. But when I tested that, it was failing, so I thought I would give it some slack: allowing 2 times the error. But on testing that, it wasn’t enough slack and the tests failed, so I gave it more (after checking the results did at least make sense). I ended up allowing 100 times as much reconstruction error, though this may have been a bit much. Bases on this, I thought I would investigate closer.

These observations are based on TensorFlow.jl, and Julia, but they really apply to any TensorFlow library,and almost any scientific computing library. All the language specific TensorFlow libraries delegate their operations to the same C/C++ backend. Most scientific computing software delegates their linear algrebra routines to some varient of LAPACK; not just julia and SciPy/numpy, but also commerial products like MatLab, and Mathematica. I’m using TensorFlow.jl and julia because that is what I am most familiar with.

There are of-course a variety of algorithms and variations to those algoriths for calculating SVD. It will become obvious that TensorFlow and LAPACK are using different ones. I’ll also point out that there is another implementation in IterativeSolves.jl. I am not going to go into any detail on the differences – I am no serious numerical computation linear-algebraist; I go and bug applied mathematicians when I need to know that kind of stuff.

Here we are just looking at the implementations from the outside.

I am not looking at speed here at all. I don’t know if TensorFlow is faster or slower than LAPACK. In general this depends significantly on your system setup, and how TensorFlow was compiled. It has been reported that it is hugely faster than numpy’s, but I’ve only seen the one report and few details.

If you want to look into TensorFlow’s accuracy checks, I am aware some of the tests for it can be found on their github. It is checking 32bit floats with a tolerance of $10^{-5}$ and 64 bit floats with a tolerance of $10^{-14}$, I think that is with sum of errors.

LAPACK tests are here. However, LAPACK has its own Domain Specific Language for testing, and I don’t speak it at all.

On to our own tests:

Input:

using TensorFlow
using Plots
using DataFrames

To be clear, since these can change with different LAPACKs, and different TensorFlow releases, this is what I am running on:

Input:

versioninfo()

Output:

Julia Version 0.5.1
Commit 6445c82 (2017-03-05 13:25 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Nehalem)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, nehalem)

Input:

TensorFlow.tf_version()

Output:

v"1.0.1"

Also, it is worth looking at these errors in the context of the machine epsilon. Most of these errors are far below that; and so don’t matter at all.

Input:

eps(Float32)

Output:

1.1920929f-7

Input:

eps(Float64)

Output:

2.220446049250313e-16

First we define a function to conveniently call the TensorFlow SVD, on a julia matrix. This works by adding a constant to the graph. This leaks memory like crazy, since it adds a new node every time the test is run but that does not matter for purposes of our test. (But it probably should have been done with a placeholder and a feed dictionary)

Input:

sess=Session(Graph())
svd_tf(m) = run(sess, svd(constant(m)))

Now we define the reconstruction error, how we will evaluate it. We are using squared error: We get one error result per matrix per method. (So not mean squared error, since we want to look at the error distribution).
Note that the evaluation happened entirely in julia, except for the SVD itself.

The choice of sum of square error here, rather than sum of error, is perhaps not ideal.
I’m honestly not sure.
Sum of error would give a much larger result – in fact almost all the errors would be above the machine epsilon. The few papers I have seen evaluating SVD seem to mostly use sum of squared error; but this is not my field.

Input:

recon_err(m, u,s,v) = sum(abs2, m-u*diagm(s)*v')
recon_err_jl(m) = recon_err(m, svd(m)...)
recon_err_tf(m) = recon_err(m, svd_tf(m)...)

We define a function to run our trials, and collect the results. Note that this takes a function matrix_dist_fun(T, size) that is used to generate the data. By changing this function we can change the distribution of values in the trial matricies.

Input:

function generate_data(n_samples, matrix_dist_fun, T, size)
    df = DataFrame(err_jl=T[], err_tf=T[])
    for ii in 1:n_samples
        m = matrix_dist_fun(T, size)
        push!(df, Dict(:err_jl => recon_err_jl(m), :err_tf => recon_err_tf(m)))
    end
    df
end

Here we define the functions to perform our analytics/visualisation. I think a histogram showing the distribution of $err_{tf}/err_{jl}$ is informative. an absolute value histogram would also be informative, but when the values are so low, it become hard to read. As well the quartile values, that is minimum, Q1, median, Q3, maximum, are informative on the absolute values of the error; since they tell us that that say three quarters of all trials showed error less than the given value.

Input:

function plot_relative_error_hist(df)
    histogram(df[:err_tf]./df[:err_jl];
        xlabel="factor by which Tensorflow error is greater than Julia (LAPACK) error",
        ylabel="number of trials with this error",
        title="Histogram of relative error values for SVD reconstruction"
    )
end

Input:

function  quartile_summary(df, field)
    q0 = minimum(df[field])
    q1 = quantile(df[field], 0.25)
    q2 = median(df[field])
    q3 = quantile(df[field], 0.75)
    q4 = maximum(df[field])
    print("$field:\t")
    @printf("Q0=%0.2e\t Q1=%0.2e\t Q2=%0.2e\t Q3=%0.2e\t Q4=%0.2e", q0, q1, q2, q3, q4)
    println()
    (q0, q1, q2, q3, q4)
end

Input:

function display_evaluation_figures(df)
    quartile_summary(df, :err_jl)
    quartile_summary(df, :err_tf)
    plot_relative_error_hist(df)
end

So now onward to the results. In the results that follow it can been seen that all the absolute errors (even for maximum/Q4) are well below the machine epsilon for the type evaluated. (But see close to the bottom where this does not hold). It can be seen that it is very rare for TensorFlow to have a lower error than Julia. Such results would show up as bar in the histogram at $x<1$. Of which there are some, but vanishingly few.

Input:

normal100double = generate_data(1000, randn, Float64, (100,100))
display_evaluation_figures(normal100double)

Output:

err_jl:	Q0=3.99e-26	 Q1=4.84e-26	 Q2=5.33e-26	 Q3=6.22e-26	 Q4=1.27e-25
err_tf:	Q0=7.73e-26	 Q1=1.16e-25	 Q2=1.30e-25	 Q3=1.46e-25	 Q4=5.47e-25
INFO: binning = auto
2.55.07.510.00100200300Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

Input:

normal100float = generate_data(1000, randn, Float32, (100,100))
display_evaluation_figures(normal100float)

Output:

err_jl:	Q0=9.65e-09	 Q1=1.13e-08	 Q2=1.19e-08	 Q3=1.25e-08	 Q4=1.62e-08
err_tf:	Q0=2.38e-08	 Q1=3.63e-08	 Q2=4.02e-08	 Q3=4.49e-08	 Q4=7.15e-08
INFO: binning = auto
246050100150Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

Input:

uniform100double = generate_data(1000, rand, Float64, (100,100))
display_evaluation_figures(uniform100double)

Output:

err_jl:	Q0=4.57e-27	 Q1=6.39e-27	 Q2=7.46e-27	 Q3=8.99e-27	 Q4=2.23e-26
err_tf:	Q0=1.27e-26	 Q1=3.95e-26	 Q2=6.08e-26	 Q3=8.84e-26	 Q4=2.10e-25
INFO: binning = auto
102030020406080100Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

Input:

uniform100float = generate_data(1000, rand, Float32, (100,100))
display_evaluation_figures(uniform100float)

Output:

err_jl:	Q0=1.07e-09	 Q1=1.31e-09	 Q2=1.47e-09	 Q3=1.69e-09	 Q4=2.95e-09
err_tf:	Q0=2.98e-09	 Q1=4.29e-09	 Q2=4.66e-09	 Q3=5.18e-09	 Q4=7.58e-09
INFO: binning = auto
246050100Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

Input:

normal10double = generate_data(1000, randn, Float64, (10,10))
display_evaluation_figures(normal10double)

Output:

err_jl:	Q0=3.69e-29	 Q1=9.58e-29	 Q2=1.38e-28	 Q3=2.24e-28	 Q4=3.18e-27
err_tf:	Q0=1.42e-28	 Q1=4.83e-28	 Q2=7.33e-28	 Q3=1.10e-27	 Q4=5.29e-27
INFO: binning = auto
010203040050100150200Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

Input:

normal10float = generate_data(1000, randn, Float32, (10,10))
display_evaluation_figures(normal10float)

Output:

err_jl:	Q0=8.95e-12	 Q1=2.14e-11	 Q2=2.80e-11	 Q3=3.74e-11	 Q4=1.11e-10
err_tf:	Q0=3.56e-11	 Q1=1.52e-10	 Q2=2.36e-10	 Q3=3.52e-10	 Q4=1.19e-09
INFO: binning = auto
02550050100150Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

In the prior tests all the matrix elements have been small. Either normally distributes, mean 0 and variance 1, or uniformly distributed between 0 and 1. But what happens when we look at matrices with element larger values? To do this, we crank up the variance on the randn. That is to say we generate our trial matrices using variance*randn(T,size). Results follow for variance 10 thousand, 10 million and 10 billion.

Input:

var10Knormal100double = generate_data(1000, (args...)->10_000*randn(args...), Float64, (100,100))
display_evaluation_figures(var10Knormal100double)

Output:

err_jl:	Q0=3.83e-18	 Q1=4.83e-18	 Q2=5.32e-18	 Q3=6.06e-18	 Q4=1.18e-17
err_tf:	Q0=7.46e-18	 Q1=1.16e-17	 Q2=1.29e-17	 Q3=1.46e-17	 Q4=2.15e-17
INFO: binning = auto
1234050100150Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

Input:

var10Mnormal100double = generate_data(1000, (args...)->10_000_000*randn(args...), Float64, (100,100))
display_evaluation_figures(var10Mnormal100double)

Output:

err_jl:	Q0=3.74e-12	 Q1=4.85e-12	 Q2=5.37e-12	 Q3=6.15e-12	 Q4=1.10e-11
err_tf:	Q0=7.98e-12	 Q1=1.17e-11	 Q2=1.32e-11	 Q3=1.48e-11	 Q4=2.38e-11
INFO: binning = auto
1234050100Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

Input:

var10Gnormal100double = generate_data(1000, (args...)->10_000_000_000*randn(args...), Float64, (100,100))
display_evaluation_figures(var10Gnormal100double)

Output:

err_jl:	Q0=3.80e-06	 Q1=4.91e-06	 Q2=5.40e-06	 Q3=6.22e-06	 Q4=1.07e-05
err_tf:	Q0=7.85e-06	 Q1=1.16e-05	 Q2=1.30e-05	 Q3=1.46e-05	 Q4=2.20e-05
INFO: binning = auto
1234050100Histogram of relative error values for SVD reconstructionfactor by which Tensorflow error is greater than Julia (LAPACK) errornumber of trials with this errory1

What we see here, is that the distribution of relative errors remains the same, but the absolute errors increase. i.e. TensorFlow is still generally has around 2.5 times the error of Julia. Further for both TensorFlow and Julia, those absolute errors are increasing quadratically with the variance. This is due to the use of sum of squared error, if we did sum of error, it would be linear increase. So at high variance, this difference in accuracy could matter. Since we are now looking at differences of $10^{-6}$ for example. However, these differences remain small compared to the values in the matrix eg $10^7$.

In the end, the differences are not relevant to most people (Potentially not relevant to anyone). It is merely a curiosity. LAPACK is consistently better at SVD than TensorFlow. Really, one should not be too surprised given that having excellent matrix factorisation is what LAPACK is all about.

Input:

]]>
By: A Technical Blog -- julia

Re-posted from: http://white.ucc.asn.au/2017/06/02/TensorFlow's-SVD-is-significantly-worse-than-LAPACK's,-but-still-very-good.html

TensorFlow’s SVD is significantly less accurate than LAPACK’s (i.e. julia’s and numpy/SciPy’s backing library for linear algebra).
But still incredibly accurate, so probably don’t panic.
Unless your matrices have very large ($>10^6$) values, then the accuracy difference might be relevant for you (but probably isn’t).
However, both LAPACK and TensorFlow are not great then – LAPACK is still much better.

TensorFlow.jl recently gained bindings for the SVD operation.
This came as a result of @malmaud’s great work to automatically creates bindings for all the ops defined by the TensorFlow backend (It is pretty awesome).
Some may be surprised to find that TensorFlow supports single value decomposition(SVD) at all.
After all, isn’t it a neural network library?
My response to that, which I have said before and will say again,
is that TensorFlow is (effectively) a linear algebra library with automatic differentiation and GPU support.
And having those features, makes it great for implementing neural networks.
But it has more general functionality that you would every expect.
SVD is one of those features; though I am sure it does have a collection of uses for neural networks – using it to implement PCA for dimensionality reduction as preprocessing comes to mind.

After I had added the binding, to match julia’s return value ordering for Base.svd,
I wanted to add a test to make sure it was working correctly.
As there are multiple different correct SVD solutions for a given input $M$ I can’t just directly check the returned $U,S,V$ against those returned by julia’s svd.
So instead we want to use $U,S,V$ to reconstruct $M$ and test that that reconstruction is close-enough

Then what is close enough?
Being as close as julia’s SVD gets makes sense.
But when I tested that, it was failing,
so I thought I would give it some slack: allowing 2 times the error.
But on testing that, it wasn’t enough slack and the tests failed, so I gave it more (after checking the results did at least make sense).
I ended up allowing 100 times as much reconstruction error, though this may have been a bit much.
Bases on this, I thought I would investigate closer.

These observations are based on TensorFlow.jl, and Julia, but they really apply to any TensorFlow library,and almost any scientific computing library.
All the language specific TensorFlow libraries delegate their operations to the same C/C++ backend.
Most scientific computing software delegates their linear algrebra routines to some varient of LAPACK; not just julia and SciPy/numpy, but also commerial products like MatLab, and Mathematica.
I’m using TensorFlow.jl and julia because that is what I am most familiar with.

There are of-course a variety of algorithms and variations to those algoriths for calculating SVD.
It will become obvious that TensorFlow and LAPACK are using different ones.
I’ll also point out that there is another implementation in IterativeSolves.jl.
I am not going to go into any detail on the differences – I am no serious numerical computation linear-algebraist; I go and bug applied mathematicians when I need to know that kind of stuff.

Here we are just looking at the implementations from the outside.

I am not looking at speed here at all.
I don’t know if TensorFlow is faster or slower than LAPACK.
In general this depends significantly on your system setup, and how TensorFlow was compiled.
It has been reported that it is hugely faster than numpy’s, but I’ve only seen the one report and few details.

If you want to look into TensorFlow’s accuracy checks, I am aware some of the tests for it can be found on their github. It is checking 32bit floats with a tolerance of $10^{-5}$ and 64 bit floats with a tolerance of $10^{-14}$, I think that is with sum of errors.

LAPACK tests are here. However, LAPACK has its own Domain Specific Language for testing, and I don’t speak it at all.

On to our own tests:

Input:

using TensorFlow
using Plots
using DataFrames

To be clear, since these can change with different LAPACKs, and different TensorFlow releases, this is what I am running on:

Input:

versioninfo()

Output:

Julia Version 0.5.1
Commit 6445c82 (2017-03-05 13:25 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Nehalem)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.7.1 (ORCJIT, nehalem)

Input:

TensorFlow.tf_version()

Output:

v"1.0.1"

Also, it is worth looking at these errors in the context of the machine epsilon.
Most of these errors are far below that; and so don’t matter at all.

Input:

eps(Float32)

Output:

1.1920929f-7

Input:

eps(Float64)

Output:

2.220446049250313e-16

First we define a function to conveniently call the TensorFlow SVD, on a julia matrix.
This works by adding a constant to the graph.
This leaks memory like crazy, since it adds a new node every time the test is run
but that does not matter for purposes of our test.
(But it probably should have been done with a placeholder and a feed dictionary)

Input:

sess=Session(Graph())
svd_tf(m) = run(sess, svd(constant(m)))

Now we define the reconstruction error,
how we will evaluate it.
We are using squared error:

We get one error result per matrix per method.
(So not mean squared error, since we want to look at the error distribution).
Note that the evaluation happened entirely in julia, except for the SVD itself.

The choice of sum of square error here, rather than sum of error, is perhaps not ideal.
I’m honestly not sure.
Sum of error would give a much larger result – in fact almost all the errors would be above the machine epsilon.
The few papers I have seen evaluating SVD seem to mostly use sum of squared error; but this is not my field.

Input:

recon_err(m, u,s,v) = sum(abs2, m-u*diagm(s)*v')
recon_err_jl(m) = recon_err(m, svd(m)...)
recon_err_tf(m) = recon_err(m, svd_tf(m)...)

We define a function to run our trials, and collect the results.
Note that this takes a function matrix_dist_fun(T, size) that is used to generate the data.
By changing this function we can change the distribution of values in the trial matricies.

Input:

function generate_data(n_samples, matrix_dist_fun, T, size)
    df = DataFrame(err_jl=T[], err_tf=T[])
    for ii in 1:n_samples
        m = matrix_dist_fun(T, size)
        push!(df, Dict(:err_jl => recon_err_jl(m), :err_tf => recon_err_tf(m)))
    end
    df
end

Here we define the functions to perform our analytics/visualisation.
I think a histogram showing the distribution of $err_{tf}/err_{jl}$ is informative.
an absolute value histogram would also be informative, but when the values are so low, it become hard to read.
As well the quartile values, that is minimum, Q1, median, Q3, maximum, are informative on the absolute values of the error; since they tell us that that say three quarters of all trials showed error less than the given value.

Input:

function plot_relative_error_hist(df)
    histogram(df[:err_tf]./df[:err_jl];
        xlabel="factor by which Tensorflow error is greater than Julia (LAPACK) error",
        ylabel="number of trials with this error",
        title="Histogram of relative error values for SVD reconstruction"
    )
end

Input:

function  quartile_summary(df, field)
    q0 = minimum(df[field])
    q1 = quantile(df[field], 0.25)
    q2 = median(df[field])
    q3 = quantile(df[field], 0.75)
    q4 = maximum(df[field])
    print("$field:\t")
    @printf("Q0=%0.2e\t Q1=%0.2e\t Q2=%0.2e\t Q3=%0.2e\t Q4=%0.2e", q0, q1, q2, q3, q4)
    println()
    (q0, q1, q2, q3, q4)
end

Input:

function display_evaluation_figures(df)
    quartile_summary(df, :err_jl)
    quartile_summary(df, :err_tf)
    plot_relative_error_hist(df)
end

So now onward to the results.
In the results that follow it can been seen that all the absolute errors (even for maximum/Q4) are well below the machine epsilon for the type evaluated. (But see close to the bottom where this does not hold).
It can be seen that it is very rare for TensorFlow to have a lower error than Julia.
Such results would show up as bar in the histogram at $x<1$.
Of which there are some, but vanishingly few.

Input:

normal100double = generate_data(1000, randn, Float64, (100,100))
display_evaluation_figures(normal100double)

Output:

err_jl:	Q0=3.99e-26	 Q1=4.84e-26	 Q2=5.33e-26	 Q3=6.22e-26	 Q4=1.27e-25
err_tf:	Q0=7.73e-26	 Q1=1.16e-25	 Q2=1.30e-25	 Q3=1.46e-25	 Q4=5.47e-25
INFO: binning = auto



















2.5


5.0


7.5


10.0


0


100


200


300


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

Input:

normal100float = generate_data(1000, randn, Float32, (100,100))
display_evaluation_figures(normal100float)

Output:

err_jl:	Q0=9.65e-09	 Q1=1.13e-08	 Q2=1.19e-08	 Q3=1.25e-08	 Q4=1.62e-08
err_tf:	Q0=2.38e-08	 Q1=3.63e-08	 Q2=4.02e-08	 Q3=4.49e-08	 Q4=7.15e-08
INFO: binning = auto



















2


4


6


0


50


100


150


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

Input:

uniform100double = generate_data(1000, rand, Float64, (100,100))
display_evaluation_figures(uniform100double)

Output:

err_jl:	Q0=4.57e-27	 Q1=6.39e-27	 Q2=7.46e-27	 Q3=8.99e-27	 Q4=2.23e-26
err_tf:	Q0=1.27e-26	 Q1=3.95e-26	 Q2=6.08e-26	 Q3=8.84e-26	 Q4=2.10e-25
INFO: binning = auto



















10


20


30


0


20


40


60


80


100


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

Input:

uniform100float = generate_data(1000, rand, Float32, (100,100))
display_evaluation_figures(uniform100float)

Output:

err_jl:	Q0=1.07e-09	 Q1=1.31e-09	 Q2=1.47e-09	 Q3=1.69e-09	 Q4=2.95e-09
err_tf:	Q0=2.98e-09	 Q1=4.29e-09	 Q2=4.66e-09	 Q3=5.18e-09	 Q4=7.58e-09
INFO: binning = auto



















2


4


6


0


50


100


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

Input:

normal10double = generate_data(1000, randn, Float64, (10,10))
display_evaluation_figures(normal10double)

Output:

err_jl:	Q0=3.69e-29	 Q1=9.58e-29	 Q2=1.38e-28	 Q3=2.24e-28	 Q4=3.18e-27
err_tf:	Q0=1.42e-28	 Q1=4.83e-28	 Q2=7.33e-28	 Q3=1.10e-27	 Q4=5.29e-27
INFO: binning = auto



















0


10


20


30


40


0


50


100


150


200


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

Input:

normal10float = generate_data(1000, randn, Float32, (10,10))
display_evaluation_figures(normal10float)

Output:

err_jl:	Q0=8.95e-12	 Q1=2.14e-11	 Q2=2.80e-11	 Q3=3.74e-11	 Q4=1.11e-10
err_tf:	Q0=3.56e-11	 Q1=1.52e-10	 Q2=2.36e-10	 Q3=3.52e-10	 Q4=1.19e-09
INFO: binning = auto



















0


25


50


0


50


100


150


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

In the prior tests all the matrix elements have been small.
Either normally distributes, mean 0 and variance 1,
or uniformly distributed between 0 and 1.
But what happens when we look at matrices with element larger values?
To do this, we crank up the variance on the randn.
That is to say we generate our trial matrices using
variance*randn(T,size).
Results follow for variance 10 thousand, 10 million and 10 billion.

Input:

var10Knormal100double = generate_data(1000, (args...)->10_000*randn(args...), Float64, (100,100))
display_evaluation_figures(var10Knormal100double)

Output:

err_jl:	Q0=3.83e-18	 Q1=4.83e-18	 Q2=5.32e-18	 Q3=6.06e-18	 Q4=1.18e-17
err_tf:	Q0=7.46e-18	 Q1=1.16e-17	 Q2=1.29e-17	 Q3=1.46e-17	 Q4=2.15e-17
INFO: binning = auto



















1


2


3


4


0


50


100


150


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

Input:

var10Mnormal100double = generate_data(1000, (args...)->10_000_000*randn(args...), Float64, (100,100))
display_evaluation_figures(var10Mnormal100double)

Output:

err_jl:	Q0=3.74e-12	 Q1=4.85e-12	 Q2=5.37e-12	 Q3=6.15e-12	 Q4=1.10e-11
err_tf:	Q0=7.98e-12	 Q1=1.17e-11	 Q2=1.32e-11	 Q3=1.48e-11	 Q4=2.38e-11
INFO: binning = auto



















1


2


3


4


0


50


100


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

Input:

var10Gnormal100double = generate_data(1000, (args...)->10_000_000_000*randn(args...), Float64, (100,100))
display_evaluation_figures(var10Gnormal100double)

Output:

err_jl:	Q0=3.80e-06	 Q1=4.91e-06	 Q2=5.40e-06	 Q3=6.22e-06	 Q4=1.07e-05
err_tf:	Q0=7.85e-06	 Q1=1.16e-05	 Q2=1.30e-05	 Q3=1.46e-05	 Q4=2.20e-05
INFO: binning = auto



















1


2


3


4


0


50


100


Histogram of relative error values for SVD reconstruction


factor by which Tensorflow error is greater than Julia (LAPACK) error


number of trials with this error


y1

What we see here, is that the distribution of relative errors remains the same, but the absolute errors increase.
i.e. TensorFlow is still generally has around 2.5 times the error of Julia.
Further for both TensorFlow and Julia, those absolute errors are increasing quadratically with the variance.
This is due to the use of sum of squared error, if we did sum of error, it would be linear increase.
So at high variance, this difference in accuracy could matter.
Since we are now looking at differences of $10^{-6}$ for example.
However, these differences remain small compared to the values in the matrix eg $10^7$.

In the end, the differences are not relevant to most people (Potentially not relevant to anyone).
It is merely a curiosity.
LAPACK is consistently better at SVD than TensorFlow.
Really, one should not be too surprised given that having excellent matrix factorisation is what LAPACK is all about.

Input:

]]>
3674
Solving a simple discrete choice model using Gaussian quadrature http://www.juliabloggers.com/solving-a-simple-discrete-choice-model-using-gaussian-quadrature/ Tue, 30 May 2017 11:16:55 +0000 http://www.pkofod.com/?p=217 Continue reading Solving a simple discrete choice model using Gaussian quadrature ]]> By: pkofod

Re-posted from: http://www.pkofod.com/2017/05/30/solving-a-simple-discrete-choice-model-using-gaussian-quadrature/

In the style of some of the earlier posts, I present a simple economic problem, that uses some sort of numerical method as part of the solution method. Of course, we use Julia to do so. However, this time we’re actually relying a bit on R, but don’t tell anyone.

Rust models

In the empirical discrete choice literature in economics, a relatively simple and popular framework is the one that matured in Rust (1987, 1988), and was later named Rust models in Aguirregabiria and Mira (2010). Basically, we consider an agent who has to choose a (in the infinite horizon) stationary policy (sequence of actions), to solve the following problem

\(\max_{a}E\left\{\sum_{t=0}^{T} \beta^t U(a_t, s_t)|s_0\right\}\)

where \(a=(a_0, a_1, \ldots, a_T)\), and \(s_t\) denotes the states. For simplicity, we consider binary decision problems such that \(a_t\in\{1,2\}\). Assume that there an additive shock, \(\varepsilon_t\), to utility such that

\(U(a_t,s_t)=U(a_t, x_t, \varepsilon_t) = u(a_t, x_t)+\varepsilon_t\)

where \(s_t=(x_t,\varepsilon_t)\) and \(x_t\) is usually called the observed states.

The additive and time separable nature of the problem allows us to consider a set of simpler problems instead. We reformulate the problem according to the principle of optimality, and write the problem in its dynamic programming formulation

\( V_t(x_t, \varepsilon_t) = max_{a_t}\left[u(a_t, x_t)+\epsilon_t + \beta E_{s_{t+1}|s_t}(V_{t+1}(x_{t+1}, \varepsilon_{t+1}))\right], \forall t\in\{0,1,\ldots,T\} \)

The object \(V_t\) is called the value function, as it summarizes the optimal value we can obtain. If we assume conditional independence between the observed states and the shocks along with the assumptions explained in the articles above, we can instead consider this simpler problem

\( W_t(x_t) = E_{\varepsilon_{t+1}|\varepsilon_t}\left\{max_{a_t}(u(a_t, x_t)+\varepsilon_t + \beta E_{x_{t+1}|x_t}((W_{t+1}(x_{t+1}))\right\}, \forall t\in\{0,1,\ldots,T\} \)

where \(W(x_t)\equiv E_{\varepsilon_{t+1}|\varepsilon}\left\{V_{t+1}(x_{t+1},\varepsilon_{t+1})\right\}\). This object is often called the ex-ante or integrated value function. Now, if we assume that the shocks are mean 0 extreme value type I, we get the following

\( W_t(x_t) = \log\left\{\sum_{a\in\mathcal{A}} \exp\left[u(a_t,x_t)+\beta E_{x_{t+1}|x_t}\left(W_{t+1}(x_{t+1})\right)\right]\right\}, \forall t\in\{0,1,\ldots,T\} \)

At this point we’re very close to something we can calculate. Either we just recursively apply the above to find the finite horizon solution, or if we’re in the infinite horizon case, then then we can come up with a guess for \(W\), and apply value function iterations (successive application of the right-hand side in the equation above), to find a solution. We just need to be able to handle the evaluation of the expected value inside the \(\exp\)‘s.

Solving for a continuous function

In the application below, we’re going to have a continuous state. Rust originally solved the problem of handling a continuous function on a computer by discretizing the continuous state in 90 or 175 bins, but we’re going to approach it a bit differently. We’re going to create a type that allows us to construct piecewise linear functions. This means that we’re going to have some nodes where we calculate the function value, and in between these, we simply use linear interpolation. Outside of the first and last nodes we simply set the value to the value at these nodes. We’re not going to extrapolate below, so this won’t be a problem.

Let us have a look at a type that can hold this information.

type PiecewiseLinear
    nodes
    values
    slopes
end

To construct an instance of this type from a set of nodes and a function, we’re going to use the following constructor

function PiecewiseLinear(nodes, f)
    slopes = Float64[]
    fn = f.(nodes)
    for i = 1:length(nodes)-1
        # node i and node i+1
        ni, nip1 = nodes[i], nodes[i+1]
        # f evaluated at the nodes
        fi, fip1 = fn[i], fn[i+1]
        # store slopes in each interval, so we don't have to recalculate them every time
        push!(slopes, (fip1-fi)/(nip1-ni))
    end
    # Construct the type
    PiecewiseLinear(nodes, fn, slopes)
end

Using an instance of \(PiecewiseLinear\) we can now evaluate the function at all input values between the first and last nodes. However, we’re going to have some fun with types in Julia. In Julia, we call a function using parentheses \(f(x)\), but we generally cannot call a type instance.

julia> pwl = PiecewiseLinear(1:10, sqrt)
PiecewiseLinear(1:10,[1.0,1.41421,1.73205,2.0,2.23607,2.44949,2.64575,2.82843,3.0,3.16228],[0.414214,0.317837,0.267949,0.236068,0.213422,0.196262,0.182676,0.171573,0.162278])
 
julia> pwl(3.5)
ERROR: MethodError: objects of type PiecewiseLinear are not callable

… but wouldn’t it be great if we could simply evaluate the interpolated function value at 3.5 that easily? We can, and it’s cool, fun, and extremely handy. The name of the concept is: call overloading. We simply need to define the behavior of “calling” (using parentheses with some input) an instance of a type.

function (p::PiecewiseLinear)(x)
    index_low = searchsortedlast(p.nodes, x)
    n = length(p.nodes)
    if 0 < index_low < n
        return p.values[index_low+1]+(x-p.nodes[index_low+1])*p.slopes[index_low]
    elseif index_low == n
        return p.values[end]
    elseif index_low == 0
        return p.values[1]
    end
end

Basically, we find out which interval we’re in, and then we interpolate appropriately in said interval. Let me say something upfront, or… almost upfront. This post is not about optimal performance. Julia is often sold as “the fast language”, but to me Julia is much more about productivity. Sure, it’s great to be able to optimize your code, but it’s also great to simply be able to do *what you want to do* – without too much hassle. Now, we can do what we couldn’t before.

julia> pwl(3.5)
1.8660254037844386

We can even plot it together with the actual sqrt function on the interval (2,4).

julia> using Plots
 
julia> plot(x->pwl(x), 2, 4, label="Piecewise Linear sqrt")
 
julia> plot!(sqrt, 2, 4, label="sqrt")
 
julia> savefig("sqrtfig")

and we get

which seems to show a pretty good approximation to the square root function considering the very few nodes.

Numerical integrals using Gaussian quadrature

So we can now create continuous function approximations based on finite evaluation points (nodes). This is great, because this allow us to work with \(W\) in our code. The only remaining problem is: we need to evaluate the expected value of \(W\). This can be expressed as

\( E_x(f) = \int_a^b f(x)w(x)dx\)

where \(w(x)\) is going to be a probability density function and \(a\) and \(b\) can be finite or infinite and represent upper and lower bounds on the values the random variable (state) can take on. In the world of Gaussian quadrature, \(w\)‘s are called weight functions. Gaussian quadrature is basically a method for finding good evaluation points (nodes) and associated weights such that the following approximation is good

\(\int_a^b f(x)w(x)dx\approx \sum_{i=1}^N f(x_i)w_i\)

where \(N\) is the number of nodes. We’re not going to provide a long description of the methods involved, but we will note that the package DistQuads.jl allows us to easily obtain nodes and weights for a handful of useful distributions. To install this package write

Pkg.clone("https://github.com/pkofod/DistQuads.jl.git")

as it is not currently tagged in METADATA.jl. This is currently calling out to R’s statmod package. The syntax is quite simple. Define a distribution instance, create nodes and weights, and calculate the expected value of the function in three simple steps:

julia> using Distributions, DistQuads
 
julia> bd = Beta(1.5, 50.0)
Distributions.Beta{Float64}(α=1.5, β=50.0)
 
julia> dq = DistQuad(bd, N = 64)
DistQuads.DistQuad([0.000334965,0.00133945,0.00301221,0.0053512,0.00835354,0.0120155,0.0163327,0.0212996,0.0269103,0.03315780.738325,0.756581,0.774681,0.792633,0.810457,0.828194,0.845925,0.863807,0.882192,0.902105],[0.00484732,0.0184431,0.0381754,0.0603806,0.0811675,0.097227,0.106423,0.108048,0.102724,0.0920435  …  1.87035e-28,5.42631e-30,1.23487e-31,2.11992e-33,2.60541e-35,2.13019e-37,1.03855e-39,2.52575e-42,2.18831e-45,2.90458e-49],Distributions.Beta{Float64}(α=1.5, β=50.0))
 
julia> E(sqrt, dq)
0.15761865929803381

We can then try Monte Carlo integration with many nodes to see how close they are

julia> mean(sqrt.(rand(bd, 100000000)))
0.1576136243615477

and they appear to be in the same ballpark.

To replace, or not to replace

The model we’re considering here is a simple one. It’s a binary choice model very close to the model in Rust (1987). An agent is in charge of maintaining a bus fleet and has a binary choice each month then the buses come in for maintenance: replace the engine (effectively renewing the bus) or maintain it. Replacement costs a fixed price RC, and regular maintenance has a cost that is a linear function of the odometer reading since last replacement (or purchase if replacement has never occurred). We can use the expressions above to solve this model, but first we need to specify how the odometer reading changes from month to month conditional on the choices made. We assume that the odometer reading (mileage) changes according to the following

\(x_{t+1}=\tilde{a}x_{t}+(1-\tilde{a}x_t)\Delta x, \quad\text{where }\Delta x \sim Beta(1.4, 50.0)\)

where \(\tilde{a}=2-a\), and as we remember \(a\in\{1,2\}\). As we see, a replacement returns the state to 0 plus whatever mileage might accumulate that month, and regular maintenance means that the bus will end up with end of period mileage between \(x_{t+1}\) and 1. To give an idea about the state process, we see the pdf for the distribution of \(\Delta x\) below.

Solving the model

We are now ready to solve the model. Let us say that the planning horizon is 96 months and the monthly discount factor is 0.9. After the 96 months, the bus is scrapped at a value of 2 units of currency, such that

\(W_T(x)=2.0\)

and from here on, we use the recursion from above. First, set up the state space.

using Distributions, DistQuads, Plots
# State space
bd = Beta(1.5, 50.0)
dq = DistQuad(bd, N = 64)
 
Sˡ = 0
Sʰ = 1
 
n_nodes = 100 # arbitrary, but could be varied
nodes = linspace(Sˡ, Sʰ, n_nodes) # doesn't have to be uniformly distributed
 
RC = 11.5 # Replacement cost
c = 9.0 # parameter in linear maintenance cost c*x
β = 0.9 # discount factor

Then, define utility function and expectation operator

u1 = PiecewiseLinear(nodes, x->-c*x) # "continuous" cost of maintenance
u2 = PiecewiseLinear(nodes, x->-RC) # "continuous" cost of replacement (really just a number, but...)
 
# Expected value of f at x today given a where x′ is a possible state next period
Ex(f, x, a, dq) = E(x′->f((2-a)*x.+(Sʰ-(2-a)*x).*x′), dq)

Then, we simply

#### SOLVE
V = Array{PiecewiseLinear,1}(70)
V[70] = PiecewiseLinear(nodes, x->2)
for i = 69:-1:1
    EV1 = PiecewiseLinear(nodes, x->Ex(V[i+1], x, 1, dq))
    EV2 = PiecewiseLinear(nodes, x->Ex(V[i+1], x, 2, dq))
    V[i] = PiecewiseLinear(nodes, x->log(exp(u1(x)*EV1(x))+exp(u2(x)*EV2(x))))
end

To get our solution. We can then plot either integrated value functions or policies (choice probabilities). We calculate the policies using the following function

function CCP(x, i)
    EV1 = PiecewiseLinear(nodes, x->Ex(V[i], x, 1, dq))
    EV2 = PiecewiseLinear(nodes, x->Ex(V[i], x, 2, dq))
    1/(1+exp(u2(x)*EV2(x)-u1(x)*EV1(x)))
end


We see that there are not 69/70 distinct curves in the plots. This is because we eventually approach the “infinite horizon”/stationary policy and solution.

Simulation

Given the CCPs from above, it is very straight forward to simulate an agent say from period 1 to period 69.

#### SIMULATE
x0 = 0.0
x = [x0]
a0 = 0
a = [a0]
T = 69
for i = 2:T
   _a = rand()<CCP(x[end], i) ? 1 : 2
   push!(a, _a)
   push!(x, (2-_a)*x[end]+(Sʰ-(2-_a)*x[end])*rand(bd))
end
plot(1:T, x, label="mileage")
Is = []
for i in eachindex(a)
    if a[i] == 2
        push!(Is, i)
    end
end
vline!(Is, label="replacement")


Conclusion
This blog post had a look at simple quadrature, creating custom types with call overloading in Julia, and how this can be used to a solve a very simple discrete choice model in Julia. Interesting extensions are of course to allow for more states, more choices, other shock distributions than extreme value type I and so on. Let me know if you try to extend the model in any of those directions, and I would love to have a look!

References

  • Aguirregabiria, Victor, and Pedro Mira. “Dynamic discrete choice structural models: A survey.” Journal of Econometrics 156.1 (2010): 38-67.
  • Rust, John. “Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher.” Econometrica: Journal of the Econometric Society (1987): 999-1033.
  • John, Rust. “Maximum likelihood estimation of discrete control processes.” SIAM Journal on Control and Optimization 26.5 (1988): 1006-1024.
  • ]]>
    3671
    Type-Dispatch Design: Post Object-Oriented Programming for Julia http://www.juliabloggers.com/type-dispatch-design-post-object-oriented-programming-for-julia/ Mon, 29 May 2017 12:57:30 +0000 http://www.stochasticlifestyle.com/?p=668 In this post I am going to try to explain in detail the type-dispatch design which is used in Julian software architectures. It's modeled after the design of many different packages and Julia Base, and has been discussed in parts elsewhere. This is actually just a blog post translation from my "A Deep Introduction to Julia for Data Science and Scientific Computing" workshop notes. I think it's an important enough topic to share more broadly.

    The tl;dr: Julia is built around types. Software architectures in Julia are built around good use of the type system. This makes it easy to build generic code which works over a large range of types and gets good performance. The result is high-performance code that has many features. In fact, with generic typing, your code may have more features than you know of! The ... READ MORE

    The post Type-Dispatch Design: Post Object-Oriented Programming for Julia appeared first on Stochastic Lifestyle.

    ]]>
    By: Christopher Rackauckas

    Re-posted from: http://www.stochasticlifestyle.com/type-dispatch-design-post-object-oriented-programming-julia/

    In this post I am going to try to explain in detail the type-dispatch design which is used in Julian software architectures. It’s modeled after the design of many different packages and Julia Base, and has been discussed in parts elsewhere. This is actually just a blog post translation from my “A Deep Introduction to Julia for Data Science and Scientific Computing” workshop notes. I think it’s an important enough topic to share more broadly.

    The tl;dr: Julia is built around types. Software architectures in Julia are built around good use of the type system. This makes it easy to build generic code which works over a large range of types and gets good performance. The result is high-performance code that has many features. In fact, with generic typing, your code may have more features than you know of! The purpose of this tutorial is to introduce the multiple dispatch designs that allow this to happen.

    Now let’s discuss the main components of this design!

    Duck Typing

    If it quacks like a duck, it might as well be a duck. This is the idea of defining an object by the way that it acts. This idea is central to type-based designs: abstract types are defined by how they act. For example, a `Number` is some type that can do things like +,-,*, and /. In this category we have things like Float64 and Int32. An AbstractFloat is some floating point number, and so it should have a dispatch of eps(T) that gives its machine epsilon. An AbstractArray is a type that can be indexed like `A[i]`. An AbstractArray may be mutable, meaning it can be “set”: A[i]=v.

    These abstract types then have actions which abstract from their underlying implmentation. A.*B does element-wise multiplication, and in many cases it does not matter what kind of array this is done on. The default is Array which is a contiguous array on the CPU, but this action is common amongst AbstractArray types. If a user has a DistributedArray (DArray), then A.*B will work on multiple nodes of a cluster. If the user uses a `GPUArray`, then A.*B will be performed on the GPU. Thus, if you don’t restrict the usage of your algorithm to Array, then your algorithm actually “just works” as many different algorithms.

    This is all well and good, but this would not be worthwhile if it were not performant. Thankfully, Julia has an answer to this. Every function auto-specializes on the types which it is given. Thus if you look at something like:

    my_square(x) = x^2

    then we see that this function will be efficient for the types that we give it. Looking at the generated code:

    @code_llvm my_square(1)
    define i64 @julia_my_square_72669(i64) #0 {
    top:
      %1 = mul i64 %0, %0
      ret i64 %1
    }
    @code_llvm my_square(1.0)
    define double @julia_my_square_72684(double) #0 {
    top:
      %1 = fmul double %0, %0
      ret double %1
    }

    See that the function which is generated by the compiler is different in each case. The first specifically is an integer multiplication x*x of the input x. The other is a floating point multiplication x*x of the input x. But this means that it does not matter what kind of Number we put in here: this function will work as long as * is defined, and it will be efficient by Julia’s multiple dispatch design.

    Thus we don’t need to restrict the types we allow in functions in order to get performance. That means that

    my_restricted_square(x::Int) = x^2

    is no more efficient than the version above, and actually generates the same exact compiled code:

    @code_llvm my_restricted_square(1)
     
    define i64 @julia_my_restricted_square_72686(i64) #0 {
    top:
      %1 = mul i64 %0, %0
      ret i64 %1
    }

    Thus we can write generic and efficient code by leaving our functions unrestricted. This is the practice of duck-typing functions. We just let them work on any input types. If the type has the correct actions, the function will “just work”. If it does not have the correct actions, for our example above say * is undefined, then a MethodError saying the action is not defined will be thrown.

    We can be slightly more conservative by restricting to abstract types. For example:

    my_number_restricted_square(x::Number) = x^2

    will allow any Number. There are things which can square which aren’t Numbers for which this will now throw an error (a matrix is a simple example). But, this can let us clearly define the interface for our package/script/code. Using these assertions, we can then dispatch differently for different type classes. For example:

    my_number_restricted_square(x::AbstractArray) = (println(x);x.^2)

    Now, my_number_restricted_square calculates x^2 on a Number, and for an array it will print the array and calculate x^2 element-wise. Thus we are controlling behavior with broad strokes using classes of types and their associated actions.

    Type Hierarchies

    This idea of control leads to type hierarchies. In object-oriented programming languages, you sort objects by their implementation. Fields, the pieces of data that an object holds, are what is inherited.

    There is an inherent limitation to that kind of thinking when looking to achieve good performance. In many cases, you don’t need as much data to do an action. A good example of this is the range type, for example 1:10.

    a = 1:10

    This type is an abstract array:

    typeof(a) <: AbstractArray
    #true

    It has actions like an Array

    fieldnames(a)
     
    # Output
    2-element Array{Symbol,1}:
     :start
     :stop

    It is an immutable type which just holds the start and stop values. This means that its indexing, A[i], is just a function. What’s nice about this is that means that no array is ever created. Creating large arrays can be a costly action:

    @time collect(1:10000000)
    0.038615 seconds (308 allocations: 76.312 MB, 45.16% gc time)

    But creating an immutable type of two numbers is essentially free, no matter what those two numbers are:

    @time 1:10000000
    0.000001 seconds (5 allocations: 192 bytes)

    The array takes \mathcal{O}(n) memory to store its values while this type is \mathcal{O}(1), using a constant 192 bytes (if the start and stop are Int64). Yet, in cases where we just want to index values, they act exactly the same.

    Another nice example is the UniformScaling operator, which acts like an identity matrix without forming an identity matrix.

    println(I[10,10]) # prints 1
    println(I[10,2]) # prints 0

    This can calculate expressions like A-b*I without ever forming the matrix (eye(n)) which would take \mathcal{O}(n^2) memory.

    This means that a lot of efficiency can be gained by generalizing our algorithms to allow for generic typing and organization around actions. This means that, while in an object-oriented programming language you group by implementation details, in typed-dispatch programming you group by actions. Number is an abstract type for “things which act like numbers, i.e. do things like *”, while AbstractArray is for “things which index and sometimes set”.

    This is the key idea to keep in mind when building type hierarchies: things which subtype are inheriting behavior. You should setup your abstract types to mean the existence or non-existence of some behavior. For example:

    abstract AbstractPerson
    abstract AbstractStudent <: AbstractPerson
    abstract AbstractTeacher <: AbstractPerson
     
    type Person <: AbstractPerson
      name::String    
    end
     
    type Student <: AbstractStudent
      name::String  
      grade::Int
      hobby::String
    end
     
    type MusicStudent <: AbstractStudent
      grade::Int
    end
     
    type Teacher <: AbstractTeacher
      name::String
      grade::Int
    end

    This can be interpreted as follows. At the top we have AbstractPerson. Our interface here is “a Person is someone who has a name which can be gotten by get_name”.

    get_name(x::AbstractPerson) = x.name

    Thus codes which are written for an AbstractPerson can “know” (by our informal declaration of the interface) that get_name will “just work” for its subtypes. However, notice that MusicStudent doesn’t have a name field. This is because MusicStudents just want to be named whatever the trendiest band is, so we can just replace the usage of the field by the action:

    get_name(x::MusicStudent) = "Justin Bieber"

    In this way, we can use get_name to get the name, and how it was implemented (whether it’s pulling something that had to be stored from memory, or if it’s something magically known in advance) does not matter. We can keep refining this: an AbstractStudent has a get_hobby, but a MusicStudent’s hobby is always Music, so there’s not reason to store that data in the type and instead just have its actions implicitly “know” this. In non-trivial examples (like the range and UniformScaling above), this distinction by action and abstraction away from the actual implementation of the types allows for full optimization of generic codes.

    Small Functions and Constant Propagation

    The next question to ask is, does storing information in functions and actions affect performance? The answer is yes, and in favor of the function approach! To see this, let’s see what happens when we use these functions. To make it simpler, let’s use a boolean function. Teachers are old and don’t like music, while students do like music. But generally people like music. This means that:

    likes_music(x::AbstractTeacher) = false
    likes_music(x::AbstractStudent) = true
    likes_music(x::AbstractPerson) = true

    Now how many records would these people buy at a record store? If they don’t like music, they will buy zero records. If they like music, then they will pick up a random number between 1 and 10. If they are a student, they will then double that (impulsive Millennials!).

    function number_of_records(x::AbstractPerson)
        if !likes_music(x) 
          return 0
        end
        num_records = rand(10)
        if typeof(x) <: AbstractStudent
          return 2num_records
        else 
          return num_records
        end
    end

    Let’s check the code that is created:

    x = Teacher("Randy",11)
    println(number_of_records(x))
    @code_llvm number_of_records(x)

    on v0.6, we get:

    on v0.6, we get:
    ; Function Attrs: uwtable
    define i64 @julia_number_of_records_63848(i8** dereferenceable(16)) #0 !dbg !5 {
    top:
      ret i64 0
    }

    Notice that the entire function compiled away, and the resulting compiled code is “return 0”! Then for a music student:

    x = MusicStudent(10)
    @code_typed number_of_records(x)
     
    # Output
    CodeInfo(:(begin 
            NewvarNode(:(num_records))
            goto 4 # line 30:
            4:  # line 32:
            @@0@@(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), Array{Float64,1}, 0, 10, 0))
            # meta: pop location
            # meta: pop location
            # meta: pop location
            # meta: pop location
            @@1@@(Expr(:invoke, MethodInstance for rand!(::MersenneTwister, ::Array{Float64,1}, ::Int64, ::Type{Base.Random.CloseOpen}), :(Base.Random.rand!), :(Base.Random.GLOBAL_RNG), SSAValue(1), :((Base.arraylen)(SSAValue(1))::Int64), :(Base.Random.CloseOpen))) # line 33:
            (MusicStudent <: Main.AbstractStudent)::Bool # line 34:
            return $(Expr(:invoke, MethodInstance for *(::Int64, ::Array{Float64,1}), :(Main.*), 2, :(num_records))) # line 36:
        end))=>Array{Float64,1}

    we get a multiplication by 2, while for a regular person,

    x = Person("Miguel")
    @code_typed number_of_records(x)
     
    # Output
    CodeInfo(:(begin 
            NewvarNode(:(num_records))
            goto 4 # line 30:
            4:  # line 32:
            @@2@@(Expr(:foreigncall, :(:jl_alloc_array_1d), Array{Float64,1}, svec(Any, Int64), Array{Float64,1}, 0, 10, 0))
            # meta: pop location
            # meta: pop location
            # meta: pop location
            # meta: pop location
            @@3@@(Expr(:invoke, MethodInstance for rand!(::MersenneTwister, ::Array{Float64,1}, ::Int64, ::Type{Base.Random.CloseOpen}), :(Base.Random.rand!), :(Base.Random.GLOBAL_RNG), SSAValue(1), :((Base.arraylen)(SSAValue(1))::Int64), :(Base.Random.CloseOpen))) # line 33:
            (Person <: Main.AbstractStudent)::Bool
            goto 22 # line 34:
            22:  # line 36:
            return num_records
        end))=>Array{Float64,1}

    we do not get a multiplication by 2. This is all in the compiled-code, so this means that in one case the *2 simply doesn’t exist at runtime, not even a check for whether to do it.

    The key thing to see from the typed code is that the “branches” (the if statements) all compiled away. Since types are known at compile time (remember, functions specialize on types), the dispatch of likes_music is known at compile-time. But this means, since the result is directly inferred from the dispatch, the boolean value true/false is known at compile time. This means that the compiler can directly infer the answer to all of these checks, and will use this information to skip them at runtime.

    This is the distinction between compile-time information and runtime information. At compile-time, what is known is:

    1. The types of the inputs
    2. Any types which can be inferred from the input types (via type-stability)
    3. The function dispatches that will be internally called (from types which have been inferred)

    Note that what cannot be inferred by the compiler is the information in fields. Information in fields is strictly runtime information. This is easy to see since there is no way for the compiler to know that person’s name was “Miguel”: that’s ephemeral and part of the type instance we just created.

    Thus by putting our information into our functions and dispatches, we are actually giving the compiler more information to perform more optimizations. Therefore using this “action-based design”, we are actually giving the compiler leeway to perform many extra optimizations on our code as long as we define our interfaces by the actions that are used. Of course, at the “very bottom” our algorithms have to use the fields of the types, but the full interface can then be built up using a simple set of functions which in many cases with replace runtime data with constants.

    Traits and THTT

    What we just saw is a “trait”. Traits are compile-time designations about types which are distinct from their abstract hierarchy. likes_music is a trait which designates which people like music, and it could in cases not be defined using the abstract types. For example, we can, using dispatch, create a WeirdStudent which does not like music, and that will still be compile-time information which is fully optimized. This means that these small functions which have constant return values allow for compile-time inheritance of behavior, and these traits don’t have to be tied to abstract types (all of our examples were on AbstractPerson, but we could’ve said a GPUArray likes music if we felt like it!). Traits are multiple-inheritance for type systems.

    Traits can be more refined than just true/false. This can be done by having the return be a type itself. For example, we can create music genre types:

    abstract MusicGenres
    abstract RockGenre <: MusicGenres
    immutable ClassicRock <: RockGenre end
    immutable AltRock <: RockGenre end
    immutable Classical <: MusicGenres end

    These “simple types” are known as singleton types. This means that we can have traits like:

    favorite_genre(x::AbstractPerson) = ClassicRock()
    favorite_genre(x::MusicStudent) = Classical()
    favorite_genre(x::AbstractTeacher) = AltRock()

    This gives us all of the tools we need to compile the most efficient code, and structure our code around types/actions/dispatch to get high performance out. The last thing we need is syntactic sugar. Since traits are compile-time information, the compiler could in theory dispatch on them. While this is currently not part of Julia, it’s scheduled to be part of a future version of Julia (2.0?). The design for this (since Julia is written in Julia!) is known as the Tim Holy Trait Trick (THTT), named after its inventor. It’s described in detail on this page. But in the end, macros can make this easier. A package which implements trait-dispatch is SimpleTraits.jl, which allows you to dispatch on a trait IsNice like:

    @traitfn ft(x::::IsNice) = "Very nice!"
    @traitfn ft(x::::(!IsNice)) = "Not so nice!"

    Composition vs Inheritance

    The last remark that is needed is a discussion of composition vs inheritance. While the previous discussions have all explained why “information not in fields” makes structural relations compile-time information and increases the efficiency. However, there are cases where we want to share runtime structure. Thus the great debate of composition vs inheritance, comes up.

    Composition vs inheritance isn’t a Julia issue, it’s a long debate in object-oriented programming. The idea is that, inheritance is inherently (pun-inteded) inflexible. It forces an “is a” relation: A inherits from B means A is a B, and adds a few things. It copies behavior from something defined elsewere. This is a recipe for havoc. Here’s a few links which discuss this in more detail:

    https://softwareengineering.stackexchange.com/questions/134097/why-should-i-prefer-composition-over-inheritance

    https://en.wikipedia.org/wiki/Composition_over_inheritance

    https://www.thoughtworks.com/insights/blog/composition-vs-inheritance-how-choose

    So if possible, give composition a try. Say you have MyType, and it has some function f defined on it. This means that you can extend MyType by making it a field in another type:

    type MyType2
        mt::MyType
        ... # Other stuff
    end 
     
    f(mt2::MyType2) = f(mt2.mt)

    The pro here is that it’s explicit: you’ve made the choice for each extension. The con is that this can require some extra code, though this can be automated by metaprogramming.

    What if you really really really want inheritance of fields? There are solutions via metaprogramming. One simple solution is the @def macro.

      macro def(name, definition)
          return quote
              macro $name()
                  esc($(Expr(:quote, definition)))
              end
          end
      end

    This macro is very simple. What it does is compile-time copy/paste. For example:

    @def give_it_a_name begin
      a = 2
      println(a)
    end

    defines a macro @give_it_a_name that will paste in those two lines of code wherever it is used. As another example, the reused fields of Optim.jl’s solvers could be put into an @def:

    @def add_generic_fields begin
            method_string::String
            n::Int64
            x::Array{T}
            f_x::T
            f_calls::Int64
            g_calls::Int64
            h_calls::Int64
    end

    and those fields can be copied around with

    type LBFGSState{T}
        @add_generic_fields
        x_previous::Array{T}
        g::Array{T}
        g_previous::Array{T}
        rho::Array{T}
        # ... more fields ... 
    end

    Because @def works at compile-time, there is no cost associated with this. Similar metaprogramming can be used to build an “inheritance feature” for Julia. One package which does this is ConcreteAbstractions.jl which allows you to add fields to abstract types and make the child types inherit the fields:

    # The abstract type
    @base type AbstractFoo{T}
        a
        b::Int
        c::T
        d::Vector{T}
    end
     
    # Inheritance
    @extend type Foo <: AbstractFoo
        e::T
    end

    where the @extend macro generates the type-definition:

    type Foo{T} <: AbstractFoo
        a
        b::Int
        c::T
        d::Vector{T}
        e::T
    end

    But it’s just a package? Well, that’s the beauty of Julia. Most of Julia is written in Julia, and Julia code is first class and performant (here, this is all at compile-time, so again runtime is not affected at all). Honestly, if something ever gets added to Julia’s Base library for this, it will likely look very similar, and the only real difference to the user will be that the compiler will directly recognize the keywords, meaning you would use base and extend instead of @base and @extend. So if you have something that really really really needs inheritance, go for it: there’s no downsides to using a package + macro for this. But you should really try other means to reduce the runtime information and build a more performant and more Julian architecture first.

    Conclusion

    Programming for type systems has a different architecture than object-oriented systems. Instead of being oriented around the objects and their fields, type-dispatch systems are oriented around the actions of types. Using traits, multiple inheritance behavior can be given. Using this structure, the compiler can have maximal information, and use this to optimize the code. But also, this directly generalizes the vast majority of the code to not be “implementation-dependent”, allowing for duck-typed code to be fully performant, with all of the details handled by dispatch/traits/abstract types. The end result is flexible, generic, and high performance code.

    The post Type-Dispatch Design: Post Object-Oriented Programming for Julia appeared first on Stochastic Lifestyle.

    ]]> 3669 Julia has an issue… http://www.juliabloggers.com/julia-has-an-issue/ Sun, 28 May 2017 15:00:00 +0000 http://kristofferc.github.io/post/julia_issue/ As most of us in the Julia community know, Julia has an issue. In fact, looking at the Julia repository we see that there are (at time of writing) 11865 issues, where 1872 are open and 9993 are closed. An interesting question to ask is:

    • How has the ratio between open and closed issues varied over the development of Julia? And how about for pull requests (PRs)?

    In this post, the aim is to answer the question above using the data that can be scraped from the GitHub repo.

    Getting the data

    A large amount of data for a repository hosted on GitHub can be found via the GitHub API. The GitHub.jl package provides a convenient interface to communicate with this API from Julia.

    Getting the data for the issues (and PRs) is as simple as:

    import GitHub
    myauth = GitHub.authenticate(ENV["GITHUB_AUTH"])    
    repo = GitHub.repo("JuliaLang/julia")
    myparams = Dict("state" => "all", "per_page" => 100, "page" => 1);
    issues, page_data = GitHub.issues(repo; params = myparams, auth = myauth)
    

    where I have created a “GitHub access token” and saved it in the environment variable GITHUB_AUTH. Note here that the function name GitHub.issues is a bit of a misnomer. What is returned is actually both issues and PRs. The variable issues now contains a Vector of all the issues and PRs made to the Julia repo. Each element in the vector is an Issue, which is a struct containing fields corresponding to the keys of the returned JSON from the GitHub REST API. In order to not have to rescrape the data every time Julia is started, it would be nice to store it to disk. The standard way of storing Julia data is by using the *JLD.jl package. Unfortunately, JLD.jl has some problems handling Nullable’s (see this issue). However, there is an unregistered package called JLD2.jl that does support Nullable’s. The code below uses JLD2 to save the issues to a .jld file and then reload them again:

    using JLD2
    
    function save_issues(issues)
        f = jldopen("issues.jld", "w")
        write(f, "issues", issues)
        close(f)
    end
    
    function load_issues()
        f = jldopen("issues.jld", "r")
        issues = read(f, "issues")
        close(f)
        return issues
    end
    

    I put up the resulting .jld file here if you don’t feel like doing the scraping yourself.

    Digression: a DateVector

    In this post we will deal with Vector’s that are naturally indexed by Date’s instead of standard integers starting at 1. Therefore, using the powerful AbstractVector interface we easily create a Vector that can be indexed with single Date’s ranges consisting of Date’s etc.

    The implementation below should be fairly self explanatory. We create the DateVector struct wrapping a Vector and two Date’s (the start and end date) and define a minimum amount of methods needed. Vector’s with non conventional indices are still quite new in Julia so not everything work perfectly with them. Here, we just implement the functionality needed to set and retrieve data using indexing with Dates.

    struct DateVector{T} <: AbstractVector{T}
        v::Vector{T}
        startdate::Date
        enddate::Date
    
        function DateVector(v::Vector{T}, startdate::Date, enddate::Date) where T
            len = (enddate - startdate) ÷ Dates.Day(1) + 1
            if length(v) != len
                throw(ArgumentError("length of vector v $(length(v)) not equal to date range $len"))
            end
            new{T}(v, startdate, enddate)
        end
    end
    
    Base.endof(dv::DateVector) = dv.enddate
    Base.indices(dv::DateVector) = (dv.startdate:dv.enddate,)
    Base.getindex(dv::DateVector, date::Date) = dv.v[(date - dv.startdate) ÷ Dates.Day(1) + 1]
    Base.setindex!(dv::DateVector, v, date::Date) = dv.v[(date - dv.startdate) ÷ Dates.Day(1) + 1] = v
    Base.checkindex(::Type{Bool}, d::Range{Date}, v::Range{Date}) = length(v) == 0 || (first(v) in d && last(v) in d)
    Base.Array(dv::DateVector) = dv.v
    

    The DateVector can now be seen in action by for example indexing into it with a Range of Date’s:

    julia> v = rand(10);
    
    julia> dv = DateVector(v, Date("2015-01-01"), Date("2015-01-10"));
    
    julia> dv[Date("2015-01-02"):Date("2015-01-05")]
    4-element Array{Float64,1}:
     0.299136
     0.898991
     0.0626245
     0.585839
    
    julia> v[2:5]
    4-element Array{Float64,1}:
     0.299136
     0.898991
     0.0626245
     0.585839
    

    Total number of opened and closed issues / PRs over time

    For a given issue we can check the time it was created, what state it is in (open / closed), what time it was eventually closed and if it is, in fact, a pull request and not an issue:

    julia> get(issues[1].created_at)
    2017-05-23T18:09:45
    
    julia> get(issues[1].state)
    "open"
    
    julia> isnull(issues[1].closed_at)
    true
    
    julia> isnull(issues[1].pull_request) # pull_request is null so this is an issue
    true
    

    It is now quite easy to write a function that takes an issue and returns two Date-intervals, the first for when the issue was opened, and the second for when it was closed up until today. If the issue is still open, we make sure to return an empty interval for the closed period.

    function open_closed_range(issue)
        opened = Date(get(issue.created_at))
        if get(issue.state) == "closed"
            closed = Date(get(issue.closed_at))
        else
            closed = Date(now())
        end
        return opened:closed, (closed + Dates.Day(1)):Date(now())
    end
    

    As an example, we can test this function on an issue:

    julia> open_closed_range(issues[200])
    (2017-05-13:1 day:2017-05-14, 2017-05-15:1 day:2017-05-29)
    

    So, here we see that the issue was opened between 2017-05-13 and 2017-05-14 (and then got closed). Now, we can simply create two DateVector’s, one that will contain the total number of opened issues and the second the total number of closed issues / PRs for a given date

    function count_closed_opened(PRs::Bool, issues)
        min_date = Date(get(minimum(issue.created_at for issue in issues)))
        days_since_min_date = (Date(now()) - min_date) ÷ Dates.Day(1) + 1
        closed_counter = DateVector(zeros(Int, days_since_min_date), min_date, Date(now()))
        open_counter = DateVector(zeros(Int, days_since_min_date), min_date, Date(now()))
    
        for issue in filter(x -> !isnull(x.pull_request) == PRs, issues)
            open_range, closed_range = open_closed_range(issue)
            closed_counter[closed_range] += 1
            open_counter[open_range] += 1
        end
    
        return open_counter, closed_counter
    end
    

    For a given date, the number of opened and closed issues are now readily available:

    julia> opened, closed = count_closed_opened(false, issues);
    
    julia> opened[Date("2016-01-1")]
    288
    
    julia> closed[Date("2016-01-1")]
    5713
    

    We could plot these now using one of our favorite plotting packages. I am personally a LaTeX fan and one of the popular plotting packages for LaTeX is pgfplots. PGFPlotsX.jl is a Julia package that makes it quite easy to interface with pgfplots so this is what I used here. The total number of open and closed issues for different dates is shown below.

    We can see that in the early development of Julia, PRs was not really used. Also, the number of closed issues and PRs seem to grow at approximately the same rate. A noticeable difference is in the number of opened issues and PRs. Open PRs accumulate significantly slower than the number of open issues. This is likely because an open PR become stale quite quickly while issues can take a long time to fix, or does not really have a clear actionable purpose.

    Let’s plot the ratio between open and closed issues / PRs:

    To reduce some of the noise, I started the plot at 2013-06-01. The ratio between open to closed issues seems to slowly increase while for PRs, the number have stabilized at around 0.05.

    Conclusions

    Using the GitHub API it is quite easy to do some data anlysis on a repo. It is hard to say how much actual usefulness can be extracted from the data here but sometimes it is fun to just hack on data. Possible future work could be to do the same analysis here but for other programming language repos. Right now, looking at e.g. the Rust repo they have an open to closed PR ratio of 63 / 21200 ≈ 0.003 which is 20 times lower than Julia. Does this mean that the Julia community need to be better at reviewing PRs to make sure they eventually get merged / closed? Or is the barrier to open a PR to Rust higher so that only PRs with high success of merging gets opened?

    ]]>
    By: Kristoffer Carlsson on Kristoffer Carlsson

    Re-posted from: http://kristofferc.github.io/post/julia_issue/

    As most of us in the Julia community know, Julia has an issue.
    In fact, looking at the Julia repository we see that
    there are (at time of writing) 11865 issues, where 1872 are open and 9993 are closed.
    An interesting question to ask is:

    • How has the ratio between open and closed issues varied over the development of Julia? And how about for pull requests (PRs)?

    In this post, the aim is to answer the question above using the data that can be scraped from the GitHub repo.

    Getting the data

    A large amount of data for a repository hosted on GitHub can be found via the GitHub API.
    The GitHub.jl package provides a convenient
    interface to communicate with this API from Julia.

    Getting the data for the issues (and PRs) is as simple as:

    import GitHub
    myauth = GitHub.authenticate(ENV["GITHUB_AUTH"])    
    repo = GitHub.repo("JuliaLang/julia")
    myparams = Dict("state" => "all", "per_page" => 100, "page" => 1);
    issues, page_data = GitHub.issues(repo; params = myparams, auth = myauth)
    

    where I have created a “GitHub access token” and saved it in the environment variable GITHUB_AUTH. Note here that the function name GitHub.issues is a bit of a misnomer.
    What is returned is actually both issues and PRs.
    The variable issues now contains a Vector of all the issues and PRs made to the Julia repo.
    Each element in the vector is an Issue, which is a struct containing fields
    corresponding to the keys of the returned JSON from the GitHub REST API.
    In order to not have to rescrape the data every time Julia is started, it would be nice to
    store it to disk. The standard way of storing Julia data is by using the *JLD.jl package.
    Unfortunately, JLD.jl has some problems handling Nullable’s (see this issue).
    However, there is an unregistered package called JLD2.jl
    that does support Nullable’s. The code below uses JLD2 to save the issues to a .jld file
    and then reload them again:

    using JLD2
    
    function save_issues(issues)
        f = jldopen("issues.jld", "w")
        write(f, "issues", issues)
        close(f)
    end
    
    function load_issues()
        f = jldopen("issues.jld", "r")
        issues = read(f, "issues")
        close(f)
        return issues
    end
    

    I put up the resulting .jld file here if you don’t feel like doing the scraping yourself.

    Digression: a DateVector

    In this post we will deal with Vector’s that are naturally indexed by
    Date’s instead of standard integers starting at 1. Therefore, using the powerful
    AbstractVector interface
    we easily create a Vector that can be indexed with single Date’s ranges consisting of Date’s etc.

    The implementation below should be fairly self explanatory. We create the DateVector struct
    wrapping a Vector and two Date’s (the start and end date) and define a minimum amount of methods needed.
    Vector’s with non conventional indices are still quite new in Julia so not everything work
    perfectly with them. Here, we just implement the functionality needed
    to set and retrieve data using indexing with Dates.

    struct DateVector{T} <: AbstractVector{T}
        v::Vector{T}
        startdate::Date
        enddate::Date
    
        function DateVector(v::Vector{T}, startdate::Date, enddate::Date) where T
            len = (enddate - startdate) ÷ Dates.Day(1) + 1
            if length(v) != len
                throw(ArgumentError("length of vector v $(length(v)) not equal to date range $len"))
            end
            new{T}(v, startdate, enddate)
        end
    end
    
    Base.endof(dv::DateVector) = dv.enddate
    Base.indices(dv::DateVector) = (dv.startdate:dv.enddate,)
    Base.getindex(dv::DateVector, date::Date) = dv.v[(date - dv.startdate) ÷ Dates.Day(1) + 1]
    Base.setindex!(dv::DateVector, v, date::Date) = dv.v[(date - dv.startdate) ÷ Dates.Day(1) + 1] = v
    Base.checkindex(::Type{Bool}, d::Range{Date}, v::Range{Date}) = length(v) == 0 || (first(v) in d && last(v) in d)
    Base.Array(dv::DateVector) = dv.v
    

    The DateVector can now be seen in action by for example indexing into it
    with a Range of Date’s:

    julia> v = rand(10);
    
    julia> dv = DateVector(v, Date("2015-01-01"), Date("2015-01-10"));
    
    julia> dv[Date("2015-01-02"):Date("2015-01-05")]
    4-element Array{Float64,1}:
     0.299136
     0.898991
     0.0626245
     0.585839
    
    julia> v[2:5]
    4-element Array{Float64,1}:
     0.299136
     0.898991
     0.0626245
     0.585839
    

    Total number of opened and closed issues / PRs over time

    For a given issue we can check the time it was created, what state it is in (open / closed),
    what time it was eventually closed and if it is, in fact, a pull request and not an issue:

    julia> get(issues[1].created_at)
    2017-05-23T18:09:45
    
    julia> get(issues[1].state)
    "open"
    
    julia> isnull(issues[1].closed_at)
    true
    
    julia> isnull(issues[1].pull_request) # pull_request is null so this is an issue
    true
    

    It is now quite easy to write a function that takes an issue and returns two Date-intervals, the first
    for when the issue was opened, and the second for when it was closed up until today.
    If the issue is still open, we make sure to return an empty interval for the closed period.

    function open_closed_range(issue)
        opened = Date(get(issue.created_at))
        if get(issue.state) == "closed"
            closed = Date(get(issue.closed_at))
        else
            closed = Date(now())
        end
        return opened:closed, (closed + Dates.Day(1)):Date(now())
    end
    

    As an example, we can test this function on an issue:

    julia> open_closed_range(issues[200])
    (2017-05-13:1 day:2017-05-14, 2017-05-15:1 day:2017-05-29)
    

    So, here we see that the issue was opened between 2017-05-13 and 2017-05-14 (and then got closed).
    Now, we can simply create two DateVector’s, one that will contain the total number of opened
    issues and the second the total number of closed issues / PRs for a given date

    function count_closed_opened(PRs::Bool, issues)
        min_date = Date(get(minimum(issue.created_at for issue in issues)))
        days_since_min_date = (Date(now()) - min_date) ÷ Dates.Day(1) + 1
        closed_counter = DateVector(zeros(Int, days_since_min_date), min_date, Date(now()))
        open_counter = DateVector(zeros(Int, days_since_min_date), min_date, Date(now()))
    
        for issue in filter(x -> !isnull(x.pull_request) == PRs, issues)
            open_range, closed_range = open_closed_range(issue)
            closed_counter[closed_range] += 1
            open_counter[open_range] += 1
        end
    
        return open_counter, closed_counter
    end
    

    For a given date, the number of opened and closed issues are now readily available:

    julia> opened, closed = count_closed_opened(false, issues);
    
    julia> opened[Date("2016-01-1")]
    288
    
    julia> closed[Date("2016-01-1")]
    5713
    

    We could plot these now using one of our favorite plotting packages. I am personally a LaTeX fan and one of the popular
    plotting packages for LaTeX is pgfplots. PGFPlotsX.jl is a Julia package that makes
    it quite easy to interface with pgfplots so this is what I used here. The total number of open and
    closed issues for different dates is shown below.

    We can see that in the early development of Julia, PRs was not really used.
    Also, the number of closed issues and PRs seem to grow at approximately the same rate.
    A noticeable difference is in the number of opened issues and PRs. Open PRs accumulate
    significantly slower than the number of open issues. This is likely because an open PR
    become stale quite quickly while issues can take a long time to fix, or does not really have
    a clear actionable purpose.

    Let’s plot the ratio between open and closed issues / PRs:

    To reduce some of the noise, I started the plot at 2013-06-01. The ratio between open to closed issues seems to slowly increase while for PRs, the number have stabilized at around 0.05.

    Conclusions

    Using the GitHub API it is quite easy to do some data anlysis on a repo.
    It is hard to say how much actual usefulness can be extracted from the data here
    but sometimes it is fun to just hack on data.
    Possible future work could be to do the same analysis here but for other programming language repos.
    Right now, looking at e.g. the Rust repo they have an open to closed PR ratio of 63 / 21200 ≈ 0.003 which is 20 times lower than Julia. Does this mean that the Julia community need to be better at reviewing PRs to make sure they eventually get merged / closed? Or is the barrier to open a PR to Rust higher so that only PRs with high success of merging gets opened?

    ]]>
    3667
    Julia Ranks Among Top 10 Programming Languages on GitHub http://www.juliabloggers.com/julia-ranks-among-top-10-programming-languages-on-github/ Thu, 25 May 2017 00:00:00 +0000 http://juliacomputing.com/press/2017/05/25/github Cambridge, MA – Julia ranks as one of the top 10 programming languages on GitHub as measured by the number of GitHub stars and the number of GitHub repositories.

    GitHub users ‘star’ a repository in order to show appreciation and create a bookmark for easy access.

    Julia ranks #10 in GitHub stars and #8 in number of repositories

    Top GitHub Programming Languages

    Rank Language GitHub Stars Number of Repositories
    1 Swift 38,513 5,755
    2 Go 28,230 3,770
    3 TypeScript 22,248 3,230
    4 Rust 21,938 4,072
    5 CoffeeScript 13,972 1,909
    6 Kotlin 13,074 1,219
    7 Ruby 12,212 3,401
    8 PHP 11,989 4,037
    9 Elixir 10,103 1,399
    10 Julia 8,651 1,982
    11 Scala 8,254 2,092
    12 Crystal 8,161 656
    13 Python 7,835 1,294
    14 Roslyn 7,538 1,851
    15 PowerShell 7,010 913

    Ranked by Number of GitHub Stars

    About Julia Computing and Julia

    Julia Computing (JuliaComputing.com) was founded in 2015 by the co-creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia.

    Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of Java and C++. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. With more than 1 million downloads and +161% annual growth, Julia adoption is growing rapidly in finance, energy, robotics, genomics and many other fields.

    1. Julia is lightning fast. Julia provides speed improvements up to 1,000x for insurance model estimation, 225x for parallel supercomputing image analysis and 11x for macroeconomic modeling.

    2. Julia is easy to learn. Julia’s flexible syntax is familiar and comfortable for users of Python, R and Matlab.

    3. Julia integrates well with existing code and platforms. Users of Python, R, Matlab and other languages can easily integrate their existing code into Julia.

    4. Elegant code. Julia was built from the ground up for mathematical, scientific and statistical computing, and has advanced libraries that make coding simple and fast, and dramatically reduce the number of lines of code required – in some cases, by 90% or more.

    5. Julia solves the two language problem. Because Julia combines the ease of use and familiar syntax of Python, R and Matlab with the speed of C, C++ or Java, programmers no longer need to estimate models in one language and reproduce them in a faster production language. This saves time and reduces error and cost.

    Julia users, partners and employers looking to hire Julia programmers in 2017 include: Google, Apple, Amazon, Facebook, IBM, Intel, Microsoft, BlackRock, Capital One, PwC, Ford, Oracle, Comcast, DARPA, Moore Foundation, Federal Reserve Bank of New York (FRBNY), UC Berkeley Autonomous Race Car (BARC), Federal Aviation Administration (FAA), MIT Lincoln Labs, Nobel Laureate Thomas J. Sargent, Brazilian National Development Bank (BNDES), Conning, Berkery Noyes, BestX, Path BioAnalytics, Invenia, AOT Energy, AlgoCircle, Trinity Health, Gambit, Augmedics, Tangent Works, Voxel8, Massachusetts General Hospital, NaviHealth, Farmers Insurance, Pilot Flying J, Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC), Oak Ridge National Laboratory, Los Alamos National Laboratory, Lawrence Livermore National Laboratory, National Renewable Energy Laboratory, MIT, Caltech, Stanford, UC Berkeley, Harvard, Columbia, NYU, Oxford, NUS, UCL, Nantes, Alan Turing Institute, University of Chicago, Cornell, Max Planck Institute, Australian National University, University of Warwick, University of Colorado, Queen Mary University of London, London Institute of Cancer Research, UC Irvine, University of Kaiserslautern, University of Queensland.

    Julia is being used to: analyze images of the universe and research dark matter, drive parallel supercomputing, diagnose medical conditions, provide surgeons with real-time imagery using augmented reality, analyze cancer genomes, manage 3D printers, pilot self-driving racecars, build drones, improve air safety, manage the electric grid, provide analytics for foreign exchange trading, energy trading, insurance, regulatory compliance, macroeconomic modeling, sports analytics, manufacturing, and much, much more.

    ]]>
    By: Julia Computing, Inc.

    Re-posted from: http://juliacomputing.com/press/2017/05/25/github.html

    Cambridge, MA – Julia ranks as one of the top 10 programming languages on GitHub as measured by the number of GitHub stars and the number of GitHub repositories.

    GitHub users ‘star’ a repository in order to show appreciation and create a bookmark for easy access.

    Julia ranks #10 in GitHub stars and #8 in number of repositories

    Top GitHub Programming Languages

    Rank Language GitHub Stars Number of Repositories
    1 Swift 38,513 5,755
    2 Go 28,230 3,770
    3 TypeScript 22,248 3,230
    4 Rust 21,938 4,072
    5 CoffeeScript 13,972 1,909
    6 Kotlin 13,074 1,219
    7 Ruby 12,212 3,401
    8 PHP 11,989 4,037
    9 Elixir 10,103 1,399
    10 Julia 8,651 1,982
    11 Scala 8,254 2,092
    12 Crystal 8,161 656
    13 Python 7,835 1,294
    14 Roslyn 7,538 1,851
    15 PowerShell 7,010 913

    Ranked by Number of GitHub Stars

    About Julia Computing and Julia

    Julia Computing (JuliaComputing.com) was founded in 2015 by the co-creators of the open source Julia language to develop products and provide support for businesses and researchers who use Julia.

    Julia is the fastest modern high performance open source computing language for data, analytics, algorithmic trading, machine learning and artificial intelligence. Julia combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of Java and C++. Julia delivers dramatic improvements in simplicity, speed, capacity and productivity. With more than 1 million downloads and +161% annual growth, Julia adoption is growing rapidly in finance, energy, robotics, genomics and many other fields.

    1. Julia is lightning fast. Julia provides speed improvements up to
      1,000x for insurance model estimation, 225x for parallel
      supercomputing image analysis and 11x for macroeconomic modeling.

    2. Julia is easy to learn. Julia’s flexible syntax is familiar and
      comfortable for users of Python, R and Matlab.

    3. Julia integrates well with existing code and platforms. Users of
      Python, R, Matlab and other languages can easily integrate their
      existing code into Julia.

    4. Elegant code. Julia was built from the ground up for
      mathematical, scientific and statistical computing, and has advanced
      libraries that make coding simple and fast, and dramatically reduce
      the number of lines of code required – in some cases, by 90%
      or more.

    5. Julia solves the two language problem. Because Julia combines
      the ease of use and familiar syntax of Python, R and Matlab with the
      speed of C, C++ or Java, programmers no longer need to estimate
      models in one language and reproduce them in a faster
      production language. This saves time and reduces error and cost.

    Julia users, partners and employers looking to hire Julia programmers in 2017 include: Google, Apple, Amazon, Facebook, IBM, Intel, Microsoft, BlackRock, Capital One, PwC, Ford, Oracle, Comcast, DARPA, Moore Foundation, Federal Reserve Bank of New York (FRBNY), UC Berkeley Autonomous Race Car (BARC), Federal Aviation Administration (FAA), MIT Lincoln Labs, Nobel Laureate Thomas J. Sargent, Brazilian National Development Bank (BNDES), Conning, Berkery Noyes, BestX, Path BioAnalytics, Invenia, AOT Energy, AlgoCircle, Trinity Health, Gambit, Augmedics, Tangent Works, Voxel8, Massachusetts General Hospital, NaviHealth, Farmers Insurance, Pilot Flying J, Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center (NERSC), Oak Ridge National Laboratory, Los Alamos National Laboratory, Lawrence Livermore National Laboratory, National Renewable Energy Laboratory, MIT, Caltech, Stanford, UC Berkeley, Harvard, Columbia, NYU, Oxford, NUS, UCL, Nantes, Alan Turing Institute, University of Chicago, Cornell, Max Planck Institute, Australian National University, University of Warwick, University of Colorado, Queen Mary University of London, London Institute of Cancer Research, UC Irvine, University of Kaiserslautern, University of Queensland.

    Julia is being used to: analyze images of the universe and research dark matter, drive parallel supercomputing, diagnose medical conditions, provide surgeons with real-time imagery using augmented reality, analyze cancer genomes, manage 3D printers, pilot self-driving racecars, build drones, improve air safety, manage the electric grid, provide analytics for foreign exchange trading, energy trading, insurance, regulatory compliance, macroeconomic modeling, sports analytics, manufacturing, and much, much more.

    ]]>
    3660
    Filling In The Interop Packages and Rosenbrock http://www.juliabloggers.com/filling-in-the-interop-packages-and-rosenbrock/ Thu, 18 May 2017 01:30:00 +0000 http://juliadiffeq.org/2017/05/18/Filling_in.html ]]> By: JuliaDiffEq

    Re-posted from: http://juliadiffeq.org/2017/05/18/Filling_in.html

    In the 2.0 state of the ecosystem post
    it was noted that, now that we have a clearly laid out and expansive common API,
    the next goal is to fill it in. This set of releases tackles the lowest hanging
    fruits in that battle. Specifically, the interop packages were setup to be as
    complete in their interfaces as possible, and the existing methods which could
    expand were expanded. Time for specifics.

    ]]>
    3656
    Exploring Fibonacci Fractions with Julia http://www.juliabloggers.com/exploring-fibonacci-fractions-with-julia/ Wed, 17 May 2017 11:48:48 +0000 http://perfectionatic.org/?p=367 ]]> By: perfectionatic

    Re-posted from: http://perfectionatic.org/?p=367

    Recently, I came across a fascinating blog and video from Mind you Decisions. It is about how a fraction
    \frac{1}{999,999,999,999,999,999,999,998,,999,999,999,999,999,999,999,999}
    would show the Fibonacci numbers in order when looking at its decimal output.

    On a spreadsheet and most standard programming languages, such output can not be attained due to the limited precision for floating point numbers. If you try this on R or Python, you will get an output of 1e-48.
    Wolfram alpha,however,allows arbitrary precision.

    In Julia by default we get a little better that R and Python

    julia> 1/999999999999999999999998999999999999999999999999
    1.000000000000000000000001000000000000000000000002000000000000000000000003000002e-48
     
    julia> typeof(ans)
    BigFloat

    We observe here that we are getting the first few Fibonacci numbers 1, 1, 2, 3. We need more precision to get more numbers. Julia has arbitrary precision arithmetic baked into the language. We can crank up the precision of the BigFloat type on demand. Of course, the higher the precision, the slower the computation and the greater the memory we use. We do that by setprecision.

    julia> setprecision(BigFloat,10000)
    10000

    Reevaluating, we get

    julia> 1/999999999999999999999998999999999999999999999999
    1.00000000000000000000000100000000000000000000000200000000000000000000000300000000000000000000000500000000000000000000000800000000000000000000001300000000000000000000002100000000000000000000003400000000000000000000005500000000000000000000008900000000000000000000014400000000000000000000023300000000000000000000037700000000000000000000061000000000000000000000098700000000000000000000159700000000000000000000258400000000000000000000418100000000000000000000676500000000000000000001094600000000000000000001771100000000000000000002865700000000000000000004636800000000000000000007502500000000000000000012139300000000000000000019641800000000000000000031781100000000000000000051422900000000000000000083204000000000000000000134626900000000000000000217830900000000000000000352457800000000000000000570288700000000000000000922746500000000000000001493035200000000000000002415781700000000000000003908816900000000000000006324598600000000000000010233415500000000000000016558014100000000000000026791429600000000000000043349443700000000000000070140873300000000000000113490317000000000000000183631190300000000000000297121507300000000000000480752697600000000000000777874204900000000000001258626902500000000000002036501107400000000000003295128009900000000000005331629117300000000000008626757127200000000000013958386244500000000000022585143371700000000000036543529616200000000000059128672987900000000000095672202604100000000000154800875592000000000000250473078196100000000000405273953788100000000000655747031984200000000001061020985772300000000001716768017756500000000002777789003528800000000004494557021285300000000007272346024814100000000011766903046099400000000019039249070913500000000030806152117012900000000049845401187926400000000080651553304939300000000130496954492865700000000211148507797805000000000341645462290670700000000552793970088475700000000894439432379146400000001447233402467622100000002341672834846768500000003788906237314390600000006130579072161159100000009919485309475549700000016050064381636708800000025969549691112258500000042019614072748967300000067989163763861225800000110008777836610193100000177997941600471418900000288006719437081612000000466004661037553030900000754011380474634642900001220016041512187673800001974027421986822316700003194043463499009990500005168070885485832307200008362114348984842297700013530185234470674604900021892299583455516902600035422484817926191507500057314784401381708410100092737269219307899917600150052053620689608327700242789322839997508245300392841376460687116573000635630699300684624818301028472075761371741391301664102775062056366209602692574850823428107600904356677625885484473810507049252476708912581411411405930102594397055221918455182579303309636633329861112681897706691855248316295261201016328488578177407943098723020343826493703204299739348832404671111147398462369176231164814351698201718008635835925499096664087184867000739850794865805193502836665349891529892378369837405200686395697571872674070550577925589950242511475751264321287522115185546301842246877472357697022053e-48

    That is looking much better. However it we be nice if we could extract the Fibonacci numbers that are buried in that long decimal. Using the approach in the original blog. We define a function

    y(x)=one(x)-x-x^2

    and calculate the decimal

    a=1/y(big"1e-24")

    Here we use the non-standard string literal big"..." to insure proper interpretation of our input. Using BigFloat(1e-24)) would first construct at floating point with limited precision and then do the conversion. The initial loss of precision will not be recovered in the conversion, and hence the use of big. Now we extract our Fibonacci numbers by this function

    function extract_fib(a)
       x=string(a)
       l=2
       fi=BigInt[]
       push!(fi,1)
       for i=1:div(length(x)-24,24)
            j=parse(BigInt,x[l+1+(i-1)*24:l+i*24])
            push!(fi,j)
       end
       fi
    end

    Here we first convert our very long decimal number of a string and they we exploit the fact the Fibonacci numbers occur in blocks that 24 digits in length. We get out output in an array of BigInt. We want to compare the output with exact Fibonacci numbers, we just do a quick and non-recursive implementation.

    function fib(n)
        f=Vector{typeof(n)}(n+1)
        f[1]=f[2]=1;
        for i=3:n+1
           f[i]=f[i-1]+f[i-2]
        end
        f
    end

    Now we compare…

    fib_exact=fib(200);
    fib_frac=extract_fib(a);
    for i in eachindex(fib_frac)
         println(fib_exact[i], " ", fib_exact[i]-fib_frac[i])
    end

    We get a long sequence, we just focused here on when the discrepancy happens.

    ...
    184551825793033096366333 0
    298611126818977066918552 0
    483162952612010163284885 0
    781774079430987230203437 -1
    1264937032042997393488322 999999999999999999999998
    2046711111473984623691759 1999999999999999999999997
    ...

    The output shows that just before the extracted Fibonacci number exceeds 24 digits, a discrepancy occurs. I am not quite sure why, but this was a fun exploration. Julia allows me to do mathematical explorations that would take one or even two orders of magnitude of effort to do in any other language.

    ]]>
    3653
    Where are the Julians? http://www.juliabloggers.com/where-are-the-julians/ Wed, 17 May 2017 00:00:00 +0000 https://juliohm.github.io/science/where-are-the-julians/

    Instructions

    • Pan and zoom with the mouse.
    • Click on a bubble to open profiles on GitHub.
    • Alt + click to remove a bubble.

    You may need to unblock popups in your browser to have multiple profiles opening as tabs. Removing a bubble can be useful for revealing other bubbles.

    Julians are presented in decreasing order of contributions. An arc is drawn between locations X and Y in the map whenever a Julian in X and a Julian in Y have contributed to a common package.

    Want to be on the map?

    If your nickname is listed below and you want to appear on the map, please consider typing your address on GitHub:

    '].join(''); } }, done: function(datamap) { datamap.svg.call(d3.behavior.zoom().on("zoom", redraw)); function redraw() { datamap.svg.selectAll("g") .attr("transform", "translate(" + d3.event.translate + ")scale(" + d3.event.scale + ")"); } } }); my.map.bubbles(juliansByLocation, { borderWidth: 1, highlightFillColor: '#3ba559', highlightBorderColor: 'white', highlightBorderWidth: 1, popupTemplate: function (geo, data) { var table = ['']; data.values.forEach(function(julian) { var juliancard = [ '', '', '', '', ]; table = table.concat(juliancard); }); table.push('
    ', '@' + julian.id + '', '
    Name: ' + julian.name, '
    Location: ' + julian.location, '
    '); return table.join(''); } }); my.map.svg.selectAll(".datamaps-bubble") .style("cursor", "pointer") .on("click", bubbleclick) .on("mouseover", bubblemouseover) .on("mouseout", bubblemouseout); d3.select("#lost") .html("

    Julians without address: " + lost.join(", ") + "

    "); d3.select("#countries") .html('JuliaCon 2017 is approaching and I thought it would be interesting to map the Julians across the globe. ' + "" + juliansByCountry.length + " countries and counting..."); }); function bubbleclick(d) { var mouseevent = d3.event; if (mouseevent.altKey) { d3.select(this).remove(); } else { d.values.forEach(function(julian) { window.open("https://github.com/" + julian.id); }); } } function bubblemouseover(d) { var $this = d3.select(this); // save old colors var previousAttributes = { 'fill': $this.style('fill'), 'stroke': $this.style('stroke'), 'stroke-width': $this.style('stroke-width'), 'fill-opacity': $this.style('fill-opacity') }; // update colors $this .style('fill', '#3ba559') .attr('data-previousAttributes', JSON.stringify(previousAttributes)); // activate popup myPopup($this, d, my.map.options.bubblesOptions); // find and draw arcs var ids = d.values.map(function(julian) { return julian.id; }); var arcs = []; ids.forEach(function(id) { var arcs4id = my.connections.filter(function(c) { return c.source == id || c.target == id; }); arcs = arcs.concat(arcs4id); }); arcs = arcs.map(function(arc) { var s = my.julianids.indexOf(arc.source); var t = my.julianids.indexOf(arc.target); var slat = my.julians[s].latitude; var slng = my.julians[s].longitude; var tlat = my.julians[t].latitude; var tlng = my.julians[t].longitude; return { origin: { latitude: slat, longitude: slng }, destination: { latitude: tlat, longitude: tlng } }; }); my.map.arc(arcs, {strokeWidth: 1}); } function bubblemouseout(d) { var $this = d3.select(this); // restore colors var previousAttributes = JSON.parse( $this.attr('data-previousAttributes') ); for ( var attr in previousAttributes ) { $this.style(attr, previousAttributes[attr]); } // deactivate popup $this.on('mousemove', null); d3.select(my.map.svg[0][0].parentNode).select('.datamaps-hoverover').style('display', 'none'); } function myPopup(element, d, options) { element.on('mousemove', null); element.on('mousemove', function() { var position = d3.mouse(my.map.options.element); d3.select(my.map.svg[0][0].parentNode).select('.datamaps-hoverover') .style('top', ( (position[1] + 30)) + "px") .html(function() { var data = JSON.parse(element.attr('data-info')); try { return options.popupTemplate(d, data); } catch (e) { return ""; } }) .style('left', ( position[0]) + "px"); }); d3.select(my.map.svg[0][0].parentNode).select('.datamaps-hoverover').style('display', 'block'); }; ]]>
    By: Júlio Hoffimann

    Re-posted from: https://juliohm.github.io/science/where-are-the-julians/



    Instructions

    • Pan and zoom with the mouse.
    • Click on a bubble to open profiles on GitHub.
    • Alt + click to remove a bubble.

    You may need to unblock popups in your browser to have multiple profiles opening as tabs. Removing a bubble can be useful for revealing other bubbles.

    Julians are presented in decreasing order of contributions. An arc is drawn between locations X and Y in the map whenever a Julian in X and a Julian in Y have contributed to a common package.

    Want to be on the map?

    If your nickname is listed below and you want to appear on the map, please consider typing your address on GitHub:

    Data

    The data was extracted from METADATA. It only includes members
    of the community that have contributed to a registered Julia package (e.g. issues, pull requests) up
    until 16-May-2017.

    The Jupyter notebook used for data extraction is available in our
    JuliaGraphsTutorials repository.

    Facts

    • Russia and China have an unexpectedly low number of bubbles.
    • The number of outgoing arcs from India is great.
    • Less developed countries are slowly adopting the language.

    Say hello to a Julian near you. #JuliansInTheGlobe

    ]]>
    3651
    Video Blog: Developing and Editing Julia Packages http://www.juliabloggers.com/video-blog-developing-and-editing-julia-packages/ Tue, 16 May 2017 20:37:57 +0000 http://www.stochasticlifestyle.com/?p=660 ]]> By: Christopher Rackauckas

    Re-posted from: http://www.stochasticlifestyle.com/video-blog-developing-editing-julia-packages/

    Google Summer of Code is starting up, so I thought it would be a good time to share my workflow for developing my own Julia packages, as well as my workflow for contributing to other Julia packages. This does not assume familiarity with commandline Git, and instead shows you how to use a GUI (GitKraken) to make branches and PRs, as well as reviewing and merging code. You can think of it as an update to my old blog post on package development in Julia. However, this is not only updated but also improved since I am now able to walk through the “non-code” parts of package developing (such as setting up AppVeyor and code coverage).

    Enjoy! (I quite like this video blog format: it was a lot less work)

    The post Video Blog: Developing and Editing Julia Packages appeared first on Stochastic Lifestyle.

    ]]>
    3646
    Inference Convergence algorithm in Julia – revisited http://www.juliabloggers.com/inference-convergence-algorithm-in-julia-revisited/ Mon, 15 May 2017 00:00:00 +0000 http://juliacomputing.com/blog/2017/05/15/inference-converage2 Adventures in Type Inference Convergence: 2017 edition

    Corrected Convergence

    In my last post on type inference convergence, I described a correct inference algorithm. However, while this was a tremendous improvement over being wrong, I was unhappy with it. Indeed, I wrote that “a full discussion of these heuristics will have to wait for a future blog post”. But what I didn’t say is that I had already written many notes for that post, and had reached a point where I understood that the current algorithm wouldn’t actually permit reliable, well-tuned heuristics.

    Plus, the algorithm required lots of global state (to manage the work-queue), which required tricky locks and coordination to ensure it stayed consistent.

    Convergence Algorithm 2.0

    The improved convergence algorithm in PR #21677 maintains all of the correctness guarantees of the existing algorithm, but uses a completely revised cycle-detection algorithm that provides much stronger guarantees about the order in which work will be done.

    By comparison to be current algorithm, the revised one has half as many states. They are:

    1. processing: any leaf function of the call-stack (usually there’s just one at a time)
    2. on the call-stack: the tree DAG (directed-acyclic-graph) of functions with edges representing inferred potential calls
    3. in a cycle: groups of functions nodes discovered not to form a DAG (represented instead as an unsorted set)
    4. finished: done

    The outline of the algorithm that manages these states is that the call-stack is always maintained as a simple tree, only permitting the existence of nodes with forward edges. During the course of running inference, if adding an edge would cause a cycle in this graph, the algorithm instead replaces all nodes that are participating in the cycle with a single node that represents that cyclic set.

    Conceptually, this set is similar to either the “fixed-point” set in the current algorithm, or the “active lines” set. It differs from the “fixed-point” set because it can also guarantee that only nodes in the cycle will be in the set. The “active lines” set only allows describing the active set within a single method – the new representation expands that to iterate convergence of an entire set of methods.

    Because the algorithm maintains a acyclic call-stack, it becomes now possible to depend on the inspection of an edge in inference to result in exactly one of two results: an inferred return type, or a cycle detection that replaces the current node with a convergence set. This greatly increases the options available for integrating new inference heuristics!

    There’s one additional nice benefit to this new representation: improved inlining heuristics! It may not be obvious how improved cycle-detection during inference convergence would impact the optimization afterwards, but one of the unintuitive products of inference is the inlining order. Deciding on the profitability of inlining is dependent upon knowing the precise structure (ordering) of the call-stack and any cycles in it. Since the new inference structure provides precise information on which functions potentially participate in cycles, the inliner order can take advantage of that information to decide which functions to inline, and where to terminate the inlining due to recursion.

    Undecidability Heuristics

    Without heuristics imposed on top of the existing inference algorithm, it would already be Turing-complete[^turing], with just the current support for recursively inspecting dispatch over computed types.

    This means that (without some additional constraints added) inference might attempt to compute anything (including the halting problem, busy beaver, infinite recursion). In some cases, code is written that explicitly tries to make use of this using “lisp-y style” function application. However, code written in that manner often quickly runs into limits that are inherently required by the compiler to avoid other very similar code patterns where the inferred recursion is either unintentional (it won’t affect runtime performance / behavior) or unreachable (solving for reachability is commonly referred to as the halting problem).

    It is suboptimal for this inference system to be undecidable (or even slow), since its sole purpose is to find optimizations to make your program run faster. And inference is going to be painfully slow when it is being used as an interpreter (rather than just running the program unspecialized). So, there needs to be some mechanism or heuristics for preventing this from happening.

    It’s also important to realize that these heuristics need to be tuned precisely to the capabilities of inference.

    Turing-completeness requires several features:

    • loops
    • conditionals
    • arbitrarily large (but finite) memory

    Removing any one of these would be sufficient to ensure the system is computable. But loops (recursion), conditionals (dispatch), and memory (apply-type) are all essential to getting useful results from inference and thus it is generally not desirable to simply remove them. Additionally, even if it was possible now, as inference becomes more capable (e.g. inter-procedural constant propagation, effect-free computation, speculative evaluation of constant arithmetic), new forms of loops and memory would appear that would also need to be addressed.

    Current limits

    Since Julia’s inference algorithm currently only operates on Julia’s Type’s, (and does not support inference of constant expressions like 1 + 1), we only need to consider how these can be used to express an arbitrarily large memory. Imply both that it can form arbitrarily large constants, and do arithmetic (conditional loops) on them.

    There are a finite number of type names. But there are two primary mechanisms by which a type can be used to express an arbitrary value. For any type, the parameters can be arbitrarily nested in depth. Additionally, there are tuples and unions, which (in addition to depth), can have arbitrary length. We can express this complexity as:

    depth(type) = 1 + maximum(depth, type.parameters)
    length(type) = length(type.parameters)
    

    It is possible, using either property (or a combination of both), to represent any number.

    The current heuristics in inference prohibit recursion which increases either depth or length. without this ability, the program cannot express any number larger than the input, forcing it to (eventually) terminate. Additionally, there are various (semi-arbitrary) limits places on the lengths and depths of the types that are flowing through the system.

    Independence of the cached result

    For predictability, it is preferable that the heuristic limits imposed should only be local properties, computable from the function itself or its forward edges (the functions that it calls). This remains an unsolved problem, since the heuristics are based on examining the call-stack (following back-edges). This means that the order in which functions enter inference can affect how precisely they are permitted to compute their results.

    heuristics

    (A diagram showing how the context-sensativity of the specialization type signature can create repeating patterns instead of simple recursion)

    To illustrate what I mean by wanting to only use “local properties”, consider if fuchsia entered inference first, then inference will see the widening recursion at fuchsia´. And green´ will not be encountered. Thus, the result of green will be different than if fuchsia was not being inferred (recursion would have been detected at green´, changing the results assigned and cached for both – including fuchsia when it is later inferred, and uses the cached return type for green).

    An optimal limit would notice the recursion at fuchsia´, but change the topmost function (fuchsia) to use an approximation of green while canceling type-inference on the original attempted edges. This would preserve the characteristic that the cached result of type-inferring green shouldn’t depend on having been called from fuschia.

    Looking forward

    A downside of the current arbitrary limits is that they penalize valid code which uses complex types, even if they are not changing under recursion.

    Another downside is that it only takes into account the possibility of expressing memory via the type-system “length” and “depth” definitions above.

    These challenges remain, and will need to be addressed in a future update to this blog (and to the implementation!).

    ]]>
    By: Julia Computing, Inc.

    Re-posted from: http://juliacomputing.com/blog/2017/05/15/inference-converage2.html

    Adventures in Type Inference Convergence: 2017 edition

    Corrected Convergence

    In my last post on type inference convergence,
    I described a correct inference algorithm.
    However, while this was a tremendous improvement over being wrong, I was unhappy with it.
    Indeed, I wrote that “a full discussion of these heuristics will have to wait for a future blog post”.
    But what I didn’t say is that I had already written many notes for that post,
    and had reached a point where I understood that the current algorithm wouldn’t actually permit
    reliable, well-tuned heuristics.

    Plus, the algorithm required lots of global state (to manage the work-queue),
    which required tricky locks and coordination to ensure it stayed consistent.

    Convergence Algorithm 2.0

    The improved convergence algorithm in PR #21677
    maintains all of the correctness guarantees of the existing algorithm,
    but uses a completely revised cycle-detection algorithm that provides much stronger guarantees
    about the order in which work will be done.

    By comparison to be current algorithm, the revised one has half as many states.
    They are:

    1. processing: any leaf function of the call-stack (usually there’s just one at a time)
    2. on the call-stack: the tree DAG (directed-acyclic-graph) of functions with edges representing inferred potential calls
    3. in a cycle: groups of functions nodes discovered not to form a DAG (represented instead as an unsorted set)
    4. finished: done

    The outline of the algorithm that manages these states is that the call-stack is always maintained as a simple tree,
    only permitting the existence of nodes with forward edges.
    During the course of running inference, if adding an edge would cause a cycle in this graph,
    the algorithm instead replaces all nodes that are participating in the cycle
    with a single node that represents that cyclic set.

    Conceptually, this set is similar to either the “fixed-point” set in the current algorithm,
    or the “active lines” set.
    It differs from the “fixed-point” set because it can also guarantee that only nodes in the cycle
    will be in the set.
    The “active lines” set only allows describing the active set within a single method –
    the new representation expands that to iterate convergence of an entire set of methods.

    Because the algorithm maintains a acyclic call-stack,
    it becomes now possible to depend on the inspection of an edge in inference to
    result in exactly one of two results: an inferred return type,
    or a cycle detection that replaces the current node with a convergence set.
    This greatly increases the options available for integrating new inference heuristics!

    There’s one additional nice benefit to this new representation:
    improved inlining heuristics!
    It may not be obvious how improved cycle-detection during inference convergence
    would impact the optimization afterwards,
    but one of the unintuitive products of inference is the inlining order.
    Deciding on the profitability of inlining is dependent upon knowing the
    precise structure (ordering) of the call-stack and any cycles in it.
    Since the new inference structure provides precise information on which
    functions potentially participate in cycles,
    the inliner order can take advantage of that information to decide which functions to inline,
    and where to terminate the inlining due to recursion.

    Undecidability Heuristics

    Without heuristics imposed on top of the existing inference algorithm,
    it would already be Turing-complete[^turing],
    with just the current support for recursively inspecting dispatch over computed types.

    This means that (without some additional constraints added) inference might attempt to
    compute anything (including the halting problem, busy beaver, infinite recursion).
    In some cases, code is written that explicitly tries to make use of this
    using “lisp-y style” function application.
    However, code written in that manner often quickly runs into limits
    that are inherently required by the compiler to avoid other very similar
    code patterns where the inferred recursion is either unintentional
    (it won’t affect runtime performance / behavior)
    or unreachable (solving for reachability is commonly referred to as the halting problem).

    It is suboptimal for this inference system to be undecidable (or even slow),
    since its sole purpose is to find optimizations to make your program run faster.
    And inference is going to be painfully slow when it is being used as an interpreter
    (rather than just running the program unspecialized).
    So, there needs to be some mechanism or heuristics for preventing this from happening.

    It’s also important to realize that these heuristics need to be tuned precisely
    to the capabilities of inference.

    Turing-completeness requires several features:

    • loops
    • conditionals
    • arbitrarily large (but finite) memory

    Removing any one of these would be sufficient to ensure the system is computable.
    But loops (recursion), conditionals (dispatch), and memory (apply-type)
    are all essential to getting useful results from inference
    and thus it is generally not desirable to simply remove them.
    Additionally, even if it was possible now,
    as inference becomes more capable
    (e.g. inter-procedural constant propagation, effect-free computation,
    speculative evaluation of constant arithmetic),
    new forms of loops and memory would appear that would also need to be addressed.

    Current limits

    Since Julia’s inference algorithm currently only operates on Julia’s Type’s,
    (and does not support inference of constant expressions like 1 + 1),
    we only need to consider how these can be used to express an arbitrarily large memory.
    Imply both that it can form arbitrarily large constants, and do arithmetic (conditional loops) on them.

    There are a finite number of type names.
    But there are two primary mechanisms by which a type can be used to express an arbitrary value.
    For any type, the parameters can be arbitrarily nested in depth.
    Additionally, there are tuples and unions, which (in addition to depth), can have arbitrary length.
    We can express this complexity as:

    depth(type) = 1 + maximum(depth, type.parameters)
    length(type) = length(type.parameters)
    

    It is possible, using either property (or a combination of both), to represent any number.

    The current heuristics in inference prohibit recursion which increases either depth or length.
    without this ability, the program cannot express any number larger than the input,
    forcing it to (eventually) terminate.
    Additionally, there are various (semi-arbitrary) limits places on the lengths and depths of
    the types that are flowing through the system.

    Independence of the cached result

    For predictability, it is preferable that the heuristic limits imposed should only be local properties,
    computable from the function itself or its forward edges (the functions that it calls).
    This remains an unsolved problem, since the heuristics are based on examining the call-stack (following back-edges).
    This means that the order in which functions enter inference can affect how precisely they are permitted to compute their results.

    heuristics

    (A diagram showing how the context-sensativity of the specialization type signature can create repeating patterns instead of simple recursion)

    To illustrate what I mean by wanting to only use “local properties”, consider if fuchsia entered inference first,
    then inference will see the widening recursion at fuchsia´.
    And green´ will not be encountered.
    Thus, the result of green will be different than if fuchsia was not being inferred
    (recursion would have been detected at green´, changing the results assigned and cached for both –
    including fuchsia when it is later inferred, and uses the cached return type for green).

    An optimal limit would notice the recursion at fuchsia´, but change the topmost function (fuchsia)
    to use an approximation of green while canceling type-inference on the original attempted edges.
    This would preserve the characteristic that the cached result of type-inferring green shouldn’t
    depend on having been called from fuschia.

    Looking forward

    A downside of the current arbitrary limits is that they penalize valid code which uses complex types,
    even if they are not changing under recursion.

    Another downside is that it only takes into account the possibility of expressing memory
    via the type-system “length” and “depth” definitions above.

    These challenges remain, and will need to be addressed in a future update to this blog
    (and to the implementation!).

    ]]>
    3643
    DifferentialEquations.jl 2.0: State of the Ecosystem http://www.juliabloggers.com/differentialequations-jl-2-0-state-of-the-ecosystem/ Mon, 08 May 2017 12:13:30 +0000 http://www.stochasticlifestyle.com/?p=613 In this blog post I want to summarize what we have accomplished with DifferentialEquations' 2.0 release and detail where we are going next. I want to put the design changes and development work into a larger context so that way everyone can better understand what has been achieved, and better understand how we are planning to tackle our next challenges.

    If you find this project interesting and would like to support our work, please star our Github repository. Thanks!

    Now let's get started.

    DifferentialEquations.jl 1.0: The Core

    Before we start talking about 2.0, let's understand first what 1.0 was all about. DifferentialEquations.jl 1.0 was about answering a single question: how can we put the wide array of differential equations into one simple and efficient interface. The result of this was the common interface explained in the first blog post. Essentially, we created ... READ MORE

    The post DifferentialEquations.jl 2.0: State of the Ecosystem appeared first on Stochastic Lifestyle.

    ]]>
    By: Christopher Rackauckas

    Re-posted from: http://www.stochasticlifestyle.com/differentialequations-jl-2-0-state-ecosystem/

    In this blog post I want to summarize what we have accomplished with DifferentialEquations’ 2.0 release and detail where we are going next. I want to put the design changes and development work into a larger context so that way everyone can better understand what has been achieved, and better understand how we are planning to tackle our next challenges.

    If you find this project interesting and would like to support our work, please star our Github repository. Thanks!

    Now let’s get started.

    DifferentialEquations.jl 1.0: The Core

    Before we start talking about 2.0, let’s understand first what 1.0 was all about. DifferentialEquations.jl 1.0 was about answering a single question: how can we put the wide array of differential equations into one simple and efficient interface. The result of this was the common interface explained in the first blog post. Essentially, we created one interface that could:

    1. Specify a differential equation problem
    2. Solve a differential equation problem
    3. Analyze a differential equation problem

    The problem types, solve command, and solution interface were all introduced here as part of the unification of differential equations. Here, most of the work was on developing the core. DifferentialEquations.jl 1.0 was about having the core methods for solving ordinary differential equations, stochastic differential equations, and differential algebraic equations. There were some nice benchmarks to show that our core native solvers were on the right track, even besting well-known Fortran methods in terms of efficiency, but the key of 1.0 was the establishment of this high level unified interface and the core libraries for solving the problems.

    DifferentialEquations.jl 2.0: Extended Capabilities

    DifferentialEquations.jl 2.0 asked a very unique question for differential equations libraries. Namely, “how flexible can a differential equations solver be?”. This was motivated by an off-putting remark where someone noted that standard differential equations solvers were limited in their usefulness because many of the higher level analyses that people need to do cannot be done with a standard differential equations solver.

    So okay, then we won’t be a standard differential equations solver. But what do we need to do to make all of this possible? I gathered a list of things which were considered impossible to do with “blackbox” differential equations solvers. People want to model continuous equations for protein concentrations inside of each cell, but allow the number of cells (and thus the number of differential equations) to change stochastically over time. People want to model multiscale phenomena, and have discontinuities. Some “differential equations” may only be discontinuous changes of discrete values (like in Gillespie models). People want to solve equations with colored noise, and re-use the same noise process in other calculations. People want to solve the same ODE efficiently hundreds of times, and estimate parameters. People want to quantify the uncertainty and the sensitivity of their model. People want their solutions conserve properties like energy.

    People want to make simulations of reality moreso than solve equations.

    And this became the goal for DifferentialEquations.jl 2.0. But the sights were actually set a little higher. The underlying question was:

    How do you design a differential equations suite such that it can have this “simulation engine” functionality, but also such that adding new methods automatically makes the method compatible with all of these features?

    That is DifferentialEquations.jl 2.0. the previous DifferentialEquations.jl ecosystem blog post details the strategies we were going to employ to achieve this goal, but let me take a little bit of time to explain the solution that eventually resulted.

    The Integrator Interface

    The core of the solution is the integrator interface. Instead of just having an interface on the high-level solve command, the integrator interface is the interface on the core type. Everything inside of the OrdinaryDiffEq.jl, StochasticDiffEq.jl, DelayDiffEq.jl packages (will be referred to as the *DiffEq solvers) is actually just a function on the integrator type. This means that anything that the solver can do, you can do by simply having access to the integrator type. Then, everything can be unified by documenting this interface.

    This is a powerful idea. It makes development easy, since the devdocs just explain what is done internally to the integrator. Adding new differential equations algorithms is now simply adding a new perform_step dispatch. But this isn’t just useful for development, this is useful for users too. Using the integrator, you can step one at a time if you wanted, and do anything you want between steps. Resizing the differential equation is now just a function on the integrator type since this type holds all of the cache variables. Adding discontinuities is just changing integrator.u.

    But the key that makes this all work is Julia. In my dark past, I wrote some algorithms which used R’s S3 objects, and I used objects in numerical Python codes. Needless to say, these got in the way of performance. However, the process of implementing the integrator type was a full refactor from straight loops to the type format. The result was no performance loss (actually, there was a very slight performance gain!). The abstraction that I wanted to use did not have a performance tradeoff because Julia’s type system optimized its usage. I find that fact incredible.

    But back to the main story, the event handling framework was re-built in order to take full advantage of the integrator interface, allowing the user to directly affect the integrator. This means that doubling the size of your differential equation the moment some value hits 1 is now a possibility. It also means you can cause your integration to terminate when “all of the bunnies” die. But this became useful enough that you might not want to just use it for traditional event handling (aka cause some effect when some function hits zero, which we call the ContinuousCallback), but you may just want to apply some affect after steps. The DiscreteCallback allows one to check a boolean function for true/false, and if true apply some function to the integrator. For example, we can use this to always apply a projection to a manifold at the end of each step, effectively preserving the order of the integration while also conserving model properties like energy or angular momentum.

    The integrator interface and thus its usage in callbacks then became a way that users could add arbitrary functionality. It’s useful enough that a DiscreteProblem (an ODE problem with no ODE!) is now a thing. All that is done is the discrete problem walks through the ODE solver without solving a differential equation, just hitting callbacks.

    But entirely new sets of equations could be added through callbacks. For example, discrete stochastic equations (or Gillespie simulations) are models where rate equations determine the time to the next discontinuity or “jump”. The JumpProblem types simply add callbacks to a differential (or discrete) equation that perform these jumps at specific rates. This effectively turns the “blank ODE solver” into an equation which can solve these models of discrete proteins stochastically changing their levels over time. In addition, since it’s built directly onto the differential equations solvers, mixing these types of models is an instant side effect. These models which mix jumps and differential equations, such as jump diffusions, were an immediate consequence of this design.

    The design of the integrator interface meant that dynamicness of the differential equation (changing the size, the solver options, or any other property in the middle of solving) was successfully implemented, and handling of equations with discontinuities directly followed. This turned a lot of “not differential equations” into “models and simulations which can be handled by the same DifferentialEquations.jl interface”.

    Generic algorithms over abstract types

    However, the next big problem was being able to represent a wider array of models. “Models and simulations which do weird non-differential equation things over time” are handled by the integrator interface, but “weird things which aren’t just a system of equations which do weird non-differential equation things over time” were still out of reach.

    The solution here is abstract typing. The *DiffEq solvers accept two basic formats. Let’s stick to ODEs for the explanation. For ODEs, there is the out-of-place format

    du = f(t,u)

    where the derivative/change is returned by the function, and there is the in-place format

    f(t,u,du)

    where the function modifies the object du which stores the derivative/change. Both of these formats were generalized to the extreme. In the end, the requirements for a type to work in the out-of-place format can be described as the ability to do basic arithmetic (+,-,/,*), and you add the requirement of having a linear index (or simply having a broadcast! function defined) in order to satisfy the in-place format. If the method is using adaptivity, the user can pass an appropriate norm function to be used for calculating the norm of the error estimate.

    This means that wild things work in the ODE solvers. I have already demonstrated arbitrary precision numbers, and unit-checked arithmetic.

    But now there’s even crazier. Now different parts of your equation can have different units using the ArrayPartition. You can store and update discrete values along with your differential equation using the DEDataArray type. Just the other day I showed this can be used to solve problems where the variable is actually a symbolic mathematical expression. We are in the late stages of getting a type which represents a spectral discretization of a function compatible with the *DiffEq solvers.

    But what about those “purely scientific non-differential equations” applications? A multiscale model of an embryo which has tissues, each with different populations of cells, and modeling the proteins in each cell? That’s just a standard application of the AbstractMultiScaleArray.

    Thus using the abstract typing, even simulations which don’t look like systems of equations can now be directly handled by DifferentialEquations.jl. But not only that, since this is done simply via Julia’s generic programming, this compatibility is true for any of the new methods which are added (one caveat: if they use an external library like ForwardDiff.jl, their compatibility is limited by the compatibility of that external library).

    Refinement of the problem types

    The last two big ideas made it possible for a very large set of problems to be written down as a “differential equation on an array” in a much expanded sense of the term. However, there was another design problem to solve: not every algorithm could be implemented with “the information” we had! What I mean by “information”, I mean the information we could get from the user. The ODEProblem type specified an ODE as

     \frac{du}{dt} = f(t,u)

    but some algorithms do special things. For example, for the ODE

     \frac{du}{dt} = f(t,u) = A + g(t,u)

    the Lawson-Euler algorithm for solving the differential equation is

     u_{n+1} = \exp(A \Delta t)(u_n + g(t,u_n)\Delta t)

    This method exploits the fact that it knows that the first part of the equation is A for some matrix, and uses it directly to improve the stability of the algorithm. However, if all we know is f, we could never implement this algorithm. This would violate our goal of “full flexibility at full performance” if this algorithm was the most efficient for the problem!

    The solution is to have a more refined set of problem types. I discussed this a bit at the end of the previous blog post that we could define things like splitting problems. The solution is quite general, where

     M \frac{du}{dt} = f_1(t,u) + f_2(t,u) + ... + f_n(t,u)

    can be defined using the SplitODEProblem (M being a mass matrix). Then specific methods can request specific forms, like here the linear-nonlinear ODE. Together, the ODE solver can implement this algorithm for the ODE, and that implementation, being part of a *DiffEq solver, will have interpolations, the integrator interface, event handling, abstract type compatibility, etc. all for free. Check out the other “refined problem types”: these are capable of covering wild things like staggered grid PDE methods and symplectic integrators.

    In addition to specifying the same equations in new ways, we created avenues for common analyses of differential equations which are not related to simulating them over time. For example, one common problem is to try to find steady states, or points where the differential equation satisfies f(u)=0. This can now easily be done by defining a SteadyStateProblem from an ODE, and then using the steady state solver. This new library will also lead to the implementation of accelerated methods for finding steady states, and the development of new accelerated methods. The steady state behavior can now also be analyzed using the bifurcation analysis tools provided by the wrapper to PyDSTool.

    Lastly, the problem types themselves have become much more expressive. In addition to solving the standard ODE, one can specify mass matrices in any appropriate DE type, to instead solve the equation

     M \frac{du}{dt} = f(t,u)

    where M is some linear operator (similarly in DDEs and SDEs). While the vast majority of solvers are not able to use M right now, this infrastructure is there for algorithms to support it. In addition, one can now specify the noise process used in random and stochastic equations, allowing the user to solve problems with colored noise. Using the optional fields, a user can define non-diagonal noise problems, and specify sparse noise problems using sparse matrices.

    As of now, only some very basic methods using all of this infrastructure have been made for the most extreme examples for testing purposes, but these show that the infrastructure works and this is ready for implementing new methods.

    Common solve extensions

    Okay, so once we can specify efficient methods for weird models which evolve over time in weird ways, we can simulate and get what solutions look like. Great! We have a tool that can be used to get solutions! But… that’s only the beginning of most analyses!

    Most of the time, we are simulating solutions to learn more about the model. If we are modeling chemical reactions, what is the reaction rate that makes the model match the data? How sensitive is our climate model to our choice of the albedo coefficient?

    To back out information about the model, we rely on analysis algorithms like parameter estimation and sensitivity analysis. However, the common solve interface acts as the perfect level for implementing these algorithms because they can be done problem and algorithm agnostic. I discuss this in more detail in a previous blog post, but the general idea is that most of these algorithms can be written with a term y(t) which is the solution of a differential equation. Thus we can write the analysis algorithms at a very high level and allow the user to pass in the arguments for a solve command use that to generate the y(t). The result is an implementation of the analysis algorithm which works with any of the problems and methods which use the common interface. Again, chaining all of the work together to get one more complete product. You can see this in full force by looking at the parameter estimation docs.

    Modeling Tools

    In many cases one is solving differential equations not for their own sake, but to solve scientific questions. To this end, we created a framework for modeling packages which make this easier. The financial models make it easy to specify common financial equations, and the biological models make it easy to specify chemical reaction networks. This functionality all works on the common solver / integrator interface, meaning that models specified in these forms can be used with the full stack and analysis tools. Also, I would like to highlight BioEnergeticFoodWebs.jl as a great modeling package for bio-energetic food web models.

    Over time, we hope to continue to grow these modeling tools. The financial tools I hope to link with Julia Computing’s JuliaFin tools (Miletus.jl) in order to make it easy to efficiently solve the SDE and PDE models which result from their financial DSL. In addition, DiffEqPhysics.jl is planned to make it easy to specify the equations of motion just by giving a Hamiltonian or Lagrangian, or by giving the the particles + masses and automatically developing a differential equation. I hope that we can also tackle domains like mechanical systems and pharmacokinetics/pharmacodynamics to continually expand what is easily able to be solved using this infrastructure.

    DifferentialEquations 2.0 Conclusion

    In the end, DifferentialEquations 2.0 was about finding the right infrastructure such that pretty much anything CAN be specified and solved efficiently. While there were some bumps along the road (that caused breaking API changes), I believe we came up with a very good solution. The result is a foundation which feeds back onto itself, allowing projects like parameter estimation of multiscale models which change size due to events to be standard uses of the ODE solver.

    And one of the key things to note is that this follows by design. None of the algorithms were specifically written to make this work. The design of the *DiffEq packages gives interpolation, event handling, compatibility with analysis tools, etc. for free for any algorithm that is implemented in it. One contributor, @ranocha, came to chat in the chatroom and on a few hours later had implemented 3 strong stability preserving Runge-Kutta methods (methods which are efficient for hyperbolic PDEs) in the *DiffEq solvers. All of this extra compatibility followed for free, making it a simple exercise. And that leads me to DifferentialEquations 3.0.

    DifferentialEquations 3.0: Stiff solvers, parallel solvers, PDEs, and improved analysis tools

    1.0 was about building the core. 2.0 was about making sure that the core packages were built in a way that could be compatible with a wide array of problems, algorithms, and analysis tools. However, in many cases, only the simplest of each type of algorithm was implemented since this was more about building out the capabilities than it was to have completeness in each aspect. But now that we have expanded our capabilities, we need to fill in the details. These details are efficient algorithms in the common problem domains.

    Stiff solvers

    Let’s start by talking about stiff solvers. As of right now, we have the all of the standard solvers (CVODE, LSODA, radau) wrapped in the packages Sundials.jl, LSODA.jl, and ODEInterface.jl respectively. These can all be used in the DifferentialEquations.jl common interface, meaning that it’s mostly abstracted away from the user that these aren’t actually Julia codes. However, these lower level implementations will never be able to reach the full flexibility of the native Julia solvers simply because they are restricted in the types they use and don’t fully expose their internals. This is fine, since our benchmarks against the standard Runge-Kutta implementations (dopri5, dop853) showed that the native Julia solvers, being more modern implementations, can actually have performance gains over these older methods. But, we need to get our own implementations of these high order stiff solvers.

    Currently there exists the Rosenbrock23 method. This method is similar to the MATLAB ode23s method (it is the Order 2/3 Shampine-Rosenbrock method). This method is A and L stable, meaning it’s great for stiff equations. This was thus used for testing event handling, parameter estimation, etc.’s capabilities and restrictions with the coming set of stiff solvers. However, where it lacks is order. As an order 2 method, this method is only efficient at higher error tolerances, and thus for “standard tolerances” it tends not to be competitive with the other methods mentioned before. That is why one of our main goals in DiffEq 3.0 will be the creation of higher order methods for stiff equations.

    The main obstacle here will be the creation of a library for making the differentiation easier. There are lots of details involved here. Since a function defined using the macros of ParameterizedFunctions can symbolically differentiate the users function, in some cases a pre-computed function for the inverted or factorized Jacobian can be used to make a stiff method explicit. In other cases, we need autodifferentiation, and in some we need to use numerical differentiation. This is all governed by a system of traits setup behind the scenes, and thus proper logic for determining and using Jacobians can immensely speed up our calculations.

    The Rosenbrock23 method did some of this ad-hocly within its own method, but it was determined that the method would be greatly simplified if there was some library that could handle this. In fact, if there was a library to handle this, then the Rosenbrock23 code for defining steps would be as simple as defining steps for explicit RK methods. The same would be true for implicit RK methods like radau. Thus we will be doing that: building a library which handles all of the differentiation logic. The development of this library, DiffEqDiffTools.jl, is @miguelraz ‘s Google Summer of Code project. Thus with the completion of this project (hopefully summer?), efficient and fully compatible high order Rosenbrock methods and implicit RK methods will easily follow. Also included will be additive Runge-Kutta methods (IMEX RK methods) for SplitODEProblems. Since these methods double as native Julia DAE solvers and this code will make the development of stiff solvers for SDEs, this will be a major win to the ecosystem on many fronts.

    Stiffness Detection and Switching

    In many cases, the user may not know if a problem is stiff. In many cases, especially in stochastic equations, the problem may be switching between being stiff and non-stiff. In these cases, we want to change the method of integration as we go along. The general setup for implementing switching methods has already been implemented by the CompositeAlgorithm. However, current usage of the CompositeAlgorithm requires that the user define the switching behavior. This makes it quite difficult to use.

    Instead, we will be building methods which make use of this infrastructure. Stiffness detection estimates can be added to the existing methods (in a very efficient manner), and could be toggled on/off. Then standard switching strategies can be introduced such that the user can just give two algorithms, a stiff and a non-stiff solvers, and basic switching can then occur. What is deemed as the most optimal strategies can then be implemented as standard algorithm choices. Then at the very top, these methods can be added as defaults for solve(prob), making the fully automated solver efficiently handle difficult problems. This will be quite a unique feature and is borderline a new research project. I hope to see some really cool results.

    Parallel-in-time ODE/BVP solvers

    While traditional methods (Runge-Kutta, multistep) all step one time point at a time, in many cases we want to use parallelism to speed up our problem. It’s not hard to buy an expensive GPU, and a lot of us already have one for optimization, so why not use it?

    Well, parallelism for solving differential equations is very difficult. Let’s take a look at some quick examples. In the Euler method, the discretization calculates the next time step u_{n+1} from the previous time step u_n using the equation

    u_{n+1} = u_n + \Delta t f(t,u)

    In code, this is the update step

    u .= uprev .+ dt.*f(t,uprev)

    I threw in the .’s to show this is broadcasted over some arrays, i.e. for systems of equations u is a vector. And that’s it, that’s what the inner loop is. The most you can parallelize are the loops internal to the broadcasts. This means that for very large problems, you can parallelize this method efficiently (this form is called parallelism within the method). Also, if your input vector was a GPUArray, this will broadcast using CUDA or OpenCL. However, if your problem is not a sufficiently large vector, this parallelism will not be very efficient.

    Similarly for implicit equations, we need to repeatedly solve (I-\Delta tJ)u = b where J is the Jacobian matrix. This linear solve will only parallelize well if the Jacobian matrix is sufficiently large. But many small differential equations problems can still be very difficult. For example, this about solving a very stiff ODE with a few hundred variables. Instead, the issue is that we are stepping serially over time, and we need to use completely different algorithms which parallelize over time.

    One of these approaches is a collocation method. Collocation methods build a very large nonlinear equation F(X)=0 which describes a numerical method over all time points at once, and simultaneously solves for all of the time points using a nonlinear solver. Internally, a nonlinear solver is just a linear solver, Ax=b, with a very large A. Thus, if the user passes in a custom linear solver method, say one using PETSc.jl or CUSOLVER, this is parallelize efficiently over many nodes of an HPC or over a GPU. In fact, these methods are the standard methods for Boundary Value Problems (BVPs). The development of these methods is the topic of @YingboMa’s Google Summer of Code project. While written for BVPs, these same methods can then solve IVPs with a small modification (including stochastic differential equations).

    By specifying an appropriate preconditioner with the linear solver, these can be some of the most efficient parallel methods. When no good preconditioner is found, these methods can be less efficient. One may wonder then if there’s exists a different approach, one which may sacrifice some “theoretical top performance” in order to be better in the “low user input” case (purely automatic). There is! Another approach to solving the parallelism over time issue is to use a neural network. This is the topic of @akaysh’s Google Summer of Code project. Essentially, you can define a cost function which is the difference between the numerical derivative and f(t_i,u_i) at each time point. This then gives an optimization problem: find the u_i at each time point such that the difference between the numerical and the desired derivatives is minimized. Then you solve that cost function minimization using a neural network. The neat thing here is that neural nets are very efficient to parallelize over GPUs, meaning that even for somewhat small problems we can get out very good parallelism. These neural nets can use very advanced methods from packages like Knet.jl to optimize efficiently and parallel with very little required user input (no preconditioner to set). There really isn’t another standard differential equations solver package which has a method like this, so it’s hard to guess how efficient it will be in advance. But given the properties of this setup, I suspect this should be a very good “automatic” method for medium-sized (100’s of variables) differential equations.

    The really great thing about these parallel-in-time methods is that they are inherently implicit, meaning that they can be used even on extremely stiff equations. There are also simple extensions to make these solver SDEs and DDEs. So add this to the bank of efficient methods for stiff diffeqs!

    Improved methods for stochastic differential equations

    As part of 3.0, the hope is to release brand new methods for stochastic differential equations. These methods will be high order and highly stable, some explicit and some implicit, and will have adaptive timestepping. This is all of the details that I am giving until these methods are published, but I do want to tease that many types of SDEs will become much more efficient to solve.

    Improved methods for jump equations

    For jump equations, in order to show that everything is complete and can work, we have only implemented the Gillespie method. However, we hope to add many different forms of tau-leaping and Poisson(/Binomial) Runge-Kutta methods for these discrete stochastic equations. Our roadmap is here and it seems there may be a great deal of participation to complete this task. Additionally, we plan on having a specialized DSL for defining chemical reaction networks and automatically turn them into jump problems or ODE/SDE systems.

    Geometric and symplectic integrators

    In DifferentialEquations.jl 2.0, the ability to Partitioned ODEs for dynamical problems (or directly specify a second order ODE problem) was added. However, only a symplectic Euler method has been added to solve this equations so far. This was used to make sure the *DiffEq solvers were compatible with this infrastructure, and showed that event handling, resizing, parameter estimation, etc. all works together on these new problem types. But, obviously we need more algorithms. Velocity varlet and higher order Nystrom methods are asking to be implemented. This isn’t difficult for the reasons described above, and will be a very nice boost to DifferentialEquations.jl 3.0.

    (Stochastic/Delay) Partial differential equations

    Oh boy, here’s a big one. Everyone since the dawn of time has wanted me to focus on building a method that makes solving the PDE that they’re interested dead simple to do. We have a plan for how to get there. The design is key: instead of one-by-one implementing numerical methods for each PDE, we needed a way to pool the efforts together and make implementations on one PDE matter for other PDEs.

    Let’s take a look at how we can do this for efficient methods for reaction-diffusion equations. In this case, we want to solve the equation

     u_t = \Delta u + f(t,u)

    The first step is always to discretize this over space. Each of the different spatial discretization methods (finite difference, finite volume, finite element, spectral), end up with some equation

     U_t = AU + f(t,U)

    where now U is a vector of points in space (or discretization of some basis). At this point, a specialized numerical method for stepping this equation efficiently in the time can be written. For example, if diffusion is fast and f is stiff, one really efficient method is the implicit integrating factor method. This would solve the equation by updating time points like:

    U_{n+1} = e^{-A\Delta t}U_n + \Delta t U_{n+1}

    where we have to solve this implicit equation each time step. The nice thing is that the implicit equation decouples in space, and so we actually only need to solve a bunch of very small implicit equations.

    How can we do this in a way that is not “specific to the heat equation”? There were two steps here, the first is discretizing in space, the second is writing an efficient method specific to the PDE. The second part we already have an answer for: this numerical method can be written as one of the methods for linear/nonlinear SplitODEProblems that we defined before. Thus if we just write a SplitODEProblem algorithm that does this form of updating, it can be applied to any ODE (and any PDE discretization) which splits out a linear part. Again, because it’s now using the ODE solver, all of the extra capabilities (event handling, integrator interface, parameter estimation tools, interpolations, etc.) all come for free as well. The development of ODE/SDE/DDE solvers for handling this split, like implicit integrating factor methods and exponential Runge-Kutta methods, is part of DifferentialEquations.jl 3.0’s push for efficient (S/D)PDE solvers.

    So with that together, we just need to solve the discretization problem. First let’s talk about finite difference. For the Heat Equation with a fixed grid-size \Delta x, many people know what the second-order discretization matrix A is in advance. However, what if you have variable grid sizes, and want different order discretizations of different derivatives (say a third derivative)? In this case the Fornburg algorithm can be used to construct this A. And instead of making this an actual matrix, because this is sparse we can make this very efficient by creating a “matrix-free type” where AU acts like the appropriate matrix multiplication, but in reality no matrix is ever created. This can save a lot of memory and make the multiplication a lot more efficient by ignoring the zeros. In addition, because of the reduced memory requirement, we easily distribute this operator to the GPU or across the cluster, and make the AU function utilize this parallelism.

    The development of these efficient linear operators is the goal of @shivin9’s Google Summer of Code project. The goal is to get a function where the user can simply specify the order of the derivative and the order of the discretization, and it will spit out this highly efficient A to be used in the discretization, turning any PDE into a system of ODEs. In addition, other operators which show up in finite difference discretizations, such as the upwind scheme, can be encapsulated in such an A. Thus this project would make turning these types of PDEs into efficient ODE discretizations much easier!

    The other very popular form of spatial discretization is the finite element method. For this form, you chose some basis function over space and discretize the basis function. The definition of this basis function’s discretization then derives what the A discretization of the derivative operators should be. However, there is a vast array of different choices for basis and the discretization. If we wanted to create a toolbox which would implement all of what’s possible, we wouldn’t get anything else done. Thus we will instead, at least for now, piggyback off of the efforts of FEniCS. FEniCS is a toolbox for the finite element element method. Using PyCall, we can build an interface to FEniCS that makes it easy to receive the appropriate A linear operator (usually sparse matrix) that arises from the desired discretization. This, the development of a FEniCS.jl, is the topic of @ysimillides’s Google Summer of Code. The goal is to make this integration seamless, so that way getting ODEs out for finite element discretizations is a painless process, once again reducing the problem to solving ODEs.

    The last form of spatial discretization is spectral discretizations. These can be handled very well using the Julia library ApproxFun.jl. All that is required is to make it possible to step in time the equations which can be defined using the ApproxFun types. This is the goal of DiffEqApproxFun.jl. We already have large portions of this working, and for fixed basis lengths the ApproxFunProblems can actually be solved using any ODE solver (not just native Julia solvers, but also methods from Sundials and ODEInterface work!). This will get touched up soon and will be another type of PDE discretization which will be efficient and readily available.

    Improved Analysis Tools

    What was described above is how we are planning to solve very common difficult problems with high efficiency and simplify the problems for the user, all without losing functionality. However, the tools at the very top of the stack, the analysis tools, can also become much more efficient as well. This is the other focus of DifferentialEquations.jl 3.0.

    Local sensitivity analysis is nice because it not only tells you how sensitive your model is to the choice of parameters, but it gives this information at every time point. However, in many cases this is overkill. Also, this makes the problem much more computationally difficult. If we wanted to classify parameter space, like to answer the question “where is the model least sensitive to parameters?”, we would have to solve this equation many times. When this is the question we wish to answer, global sensitivity analysis methods are much more efficient. We plan on adding methods like the Morris method in order for sensitives to be globally classified.

    In addition, we really need better parameter estimation functionality. What we have is very good: you can build an objective function for your parameter estimation problem to then use Optim.jl, BlackBoxOptim.jl or any MathProgBase/JuMP solver (example: NLopt.jl) to optimize the parameters. This is great, but it’s very basic. In many cases, more advanced methods are necessary in order to get convergence. Using likelihood functions instead of directly solving the nonlinear regression can often times be more efficient. Also, in many cases statistical methods (the two-stage method) can be used to approximately optimize parameters without solving the differential equations repeatedly, a huge win for performance. Additionally, Bayesian methods will not only give optimal parameters, but distributions for the parameters which the user can use to quantify how certain they are about estimation results. The development of these methods is the focus of @Ayush-iitkgp’s Google Summer of Code project.

    DifferentialEquations.jl 3.0 Conclusion

    2.0 was about building infrastructure. 3.0 is about filling out that infrastructure and giving you the most efficient methods in each of the common problem domains.

    DifferentialEquations.jl 4.0 and beyond

    I think 2.0 puts us in a really great position. We have a lot, and the infrastructure allows us to be able to keep expanding and adding more and more algorithms to handle different problem types more efficiently. But there are some things which are not slated in the 3.0 docket. One thing that keeps getting pushed back is the automatic promotion of problem types. For example, if you specified a SplitODEProblem and you want to use an algorithm which wants an ODEProblem (say, a standard Runge-Kutta algorithm like Tsit5), it’s conceivable that this conversion can be handled automatically. Also, it’s conceivable that since you can directly convert an ODEProblem into a SteadyStateProblem, that the steady state solvers should work directly on an ODEProblem as well. However, with development time always being a constraint, I am planning on spending more time developing new efficient methods for these new domain rather than the automatic conversions. However, if someone else is interested in tackling this problem, this could probably be addressed much sooner!

    Additionally, there is one large set of algorithms which have not been addressed in the 3.0 plan: multistep methods. I have been holding off on these for a few reasons. For one, we have wrappers to Sundials, DASKR, and LSODA which supply well-known and highly efficient multistep methods. However, these wrappers, having the internals not implemented in Julia, are limited in their functionality. They will never be able to support arbitrary Julia types and will be constrained to equations on contiguous arrays of Float64s. Additionally, the interaction with Julia’s GC makes it extremely difficult to implement the integrator interface and thus event handling (custom allocators would be needed). Also, given what we have seen with other wrappers, we know we can actually improve the efficiency of these methods.

    But lastly, and something I think is important, these methods are actually only efficient in a small but important problem domain. When the ODE f is not “expensive enough”, implicit Runge-Kutta and Rosenbrock methods are more efficient. When it’s a discretization of a PDE and there exists a linear operator, exponential Runge-Kutta and implicit integrating factor methods are more efficient. Also, if there are lots of events or other dynamic behavior, multistep methods have to “restart”. This is an expensive process, and so in most cases using a one-step method (any of the previously mentioned methods) is more efficient. This means that multistep methods like Adams and BDF (/NDF) methods are really only the most efficient when you’re talking about a large spatial discretization of a PDE which doesn’t have a linear operator that you can take advantage of and events are non-existent/rare. Don’t get me wrong, this is still a very important case! But, given the large amount of wrappers which handle this quite well already, I am not planning on tackling these until the other parts are completed. Expect the *DiffEq infrastructure to support multistep methods in the future (actually, there’s already quite a bit of support in there, just the adaptive order and adaptive time step needs to be made possible), but don’t count on it in DifferentialEquations 3.0.

    Also not part of 3.0 but still of importance is stochastic delay differential equations. The development of a library for handling these equations can follow in the same manner as DelayDiffEq.jl, but likely won’t make it into 3.0 as there are more pressing (more commonly used) topics to address first. In addition, methods for delay equations with non-constant lags (and neutral delay equations) also will likely have to wait for 4.0.

    In the planning stages right now is a new domain-specific language for the specification of differential equations. The current DSL, the @ode_def macro in ParameterizedFunctions.jl does great for the problem that it can handle (ODEs and diagonal SDEs). However, there are many good reasons to want to expand the capabilities here. For example, equations defined by this DSL can be symbolically differentiated, leading to extremely efficient code even for stiff problems. In addition, one could theoretically “split the ODE function” to automatically turn the problem in a SplitODEProblem with a stiff and nonstiff part suitable for IMEX solvers. If PDEs also can be written in the same syntax, then the PDEs can be automatically discretized and solved using the tools from 2.0. Additionally, one can think about automatically reducing the index of DAEs, and specifying DDEs.

    This all sounds amazing, but it will need a new DSL infrastructure. After a discussion to find out what kind of syntax would be necessary, it seems that a major overhaul of the @ode_def macro would be needed in order to support all of this. The next step will be to provide a new library, DiffEqDSL.jl, for providing this enhanced DSL. As described in the previously linked roadmap discussion, this DSL will likely take a form closer in style to JuMP, and the specs seem to be compatible at reaching the above mentioned goals. Importantly, this will be developed as a new DSL and thus the current @ode_def macro will be unchanged. This is a large development which will most likely not be in 3.0, but please feel free to contribute to the roadmap discussion which is now being continued at the new repository.

    Conclusion

    DifferentialEquations.jl 1.0 was about establishing a core that could unify differential equations. 2.0 was about developing the infrastructure to tackle a very vast domain of scientific simulations which are not easily or efficiently written as differential equations. 3.0 will be be about producing efficient methods for the most common sets of problems which haven’t adequately addressed yet. This will put the ecosystem in a very good state and hopefully make it a valuable tool for many people. After this, 4.0+ will keep adding algorithms, expand the problem domain some more, and provide a new DSL.

    The post DifferentialEquations.jl 2.0: State of the Ecosystem appeared first on Stochastic Lifestyle.

    ]]>
    3634
    Some Fun With Julia Types: Symbolic Expressions in the ODE Solver http://www.juliabloggers.com/some-fun-with-julia-types-symbolic-expressions-in-the-ode-solver/ Thu, 04 May 2017 05:57:04 +0000 http://www.stochasticlifestyle.com/?p=573 In Julia, you can naturally write generic algorithms which work on any type which has specific "actions". For example, an "AbstractArray" is a type which has a specific set of functions implemented. This means that in any generically-written algorithm that wants an array, you can give it an AbstractArray and it will "just work". This kind of abstraction makes it easy to write a simple algorithm and then use that same exact code for other purposes. For example, distributed computing can be done by just passing in a DistributedArray, and the algorithm can be accomplished on the GPU by using a GPUArrays. Because Julia's functions will auto-specialize on the types you give it, Julia automatically makes efficient versions specifically for the types you pass in which, at compile-time, strips away the costs of the abstraction.

    This means ... READ MORE

    The post Some Fun With Julia Types: Symbolic Expressions in the ODE Solver appeared first on Stochastic Lifestyle.

    ]]>
    By: Christopher Rackauckas

    Re-posted from: http://www.stochasticlifestyle.com/fun-julia-types-symbolic-expressions-ode-solver/

    In Julia, you can naturally write generic algorithms which work on any type which has specific “actions”. For example, an “AbstractArray” is a type which has a specific set of functions implemented. This means that in any generically-written algorithm that wants an array, you can give it an AbstractArray and it will “just work”. This kind of abstraction makes it easy to write a simple algorithm and then use that same exact code for other purposes. For example, distributed computing can be done by just passing in a DistributedArray, and the algorithm can be accomplished on the GPU by using a GPUArrays. Because Julia’s functions will auto-specialize on the types you give it, Julia automatically makes efficient versions specifically for the types you pass in which, at compile-time, strips away the costs of the abstraction.

    This means that one way to come up with new efficient algorithms in Julia is to simply pass a new type to an existing algorithm. Does this kind of stuff really work? Well, I wanted to show that you can push this to absurd places by showing one of my latest experiments.

    Note

    The ODE solver part of this post won’t work on release until some tags go through (I’d say at least in a week?). The fix that I had to do was make the “internal instability checks”, which normally default to checking if the number is NaN, instead always be false for numbers which aren’t AbstractFloats (meaning, no matter what don’t halt integration due to instability). Do you understand why that would be a problem when trying to use a symbolic expression? This is a good question to keep in mind while reading.

    SymEngine.jl

    Let me first give a little bit of an introduction to SymEngine.jl. SymEngine is a re-write of SymPy into C++ for increased efficiency, and SymEngine.jl is a wrapper for Julia. The only part of the package I will use is its symbolic expression type, the SymEngine.Basic.

    In many cases, this is all you need out of the package. This is because dispatch and generic algorithms tend to handle everything else you need. For example, what if you want the symbolic expression for the inverse of a matrix? Just make a matrix of expressions, and call the Julia inverse function:

    y1,y2,y3,y4 = @vars y1 y2 y3 y4
    A = [y1 y1+y2
         y3+y2 y4]
     
    println(inv(A))
     
    #SymEngine.Basic[(1 + y2*y3/(y1*(y4 - y2*y3/y1)))/y1 -y2/(y1*(y4 - y2*y3/y1)); -y3/(y1*(y4 - y2*y3/y1)) (y4 - y2*y3/y1)^(-1)]

    Notice that the only thing that’s directly used here from SymEngine is the declaration of the variables for the symbolic expressions (the first line). The second line just creates a Julia matrix of these expressions. The last line prints the inv function from Julia’s Base library applied to this matrix of expressions.

    Julia’s inverse function is generically written, and thus it just needs to have that the elements satisfy some properties of numbers. For example, they need to be able to add, subtract, multiply, and divide. Since our “number” type, can do this the inverse function “just works”. But also, by Julia’s magic, a separate function is created and compiled just for doing this, which makes this into a highly efficient function. So easy + efficient: the true promise of Julia is satisfied.

    Going Deep: The ODE Solver

    Now let’s push this. Let’s say we had a problem where we wanted to find out which initial condition is required in order to get out a specific value. One way to calculate this is to use Boundary Value Problem (BVP) solvers, but let’s say we will want to do this at a bunch of different timepoints.
    How about using a numerical solver for ODEs, except instead of using numbers, let’s use symbolic expressions for our initial conditions so that way we can calculate approximate functions for the timepoints. Sounds fun and potentially useful, so let’s give it a try.

    The ODE solvers for Julia are in the package DifferentialEquations.jl. Let’s solve the linear ODE:

     \frac{dy}{dt} = 2y

    with an initial condition which is a symbolic variable. Following the tutorial, let’s swap out the numbers for symbolic expressions. To do this, we simply make the problem type and solve it:

    using DifferentialEquations, SymEngine
    y0 = symbols(:y0)
    u0 = y0
    f = (t,y) -> 2y
    prob = ODEProblem(f,u0,(0.0,1.0))
    sol = solve(prob,RK4(),dt=1/10)
     
    println(sol)
    # SymEngine.Basic[y0,1.2214*y0,1.49181796*y0,1.822106456344*y0,2.22552082577856*y0,2.71825113660594*y0,3.32007193825049*y0,4.05513586537915*y0,4.95294294597409*y0,6.04952451421275*y0,7.38888924165946*y0,7.38888924165946*y0]

    The solution is an array of symbolic expressions for what the RK4 method (the order 4 Runge-Kutta method) gives at each timepoint, starting at 0.0 and stepping 0.1 units at a time up to 1.0. We can then use the SymEngine function lambdify to turn the solution at the end into a function, and check it. For our reference, I will re-solve the differential equation at a much higher accuracy. We can test this against the true solution, which for the linear ODE we know this is:

     y(t) = y_0 \exp(2t)

    sol = solve(prob,RK4(),dt=1/1000)
    end_solution  = lambdify(sol[end])
     
    println(end_solution(2)) # 14.778112197857341
    println(2exp(2)) # 14.7781121978613
    println(end_solution(3)) # 22.167168296786013
    println(3exp(2)) # 22.16716829679195

    We have successfully computed a really high accuracy approximation to our solution which is a function of the initial condition! We can even do this for systems of ODEs. For example, let’s get a function which approximates a solution to the Lotka-Volterra equation:

    using SymEngine, DifferentialEquations
    # Make our initial condition be symbolic expressions from SymEngine
    x0,y0 = @vars x0 y0
    u0 = [x0;y0]
    f = function (t,y,dy)
      dy[1] = 1.5y[1] - y[1]*y[2]
      dy[2] = -3y[2] + y[1]*y[2]
    end
    prob = ODEProblem(f,u0,(0.0,1.0))
    sol = solve(prob,RK4(),dt=1/2)

    The result is a stupidly large expression because it grows exponentially and SymEngine doesn’t have a simplification function yet (it’s still pretty new!), but hey this is super cool anyways.

    Going Past The Edge: What Happens With Incompatibility?

    There are some caveats here. You do need to work through MethodErrors. For example, if you wanted to use the more efficient version of ode45/dopri5, you’ll get an error:

    sol = solve(prob,Tsit5())
     
     
    MethodError: no method matching abs2(::SymEngine.Basic)
    Closest candidates are:
      abs2(!Matched::Bool) at bool.jl:38
      abs2(!Matched::ForwardDiff.Dual{N,T<:Real}) at C:\Users\Chris\.julia\v0.5\ForwardDiff\src\dual.jl:325
      abs2(!Matched::Real) at number.jl:41
      ...

    What this is saying is that there is no abs2 function for symbolic expressions. The reason why this is used is because the adaptive algorithm uses normed errors in order to find out how to change the stepsize. However, in order to get a normed error, we’d actually have to know the initial condition… so this just isn’t going to work.

    Conclusion

    This is kind of a silly example just for fun, but there’s a more general statement here. In Julia, new algorithms can just be passing new types to pre-written functionality. This vastly decreases the amount of work that you actually have to do, without a loss to efficiency! You can add this to your Julia bag of tricks.

    The post Some Fun With Julia Types: Symbolic Expressions in the ODE Solver appeared first on Stochastic Lifestyle.

    ]]>
    3617
    Plain Functions that Just Work with TensorFlow.jl http://www.juliabloggers.com/plain-functions-that-just-work-with-tensorflow-jl/ Thu, 04 May 2017 00:00:00 +0000 http://white.ucc.asn.au/2017/05/04/Plain-Functions-that-Just-Work-with-TensorFlow.jl.html Anyone who has been stalking me may know that I have been making a fairly significant number of PR’s against TensorFlow.jl. One thing I am particularly keen on is making the interface really Julian. Taking advantage of the ability to overload julia’s great syntax for matrix indexing and operations. I will make another post going into those enhancements sometime in the future; and how great julia’s ability to overload things is. Probably after #209 is merged. This post is not directly about those enhancements, but rather about a emergant feature I noticed today. I wrote some code to run in base julia, but just by changing the types to Tensors it now runs inside TensorFlow, and on my GPU (potentially).

    Technically this did require one little PR, but it was just adding in the linking code for operator.

    In [1]:

    using TensorFlow
    using Base.Test

    I have defined a function to determine the which bin-index a continous value belongs it. This is useful if one has discretized a continous range of values; as is done in a histogram. This code lets you know which bin a given input lays within.

    It comes from my current research interest in using machine learning around the language of colors.

    In [2]:

    "Determine which bin a continous value belongs in"
    function find_bin(value, nbins, range_min=0.0, range_max=1.0)
        portion = nbins * (value / (range_max - range_min))
    
        clamp(round(Int, portion), 1, nbins)
    end
    None

    In [3]:

    @testset "Find_bin" begin
        @test find_bin(0.0, 64) == 1
        @test find_bin(1.0, 64) == 64
        @test find_bin(0.5, 64) == 32
        @test find_bin(0.4999, 64) == 32
        @test find_bin(0.5001, 64) == 32
    
        n_bins = 20
        for ii in 1.0:n_bins
            @test find_bin(ii, 20, 0.0, n_bins) == Int(ii)
        end
        
        @test [10, 11, 19, 2] == find_bin([0.5, 0.51, 0.9, 0.1], 21)
    end
    Test Summary: | Pass  Total
      Find_bin    |   26     26
    None

    It is perfectly nice julia code that runs perfectly happily with the types from Base. Both on scalars, and on Arrays, via broadcasting.

    Turns out, it will also run perfectly fine on TensorFlow Tensors. This time it will generate an computational graph which can be evaluated.

    In [4]:

    sess = Session(Graph())
    
    obs = placeholder(Float32)
    bins = find_bin(obs, 100)
    2017-05-04 15:34:12.893787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
    name: Tesla K40c
    major: 3 minor: 5 memoryClockRate (GHz) 0.745
    pciBusID 0000:02:00.0
    Total memory: 11.17GiB
    Free memory: 11.10GiB
    2017-05-04 15:34:12.893829: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
    2017-05-04 15:34:12.893835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
    2017-05-04 15:34:12.893845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:02:00.0)
    WARNING: You are using an old version version of the TensorFlow binary library. It is recommened that you upgrade with Pkg.build("TensorFlow") or various
                errors may be encountered.
     You have 1.0.0 and the new version is 1.0.1.

    In [5]:

    run(sess, bins, Dict(obs=>0.1f0))
    None

    In [6]:

    run(sess, bins, Dict(obs=>[0.1, 0.2, 0.25, 0.261]))
    None

    We can quiet happily run the whole testset from before. Using constant to change the inputs into constant Tensors. then running the operations to get back the result.

    In [7]:

    @testset "Find_bin_tensors" begin
        sess = Session(Graph()) #New graph
        
        
        @test run(sess, find_bin(constant(0.0), 64)) == 1
        @test run(sess, find_bin(constant(1.0), 64)) == 64
        @test run(sess, find_bin(constant(0.5), 64)) == 32
        @test run(sess, find_bin(constant(0.4999), 64)) == 32
        @test run(sess, find_bin(constant(0.5001), 64)) == 32
    
        n_bins = 20
        for ii in 1.0:n_bins
            @test run(sess, find_bin(constant(ii), 20, 0.0, n_bins)) == Int(ii)
        end
        
        @test [10, 11, 19, 2] ==  run(sess, find_bin(constant([0.5, 0.51, 0.9, 0.1]), 21))
    end
    Test Summary:    | Pass  
    2017-05-04 15:34:16.021916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:02:00.0)
    Total
      Find_bin_tensors |   26     26
    None

    It just works.
    In general that is a great thing to say about any piece of technology.
    Be it a library, a programming language, or a electronic device.

    Wether or not it is particular useful to be running integer cliping and rounding operations on the GPU is another question. It is certainly nice to be able to include this operation as part of a larger network defination.

    The really great thing about this, is that the library maker does not need to know anything about TensorFlow, at all. I certainly didn’t have it in mind when I wrote the function. The function just works on any type, so long as the user provides suitable methods for the functions it uses via multiple dispatch. This is basically Duck-Typing. if if it provides methods for quack and for waddle, then I can treat it like a Duck, even if it is a Goose.

    It would not work if I had have written say:

    In [8]:

    function find_bin_strictly_typed(value::Float64, nbins::Int, range_min::Float64=0.0, range_max::Float64=1.0)
        portion = nbins * (value / (range_max - range_min))
    
        clamp(round(Int, portion), 1, nbins)
    end
    None

    In [9]:

    run(sess, find_bin_strictly_typed(constant(0.4999), 64)) == 32
    None

    The moral of the story is don’t over constrain your function parameters.
    Leave you functions loosely typed, and you may get free functionality later.

    ]]>
    By: A Technical Blog -- julia

    Re-posted from: http://white.ucc.asn.au/2017/05/04/Plain-Functions-that-Just-Work-with-TensorFlow.jl.html

    Anyone who has been stalking me may know that I have been making a fairly significant number of PR’s against TensorFlow.jl.
    One thing I am particularly keen on is making the interface really Julian. Taking advantage of the ability to overload julia’s great syntax for matrix indexing and operations.
    I will make another post going into those enhancements sometime in the future; and how great julia’s ability to overload things is. Probably after #209 is merged.
    This post is not directly about those enhancements, but rather about a emergant feature I noticed today.
    I wrote some code to run in base julia, but just by changing the types to Tensors it now runs inside TensorFlow, and on my GPU (potentially).

    Technically this did require one little PR, but it was just adding in the linking code for operator.

    In [1]:

    using TensorFlow
    using Base.Test

    I have defined a function to determine the which bin-index a continous value belongs it.
    This is useful if one has discretized a continous range of values; as is done in a histogram.
    This code lets you know which bin a given input lays within.

    It comes from my current research interest in using machine learning around the language of colors.

    In [2]:

    "Determine which bin a continous value belongs in"
    function find_bin(value, nbins, range_min=0.0, range_max=1.0)
        portion = nbins * (value / (range_max - range_min))
    
        clamp(round(Int, portion), 1, nbins)
    end
    None

    In [3]:

    @testset "Find_bin" begin
        @test find_bin(0.0, 64) == 1
        @test find_bin(1.0, 64) == 64
        @test find_bin(0.5, 64) == 32
        @test find_bin(0.4999, 64) == 32
        @test find_bin(0.5001, 64) == 32
    
        n_bins = 20
        for ii in 1.0:n_bins
            @test find_bin(ii, 20, 0.0, n_bins) == Int(ii)
        end
        
        @test [10, 11, 19, 2] == find_bin([0.5, 0.51, 0.9, 0.1], 21)
    end
    Test Summary: | Pass  Total
      Find_bin    |   26     26
    None

    It is perfectly nice julia code that runs perfectly happily with the types from Base.
    Both on scalars, and on Arrays, via broadcasting.

    Turns out, it will also run perfectly fine on TensorFlow Tensors.
    This time it will generate an computational graph which can be evaluated.

    In [4]:

    sess = Session(Graph())
    
    obs = placeholder(Float32)
    bins = find_bin(obs, 100)
    2017-05-04 15:34:12.893787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
    name: Tesla K40c
    major: 3 minor: 5 memoryClockRate (GHz) 0.745
    pciBusID 0000:02:00.0
    Total memory: 11.17GiB
    Free memory: 11.10GiB
    2017-05-04 15:34:12.893829: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
    2017-05-04 15:34:12.893835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
    2017-05-04 15:34:12.893845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:02:00.0)
    WARNING: You are using an old version version of the TensorFlow binary library. It is recommened that you upgrade with Pkg.build("TensorFlow") or various
                errors may be encountered.
     You have 1.0.0 and the new version is 1.0.1.

    In [5]:

    run(sess, bins, Dict(obs=>0.1f0))
    None

    In [6]:

    run(sess, bins, Dict(obs=>[0.1, 0.2, 0.25, 0.261]))
    None

    We can quiet happily run the whole testset from before.
    Using constant to change the inputs into constant Tensors.
    then running the operations to get back the result.

    In [7]:

    @testset "Find_bin_tensors" begin
        sess = Session(Graph()) #New graph
        
        
        @test run(sess, find_bin(constant(0.0), 64)) == 1
        @test run(sess, find_bin(constant(1.0), 64)) == 64
        @test run(sess, find_bin(constant(0.5), 64)) == 32
        @test run(sess, find_bin(constant(0.4999), 64)) == 32
        @test run(sess, find_bin(constant(0.5001), 64)) == 32
    
        n_bins = 20
        for ii in 1.0:n_bins
            @test run(sess, find_bin(constant(ii), 20, 0.0, n_bins)) == Int(ii)
        end
        
        @test [10, 11, 19, 2] ==  run(sess, find_bin(constant([0.5, 0.51, 0.9, 0.1]), 21))
    end
    Test Summary:    | Pass  
    2017-05-04 15:34:16.021916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus id: 0000:02:00.0)
    Total
      Find_bin_tensors |   26     26
    None

    It just works.
    In general that is a great thing to say about any piece of technology.
    Be it a library, a programming language, or a electronic device.

    Wether or not it is particular useful to be running integer cliping and rounding operations on the GPU is another question.
    It is certainly nice to be able to include this operation as part of a larger network defination.

    The really great thing about this, is that the library maker does not need to know anything about TensorFlow, at all.
    I certainly didn’t have it in mind when I wrote the function.
    The function just works on any type, so long as the user provides suitable methods for the functions it uses via multiple dispatch.
    This is basically Duck-Typing.
    if if it provides methods for quack and for waddle,
    then I can treat it like a Duck, even if it is a Goose.

    It would not work if I had have written say:

    In [8]:

    function find_bin_strictly_typed(value::Float64, nbins::Int, range_min::Float64=0.0, range_max::Float64=1.0)
        portion = nbins * (value / (range_max - range_min))
    
        clamp(round(Int, portion), 1, nbins)
    end
    None

    In [9]:

    run(sess, find_bin_strictly_typed(constant(0.4999), 64)) == 32
    None

    The moral of the story is don’t over constrain your function parameters.
    Leave you functions loosely typed, and you may get free functionality later.

    ]]>
    3621
    Rhisco Group partners with Julia Computing http://www.juliabloggers.com/rhisco-group-partners-with-julia-computing/ Tue, 02 May 2017 00:00:00 +0000 http://juliacomputing.com/blog/2017/05/02/Rhisco-partnership Rhisco Group and Julia Computing are delighted to announce a new partnership to collaborate in providing solutions in the regulatory risk and capital space.

    As a highly-specialised boutique firm with significant expertise in risk analysis and various technologies, Rhisco provides services and solutions on risk capital technology implementations for banks, insurance and other financial companies. Julia Computing builds products focused on productivity, performance and scalability for multiple areas of data science and numerical computing leveraging the Julia programming language. Julia Computing’s products - JuliaPro, JuliaRun, and JuliaFin - are used for modelling financial contracts, asset management, risk management, algorithmic trading, backtesting, and many other areas of computational finance.

    By integrating JuliaFin and other products of Julia Computing, Rhisco will significantly enhance its integration platform, Tegra, improving the technology offered to clients that need high performance solutions for risk management and regulatory requirements.

    About Julia Computing and Julia

    Julia Computing was founded in 2015 by the co-creators of the Julia language to provide support to businesses and researchers who use Julia.

    Julia is the fastest modern high performance open source computing language for data and analytics. It combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of Java and C++. Julia delivers dramatic improvements in simplicity, speed, scalability, capacity and productivity. Julia provides parallel computing capabilities out of the box and literally infinite scalability with minimal effort. With more than 1 million downloads and +161% annual growth, Julia adoption is growing rapidly in finance, energy, robotics, genomics and many other fields.

    About Rhisco Group and TEGRA

    Rhisco was founded in 2010 by expert professionals, supporting clients internationally through its head office in London, a subsidiary in Mexico, and a network of associates and consulting partners in Europe, Latin America, Middle East and Africa. The people behind Rhisco have professional, quantitative and technical acumen acquired through several years of industry and consulting practice internationally.

    Rhisco has developed TEGRA, a modular integration platform to enhance existing client’s risk infrastructure and accelerate implementation of new age risk technology, including cloud-computing. TEGRA software components provide a significant edge to pricing/risk engines developed by third parties or by the client. This is done through innovative technology for data intelligence, distributed computing, and advance aggregation.

    ]]>
    By: Julia Computing, Inc.

    Re-posted from: http://juliacomputing.com/blog/2017/05/02/Rhisco-partnership.html

    Rhisco Group and Julia Computing are delighted to announce a new partnership to collaborate in providing solutions in the regulatory risk and capital space.

    As a highly-specialised boutique firm with significant expertise in risk analysis and various technologies, Rhisco provides services and solutions on risk capital technology implementations for banks, insurance and other financial companies. Julia Computing builds products focused on productivity, performance and scalability for multiple areas of data science and numerical computing leveraging the Julia programming language. Julia Computing’s products – JuliaPro, JuliaRun, and JuliaFin – are used for modelling financial contracts, asset management, risk management, algorithmic trading, backtesting, and many other areas of computational finance.

    By integrating JuliaFin and other products of Julia Computing, Rhisco will significantly enhance its integration platform, Tegra, improving the technology offered to clients that need high performance solutions for risk management and regulatory requirements.

    About Julia Computing and Julia

    Julia Computing was founded in 2015 by the co-creators of the Julia language to provide support to businesses and researchers who use Julia.

    Julia is the fastest modern high performance open source computing language for data and analytics. It combines the functionality and ease of use of Python, R, Matlab, SAS and Stata with the speed of Java and C++. Julia delivers dramatic improvements in simplicity, speed, scalability, capacity and productivity. Julia provides parallel computing capabilities out of the box and literally infinite scalability with minimal effort. With more than 1 million downloads and +161% annual growth, Julia adoption is growing rapidly in finance, energy, robotics, genomics and many other fields.

    About Rhisco Group and TEGRA

    Rhisco was founded in 2010 by expert professionals, supporting clients internationally through its head office in London, a subsidiary in Mexico, and a network of associates and consulting partners in Europe, Latin America, Middle East and Africa. The people behind Rhisco have professional, quantitative and technical acumen acquired through several years of industry and consulting practice internationally.

    Rhisco has developed TEGRA, a modular integration platform to enhance existing client’s risk infrastructure and accelerate implementation of new age risk technology, including cloud-computing. TEGRA software components provide a significant edge to pricing/risk engines developed by third parties or by the client. This is done through innovative technology for data intelligence, distributed computing, and advance aggregation.

    ]]>
    3613
    DifferentialEquations.jl 2.0 http://www.juliabloggers.com/differentialequations-jl-2-0/ Sun, 30 Apr 2017 01:30:00 +0000 http://juliadiffeq.org/2017/04/30/API_changes.html ]]> By: JuliaDiffEq

    Re-posted from: http://juliadiffeq.org/2017/04/30/API_changes.html

    This marks the release of ecosystem version 2.0. All of the issues got looked
    over. All (yes all!) of the API suggestions that were recorded in issues in
    JuliaDiffEq packages have been addressed! Below are the API changes that have occurred.
    This marks a really good moment for the JuliaDiffEq ecosystem because it means all
    of the long-standing planned API changes are complete. Of course new things may come
    up, but there are no more planned changes to core functionality. This means that we can simply
    work on new features in the future (and of course field bug reports as they come).
    A blog post detailing our full 2.0 achievements plus our 3.0 goals will come out at
    our one year anniversary. But for now I want to address what the API changes are,
    and the new features of this latest update.

    ]]>
    3627
    Wilmott – Automatic Differentiation for the Greeks http://www.juliabloggers.com/wilmott-automatic-differentiation-for-the-greeks/ Wed, 26 Apr 2017 00:00:00 +0000 http://juliacomputing.com/blog/2017/04/26/Wilmott-AD-Greeks Online quantitative finance magazine Wilmott featured Julia yet again.

    Julia Computing’s Dr. Simon Byrne and Dr. Andrew Greenwell engage the magazine readers in a solution they built in Julia, that uses Automatic Differentiation (AD) to calculate price sensitivities, also known as the Greeks.

    Fast and accurate calculation of these price sensitivities is extremely crucial in understanding the risk of an option position, and using AD in Julia achieves precisely that.

    Traditionally, the world is familiar with using finite-difference approximation for the same calculations. Simon and Andrew go on to argue how that solution is numerically unstable, and how their solution will not only shoot up numerical accuracy, but will also eliminate computational overheads.

    To put that in context, there are C++ libraries that assist in these calculations too, QuantLib being one of them. However, a simple implementation of a Cox–Ross–Rubinstein tree (for pricing an American put) with AD in Julia fared 3x times faster than with the C++ library. The code for this example is available here.You can also read the article to know more.

    At Julia Computing, we curate all this and much more as part of JuliaFin, a suite of Julia packages that simplify the workflow for quantitative finance, including storage, retrieval, analysis and action.

    Julia is already solving a variety of use cases. BlackRock, the Federal Reserve Bank of New York, Nobel Laureate Thomas J. Sargent, and the world’s largest investment banks, insurers, risk managers, fund managers, asset managers, foreign exchange analysts, energy traders, commodity traders and others are all using Julia to solve some of their very complex and challenging quantitive computational problems.

    ]]>
    By: Julia Computing, Inc.

    Re-posted from: http://juliacomputing.com/blog/2017/04/26/Wilmott-AD-Greeks.html

    Online quantitative finance magazine Wilmott featured Julia yet again.

    Julia Computing’s Dr. Simon Byrne and Dr. Andrew Greenwell engage the magazine readers in a solution they built in Julia, that uses Automatic Differentiation (AD) to calculate price sensitivities, also known as the Greeks.

    Fast and accurate calculation of these price sensitivities is extremely crucial in understanding the risk of an option position, and using AD in Julia achieves precisely that.

    Traditionally, the world is familiar with using finite-difference approximation for the same calculations. Simon and Andrew go on to argue how that solution is numerically unstable, and how their solution will not only shoot up numerical accuracy, but will also eliminate computational overheads.

    To put that in context, there are C++ libraries that assist in these calculations too, QuantLib being one of them. However, a simple implementation of a Cox–Ross–Rubinstein tree (for pricing an American put) with AD in Julia fared 3x times faster than with the C++ library. The code for this example is available here.You can also read the article to know more.

    At Julia Computing, we curate all this and much more as part of JuliaFin, a suite of Julia packages that simplify the workflow for quantitative finance, including storage, retrieval, analysis and action.

    Julia is already solving a variety of use cases. BlackRock, the
    Federal Reserve Bank of New
    York
    , Nobel
    Laureate Thomas J.
    Sargent
    ,
    and the world’s largest investment banks,
    insurers,
    risk managers, fund
    managers
    , asset
    managers, foreign exchange
    analysts
    , energy
    traders, commodity traders and others are all using Julia to solve some of their very complex and challenging quantitive computational problems.

    ]]>
    3610
    Timing in Julia http://www.juliabloggers.com/timing-in-julia/ Mon, 24 Apr 2017 12:10:48 +0000 http://www.pkofod.com/?p=181 Continue reading Timing in Julia ]]> By: pkofod

    Re-posted from: http://www.pkofod.com/2017/04/24/timing-in-julia/

    Timing code is important when you want to benchmark or profile your code. Is it the solution of a linear system or the Monte Carlo integration scheme that takes up most of the time? Is version A or version B of a function faster? Questions like that show up all the time. Let us have a look at a few of the possible ways of timing things in Julia.

    The basics

    The most basic timing functionalities in Julia are the ones included in the Base language. The standard way of timing things in Julia, is by use of the @time macro.

    julia> function test(n)
               A = rand(n, n)
               b = rand(n)
               @time A\b
           end
    test (generic function with 1 method)

    Do note, that the code we want to time is put in a function . This is because everything we do at the top level in the REPL is in global scope. It’s a mistake a lot of people do all the time, but currently it is a very bad idea performance wise. Anyway, let’s see what happens for n = 1, 10, 100, and 1000.

    julia> test(1);
      0.000002 seconds (7 allocations: 320 bytes)
     
    julia> test(10);
      0.000057 seconds (9 allocations: 1.313 KB)
     
    julia> test(100);
      0.001425 seconds (10 allocations: 80.078 KB)
     
    julia> test(1000);
      0.033573 seconds (10 allocations: 7.645 MB, 27.81% gc time)
     
    julia> test(1000);
      0.045214 seconds (10 allocations: 7.645 MB, 47.66% gc time)

    The first run is to compile both test, and then we have a look at what happens when the dimension of our problem increases. Elapsed time seems to increase, and we also see that the number of allocations, and the amount of memory that was allocated increases. For the runs with dimension 1000 we see something else in the output. 30-50% of the time was spent in “gc”. What is this? Julia is a garbage collected language. This means that Julia keeps track of current allocations, and frees the memory if it isn’t needed anymore. It doesn’t do this all the time, though. Running the 1000-dimensional problem once more gives us

    julia> test(1000)
      0.029277 seconds (10 allocations: 7.645 MB)

    We see it runs slightly faster, and there is no GC time this time around. Of course, these things will look slightly different if you try to replicate them.

    So now we can time. But what if we want to store this number? We could be tempted to try

    t = @time 3+3

    but we will realize, that what is returned is the return value of the expression, not the elapsed time. To save the time, we can either use @timed or @elapsed. Let us try to change the @time to @timed and look at the output when we have our new test2 function return the return value.

    julia> function test2(n)
               A = rand(n, n)
               b = rand(n)
               @timed A\b
           end
    test2 (generic function with 1 method)
     
    julia> test2(3)
    ([0.700921,-0.120765,0.683945],2.7889e-5,512,0.0,Base.GC_Diff(512,0,0,9,0,0,0,0,0))

    We see that it returns a tuple with: the return value of A\b followed by the elapsed time, then the bytes allocated, time spent in garbage collection, and lastly some further memory counters. This is great as we can now work with the information @time printed, but we still have access to the results of our calculations. Of course, it is a bit involved to do it this way. If we simply wanted to see the elapsed time to act on that – then we would just use @time as we did above.

    Before we move on to some simpler macros, let us consider the last “time*-family” macro: @timev. As we saw above, @timed contained more information about memory allocation than @time printed. If we want the “verbose” version, we use @timev (v for verbose):

    julia> function test3(n)
               A = rand(n, n)
               b = rand(n)
               @timev A\b
           end
    test3 (generic function with 1 method)

    Running test3 on a kinda large problem, we see that is does indeed print the contents of Base.GC_Diff

    julia> test3(5000);
      1.923164 seconds (12 allocations: 190.812 MB, 4.67% gc time)
    elapsed time (ns): 1923164359
    gc time (ns):      89733440
    bytes allocated:   200080368
    pool allocs:       9
    malloc() calls:    3
    GC pauses:         1
    full collections:  1

    If any of the entries are zero, the corresponding lines are omitted.

    julia> test3(50);
      0.001803 seconds (10 allocations: 20.828 KB)
    elapsed time (ns): 1802811
    bytes allocated:   21328
    pool allocs:       9
    malloc() calls:    1

    Of the three macros, you’ll probably not use @timev a lot.

    Simpler versions

    If we only want the elapsed time or only want the allocations, then we used either the @elapsed or @allocated macros. However, these do not return the results of our calculations, so in many cases it may be easier to just used @timed, so we can grab the results, the elapsed time, and the allocation information. “MATLAB”-style tic();toc()‘s are also available. toc() prints the time, while toq() is used if we want only the returned time without the printing. It is also possible to use time_ns() to do what time.time() would do in Python, although for practically all purposes, the above macros are recommended.

    More advanced functionality

    Moving on to more advanced features, we venture into the package ecosystem.

    Nested timings

    The first package I will present is the nifty TimerOutputs.jl by Kristoffer Carlsson. This packages essentially allows you to nest @time calls. The simplest way to show how it works, is to use the example posted at the announcement (so credit to Kristoffer for the example).

    using TimerOutputs
     
    # Create the timer object
    to = TimerOutput()
     
    # Time something with an assigned label
    @timeit to "sleep" sleep(0.3)
     
    # Data is accumulated for multiple calls
    for i in 1:100
        @timeit to "loop" 1+1
    end
     
    # Nested sections are possible
    @timeit to "nest 1" begin
        @timeit to "nest 2" begin
            @timeit to "nest 3.1" rand(10^3)
            @timeit to "nest 3.2" rand(10^4)
            @timeit to "nest 3.3" rand(10^5)
        end
        rand(10^6)
    end

    Basically we’re timing the sleep call in one time counter, all the additions in the loop in another counter, and then we do some nested generation of random numbers. Displaying the to instance gives us something like the following

     ───────────────────────────────────────────────────────────────────────
                                    Time                   Allocations      
                            ──────────────────────   ───────────────────────
        Tot / % measured:        6.48s / 5.60%           77.4MiB / 12.0%    
     
     Section        ncalls     time   %tot     avg     alloc   %tot      avg
     ───────────────────────────────────────────────────────────────────────
     sleep               1    338ms  93.2%   338ms    804KiB  8.43%   804KiB
     nest 1              1   24.7ms  6.80%  24.7ms   8.52MiB  91.5%  8.52MiB
       nest 2            1   9.10ms  2.51%  9.10ms    899KiB  9.43%   899KiB
         nest 3.1        1   3.27ms  0.90%  3.27ms   8.67KiB  0.09%  8.67KiB
         nest 3.3        1   3.05ms  0.84%  3.05ms    796KiB  8.34%   796KiB
         nest 3.2        1   2.68ms  0.74%  2.68ms   92.4KiB  0.97%  92.4KiB
     loop              100   6.97μs  0.00%  69.7ns   6.08KiB  0.06%      62B
     ───────────────────────────────────────────────────────────────────────

    which nicely summarizes absolute and relative time and memory allocation of the individual @timeit calls. A real use case could be to see what the effect is of using finite differencing to construct the gradient for the Generalized Rosenbrock (GENROSEN) problem from CUTEst.jl using a conjugate gradient solver in Optim.jl.

    using CUTEst, Optim, TimerOutputs
     
    nlp = CUTEstModel("GENROSE")
     
    const to = TimerOutput()
     
    f(x    ) =  @timeit to "f"  obj(nlp, x)
    g!(g, x) =  @timeit to "g!" grad!(nlp, x, g)
     
    begin
    reset_timer!(to)
    @timeit to "Conjugate Gradient" begin
        res = optimize(f, g!, nlp.meta.x0, ConjugateGradient(), Optim.Options(iterations=5*10^10));
        println(Optim.minimum(res))
    end
    @timeit to "Conjugate Gradient (FiniteDiff)" begin
        res = optimize(f, nlp.meta.x0, ConjugateGradient(), Optim.Options(iterations=5*10^10));
        println(Optim.minimum(res))
    end
    show(to; allocations = false)
    end

    the output is a table as before, this time without the allocations (notice the use of the allocations keyword in the show method)

     ────────────────────────────────────────────────────────────────
                                                       Time          
                                               ──────────────────────
                 Tot / % measured:                  33.3s / 100%     
     
     Section                           ncalls     time   %tot     avg
     ────────────────────────────────────────────────────────────────
     Conjugate Gradient (FiniteDiff)        1    33.2s  99.5%   33.2s
       f                                1.67M    32.6s  97.9%  19.5μs
     Conjugate Gradient                     1    166ms  0.50%   166ms
       g!                               1.72k   90.3ms  0.27%  52.6μs
       f                                2.80k   59.1ms  0.18%  21.1μs
     ────────────────────────────────────────────────────────────────

    And we conclude: finite differencing is very slow when you’re solving a 500 dimensional unconstrained optimization problem, and you really want to use the analytical gradient if possible.

    Benchmarking

    Timing individual pieces of code can be very helpful, but when we’re timing small function calls, this way of measuring performance can be heavily influenced by noise. To remedy that, we use proper benchmarking tools. The package for that, well, it’s called BenchmarkTools.jl and is mainly written by Jarrett Revels. The package is quite advanced in its feature set, but its basic functionality is straight forward to use. Please see the manual for more details than we provide here.

    Up until now, we’ve asked Julia to tell us how much time some code took to run. Unfortunately for us, the computer is doing lots of stuff besides the raw calculations we’re trying to time. From the example earlier, this means that we have a lot of noise in our measure of the time it takes to solve A\b. Let us try to run test(1000) a few times

    julia> test(1000);
      0.029859 seconds (10 allocations: 7.645 MB)
     
    julia> test(1000);
      0.033381 seconds (10 allocations: 7.645 MB, 6.41% gc time)
     
    julia> test(1000);
      0.024345 seconds (10 allocations: 7.645 MB)
     
    julia> test(1000);
      0.039585 seconds (10 allocations: 7.645 MB)
     
    julia> test(1000);
      0.037154 seconds (10 allocations: 7.645 MB, 2.82% gc time)
     
    julia> test(1000);
      0.024574 seconds (10 allocations: 7.645 MB)
     
    julia> test(1000);
      0.022185 seconds (10 allocations: 7.645 MB)

    There’s a lot of variance here! Let’s benchmark instead. The @benchmark macro won’t work inside a function as above. This means that we have to be a bit careful (thanks to Fengyang Wang for clarifying this). Consider the following

    julia> n = 200;
     
    julia> A = rand(n,n);
     
    julia> b = rand(n);
     
    julia> @benchmark A\b
    BenchmarkTools.Trial: 
      memory estimate:  316.23 KiB
      allocs estimate:  10
      --------------
      minimum time:     531.227 μs (0.00% GC)
      median time:      718.527 μs (0.00% GC)
      mean time:        874.044 μs (3.12% GC)
      maximum time:     95.515 ms (0.00% GC)
      --------------
      samples:          5602
      evals/sample:     1

    This is fine, but since A and b are globals (remember, if it ain’t wrapped in a function, it’s a global when you’re working from the REPL), we’re also measuring the time dynamic dispatch takes. Dynamic dispatch happens here, because “Julia” cannot be sure what the types of A and b are when we invoke A\b since they’re globals. Instead, we should use interpolation of the non-constant variables, or mark them as constants using const A = rand(n,n) and const b = rand(20). Let us use interpolation.

    julia> @benchmark $A\$b
    BenchmarkTools.Trial: 
      memory estimate:  316.23 KiB
      allocs estimate:  10
      --------------
      minimum time:     531.746 μs (0.00% GC)
      median time:      717.269 μs (0.00% GC)
      mean time:        786.240 μs (3.22% GC)
      maximum time:     12.463 ms (0.00% GC)
      --------------
      samples:          6230
      evals/sample:     1

    We see that the memory information is identical to the information we got from the other macros, but we now get a much more robust estimate of the time it takes to solve our A\b problem. We also see that dynamic dispatch was negligible here, as the solution takes much longer to compute than for Julia to figure out which method to call. The @benchmark macro will do various things automatically, to try to give as accurate results as possible. It is also possible to provide custom tuning parameters, say if you’re running these benchmarks over an extended period of time and want to track performance regressions, but that is beyond this blog post.

    Dynamic dispatch

    Before we conclude, let’s have a closer look at the significance of dynamic dispatch. When using globals it has to be determined at run time which method to call. If there a few methods, this may not be a problem, but the problem begins to show itself when a function has a lot of methods. For example, on Julia v0.5.0, identity has one method, but + has 291 methods. Can we measure the significance of dynamic dispatch then? Sure. Just benchmark with, and without interpolation (thanks again to Fengyang Wang for cooking up this example). To keep output from being too verbose, we’ll use the @btime macro – again from BenchmarkTools.jl.

    julia> x = 0
    0
     
    julia> @btime identity(x)
      1.540 ns (0 allocations: 0 bytes)
    0
     
    julia> @btime +x
      15.837 ns (0 allocations: 0 bytes)
    0
     
    julia> @btime identity($x)
      1.540 ns (0 allocations: 0 bytes)
    0
     
    julia> @btime +$x
      1.548 ns (0 allocations: 0 bytes)
    0

    As we can see, calling + on the global x takes around 10 times a long as the single method function identity. To show that declaring the input a const and interpolating the variable gives the same result, consider the example below.

    julia> const y = 0
    0
     
    julia> @btime identity(y)
      1.539 ns (0 allocations: 0 bytes)
    0
     
    julia> @btime +y
      1.540 ns (0 allocations: 0 bytes)
    0
     
    julia> @btime identity($y)
      1.540 ns (0 allocations: 0 bytes)
    0
     
    julia> @btime +$y
      1.540 ns (0 allocations: 0 bytes)
    0

    We see that interpolation is not needed, as long as we remember to use constants.

    Conclusion

    There are quite a few ways of measuring performance in Julia. I’ve presented some of them here, and hopefully you’ll be able to put the tools to good use. The functionality from Base is good for many purposes, but I really like the nested time measuring in TimerOutputs.jl a lot, and for serious benchmarking it is impossible to ignore BenchmarkTools.jl.

    ]]>
    3608
    Julia at the Intel AI Day, Bangalore 2017 http://www.juliabloggers.com/julia-at-the-intel-ai-day-bangalore-2017/ Mon, 24 Apr 2017 00:00:00 +0000 http://juliacomputing.com/blog/2017/04/24/Intel-AI-Day Bengaluru, India - Julia Computing featured at one of India’s most prominent AI conferences, demonstrating two very powerful Deep Learning use cases the company is trying to solve using Julia on Intel’s hardware.

    The two day event was organized and crafted to showcase robust AI-supporting hardware and software solutions from Intel and it’s partners.

    Julia Computing Inc, one of Intel’s partners from India, took centre-stage on day two for their demonstrations. The first of the two demos, namely Neural Styles caught the audience’s fancy - building a neural network imposing the style of one image onto another. Our very own Ranjan Anantharaman took a live picture of the audience and applied transforms to it in real-time.

    The second demo targeted solving the serious problem of identifying if a person has symptoms of Diabetic Retinopathy by taking only but an image of his retina as an input, without any human intervention or prediction.

    The following video holds a glimpse of what the two firms envision as the future of AI.

    Julia Computing and Intel - Acelerating the AI revolution

    Also see this whitepaper on Intel and Julia Computing working together on an AI stack.

    About Julia

    Julia is the simplest, fastest and most powerful numerical computing language available today. Julia combines the functionality of quantitative environments such as Python and R, with the speed of production programming languages like Java and C++ to solve big data and analytics problems. Julia delivers dramatic improvements in simplicity, speed, capacity, and productivity for data scientists, algorithmic traders, quants, scientists, and engineers who need to solve massive computational problems quickly and accurately.

    Julia offers an unbeatable combination of simplicity and productivity with speed that is thousands of times faster than other mathematical, scientific and statistical computing languages.

    Partners and users include: Intel, The Federal Reserve Bank of New York, Lincoln Laboratory (MIT), The Moore Foundation and a number of private sector finance and industry leaders, including several of the world’s leading hedge funds, investment banks, asset managers and insurers.

    About Julia Computing, Inc.

    Julia Computing, Inc. was founded in 2015 to develop products around Julia such as JuliaFin. These products help financial firms leverage the 1,000x improvement in speed and productivity that Julia provides for trading, risk analytics, asset management, macroeconomic modeling and other areas. Products of Julia Computing make Julia easy to develop, easy to deploy and easy to scale.

    ]]>
    By: Julia Computing, Inc.

    Re-posted from: http://juliacomputing.com/blog/2017/04/24/Intel-AI-Day.html

    Bengaluru, India – Julia Computing featured at one of India’s most prominent AI conferences, demonstrating two very powerful Deep Learning use cases the company is trying to solve using Julia on Intel’s hardware.

    The two day event was organized and crafted to showcase robust AI-supporting hardware and software solutions from Intel and it’s partners.

    Julia Computing Inc, one of Intel’s partners from India, took centre-stage on day two for their demonstrations. The first of the two demos, namely Neural Styles caught the audience’s fancy – building a neural network imposing the style of one image onto another. Our very own Ranjan Anantharaman took a live picture of the audience and applied transforms to it in real-time.

    The second demo targeted solving the serious problem of identifying if a person has symptoms of Diabetic Retinopathy by taking only but an image of his retina as an input, without any human intervention or prediction.

    The following video holds a glimpse of what the two firms envision as the future of AI.

    Julia Computing and Intel – Acelerating the AI revolution

    Also see this whitepaper on Intel and Julia Computing working together on an AI stack.

    About Julia

    Julia is the simplest, fastest and most powerful numerical computing language available today. Julia combines the functionality of quantitative environments such as Python and R, with the speed of production programming languages like Java and C++ to solve big data and analytics problems. Julia delivers dramatic improvements in simplicity, speed, capacity, and productivity for data scientists, algorithmic traders, quants, scientists, and engineers who need to solve massive computational problems quickly and accurately.

    Julia offers an unbeatable combination of simplicity and productivity with speed that is thousands of times faster than other mathematical, scientific and statistical computing languages.

    Partners and users include: Intel, The Federal Reserve Bank of New York, Lincoln Laboratory (MIT), The Moore Foundation and a number of private sector finance and industry leaders, including several of the world’s leading hedge funds, investment banks, asset managers and insurers.

    About Julia Computing, Inc.

    Julia Computing, Inc. was founded in 2015 to develop products around Julia such as JuliaFin. These products help financial firms leverage the 1,000x improvement in speed and productivity that Julia provides for trading, risk analytics, asset management, macroeconomic modeling and other areas. Products of Julia Computing make Julia easy to develop, easy to deploy and easy to scale.

    ]]>
    3600