By: Simon Danisch
There are currently different problem areas when dealing with visualizations and high performance computing, which make it hard to create a pleasant and fast workflow.
- There is no seamless way from data to high performance visualization libraries. Even if your data processing runs on the GPU, it is common practice to serialize it to the hard drive and load it into an external visualization program, which again uploads the data to the GPU.
- You need to know different low-level languages and API’s to get full control over the data processing/simulation and visualization pipeline -> if you’re not an expert you won’t get far
- It is often hard to set up a speedy pipeline with different libraries that all need to work together (The worst setup took me one week before I could get started)
- There are many problems in scientific computing which need custom solutions. If your technology stack is fragmented and uses a lot of different languages, it is really hard to create a custom solution without starting from scratch.
- Once your research is done, it might only run on a couple of platforms.
Vulkan and Julia can form the next framework for high performance scientific computing with a new level of native support for interactive 2D/3D visualizations across different platforms.
Short introduction to Vulkan:
Vulkan is the newest industry standard for general computing and graphics on the GPU (released by the Khronos Group on the 02/16/16). It can be thought as the successor of Khronos’s OpenGL (Graphics) and OpenCL (GPGPU), but it is designed almost from scratch.
The design offers a new kind of flexibility and performance when programming for the GPU.
Some of the biggest changes are, that OpenGL and OpenCL kernels now share the same intermediate representation (SPIR-V) and execution model. This intermediate binary representation is very similar to that of LLVM-IR and there are bi-directional translators for SPIR-V and LLVM-IR.
SPIR-V is easy to target from different languages finally opening up the world of GPU accelerated graphics and computing to more languages.
Another improvement is the elimination of most driver overhead and better support for multithreading which makes it easier to utilize the CPU while doing work on the GPU.
And last of all, Vulkan is expected to run on different hardware setups, from multi node clusters with thousands of GPU’s down to a smartphone, all while squeezing out the maximum performance.
All this comes at the cost of a complicated runtime, where you have to manage the memory allocations and schedule programs via command buffers with a fairly complex C API.
Short introduction to Julia:
Julia is a young programming language promising to be the greediest of all!
It is easy to use and offers the kind of performance you would expect from C, while offering freedom and usability known from dynamic languages like Python.
The main language features are multiple dispatch and a rather functional style of programming. The two together make it a good fit for mathematical code and parallel programming. As a loosely typed language without much boilerplate, a huge standard library and a lot of scientific libraries, it is very easy to work interactively on powerful scripts. These scripts can go from data analysis to generating HTML for a website.
Still offering high speed in this situation is made possible by a clever runtime type inference and just in time compilation of the then (mostly) fully typed multimethods.
The just in time compilation is done via LLVM, which means Julia already targets LLVM-IR as an intermediate representation. Since Julia’s memory layout and binary format is similar to C’s, Julia has a mostly overhead less C-call API. Finally, Julia runs on most platforms supported by LLVM, and is designed to easily run on large clusters.
You might already see where this is going.
We have a language that targets number crunching and scientific computing with a huge demand for processing power, and an API that allows to squeeze out the last bit of performance from heterogeneous hardware.
We have a hard to control Vulkan C-API and a scripting language which excels at calling C code.
We have a language that compiles to LLVM-IR and converters that convert LLVM-IR to SPIR-V, allowing the language to run natively on the GPU.
Everything is there for a great support of high performance computing and visualizations from within one easy to use language.
All these goodies are very close, but there is much work to be done. We still need a wrapper for the Vulkan API and Julia’s LLVM-IR needs some tweaking to conform to the SPIR-V standards, which is a pretty involved task.
To slowly bring Julia and Vulkan together, a two stepped approach is adviced:
- Use Julia as a scripting language for Vulkan to have a nice interface to the memory and command buffers, while still relying on C/C++ GLSL kernels to produce SPIR-V executables.
- make Julia compile to SPIR-V itself, completely elimating the need for other languages.
In the first stage, Julia already offers great advantages.
From my experience with the julian OpenCL and OpenGL wrappers, decorating a low level API like Vulkan with a higher level interface in Julia offers a great improvement in productivity and safety while losing almost no performance.
Then in the final stage, we could even choose freely what to run on the CPU and what to run on the GPU, since the Julia code can run on both.
You could create software that works on the desktop and on tablets and mobile phones without any code duplication.
We could simulate the gravity of large galaxies on a cluster of GPU’s while immediately visualizing the result, all while staying in a nice high level programming language and having absolute control over all libraries involved.
This could be done in an interactive way, refining functions and algorithms while directly getting feedback.
We could finally build the interactive tools that stream big data in parallel, while seamlessly viewing and editing the data.
If you have a powerful GPU, you could even dive into virtual reality to get a good look at the data. If you know a bit about VR, you also know that it is very sensitive to latency, making it a must to rely on tools with the highest possible performance.
I think the motivation is clear, now we only need to implement the missing bits!