Tag Archives: matlab

ChatGPT performs better on Julia than Python (and R) for Large Language Model (LLM) Code Generation. Why?

By: Christopher Rackauckas

Re-posted from: http://www.stochasticlifestyle.com/chatgpt-performs-better-on-julia-than-python-and-r-for-large-language-model-llm-code-generation-why/

Machine learning is all about examples. The more data you have, the better it should perform, right? With the rise of ChatGPT and Large Language Models (LLMs) as a code helping tool, it was thus just an assumption that the most popular languages like Python would likely be the best for LLMs. But because of the increased productivity, I tend to use a lot of Julia, a language with an estimated user-base of around a million programmers. For this reason, people have often asked me how it fairs with ChatGPT, Github Copilot, etc., and so I checked out those pieces and… was stunned. It’s really good. It seemed better than Python actually?

The data is in: Julia does well with ChatGPT

This question was recently put to the test by a researcher named Alessio Buscemi in A Comparative Study of Code Generation using ChatGPT 3.5 across 10 Programming Languages. Basically, he setup a bunch of example problems and asked ChatGPT for the code, executed it, and documented the results. The main result of the paper is the following:

or in the words of the author:

“Overall, 1833 runs, or 45.8% of the total number, lead to executable code. However, this percentage varies greatly according to the tested language. ChatGTP performs the best on Julia, with a 81.5% of generated code being successfully executed, and performs the worst on C++, with only 7.3% of the executions being successful. Specifically, the model seems to perform better on high-level dynamically typed languages (Javascript, Julia, Perl, Python, R, Ruby, Smalltalk) rather than lower level statically typed languages (C, C++, Go).”

That is right, of all languages, “ChatGTP performs the best on Julia”. In fact, ChatGPT generally only performs well on slow languages: Julia is the only fast language that it did well on. The author went on to do a podcast at DataSkeptic where at 25:20 this is addressed where he was unable to answer why ChatGPT was able to make him more successful in Julia than Python, even though he himself only had used Python before.

Is it unexpected that Julia would outperform Python in generated code execution and correctness?

But the real question is, is this really unexpected? I don’t think so, this aligns with what I have seen. I am an educator (currently at MIT researching/teaching in machine learning) who has ran many workshops over the last 10 years, with the languages mainly being Julia, Python, MATLAB, R, and C. I also recently have been a power-user of this translation because I recently have been updating the bindings and documentation of diffeqpy, a fast differential equation solver package in Python which uses Julia as the backend. This has been requiring a lot of translating Julia DifferentialEquations.jl code into a Python format for diffeqpy, and thus there was some ChatGPT involved.

[A quick intro to the “why of diffeqpy” is that we recently had a peer reviewed study demonstrating the generated GPU kernels from Julia for ODE solvers were about 20x-100x faster than the leading PyTorch and Jax libraries. Given these results, we wanted to create a Julia backend for scientists who use Python and could demonstrate on Google Collab notebooks that even when used through Python with automated language translation, it’s still about 10x faster than Jax. The future is working together: if you happen to be at PyData Eindhoven or PyData Global, come chat with me about building bridges between languages!]

In both of these scenarios, the development of diffeqpy with the help of ChatGPT and in teaching new students Julia vs Python, one of the things that really stuck out was that new developers and AI get caught on the API inconsistencies of Python. “But Python is so Pythonic, it doesn’t have inconsistencies?”… ehh have you tried to teach it? I think it’s an interesting thing in the psyche of a programmer. If you have used a programming language for 10 years, then everything the programming language does feels natural. It’s like an English speaker using the word “do”, a phase by which do-support is very difficult for new language learners, but someone who natively learned English may have never thought twice about how weird the word is.

There is a lot of this “it’s always been like this and therefore it makes sense” in Python. In the workshops, it always got best highlighted when using a cheatsheet which shows Julia vs MATLAB vs Python syntax side-by-side. One of the examples that comes to mind is the syntax for the simplest things, like “let’s create an array of ones, zeros, or random numbers”. What does that look like side-by-side?

The Python form in the middle is undoubtably a bit weird, requires extra packages (“what is np?”), etc. but it’s not “awful”. However, it’s then the rand part that gets students:

And that’s when it all becomes clear. np.zeros((2, 2)) to np.random.rand(2, 2), students always ask “why the ((” and the answer is “well there’s a tuple and …” but no, no matter how much we as educators try to explain it, it’s just an inconsistency. It’s just harder. And it happens over and over in the teaching. Sometimes it’s that the syntax is more complex or opaque:

while in other cases it’s inconsistency, unconventional wording, or something else.

So what happened with ChatGPT? It tripped up on exactly these same points that new learners commonly tripped up on. Common issues I noticed when developing diffeqpy was:

  • Results that were “right” but contextually wrong because of standard library inconsistencies or lack of a standard library. For example, “create an array of random numbers in Python” does “import random”, then “random_numbers = [random.randint(1, 100) for _ in range(5)]”. Okay, that’s not wrong, but “obviously” the right thing to do is to create a numpy array in any context where I am using a differential equation solver. But ChatGPT doesn’t know these contextual differences, it does not guess well, and if a user is not already familiar in Python they will get tripped up. The solution is to be more specific “create an array of random numbers in Python with numpy”, i.e. prompt engineering. But notably, this prompt engineering was not required with the same examples in Julia, and is generally a result of Python having a lack of core standardization, i.e. having many different ways to do basic things (numpy vs PyTorch vs standard libraries and methods built into CPython).
  • Results that were “close” but tripped up due to extra complexity. “A = coo_matrix(([4, 1],([0, 1], [1, 1])),shape=(2, 2))” is undeniably hard syntax to guess, and ChatGPT messed up quite a bit on examples like these sparse matrix constructions and forced vectorizations. Interestingly, it had some difficulties in translating code to zero-based indexing in a few cases. Mathematical models in texts tend to be written in one-based ($$x_1$$), and giving ChatGPT the extra step of converting one-based to zero-based gave it an extra step which I found tripped it up in a few cases.
  • Results that were too inefficient to be used in practice. This is something that tag-teamed the last one. Sometimes the code that would be correct would be a for loop (that would often times have difficulty compiling in numba), but that code would inhibit performance enough that I would ask for a vectorized version, in which case it would create a weird matrix with some syntax errors (which is also not fast after fixing the syntax errors).
  • Results that were clearly pulling from “bad data”. More data is not always better. One other interesting point to note was that, in the case of differential equations, it was clear that ChatGPT was relying on data from tutorials and web responses I had written because its examples match my favorite examples (Lotka-Volterra, ROBER, Bruss, etc.) and match a lot of the coding styles I prefer (for reference, I wrote and maintain the differential equation libraries in Julia). In the Python examples I could find some oddities of sometimes inappropriately chosen solvers (stiff vs non-stiff) with some odd statements that you wouldn’t expect an expert to say. What I mean is, the Python code clearly had a larger set of training data, but not everyone in that training data seemed to really know the ins and outs of numerical differential equations to be actually trustworthy data. This likely is one of the major parts impacting the results.

These one of the reasons why people tend to say you need to double check your code when generated by LLMs (or that they aren’t quite ready), it’s because these errors tend to happen (often). However, what I found is that these classes of errors were dramatically increased with Python instead of Julia, and where it tripped up is exactly where I would have expected new students to get tripped up. That’s not to say it does perfect (or well) on Julia, but it clearly did better, and thus after trying hard I tended to only use ChatGPT to convert Python examples to Julia and not Julia examples to Python in the end, simply because the Python generated examples required too much work to fix.

Conclusion

It’s interesting what you can get used to in life. Discomfort can become the norm if it’s what you experience every day. I had a back pain that I thought was just normal because I had gotten older, until I got a new office chair and realized it went away. In the same way, programming languages add their own discomforts to your programming life. You get used to them, but it becomes very obvious where the pain points are when you have to teach somebody new, since they have not become accustomed to the pain. When I first started using Julia, I came because I needed a high level language that generated fast code. When I started moving workshops from MATLAB and Python to Julia, I was shocked at how much easier it was to train newcomers to the language because of the syntactic simplicity. But now when using LLMs, it’s interesting to see how that syntactic simplicity and overall uniformity also reduced errors from AI generated code in precisely the same spots. My “eye check” from the diffeqpy project has now been confirmed by a larger study that indeed Julia works well with LLMs.

Are LLMs ready for doing all things code generation? No. But it does do surprisingly well with Julia, and it will be interesting to watch how that evolves as we get more and more Julia code as training data.

The post ChatGPT performs better on Julia than Python (and R) for Large Language Model (LLM) Code Generation. Why? appeared first on Stochastic Lifestyle.

Writing type-stable Julia code

By: Philippe Mainçon

Re-posted from: https://blog.sintef.com/industry-en/writing-type-stable-julia-code/

Julia code
This text presents type stability, which is one of the important concepts that one needs to understand in order to write high-performance Julia code.

This text is aimed at Julia users that are familiar with composite types, abstract types, functions, methods and multiple dispatch. At the same time, as little advanced Julia syntax as possible is used, to make the text accessible.

To type, or not to type

The developers of Julia wanted to solve the two-language problem. They have achieved this and produced a language that “walks like Python and runs like C”. Julia “walks like Python”, because it is not necessary to systematically define the type of every variable that appears in the code. It “runs like C” because it is a compiled language, and produces (or rather, can produce) highly efficient machine code.

Python and MATLAB are examples of interpreted language. In a pure interpreted language, the type of the variables is computed at run time, at the same time as the value of the variables. As long as the values of the inputs to the code are known at the top level (in the REPL or the top script), the interpretation infers, step by step the type of the variables, all the way down the call stack. This allows to write functions without specifying types, and this in turn allows to write generic code (for example an iterative solver that works just as well with Float64 and Float32 variables). The disadvantage is that inferring the type of variables on the fly introduces significant overhead at run time.

At the other end of the scale C and Fortran are examples of strictly typed compiled languages. Because the source code specifies the type of every variable in the function (both variables in the function interface, and local variables), the compiler can create efficient machine code for each function, just by considering the code of that function alone. The disadvantage is that type declaration takes time to write and clutters the source code, and (unless the language offers “templates”, as C++ does), it may be necessary to write several methods, identical in all but types of variables, to make an algorithm available to various data types.

Julia’s approach to type specifications

Julia takes the sweet spot in between, not requiring to specify the type of each variable, yet producing fast machine code. The trick is as follows: every time a method is called (so, at run time), with a combination of concrete types of arguments that has not yet been encountered for this method, the compiler quicks in. A “concrete type” is the information returned by typeof() when called on a variable. One example is Float64. This is as opposed to an abstract type, like Real, which is a set of concrete types, and includes Float64 and Float32. In the rest of this text “type” will refer to “concrete type”.

The compiler now has the source code of the method, and the types of all the arguments. The compiler will produce a method instance (or instance, for short), which is machine code for this combination. One interesting implication is that writing strictly typed method interfaces in Julia does not provide any improvement of machine code performance: the compiler takes the type of the arguments from the calling context anyway. A strictly typed interface has the disadvantage of offering no flexibility. A method that only accepts a Vector will not accept other vector-like things like a SubArray (an array view), a Adjoint (a transposed array), a SparseMatrix or a StaticArray, even thought the method probably implements an algorithm that would compile perfectly well for all of these.

However, providing partial specification of the type of the arguments of a method serves important purposes in Julia:

  1. If a function has several methods, it allows to specify which method should be executed (multiple dispatch). This is where abstract types like Real, AbstractVector and AbstractVector{<:Real} come into their own.
  2. It improves code readability, stating for example “this method expects some vector of some real numbers – but not a string”.
  3. It provides more graceful failures: “function foo has no method that takes in a string” is more informative that some esoteric failure down the line when attempting to add two strings.

What is type stability?

If the source code of the method is well written, the source code and the concrete type of all arguments is enough information for the compiler to infer the concrete type of every variable and expression within the method. The method is then said to be “typestable”, and the Julia compiler will produce efficient code.

If, for a variety of reasons that will be studied in the following, the type of a local variable cannot be inferred from the types of the arguments, the compiler will produce machine code full of “if”s, covering all options of what the type of each variable could be. The loss in performance is often significant, easily by a factor of 10.

If you are yourself able to infer the type of every local variable, and every expression in a method (or script) from the types (not the values) of the arguments or from constants in the code, the function will be typestable. Actually, as will be seen below, this inference of types is also allowed access to struct declarations, and to the types of the return values of functions called by the function you are studying.

The rest of this text will examine a variety of situations, ranging from obvious to more tricky, in which it is not possible to infer the types of local variables from the types of the arguments, resulting in type instability.

For this purpose, it will be useful to write down the information available to the compiler. So for example, if the method

function add(a::Number,b::Number)
    c = a+b
    return c
end

is called with a of type Float64 and b of type Int32, then we will write the information available to the compiler to create an instance as

instance add(a::Float64,b::Int32)
    c = a+b
    return c
end

instance is not Julia syntax, it is just a notation introduced in this text to describe an instance. In such instance description, a concrete type must be associated with every argument.

Example of Julia programming language code
Julia is a high-level, high-performance, dynamic programming language.

If, then

Consider the following method instance

instance largest(a::Float64,b::Int64)
    if a > b
        c = a
    else
        c = b    
    end
    return c
end

The variable c will be set to either a or b. c will take the value and the type of either a or b. The type of c depends on an operation a > b on the values of a and b: the type of c cannot be inferred from the type of arguments alone, and this code is not typestable.

Several approaches might be relevant to prevent type instability. The simplest is to code largest so that it only accepts two arguments of the same type.

function largest(a::R,b::R) where{R<:Real}
    if a > b
        c = a
    else
        c = b    
    end
    return c
end

The method is general, it can result in the generation of method instances like instance largest(a::Float64,b::Float64), instance largest(a::Int64,b::Int64) and many others. It cannot result in the generation of machine code for instance largest(a::Float64,b::Int64) (because R cannot be both Int64 and Float64). If we need to be able to handle variables of different types, yet want type stability, a solution is to use promotion to ensure that c is always of the same type.

function largest(a,b)
    pa,pb = promote(a,b)
    if a > b
        c = pa
    else
        c = pb  
    end
    return c
end

promote is defined so that pa and pb have the same type, and this type is inferred from the types of a and b. For example, for a call instance largest(a::Float64,b::Int64), the types of pa, pb and c will be Float64, to which one can convert a Int64 variable without loss of information (well, mostly).

Do not allow an if-then construct to return a variable which type depends on the branch taken.

Method return value

A method foo that would call the above first, not typestable, version of the method instance largest would receive as output a variable of a type that is value dependent: foo itself would not be typestable. The workaround here is to create typestable methods for largest, as suggested above.

One example is the method Base.findfirst(A), which given a Vector{Boolean} returns the index of the first true element of the vector. The catch is that if all the vector’s elements are false, the method returns nothing. nothing is of type Nothing, while the index is of type Int64. Using this method will make the calling method not typestable.

Avoid methods that return variables of value-dependant types.

Array of abstract element type

Consider the following code

v = [3.,1,"Hello world!"]
function showall(v)
    for e ∈ v
        @show e
    end
end
showall(v)

The above call showall(v) generates a method instance

instance showall(v::Array{Any,1})
    for e ∈ v
        @show e
    end
end

The concrete type of e cannot be inferred from Array{Any,1}, because Any is not a concrete type. More specifically, the type of e changes from one iteration to the next: the code is not typestable. If v is of type Array{Any,1}, even if V is has elements that are all of the same type, this does not help:

v = Vector{Any}(undef,3)
v[1] = 3.
v[2] = 1.
v[3] = 3.14
showall(v)

e may have the same type at each iteration, but this type still cannot be inferred from the type Array{Any,1} of the argument.

If we define w = randn(3), w has type Array{Float64,1}. This is much more informative: every element of w is known to have the same concrete type Float64. Hence the call showall(w) generates a method instance

instance showall(v::Array{Float64,1})
    for e ∈ v
        @show e
    end
end

and the compiler can infer that e is a Float64.

Wherever possible use arrays with a concrete element type.

Sometimes, the use of array with abstract element type is deliberate. One may really wish to iterate over a heterogeneous collection of elements and apply various methods of the same function to them: we design for dynamic dispatch, and must accept that the process of deciding which method to call takes time. Two techniques can be used to limit the performance penalty.

The first is the use of a “function barrier”: The loop over the heterogenous array should contain as little code as possible, ideally only the access to the arrays element, and the call to a method.

for e ∈ v
    foo(e)
end

If v contains elements of different type, the loop is not typestable and hence slow. Yet each value of e at each iteration has its unique concrete type, for which an instance of foo will be generated: foo can be made typestable and fast.

The second, a further improvement of the first, is to group elements by concrete type, for example, using a heterogenous arrays of homogeneous arrays.

vv = [[1.,2.,3.],[1,2]]
for v ∈ vv  # outerloop
    innerloop(v)
end
function innerloop(v)
    for e ∈ v
        foo(e)
    end
end

Here vv is an Array{Any,1}, containing two vectors of different types. vv[1] is a Array{Float64,1} and vv[2] is a Array{Int64,1}. Function innerloop is called twice and two instances are generated

instance innerloop(v::Array{Float64,1})
    for e ∈ v  # e is Float64
        foo(e)
    end
end
instance innerloop(v::Array{Int64,1})
    for e ∈ v  # e is Int64
        foo(e)
    end
end

and in both instances, the type of e is clearly defined: the instances are typestable.

The with this second approach is that the loop for v ∈ vv has few iterations (if the number of types is small compared to the number of elements in each types).

Structure of abstract field type

A similar loss of type stability arises when reading data from structures that have a field of abstract type:

struct SlowType
    a
end
struct JustAsBad
    a::Real
end
struct MuchBetter
    a::Float64
end
function show_a(s)
    @show s.a
end
show_a(SlowType(3.))
show_a(JustAsBad(3.))
show_a(MuchBetter(3.))

The first call to show_a generates

instance show_a(s::SlowType)
    @show s.a # The concrete type of field a of type SlowType cannot be inferred from the definition of SlowType
end

The second call to show_a has the same problem. The third call generates a typestable instance

instance show_a(s::Better)
    @show s.a # That's a Float64
end

It is often interesting to create structures with fields that can have various types. A classic example is Julia’s Complex type, which can have real and imaginary components which are either both Float64, both Float32 or other more exotic choices. This can be done without losing type stability by using parametric types:

struct FlexibleAndFast{R}
    a::R
end
show_a(FlexibleAndFast(3.))
show_a(FlexibleAndFast(3 ))

The above calls generate two typestable instances of show_a

instance show_a(s::FlexibleAndFast{Float64})
    @show s.a # That's a Float64
end
instance show_a(s::FlexibleAndFast{Int64})
    @show s.a # That's an Int64
end

Always use struct with fields of concrete types. Use parametric structure where necessary.

A note on constructors for parametric types

Consider a struct definition without inner constructor:

struct MyType{A,B}
    a::A
    b::B
end

Julia will automatically generate a constructor method with signature

MyType{A,B}(a::A,b::B)

Julia will also produce another method with signature

MyType(a::A,b::B)

because for MyType, it is possible to infer all type parameters from the types of the inputs to the constructor. Other constructors like

MyType{A}(a::A,b::B)

have to be defined explicitly (how should the compiler decide whether to interpret a single type-parameter input as A or B…).

Consider another example:

struct MyType{A,B,C}
    a::A
    b::B
end

Julia will automatically generate a constructor method with signature

MyType{A,B,C}(a::A,b::B)

but will not generate other methods. A method like

MyType{C}(a::A,b::B)

would have to be defined explicitly.

StaticArray

Julia Arrays are an example of parametric type, where the parameters are the type of elements, and the dimension (the number of indices). Importantly, the size of the array is not part of the type, it is a part of the value of the array.

The package StaticArrays.jl provides the type StaticArray, useful for avoiding another performance problem: garbage collection that follows the allocation of Arrays on the heap. This is because StaticArray are allocated on the stack, simplifying runtime memory management.

using StaticArrays
SA = SVector{3,Float64}([1.,2.,3.])
SA = SVector(1.,2.,3.)
SA = SVector([1.,2.,3.])

The first call to SVector is typestable: all the information needed to infer the type of SA is provided in curly braces. The second call is typestable too, because the compiler can deduce the same information from the type and number of inputs. The third call is problematic: while the type of the elements of SA can be inferred by the compiler, the length of [1.,2.,3.] is part of this array’s value, not type. The type of SA has a parameter that depends on the value (the size) of the argument passed to the constructor. Not only does this generate an instance of the constructor that is not type stable, but the non-inferable type of SA “contaminates” the calling code with type instability.

Val

What if we want to write a function that takes an Vector as an input, processes it (for example just keeps it as it is), and returns a SVector of the same shape. Of course we want this function to be general and not be limited to a given array size and we want this function to be typestable, for good performance.

First attempt:

function static(v::Vector)
    return SVector{length(v),eltype(v)}(v)
end

This function is not typestable. It constructs a variable of type StaticArray{(3,),Float64}, where 3 is obtained as the length of v, and the length is part of the value of an Array. Value-to-type alarm!

One possible solution is to use Val. Let us say that static is called by a function foo within which the length of v can be inferred at compile time. We could create the following code

function static(v,::Val{L}) where{L}
    return SVector{L,Float64}(v)
end
function foo()
    Val3 = Val(3)
    Val4 = Val(4)
    @show static([1.,2.,3.]   ,Val3)
    @show static([1.,2.,3.,4.],Val4)
end

The call Val(3) generates a variable, of type Val{3}. Clearly, Val as a function is not typestable, since it creates a variable of a type depending on the value of its argument.

However, function foo is typestable. This may come as a surprise, but two things conspire to allow this:

  1. The source code of foo explicitly mentions the constants 3 and 4, and the compiler has access to it.
  2. The compiler is greedy – it evaluates at compile time whenever possible. Hence the call Val(3) is evaluated during compilation, and Val3 is known to the compiler to be a a value-empty variable of type Val{3}.

In foo, the method static is called twice, leading to the generation of two typestable instances

instance static(v,::Val{3})  
    return SVector{3,Float64}(v)
end
instance static(v,::Val{4})
    return SVector{4,Float64}(v)
end

What if the length of the vectors is not defined as a constant in foo? If this length is the result of some computation, the call to Val with not be typestable. If foo is high enough in the call hierarchy, and outside any time-critical loop, this is not an issue: only foo will not be typestable, but functions that it calls can still be typestable (cf. the function barrier pattern).

Val allows to move type instability up the call hierarchy, or eliminate it altogether.

Anonymous function

Before Julia 1.6.0, the output type of an anonymous function was known to the compiler as Any. As a consequence, the following code was not typestable:

function foo(f)
    @show f(randn())
    return
end
foo(x->0)

However the following code was

function f(x)
    return 0
end
foo(f)

This is not an issue with Julia 1.6, and anonymous functions can be used.

@code_warntype

One important tool to check that an instance is typestable is the macro @code_warntype. For example

v = randn(3)
@code_warntype Val(length(v))
@code_warntype static(v,Val(length(v)))

The first invocation of @code_warntype outputs a semi-compiled code, and highlights some of the types in red: the call Val(3) is not typestable. The second invocation of @code_warntype produces an output in which all types are highlighted in blue: the call to static is typestable. Note that @code_warntype only analyses the compilation of the outermost function static – given the arguments v and Val(length(v)).

Profiler.jl

Profiler.jl provides a graphical representation of where processor time goes, in which code that is not typestable is highlighted. Output from the profiler often shows how type instability propagates: a single variable that is not typestable makes “anything it touches” type unstable.

Welcome!

By: Sören Dobberschütz

Re-posted from: https://tensorflowjulia.blogspot.com/2018/08/welcome.html

Welcome! With this blog, we are going to explore the Tensorflow.jl-Package for the Julia Programming Language

I am new to both Tensorflow and Julia – so please don’t expect any in-depth programming tricks or advanced API explanations. I will use Google’s newly released Machine Learning Crash Course and implement most of the exercises in a Jupyter notebook.

Why Julia?


Julia is a rather new programming language, initially released in 2012. The things that attracted me personally are

  • Most of my professional life, I have developed code in Matlab. Julia’s syntax is very close to that of Matlab, which makes the transition very easy. 
  • Julia is open-source. So no hassles with obtaining an expensive license.
  • Julia is fast (as compared to Matlab or Python). Its just-in-time compiler compiles the code to native machine code before execution. One of the first bits of code I wrote with Julia contained a part that enlarged and appended a big array for thousands of iterations. Of course this is very bad programming practise – but still it performed surprisingly well.

Why TensorFlow?


  • TensorFlow is open-source and used by Google and many other big companies for production machine learning tasks. 








Disclaimer

I am neither affiliated with Google (who runs the Machine Learning Crash Course), Julia or the creators of Tensorflow.jl. This is a pure just-for-fun hobby project.