Tag Archives: Programming

Revisiting emulated OOP behaviour and multiple dispatch in Julia

By: Terence Copestake

Re-posted from: https://thenewphalls.wordpress.com/2014/06/02/revisiting-emulated-oop-behaviour-and-multiple-dispatch-in-julia/

In an earlier post, I explored one approach to emulating bundling functionality with the data on which it operates, akin to object methods in OOP languages such as C# and PHP. A comment posted by Matthew Browne questioned whether this approach was compatible with Julia’s multiple dispatch.

This is something I thought about at the time of writing the original article, but I had assumed it wouldn’t be possible due to the way in which the anonymous functions are assigned to variables i.e. assigning one definition would overwrite the previous. However, Matthew’s question prompted me to reconsider – and after some brief experimentation and some small alterations, I found that there is indeed a way to maintain compatibility with multiple dispatch.

Below is an updated example type definition:

type MDTest
    method::Function

    function MDTest()
        this = new()

        function TestFunction(input::String)
            println(input)
        end

        function TestFunction(input::Int64)
            println(input * 10)
        end

        this.method = TestFunction

        return this
    end
end

The theory is basically the same, with the constructor assigning the methods to their respective fields within the type. The difference is in how the functions are defined and assigned.

On lines 7 and 11, methods are defined with different argument types. These methods could be defined outside of the type definition without error, but defining them within the constructor has the advantage of not polluting the global scope.

On line 15, the function is assigned to its field using some slightly different syntax, which allows both methods to be called.

With this, the example code below:

test = MDTest()

test.method("String")

test.method(5)

Produces the output:

String
50

Another advantage to this approach is the absence of anonymous functions – which, according to benchmarks and GitHub issues, have significantly worse performance compared to named functions.

Julia variable gotchas

By: Terence Copestake

Re-posted from: http://thenewphalls.wordpress.com/2014/04/07/julia-variable-gotchas/

As is typical for many languages, assigning one variable to another in Julia does not create a copy of the variable data, but rather a reference to the existing data. However, I learned the hard way whilst working on the CGI module* that Julia does not currently support a copy-on-write mechanism for collections.

Take the example code below:

n = [ 1, 2, 3 ]

m = n

As expected, m becomes a reference to the collection referenced by n. Working with any number of mainstream languages, one might expect a copy to be made of the data referenced by n if either n or m is modified, for example:

n = [ 1, 2, 3 ]

m = n

push!(n, 4)

# Expect n = [ 1, 2, 3, 4] and m = [ 1, 2, 3 ]

This is not the case for Julia. When the array pointed to by n is modified, m maintains its reference to that same array, giving both a value of [ 1, 2, 3, 4 ].

Problems in the wild

I encountered this quirk when working with binary data and UTF-8 strings.

n = Uint8[ 0x32, 0x33, 0x34, 0x61 ]

m = utf8(n)

empty!(n)

Having created a string using the utf8 function, I wanted to empty the original byte array to free those resources. After a few minutes of trying to figure out how a bounds error had crept in to my app, I narrowed it down to this deletion of the byte array.

Digging deeper into the Julia source, the utf8 function is just an alias for a conversion function.

utf8(x) = convert(UTF8String, x)
...
convert(::Type{UTF8String}, a::Array{Uint8,1}) = is_valid_utf8(a) ? UTF8String(a) : ...

You can see here that passing an array of Uint8 bytes to utf8() creates an instance of UTF8String with the Uint8 array as its data. The type definition for UTF8String is:

immutable UTF8String <: String
    data::Array{Uint8,1}
end

As was covered above, the UTF8String’s data field will be only a reference to the collection passed to the utf8 function. If that collection is modified in any way at any point during the program’s runtime, so too will be the returned string.

In closing

It seems that the solution at this time is to explicitly use the copy or deepcopy functions, where copies of data are required by the program logic.

The issue is explored in this Google Groups thread. If I’ve understood correctly, the gist of it is that Julia makes this sacrifice for the sake of performance. As this is a feature wanted by many, there’s a possibility of it being implemented in a later version of the language.

* Write-up to follow at a later date

Capturing output in Julia

By: Terence Copestake

Re-posted from: http://thenewphalls.wordpress.com/2014/03/21/capturing-output-in-julia/

In a previous blog post I pondered whether it may be possible to redirect STDERR to an IOBuffer so that the output can be handled in a controlled way e.g. written to a log file. It turns out that it’s not quite that simple, but capturing output can still be achieved easily with a few more lines of code.

The noteworthy functions here are the redirect_std* family of functions. These functions redirect their respective handles to a new pipe and return a read and write handle for said pipe.

Capturing output

Below is an example of capturing output to STDOUT.

(outRead, outWrite) = redirect_stdout()

print("Test")
print("ing")

close(outWrite)

data = readavailable(outRead)

close(outRead)

At the end of execution, the variable “data” will contain the string ‘Testing’.

(As seen on line 6, it’s advisable to close the write handle before trying to read from the pipe, as this will ensure that any buffered writes are flushed and available for reading.)

If you need to write to the original output stream after redirecting and capturing data, you’ll first need to create a copy of the original handle and restore it later. For example:

originalSTDOUT = STDOUT

(outRead, outWrite) = redirect_stdout()

print("Test")
print("ing")

close(outWrite)

data = readavailable(outRead)

close(outRead)

redirect_stdout(originalSTDOUT)

print(data)

Line 1 is where the original handle is copied; line 14 is where it’s restored. The print on line 16 will therefore write to the original STDOUT (e.g. console window, browser, etc) instead of the outWrite pipe.

Capturing errors

This technique is particularly useful when applied to STDERR, as it can be used to write errors to a log file. The process is the same, but instead using functions applicable to STDERR. Below is a slightly different example:

(errorRead, errorWrite) = redirect_stderr()

atexit(function ()
    close(errorWrite)

    errors = readavailable(errorRead)

    close(errorRead)

    logfile = open("errors.log", "a")
    write(logfile, errors)
    close(logfile)
end)

atexit registers a function to be called when the program execution ends for whatever reason (fatal error, user called the quit function, etc). The code between lines 4 and 8 is similar to the STDOUT example – and code as been added at lines 10 to 12 to log the captured error(s) to a file.