Finding Correlations in Data with Julia and NAG

The Numerical Algorithms Group (NAG) and the NAG Library Introduction

The Numerical Algorithms Group (NAG) provides expertise in numerical engineering, by delivering high-quality computational software,
consulting services and high performance computing services. For over four decades NAG have collaborated with world-leading researchers
in academia and industry to create powerful, reliable and flexible software which is relied on by tens of thousands of individual users,
as well as numerous independent software vendors. As a not-for-profit company,
NAG reinvests surpluses into the research and development of its products, services, staff and its collaborations.

The NAG Library is the world’s largest collection of robust, documented, tested and maintained numerical and statistical algorithms.
It is callable from many different programming languages and environments such as .NET, C++, Java®, MATLAB®, Python® and Visual Basic®.
Some users of the Julia language already call the NAG Library into their applications.
This blog demonstrates that the NAG Library is also callable directly from Julia, using one of the routines from the NAG Library as an example.

This blog post is also available on the NAG website.

Introduction to Correlation Matrices

Correlations provide helpful information when working with possibly mutually related data sources, such as the value of stocks, options, precious metals or others. In the case of three or more data sources this produces a matrix known as a “correlation matrix” which has a special structure, shown in Julia code below:

# The input Y holds m observations of n random variables
C = cor(Y)
# Output C contains pairwise correlations of Y's columns

Correlation (or, to give its full name, “Pearson’s product-moment correlation coefficient”),
is a measure of the linear relationship between two random variables. One of the features
that makes it particularly useful is that it is invariant to location-scale
transformations of the variables, giving values on the interval [-1,1]: 1 being perfect
positive linear relationship, 0 being no linear relationship (though, it should be
emphasised, not necessarily independent), and -1 being a perfect negative relationship.

A correlation matrix is then the set of pairwise correlations between multiple random
variables, where the (i,j)th element is the correlation between the ith and the
jth variable. Mathematically, a correlation matrix must satsify the following
properties:

  • it must be symmetric,

  • the diagonal elements must be 1, and

  • it must be positive semidefinite, which is equivalent to all eigenvalues being non-negative.

Estimating Correlation Matrices

Of course, we never get to observe the correlation matrix directly, instead it must be
estimated from data. Given a matrix Y, where each row is a joint observation of the
variables, the empirical correlation matrix can be computed using the
cor function in Julia.

The resulting matrix should satisfy the three properties above, though some small
deviations may arise due to numerical error in extreme cases:

julia> m = 100; ϵ = 1e-2;

julia> Y = hcat(
    linspace(-1.0,1.0,m) + rand(m)*ϵ,
    linspace(-1.0,1.0,m),
    rand(m),
    linspace(1.0,-1.0,m));   # intentionally ill-conditioned matrix

julia> C = cor(Y)
4x4 Array{Float64,2}:
     1.0        0.999989  -0.22264   -0.999989
     0.999989   1.0       -0.222371  -1.0     
    -0.22264   -0.222371   1.0        0.222371
    -0.999989  -1.0        0.222371   1.0     

julia> issym(C)            # is the matrix symmetric?
true

julia> diag(C) .== 1       # are the diagonal elements 1?
4-element BitArray{1}:
    true
    true
    true
    true

julia> eigvals(C)          # Matrix is numerically positive semidefinite
4-element Array{Float64,1}:
    -4.6987e-16
     1.51587e-5
     0.928334  
     3.07165 

The structure of correlations produces a matrix that should be symmetric, have unit diagonal and be
positive semidefinite. In
applications however it can happen that approximate correlations lead
to accordingly approximate correlation matrices which are symmetric and
have unit diagonal, but also have nontrivial negative eigenvalues, meaning
it is not positive semidefinite.

Larger deviations from positive definiteness may arise when the correlations are computed
from incomplete data: for example, two stock symbols may not have price data on the exact
same days.

The available approaches to handling the situation of missing data can be broken down into two broad categories;
model the missing values and use that model to generate a full set of data,
hence guaranteeing a semi-definitive correlation matrix; or calculate an approximate correlation matrix from the available data and adjust it to ensure that it is definite.

The first of these approaches is beyond the scope of this article and we only deal with the second:
calculating an approximate correlation matrix from existing data, adjusting if required.
This approach has the advantage of being applicable when an approximate correlation matrix is indefinite for a reason other than missing data,
for example, due to rounding during its calculation.

In the example given below, based on real financial data, it was not possible to calculate a full correlation matrix due to missing data.
Pairwise correlations between the stocks were calculated using as much data as possible, which yielded a matrix with a negative eigenvalue.

julia> using DataFrames

julia> data = "../data";

julia> alum = readtable("$data/GOOG-LON_ALUM.csv");   alum = alum[[:Date, :Close]];   # Aluminum

julia> gold = readtable("$data/LBMA-GOLD.csv");       gold = gold[[:Date, :USD_PM_]]; # Gold

julia> aapl = readtable("$data/WIKI-AAPL.csv");       aapl = aapl[[:Date, :Close]];   # Apple stock

julia> efx  = readtable("$data/WIKI-EFX.csv");        efx  = efx[[:Date, :Close]];    # Equifax

julia> oil  = readtable("$data/ODA-POILWTI_USD.csv"); oil  = oil[[:Date, :Value]];    # Oil

julia> dfs = [alum, gold, aapl, efx, oil];

julia> function cor_join(df1,df2) # compute correlation by merging on Date and dropping NAs           
            df = join(df1, df2, on=:Date)
            mask = !(isna(df[2]) | isna(df[3]))
            cor(df[mask,2], df[mask,3])
        end
cor_join (generic function with 1 method)

julia> C = Float64[cor_join(df1,df2) for df1 in dfs, df2 in dfs] # correlation matrix
5x5 Array{Float64,2}:
     1.0       -0.345452  -0.136751  -0.67236   0.347417
    -0.345452   1.0        0.865021   0.592018  0.845236
    -0.136751   0.865021   1.0        0.358125  0.775634
    -0.67236    0.592018   0.358125   1.0       0.427315
     0.347417   0.845236   0.775634   0.427315  1.0     

julia> issym(C)          # is symmetric?
true

julia> norm(diag(C)-1)
4.002966042486721e-16    # diagonals are approximately 1

julia> eigvals(C)        # 1 negative eigenvalue
5-element Array{Float64,1}:
    -0.130276 
     0.0985574
     0.442257 
     1.54673  
     3.04273  

It is useful therefore to correlate incomplete data and then clean the resulting
correlation matrix by finding the “nearest” correlation matrix, and then use the nearest correlation
matrix as a proxy for the original correlation matrix in workflows which require those properties.

Nearest Correlation Matrix

To fix the problem of indefiniteness in an approximation correlation matrix, one
can define an optimization problem which fixes it appropriately without reducing
the original value of the true correlations. One could for example simply find the closest matrix in the Frobenius norm to an input
matrix, but subject to the constraints that the output is a correlation matrix.
In practice this may not be an acceptable solution, so there are alternative
formulations which enforce important properties
of the approximate correlation matrix input. NAG has implemented many of these different
approaches and through a partnership with Julia Computing a Julia
interface has been developed to access these nearest correlation routines in the NAG Library. The next
section will first detail the routines and following this there is
an explanation of how they can be accessed through Julia.

NAG Routines

The first and simplest routine in the NAG Library for nearest correlation matrices
solves an optimization problem and accepts an arbitrary matrix as input, this is
g02aa, which solves the following optimization problem

where X is a valid correlation matrix. An algorithm devised in 1 uses alternating projection to solve this optimization problem,
which has the virtue of being simple to implement, but tends to converge slowly. Newton’s algorithm
can have rapid convergence and was investigated in 2 along with suitable preconditioners
for the linear systems it yields. This algorithm has been implemented by NAG

Some common situations may make this simple approach less
appealing however. For example suppose only a few data pairs have problematic missing data,
then only a small sub-block of the approximate correlation matrix needs to be fixed and
a likely-small change added to the rest of the matrix to preserve the properties it had
to begin with. Thus targeting of a sub-block can be formulated as follows:

this was solved by Higham et al. in 3 and is implemented by NAG as routine g02an. A similar routine “g02ap” which will be released with the NAG Library Mark 26 can keep arbitrary entries of a matrix fixed, rather than a full sub-block, which affords the user more flexibility.

If fixing a sub-block or fixed entries of the input matrix does not model well the additional information
a user may have, then one can also weight fixed rows and columns in the optimization, formulated as follows:

where W is an input diagonal matrix of weights.
The NAG routine g02ab incorporates this weighted formulation.

Alternatively if weighting whole rows and columns is not adequate one can optionally weight
only specified entries with the routine g02aj,
(note this is a very expensive formulation).

Finally one may know that the output should have a k-factor structure, which means
that the off-diagonal entries of the output can be low-rank approximated.
The NAG Library also incorporates this additional information through the
reformulation

This optimization problem is solved in the NAG routine g02ae and produces a nearest correlation matrix which respects k-factor structure specified by one.

In summary, the NAG Library contains the following nearest correlation routines:

  • g02aa Basic problem using Newton’s method.
  • g02ab Incorporates row/column weights.
  • g02aj Incorporates entrywise weights.
  • g02an Incorporates fixed submatrix.
  • g02ap Incorporates fixed entries.
  • g02ae Incorporates k-factor structure specified by user.

As a final remark it is worth noting that g02ab, g02aj, and g02ap can also optionally include a bound on the lowest eigenvalue. This might be desirable for example
to clamp the smallest eigenvalue to zero rather than allowing a possible roundoff to be negative. Bounding the smallest eigenvalue can also help in enforcing a positive definite
output, rather than only positive semidefinite.

What follows will detail a NAG partnership with Julia Computing which makes these routines much easier to utilize from a Julia environment.

Julia Interface

Julia’s ccall function provides an easy to use, “no boilerplate,” interface for calling out to shared C and Fortran libraries, such as the NAG C Library and NAG Fortran Library. The ccall function allows for calling into compiled shared libraries directly without the need to write any “glue” code, perform low-level code generation to create function interfaces or even recompile a binary. Full documentation on the use of ccall can be found in the “Calling C and Fortran Code” section of the Julia Language documentation.

For this project, we developed a set of Julia wrappers to the above set of nearest correlation matrix routines. Development of these wrappers consisted of the following components:

  • a Julia NAGdemo module containing all of the Julia code for the interface. The module includes all Julia type definitions, const values, function definitions and export definitions for the Julia/NAG interface.
  • a constant string (LIBNAGC_NAG) pointing to the location of the NAG shared library distributed with a NAG C Library installation.
  • a function nag_licence_query/nag_license_query that tests execution of the simple NAG routine a00acc as a method of performing proper license checkout for the NAG Library.
  • translation of entries in the nag_types.h header file included with the NAG C Library into corresponding Julia objects.
  • definition of proper error handling routines that can capture, store and print any errors in Julia that occur within the NAG C Library routines.
  • definition of Julia wrapper functions to NAG’s nearest correlation matrix routines.

Below we will discuss each of these items in turn.

Defining a NAGdemo module

In this post we define a simple NAGdemo module, included in full at the bottom of the page. NAGdemo is a Julia module much like any other Julia module, providing a standard mechanism for the organization of Julia code, as well as namespace resolution for all functions defined as part of the module when called from external Julia code. The list of exported functions in the module enumerate those functions that do not need to be called with full namespace resolution when the NAGdemo module is loaded into a Julia session via a using NAGdemo statement.

Defining a constant for the location of the NAG shared library

The constant string LIBNAGC_NAG contains the location of the NAG shared library which is a required argument for all uses of ccall that will call into the NAG C Library from Julia.

const LIBNAGC_NAG = "/opt/NAG/cll6i25dcl/lib/libnagc_nag.so.25"

Testing access to a NAG licence

The nag_licence_query \ nag_license_query function provides our first, and simplest, example of calling into the NAG C Library from Julia using ccall.

nag_licence_query() = ccall((:a00acc, LIBNAGC_NAG), Cint, ()) == 1
const nag_license_query = nag_licence_query
nag_license_query() || warn("Cannot acquire a NAG licence.")

This particular use of ccall takes 3 arguments:

  • a tuple (:a00acc, LIBNAGC_NAG) containing a symbol for the name of the exported function to be accessed, :a00acc, and the const string, LIBNAGC_NAG, pointing to the location of the NAG shared library. The NAG routine :a00acc outputs information about the version of the NAG C Library in use.
  • the output type, Cint, of the NAG C routine being called. In the case of :a00acc, the output type is a C integer. The routine will return 0 if a licence is acquired and the function executes successfully, or 1 if the routine was unsuccessful.
  • a tuple of the types for all input arguments, In this particular case, :a00acc does not take any input arguments, so this tuple argument is empty.

Translating the NAG header file

The header file nag_types.h is a large file containing many C enums, typedefs and pre-processor macros. As the contents of this header file are widely used throughout the NAG C Library when referring to the input and output types of most functions, the definitions included in this header, or at least the atomic types and values that are represented by these definitions, need to be present in Julia when wrapping routines via ccall.

While there are packages within the Julia ecosystem, such as Clang.jl, that can assist in automation of wrapping C libraries in certain situations, a manual translation of the necessary portions of nag_types.h file into corresponding Julia code is not difficult, and is sufficient for this exercise. Each of the enum entries within nag_types.h are fundamentally integer constant values and this Nearest Correlation Matrix demonstration only requires one of those enum values:

@static if is_windows()
    const NagInt = Int32
else
    const NagInt = Int
end
const Nag_ColMajor = NagInt(102)

NagInt is an alias for an appropriately sized integer value on Windows or Unix platforms, as needed for use of the NAG C Library in each operating system environment for appropriate 32-bit or 64-bit flavours as necessary.

Handling errors from NAG routines

The following block of Julia code defines a NagError type that inherits from Julia’s abstract Exception type, as well as various functions for handling and storing any errors returned by any NAG routine as Julia exceptions:

# Error handling
type NagError <: Exception
    code::Int
    name::ASCIIString
    message::UTF8String
end

Base.showerror(io::IO, e::NagError) =
    print(io, "NAG function \"$(e.name)\" [$(e.code)] – $(e.message)")

const NAG_ERROR = zeros(UInt8,544)
const nag_errors = Array(NagError)
nag_errors[] = NagError(0, "", "NO_ERROR")

cstr_to_array(p::Ptr{UInt8}, own::Bool = false) =
    pointer_to_array(p, Int(ccall(:strlen, Csize_t, (Ptr{UInt8},), p)), own)

function error_handler(msg::Ptr{UInt8}, code::Cint, name::Ptr{UInt8})
    code == 0 && return
    msg = UTF8String(copy(cstr_to_array(msg)))
    name = ASCIIString(copy(cstr_to_array(name)))
    nag_errors[] = NagError(Int(code), name, msg)
    throw(nag_errors[])
end
const ptr_error_handler = cfunction(error_handler, Void, (Ptr{UInt8}, Cint,
                                    Ptr{UInt8}))

function reset_nag_error()
    fill!(NAG_ERROR, 0)
    unsafe_store!(
        convert(Ptr{Ptr{Void}}, pointer(NAG_ERROR)) + 2*sizeof(Cint) + 512,
        ptr_error_handler
    )
end
reset_nag_error()

last_nag_error() = nag_errors[]

The call to reset_nag_error() on load of the NAGdemo module sets our custom error_handler routine to collect and handle any errors thrown by NAG routines, as well as throw those errors as Julia exceptions. The last_nag_error() function resets the stored list of NagError values.

Creating Julia wrappers

For each of the NAG routines for which a Julia function API is to be created we start with inspection of the appropriate NAG function signature to determine the datatypes of all arguments.

With a ccall invocation defined we can subsequently make a determination of which arguments to a NAG routine can be considered as purely input arguments, which arguments could be considered as purely outputs and which arguments could be considered both input and output arguments. This classification can help in the design of a user-facing Julia API.

Pure input arguments are those arguments which are passed to the NAG routine but are not modified in any way. Pure output arguments are those arguments for which memory is allocated prior to calling the NAG routine, but for which the initial values in that memory location have no importance the memory is only needed for returning an output value. Arguments that are both inputs and outputs are those arguments for which the initial value is both used by the NAG routine and modified before being returned.

Understanding the context of each argument in the NAG C API also helps to determine which of the C arguments might be able to have default values assigned via Julia keyword arguments, which of the C arguments might be able to infer their values from other input arguments at the Julia level and which of the C arguments might not need to be present at all within a user-facing Julia API.

For each of the nearest correlation routines considered in this exercise, the routine returns an output of type Void.

For the g02aac routine, its C API is the following:

// g02aac / nag_nearest_correlation
g02aac(Nag_OrderType order, double g[], Integer pdg, Integer n,
    double errtol, Integer maxits, Integer maxit, double x[], Integer pdx,
    Integer *iter, Integer *feval, double *nrmgrd, NagError *fail);

Constructing a ccall command for this API signature requires mapping the type of each input argument from C type to the corresponding Julia data type. The following table provides that mapping:

argument C data type Julia data type
order Nag_OrderType NagInt
g double[] Ptr{Float64}
pdg Integer NagInt
n Integer NagInt
errtol double Float64
maxits Integer NagInt
maxit Integer NagInt
x double[] Ptr{Float64}
pdx Integer NagInt
iter Integer* Ptr{NagInt}
feval Integer* Ptr{NagInt}
nrmgrd double* Ptr{Float64}
fail NagError* Ptr{Void}

With each of these variables defined in the current Julia scope a pure ccall into NAG’s g02aac routine would look like the following block of code:

order = Nag_ColMajor    # Nag_ColMajor is defined in nag_types.jl
n = 4
pdg = 4
g = [ 2.0 -1.0  0.0  0.0;
     -1.0  2.0 -1.0  0.0;
      0.0 -1.0  2.0 -1.0;
      0.0  0.0 -1.0  2.0]
pdx = 4
x = zeros(4,4)
errtol = 0.0
maxits = 0
maxit = 0
iter = Ref{NagInt}(0)
feval = Ref{NagInt}(0)
nrmgrd = Ref{Float64}(0.0)

ccall((:g02aac, LIBNAGC_NAG), Void,
        (NagInt, Ptr{Float64}, NagInt, NagInt, Float64,
        NagInt, NagInt, Ptr{Float64}, NagInt, Ptr{NagInt},
        Ptr{NagInt}, Ptr{Float64}, Ptr{Void}),
        order, g, pdg, n, errtol,
        maxits, maxit, x, pdx, iter,
        feval, nrmgrd, NAG_ERROR)

As stated previously, for the design of our initial Julia API, we need to make a few decisions about which of the arguments to ccall should be Julia inputs and which values should be returned as outputs.

In this proof of concept project, we made the following decisions for a first level Julia wrapper to g02aac named nag_nearest_correlation!:

  • the scalar arguments being passed to ccall by value (e.g. order, pdg, n, errtol, maxits, maxit, pdx) will be arguments to nag_nearest_correlation!
  • all array arguments (e.g. g and x) will be arguments to nag_nearest_correlation!. Argument g is actually a pure input argument, as its value is not modified, while argument x could be considered a pure output argument as its memory is only used to store the output correlation matrix. We will return to this point below.
  • the scalar arguments being passed by reference into the ccall (e.g. iter, feval, nrmgrd) are being used to track and report internal behaviour of the NAG routine. These pointers to scalar values will be created within the Julia routine and their values returned as Julia integer or floating point values as needed.

The initial implementation of the nag_nearest_correlation! Julia function subsequently took the following form:

function nag_nearest_correlation!(order::NagInt, g::Matrix{Float64},
                                    pdg::NagInt, n::NagInt, errtol::Float64,
                                    maxits::NagInt, maxit::NagInt,
                                    x::Matrix{Float64}, pdx::NagInt)

    iter = Ref{NagInt}(0)
    feval = Ref{NagInt}(0)
    nrmgrd = Ref{Float64}(0.0)

    ccall((:g02aac, LIBNAGC_NAG), Void,
            (NagInt, Ptr{Float64}, NagInt, NagInt, Float64,
            NagInt, NagInt, Ptr{Float64}, NagInt, Ptr{NagInt},
            Ptr{NagInt}, Ptr{Float64}, Ptr{Void}),
            order, g, pdg, n, errtol,
            maxits, maxit, x, pdx, iter,
            feval, nrmgrd, NAG_ERROR)

    return iter[], feval[], nrmgrd[]
end

Using the same data defined above, we can call this routine as follows:

order = Nag_ColMajor
n = 4
pdg = 4
g = [ 2.0 -1.0  0.0  0.0;
     -1.0  2.0 -1.0  0.0;
      0.0 -1.0  2.0 -1.0;
      0.0  0.0 -1.0  2.0]
pdx = 4
x = Array(Float64, 4, 4)
errtol = 0.0
maxits = 0
maxit = 0
nag_nearest_correlation!(order, g, pdg, n, errtol, maxits, maxit, x, pdx)

Now if we decide that we would like to provide an interface that does not require one to manage the memory allocation for x (and also only provides the nearest correlation matrix x as output), we can define the following method nag_nearest_correlation that allocates the memory for x internally and then calls nag_nearest_correlation!.

function nag_nearest_correlation(order::NagInt, g::Matrix{Float64}, pdg::NagInt, n::NagInt, errtol::Float64, maxits::NagInt, maxit::NagInt, pdx::NagInt)
    x = Array(Float64,n,pdg)
    nag_nearest_correlation!(order, g, pdg, n, errtol, maxits, maxit, x, pdx)
    return x
end

But the values we have assigned for errtol, maxits and maxit are essentially default values for these arguments. Consequently, we can assign these values in keyword arguments in our nag_nearest_correlation function signature:

function nag_nearest_correlation(g::Matrix{Float64}, pdg::NagInt, n::NagInt, pdx::NagInt, order = Nag_ColMajor, errtol = 0.0, maxits = NagInt(0), maxit = NagInt(0))
    x = Array(Float64,n,pdg)
    nag_nearest_correlation!(order, g, pdg, n, errtol, maxits, maxit, x, pdx)
    return x
end

Furthermore, the values for arguments pdg, n and pdx can be deduced from the size of the input array g and can also be removed from the function signature for nag_nearest_correlation as follows:

function nag_nearest_correlation(g::Matrix{Float64}, order = Nag_ColMajor, errtol=0.0, maxits=NagInt(0), maxit=NagInt(0))
    n, pdg = size(g)
    pdx = pdg
    x = Array(Float64,n,pdg)
    nag_nearest_correlation!(order, g, pdg, n, errtol, maxits, maxit, x, pdx)
    return x
end

Calling this method now involves only the following code:

G = [ 2.0 -1.0  0.0  0.0;
     -1.0  2.0 -1.0  0.0;
      0.0 -1.0  2.0 -1.0;
      0.0  0.0 -1.0  2.0]

X = nag_nearest_correlation(G)

Having seen the entire process for construction of a user-friendly wrapper for the nag_nearest_correlation function below are the full contents of our NAGdemo module:

module NAGdemo

# Section translated from nag_types.h
@static if is_windows()
    const NagInt = Int32
else
    const NagInt = Int
end
const Nag_ColMajor = NagInt(102)

export
    nag_licence_query,
    nag_license_query,
    nag_nearest_correlation!,
    nag_nearest_correlation,
    last_nag_error

###############################################################################

const LIBNAGC_NAG = "/opt/NAG/cll6i25dcl/lib/libnagc_nag.so.25"

###############################################################################
# Access License

nag_licence_query() = ccall((:a00acc, LIBNAGC_NAG), Cint, ()) == 1
const nag_license_query = nag_licence_query # US vs UK spelling
nag_license_query() || warn("Cannot acquire a NAG license.")

###############################################################################
# Error handling
type NagError <: Exception
    code::Int
    name::ASCIIString
    message::UTF8String
end

Base.showerror(io::IO, e::NagError) =
    print(io, "NAG function \"$(e.name)\" [$(e.code)] – $(e.message)")

const NAG_ERROR = zeros(UInt8,544)
const nag_errors = Array(NagError)
nag_errors[] = NagError(0, "", "NO_ERROR")

cstr_to_array(p::Ptr{UInt8}, own::Bool = false) =
    pointer_to_array(p, Int(ccall(:strlen, Csize_t, (Ptr{UInt8},), p)), own)

function error_handler(msg::Ptr{UInt8}, code::Cint, name::Ptr{UInt8})
    code == 0 && return
    msg = UTF8String(copy(cstr_to_array(msg)))
    name = ASCIIString(copy(cstr_to_array(name)))
    nag_errors[] = NagError(Int(code), name, msg)
    throw(nag_errors[])
end
const ptr_error_handler = cfunction(error_handler, Void, (Ptr{UInt8}, Cint,
                                    Ptr{UInt8}))

function reset_nag_error()
    fill!(NAG_ERROR, 0)
    unsafe_store!(
        convert(Ptr{Ptr{Void}}, pointer(NAG_ERROR)) + 2*sizeof(Cint) + 512,
        ptr_error_handler
    )
end
reset_nag_error()

last_nag_error() = nag_errors[]

###############################################################################

# g02aac / nag_nearest_correlation
# g02aac(Nag_OrderType order, double g[], Integer pdg, Integer n,
#        double errtol, Integer maxits, Integer maxit, double x[], Integer pdx,
#        Integer *iter, Integer *feval, double *nrmgrd, NagError *fail);

function nag_nearest_correlation!(order::NagInt, g::Matrix{Float64},
                                    pdg::NagInt, n::NagInt, errtol::Float64,
                                    maxits::NagInt, maxit::NagInt,
                                    x::Matrix{Float64}, pdx::NagInt)

    iter = Ref{NagInt}(0)
    feval = Ref{NagInt}(0)
    nrmgrd = Ref{Float64}(0.0)

    ccall((:g02aac, LIBNAGC_NAG), Void,
            (NagInt, Ptr{Float64}, NagInt, NagInt, Float64,
            NagInt, NagInt, Ptr{Float64}, NagInt, Ptr{NagInt},
            Ptr{NagInt}, Ptr{Float64}, Ptr{Void}),
            order, g, pdg, n, errtol,
            maxits, maxit, x, pdx, iter,
            feval, nrmgrd, NAG_ERROR)

    return iter[], feval[], nrmgrd[]
end

function nag_nearest_correlation(g::Matrix{Float64}, order = Nag_ColMajor,
                                    errtol = 0.0, maxits = NagInt(0),
                                    maxit = NagInt(0))
    n, pdg = size(g)
    pdx = pdg
    x = Array(Float64,n,pdg)
    nag_nearest_correlation!(order, g, pdg, n, errtol, maxits, maxit, x, pdx)
    return x
end

end

Summary

By using Julia’s features for calling external libraries we have provided easy interfaces into the NAG nearest correlation matrix routines. This example shows that the numerical computing capabilities of Julia can be easily extended to incorporate pre-existing industry grade libraries. For users of Julia, NAG integration allows the possibility of including widely utilized, specialized libraries developed outside of the Julia ecosystem, and for users of the NAG Library it means their existing workflows depending on NAG need not be disrupted by a switch to Julia.

In summary: the NAG Library, which is the worlds’ largest collection of robust and commercially maintained numerical and statistical routines and Julia, which is
a high-productivity, high-performance programming language is a winning combination.

If you are interested in a more extensive integration of Julia and the NAG Library then please contact Julia Computing at info@juliacomputing.com and NAG at infodesk@nag.com.

Trademark Usage

Java® is a registered trademark of Oracle Corporation

MATLAB® is a registered trademark of The MathWorks, Inc.

Python® is a registered trademark of the Python Software Foundation

Visual Basic® is a registered trademark of Microsoft Corporation

References

  1. N. J. Higham. “Computing the nearest correlation matrix – A problem from finance”. IMA J. Numer. Anal., 22(3):329–343, 2002. doi: 10.1093/imanum/22.3.329.

  2. R. Borsdorf and N. J. Higham. “A preconditioned (Newton) algorithm for the nearest correlation matrix”. IMA J. of Numer. Anal., 30(1):94–107, 2010. doi: 10.1093/imanum/drn085.

  3. N. J. Higham, N. Strabić, and V. Šego, “Restoring definiteness via shrinking, with an application to correlation matrices with a fixed block”. SIAM Rev., 58(2), 245–263, 2016. doi:10.1137/140996112.