Author Archives: Tamás K. Papp

Approximating a function using derivatives

By: Tamás K. Papp

Re-posted from: https://tamaspapp.eu/pages/blog/2022/hermite-approximation-spectralkit/index.html

When simulating from an economic model, I had to approximate a function \(f(x; \theta): [0,1] \to [0,1]\) for a variety of \(\theta\)s. \(f\) itself has to be solved for numerically, but otherwise it is pretty friendly, being continuous and increasing, with \(f(0)=0\) and \(f(1)=1\).

After profiling, this turned out to be the most costly part, so I had to approximate it. Since I needed derivatives \(f'(x)\), I was wondering whether making the approximation match them (known as a Hermite interpolation) would increase accuracy.

The (pedagogical, unoptimized) code below sums up the gist of my numerical experiments, with f below standing in for my implicitly solved function. It also demonstrates the new features of SpectralKit.jl v0.10.

First, we set up the problem:

using SpectralKit, PGFPlotsX, DisplayAsf(x) = (exp(x) - 1) / (exp(1) - 1)
f′(x) = exp(x) / (exp(1) - 1)
const I = BoundedLinear(0, 1)   # interval we map from

Then define an interpolation using N Chebyshev nodes, matching the values.

function interpolation0(f, N)
    basis = Chebyshev(EndpointGrid(), N)
    ϕ = collocation_matrix(basis) \ map(f ∘ from_pm1(I), grid(basis))
    linear_combination(basis, ϕ) ∘ to_pm1(I)
end;

Same exercise, but with the derivatives too, so we need two bases, one with double the number of functions (so we need to make sure N is even), while we just use N/2 for the nodes.

function interpolation01(f, f′, N)
    @assert iseven(N)
    basis1 = Chebyshev(EndpointGrid(), N ÷ 2) # nodes from this one
    basis2 = Chebyshev(EndpointGrid(), N)     # evaluate on this basis
    x = from_pm1.(I, grid(basis1))            # map nodes from [-1,1]
    M = collocation_matrix(basis2, to_pm1.(I, derivatives.(x)))
    ϕ = vcat(map(y -> y[0], M), map(y -> y[1], M)) \ vcat(f.(x), f′.(x))
    linear_combination(basis2, ϕ) ∘ to_pm1(I)
end;

Importantly, note that mapping to [-1,1] for the collocation matrix has to be preceded by lifting to derivatives.

Then calculate the max abs difference, in digits (log10).

function log10_max_abs_diff(f, f̂; M = 1000)
    x = range(0, 1; length = M)
    log10(maximum(@. abs(f(x) - f̂(x))))
end;

Then let's explore the errors in values …

Ns = 4:2:20
errors = [(log10_max_abs_diff(f, interpolation0(f, N)),
           log10_max_abs_diff(f, interpolation01(f, f′, N)))
          for N in Ns]
9-element Vector{Tuple{Float64, Float64}}:
 (-3.1996028783051695, -2.594513489315976)
 (-5.882145733021446, -5.488666848393999)
 (-8.835160643191552, -8.232779398084544)
 (-11.994023867372805, -11.44897859894945)
 (-15.176438519807359, -14.664555158828485)
 (-15.35252977886304, -15.35252977886304)
 (-15.255619765854984, -15.35252977886304)
 (-15.35252977886304, -15.35252977886304)
 (-15.35252977886304, -15.35252977886304)

… and derivatives.

d_errors = [(log10_max_abs_diff(f′, (x -> x[1]) ∘ interpolation0(f, N) ∘ derivatives),
             log10_max_abs_diff(f′, (x -> x[1]) ∘ interpolation01(f, f′, N) ∘ derivatives))
            for N in Ns]
9-element Vector{Tuple{Float64, Float64}}:
 (-2.0758500387125216, -2.093336352131656)
 (-4.549339116162139, -4.611253272379436)
 (-7.363367596306161, -7.429305371299876)
 (-10.417554370684012, -10.485171320264207)
 (-13.381718167990524, -13.689771947181466)
 (-13.834015838985154, -14.374806173574193)
 (-14.03551167781493, -14.539616422220185)
 (-13.724140848812729, -14.750469787535078)
 (-13.714040521908403, -14.724140848812729)

Finally the plots:

@pgf Axis({ xlabel = "number of basis functions",
            ylabel = "log10 abs error in values",
            legend_cell_align= "left" },
          PlotInc(Table(Ns, first.(errors))),
          LegendEntry("fitting values"),
          PlotInc(Table(Ns, last.(errors))),
          LegendEntry("fitting values and derivatives")) |> DisplayAs.SVG

@pgf Axis({ xlabel = "number of basis functions",
            ylabel = "log10 abs error in values",
            legend_cell_align= "left" },
          PlotInc(Table(Ns, first.(d_errors))),
          LegendEntry("fitting values"),
          PlotInc(Table(Ns, last.(d_errors))),
          LegendEntry("fitting values and derivatives")) |> DisplayAs.SVG

The conclusion is that even without matching them explicitly, derivatives are well-approximated. Getting an extra digit of accuracy in derivatives above 12–14 nodes means sacrificing a digit of accuracy with a low number of nodes. 14 seems to be the break-even point here, but then we are at machine precision anyway.

As usual, simply approximating with Chebyshev polynomials is extremely accurate in itself for practical purposes, even when derivatives are needed. Of course, this depends on the function being “nice”.

Julia and batteries

By: Tamás K. Papp

Re-posted from: https://tamaspapp.eu/post/2019-12-21-julia-batteries/

New Julia users frequently suggest that some features should be included in Base (the part that is available directly without using any packages) or the standard libraries. The colloquial phrase is that “Julia should come with batteries included”, hence the title of this post.

In this blog post, I explain why doing this is unlikely to improve anything, likely to make some things worse, and thus often meets with resistance from developers. Most of these points are well-known, and repeated in various discussions. I thought it would be nice to have them in one place.

About the standard libraries

Before the 0.7/1.0 release of Julia, a lot of functionality now in the standard libraries was part of Base. As the codebase grew, this was causing practical problems, so modularizing the code was on the agenda from the very beginning, with the understanding that this would make things load faster, encapsulate implementation details better, and the code would be easier to maintain. The excision was completed for 0.7, and “standard libraries” as we know them now were born.

One very neat thing about Julia is that conceptually, standard libraries are like any other package. There is no “secret sauce” that makes them special: some things like precompilation are enabled by default, but you can do the same for any package.

What is different about standard libraries is an implicit guarantee for code quality and support: they are officially endorsed, and the Julia maintainers fix issues relating to these packages.

When users suggest that other packages they find useful are included in the standard libraries, they usually expect that the same kind of code quality and support guarantees would extend automatically to these packages.

Unfortunately, since developer time is a scarce resource, this would only mean that the same number of people would now have to maintain more code. This would most likely result in longer delays for releases, and is not necessarily a guarantee of better support: some standard libraries have (minor) issues that have been outstanding for a long time.

The benefits of (third party) packages

Below, I list some major advantages of having code in third party packages (what we normally think of as “packages”, usually registered) that are maintained outside the Julia repository.

Separate versioning, more frequent releases

Standard libraries, not being versioned separately, maintain the same SemVer compatibility guarantees as the rest of Julia. This means that no incompatible changes are allowed without changing the major version — in other words, anything that ends up there would have to be supported without breaking changes for the rest of the 1.x life cycle.

Since nontrivial interfaces usually require a few iterations to become polished, this would mean that the authors would need to get it right without a lot of community testing, or keep legacy methods around for the rest of 1.x.

Third party packages have their own version. This means that they can release as often they like, experiment with new ways of doing things and deprecate old APIs, and add and reorganize features without the implicit cost of having to support them for a long time. Because of this, it is not uncommon for actively maintained packages to have minor releases with new features every month or so, and bugfix releases within a day.

Easier contribution

Contributing to standard libraries is usually a more involved process than making a PR to third party packages. Unit tests take longer to run (since the whole of Julia is tested), ramifications of changes have to be considered in wider context, and contributors have to be much more cautious with experimentation since anything that ends up there would need to be supported for a long time.

The changes will need to be reviewed by people who may be very busy working on some other part of Julia, and may take a longer time; decisions frequently need a wider consensus and some PRs just languish unresolved for months. This naturally raises the barrier to entry, and can be quite frustrating to contributors.

In contrast, many packages have a single or a few authors, who have a clear vision of where they want to take their package, and are usually able to make decisions quicker.

Separate documentation and issue tracker

The Julia repository has a total of 17k issues (!), 14k of them closed, a similar number of open pull requests, and a manual that has 400 pages on the standard libraries (in PDF format). It is occasionally challenging to find an earlier issue, discussion, or documentation for something.

For third party packages, having a separate issue tracker and documentation makes it much easier to find what you are looking for.

Development happens in packages

If you would like Julia to support something that is experimental or not fully specified (and most software isn’t), it is unlikely that adding it to Base or the standard libraries is the right solution.

Instead, it is usually recommended that you put it in a package and work out the details, improving the code based on experience and feedback. There is no downside to this in most cases, just the benefits mentioned above.

If the experiment pans out, great: you have created a nice package! If it doesn’t, it can be retired and archived, without having to commit to continued support.

How my Julia coding style changed

By: Tamás K. Papp

Re-posted from: https://tamaspapp.eu/post/2019-09-07-coding_style_retrospective/

The recent redesign of DynamicHMC.jl provides an opportunity to reflect on how my Julia coding style changed. I started this package two years ago, and did not do any major refactoring after it became usable, mostly because I did not have a good comprehensive test suite or a solid conceptual understanding of all the ramifications of the algorithm, especially the adaptation, and I was afraid of breaking something that I just got working. Consequently, the code before the redesign largely reflected the coding style I developed for Julia 0.5 and then partly 0.6.

I am one of those people who recognize the benefits of reflecting on the past from time to time, but are too busy lazy to keep a diary. But code under version control is effectively a diary about the life of software, and I find it interesting to think about how my Julia coding style has changed in the past to years.

It is important to emphasize that this writeup reflects how I code at the moment. I recognize that other coding styles in Julia can be perfectly legitimate, and when I contribute to packages, I do my best to follow existing style. Also, it is very likely that my own coding style will change as the language evolves. In this blog post, I won’t address details of coding style about which there is a more or less general agreement — see the Style Guide in the manual and YASGuide for useful summaries. Finally, some examples are stylized adaptations of real code for simpler exposition.

Randomness

Random numbers are very important for scientific computing, but at the same time present an important challenge for writing bug-free parallel code and unit testing. A lot of code in Random and Base follows the interface conventions

rand([rng], ...)

making the random number generator an optional first argument, and defaulting to the global one.

While this can be convenient for user code, in packages I find that it is better to keep the random number generator as an explicit argument to almost all methods. This paves the way for making the code work nicely with composable multi-threading too.

If random and deterministic parts of the algorithm can be separated, it should be done, as it usually makes the code cleaner, and unit testing much easier. An example of this is factoring out the random directions for tree doubling in NUTS: drawing a single set of bit flags once and providing this to the tree building algorithm, the latter can treat the doubling directions as predetermined. This also made it much easier to verify detailed balance in the unit tests.

Named tuples

Named tuples are extremely useful for organizing multiple values as inputs and outputs when a composite type (struct) is not warranted or needed. If a function returns multiple values, especially more than two, I now always default to using named tuples, ie

function f(x, y)
    a = ...
    b = ...
    c = ...
    (a = a, b = b, c = c)
end

then use @unpack from Parameters.jl for destructuring:

@unpack a, b = f(x, y) # I don't need c here

This is costless and makes the interfaces more flexible: for example, I can add extra fields, or eventually make the returned value a composite type later, without changing anything in the callers.

Multi-liners as opposed to one-liners

Almost all the “clever” one liners are now expunged from the code. Compare

EBFMI(x) = (πs = map(x -> x.π, x); mean(abs2, diff(πs)) / var(πs))

with

function EBFMI(tree_statistics)
    πs = map(x -> x.π, tree_statistics)
    mean(abs2, diff(πs)) / var(πs)
end

The first one is kind of crammed, and difficult to read, while the second one makes it clear that an interim value is obtained for the calculation.

When a function returns multiple values, I find it nice to assign each to a variable first, ie

function f(x, y)
    a = g(x, y)
    b = h(x, y)
    a, b
end

as opposed to

f(x, y) = g(x, y), h(x, y)

This makes the returned structure much more readable when I need to look at some code later.

That said, I still have a strong preference for writing small functions that “do one thing”. This again is frequently an optimal strategy for Julia, as the compiler is very good about inlining decisions (and when not, can be nudged with @inline).

Abstract types can be an implementation detail

Abstract types can be very useful for organizing method dispatch: code like

abstract type SomeAbstractType end

f(x::SomeAbstractType) = do_something()

struct ThatSubType{T} <: SomeAbstractType
    y::T
end

g(x::ThatSubType) = do_something_else(x.y)

can help avoid repetition.

But abstract types are just one convenient tool for this, and building a type tree is a tool, not an end in itself. When they are no longer adequate, other solutions such as traits should be used instead.

Consequently, these days I mostly consider type hierarchies an implementation detail and I am wary of exposing them in the API of a package. This is especially important if some functionality requires that the user of a package implements their own custom types, e.g. for a model. Requiring that

UserDefinedModel <: SomePackage.ModelType

to work with some API is usually unnecessary, and can cause unnecessary complications when the user would want UserDefinedModel to implement some other API. Either duck typing, or an explicit interface check like

implements_model(::Any) = false
implements_model(::Type{<:UserDefinedModel}) = true

can work better (returning types when that is necessary, eg for further dispatch — see Base.IteratorSize for an example).

Small type unions

The ability of the compiler create efficient code for small type unions made code organization much easier for me. For example, DynamicHMC.adjacent_tree attempts to build a tree next to some position, but this can fail because of divergence or turning in the leapfrog integrator, in which case it is useful to bail out early and return information about the reason. Consequently, this function either returns a tree, or an InvalidTree, which is a container with the aforementioned diagnostic information. The compiler recognizes this and generates fast code.

Some attention should be paid to avoid combinatoric type explosions, but when used well, small type unions can make code much simpler at little or no extra cost.

Functional is often the best solution

Preallocating containers for outputs can make code faster, but at the cost of extra complexity, especially that of having to figure out type parameters for said containers.

My general strategy is to write code in a functional, side-effect-free style until I am really convinced that preallocation would help, and then confine it to carefully compartmentalized parts of the code. This is of course ideal only for some kinds of code — when allocations dominate, a functional style is not optimal.

That said, Julia’s compiler is getting more and more clever about optimizing allocations, so I frequently find that I never rewrite code with preallocated containers, and occasionally even revert from preallocated code to a functional style without a relevant performance cost.

When in doubt, document

Some readers of the code of DynamicHMC.jl will have the impression that it is overdocumented. Probably 20–25% of the lines in the source are docstrings, and most internal interfaces and convenience functions have one.

Primarily, this is driven by instances of the embarassing realization of having written some code two years ago which kind of works, but no longer being sure of all the details that would be nice to know before refactoring. But another consideration is encouraging contributions, or uses of the packages for research — besides being an implementation of the NUTS sampler, DynamicHMC.jl is also a platform for experimenting, which is easier to do when the internals are documented.

Julia’s documentation ecosystem is now very mature and robust. Besides the excellent Documenter.jl, I also find DocStringExtensions.jl very useful.

Internal interfaces have high return on investment

Many packages, especially those that are work in progress or experimental, do not document their user interfaces very carefully. This is of course understandable, but I have developed a preference for abstracting out and documenting interfaces in internally used code, too, as the package matures.

This makes refactoring and compartmentalization much easier, and makes sense even if a particular interface has only a single implementation in the package, because unit tests can just provide a “dummy” implementation and test related code that way. An example of this is the code for “trees” and the related tests: trees here are an abstraction for walking integer positions in two directions. This is key for the NUTS algorithm, but can be handled as an abstraction separate from the details of leapfrog integrators.