Tag Archives: General Programming

MATLAB vs Julia: Best Programming Language for Renewable Energy Simulations

By: Great Lakes Consulting

Re-posted from: https://blog.glcs.io/matlab-vs-julia-renewable-energy

This post was written by Steven Whitaker.

At GLCS, we’re proud to have delivered innovative projects across top industries, including renewable energy, aerospace, and biomedical engineering.Our comprehensive Modeling and Simulation Servicesempower clients to elevate their designs,whether rewriting models in Juliafor greater efficiency or unlocking cutting-edge featuresto solve complex problems.With deep expertise spanning several engineering and scientific domains,including computational fluid dynamics,thermodynamics,controls,biomedical engineering,and chemistry,we are your trusted partner in pushing modeling boundariesand achieving breakthrough results.

In this post,we’ll focus on systems modelingfor renewable energy.

Renewable energy systems,from wind farms to wave power,require precise modeling and simulation.Selecting the optimal computational tools is crucial;it can significantly acceleratedevelopment, reduce costs, and drive the transitionto a sustainable future.

Both MATLAB and Julia are widely used in engineering,particularly for solving differential equations.This post compares how each handlesa renewable energy modeling scenario,the steady axisymmetric turbulent wakebehind a wind turbine,ultimately showing why Julia has the edge.Maximize efficiency and energy outputwith cutting-edge technologiesdesigned for tomorrows energy.

Steady Axisymmetric Turbulent Wake

When air flows past a wind turbine,a wake forms downstream.The wake velocity deficit impacts turbine spacing,efficiency,and power output.

A simplified steady axisymmetric turbulent wake equation is:

\[\frac{\partial U}{\partial x} = \nu_t \cdot \left( \frac{\partial^2 U}{\partial r^2} + \frac{1}{r} \frac{\partial U}{\partial r} \right)\]

where:

  • \( U \) is the axial velocity.
  • \( x \) is the downstream distance.
  • \( r \) is the radial coordinate.
  • \( \nu_t \) is the turbulent viscosity.

This is a reduced form of the momentum equation,capturing diffusion of momentumdue to turbulence.

MATLAB vs Julia: ODE/PDE System Set-up

Let’s see how to convert the math into codeand solve this PDE.

  • MATLAB:

    function dUdx = wake_eq(x, U, p)    % Radial step size    dr = p.r(2) - p.r(1);    % First derivative (central difference)    dUdr = (U(3:end) - U(1:end-2)) / (2 * dr);    % Second derivative (central difference)    d2Udr2 = (U(3:end) - 2 * U(2:end-1) + U(1:end-2)) / dr^2;    dUdx = zeros(size(U));    dUdx(2:end-1) = p.nu_t * (d2Udr2 + dUdr ./ p.r(2:end-1));endU0 = initial_profile();xspan = [0 100];p.nu_t = 0.05;p.r = linspace(0, 1, numel(U0));prob = ode;prob.ODEFcn = @dUdx;prob.InitialTime = xspan(1);prob.InitialValue = U0;prob.Parameters = p;prob.Solver = "ode45";sol = solve(prob, xspan(1), xspan(2));
  • Julia:

    using DifferentialEquations@kwdef mutable struct WakeParams{T}    _t::Float64    const r::Tendfunction wake_eq!(dU, U, p, x)    # Radial step size    dr = p.r[2] - p.r[1]    # First derivative (central difference)    dUdr = (U[3:end] .- U[1:end-2]) ./ (2 * dr)    # Second derivative (central difference)    d2Udr2 = (U[3:end] .- 2 .* U[2:end-1] .+ U[1:end-2]) ./ dr^2    dU[1] = dU[end] = 0    dU[2:end-1] .= p._t .* (d2Udr2 .+ dUdr ./ p.r[2:end-1])endU0 = initial_profile()xspan = (0.0, 100.0)p = WakeParams(; _t = 0.05, r = range(0.0, 1.0, length(U0)))prob = ODEProblem(wake_eq!, U0, xspan, p)solver = Tsit5()sol = solve(prob, solver)

(Note that, in practice,the boundary conditionsfor \( r = 0 \) and \( r = 1 \)would need to be handled with more care.)

As you can see,the syntax for Julia and MATLABis quite similar.However,there are some key differencesbetween the two approaches:

  • Julia’s broadcasting (.=, .*, etc.) is explicit and fast.
  • Julia can use in-place functionsto minimize memory allocations,boosting performance.
  • The Julia code for the dynamics above was writtento look more like the MATLAB code.However,additional easy performance optimizations are possibleto reduce memory allocations.
  • Julia’s DifferentialEquations.jl supports many solverswith a unified interface.MATLAB supports only a limited set of solvers.

Event Handling and Callbacks: Wind Gusts

Now let’s add a gust at \( x = 50 \)that will change \( \nu_t \)for \( x \ge 50 \).

  • MATLAB:

    function v = events_func(x, U, p, gust_position)    % Event occurs when `x == gust_position`.    v = x - gust_position;endfunction [stop, U, p] = callbacks_func(x, U, ie, p)    stop = false;    % Check if the event occurred.    if ismember(1, ie)        p.nu_t = 0.08;    endendgust_position = 50;event = odeEvent;event.EventFcn = @(x, U, p) events_func(x, U, p, gust_position);event.Response = "callback";event.CallbackFcn = @callbacks_func;prob.EventDefinition = event;sol = solve(prob);
  • Julia:

    function gust_affect!(integrator)    integrator.p._t = 0.08endgust_position = 50.0;callback = PresetTimeCallback([gust_position], gust_affect!)sol = solve(prob, solver; callback)

When it comes to events and callbacks,Julia’s approach is better:

  • Julia’s callbacks are better organized;events and corresponding callbackscan be defined next to each otherin the code,instead of separated across filesas is typical in MATLAB.
  • Multiple events can be combined cleanlywith CallbackSet,no need to try to cram multiple eventsinto a single functionlike you have to do in MATLAB.
  • Julia keeps track of what events are triggered.In MATLAB,you have to checkwhat events were triggered manually.
  • Julia provides many pre-defined callbacks(in DiffEqCallbacks.jl)that often have to be manually implemented in MATLAB.

Other Considerations

In addition to the differencesin solving differential equations,here are some other key differencesbetween Julia and MATLAB:

  • Performance:Julia’s JIT-compilation and method specializationyield C-like speeds.Large-scale renewable energy simulationsrun faster and scale better,especially for parameter sweeps.You can expect to see 50–150 times faster code with Julia.
  • Workflow:MATLAB offers a polished GUI and plotting out of the box.However,Julia offers open-source freedom,easy integration with Python/C,and a thriving ecosystem.
  • Licensing:MATLAB requires a paid license;Julia is entirely free and open-source.

Summary

In this post,we saw how Julia and MATLAB comparefor defining and solvingsteady axisymmetric turbulent wake.Both languages can model renewable energy systems effectively.However, Julia offers:

  • Performance: JIT speed for large systems.
  • Flexibility: Powerful callbacks and open integrations.
  • Cost: No license fees.
  • Modern syntax: Designed for productivity and clarity.

Ready to revolutionize your renewable energy simulations? Transition your MATLAB models to Juliaand experience unparalleled speed, flexibility, and long-term maintainability.Reach out today and let’s accelerateyour green energy innovations together!

Worried about the technical hurdles and costsof switching from MATLAB to Julia?Discover our cutting-edgeJulia-MATLAB Integration.We develop high-performance Julia modelsthat seamlessly connect withyour existing MATLAB codebase,minimizing risks while maximizingyour return on investment.Transition smarter, faster, and more cost-effectivelywith our expert solutions!

Additional Links

MATLAB is a registered trademarkof The MathWorks, Inc.

]]>

Accelerate Your Julia Code with Effective Profiling Methods

By: Great Lakes Consulting

Re-posted from: https://blog.glcs.io/profiling_allocations

This post was written by Steven Whitaker.

In this Julia for Devs post,we will discussusing Julia’s Profile standard libraryfor performance and allocation profiling.We will illustrate these toolswith example codeand then show how to improve the code.

This post will showcasethe powerful techniques we usedto significantly accelerateour clients simulations,resulting in a remarkable 25% reduction in run timeof code that was already highly optimized for performance.The impact of these techniqueson our client’s workis a testament to their effectivenessand should inspire youin your own projects.

Many new Julia developersface poor code performance,but these techniques can net10x or even 100x faster run times!

While we won’t be sharing client-specific code,we will provide similar examplesthat are practical and applicableto a wide range of simulations.This approachshould give you the confidenceto apply these techniquesin your work.

Here are some of the key ideas we will focus on:

  • Pinpoint areas of improvement with Profile.@profile.
  • Track down allocations with Profile.Allocs.@profile.
  • Improve run time and reduce allocations with @generated functions.

Let’s dive in!

Profiling in Julia

Profiling is a crucial toolfor locating performance bottlenecks.This knowledge is invaluableas it guides development effortsto the parts of the codethat will have the most significant impact on run time.Understanding this processwill keep you informedand in controlof your code’s performance.

Here is some example codewe will useto illustrate profiling in Julia:

using StaticArrays: SVectorfunction kernel_original(x::SVector{N}) where N    y = zeros(N + 1)    y[1] = x[1]    for i = 2:N        y[i] = 0.5 * y[i-1] + x[i]    end    y[end] = sum(@view(y[1:end-1]))    return SVector{N+1}(y)endfunction workflow_original()    total = 0.0    for i = 1:100        x = SVector{10, Float64}(rand(10))        y = kernel_original(x)        total += y[end]    end    return totalend

The main idea with this exampleis that there is a workflow,workflow_original,that calls out to a core computationfor many different input values.

To profile this code,we will use the Profile standard library.Since Profile implements a statistical profiler,we need to ensure the code we profileruns long enoughto reduce the impact of noise in the measurements(see the documentation for more info).So,let’s see how long workflow_original takes(using BenchmarkTools.jl):

julia> using BenchmarkToolsjulia> @btime workflow_original();  6.410 s (300 allocations: 25.00 KiB)

6 microseconds is very fastcompared to the default profiling sample delayof 1 millisecond.Therefore,we must run the workflow many timesto get enough samples.We have found that 1000–5000 samplesare usually plentyfor identifying performance bottlenecks,though slower code will likely need more samplesand/or a larger sample delay.

So,to get approximately 1000 samples,we will need to run the workflow\( 1000 \cdot \frac{1 \mathrm{ms}}{0.006 \mathrm{ms}} = 166,667 \) times.We’ll round up to 200,000 for good measure.

Let’s see how to profile the code:

julia> using Profilejulia> Profile.clear()julia> workflow_original();julia> Profile.@profile for _ in 1:200_000 workflow_original() end

A couple of notes:

  • The Profile.clear() stepis not strictly necessary.However, a habit of always clearing the profile dataensures one doesn’t inadvertently spoil their profileswith old data from previous profiling results.
  • We called workflow_original once before profilingto avoid profiling JIT compilation.

Now we want to display the profile data.One way is via Profile.print,which prints a textual representationof the profile to the console,but this isn’t the most efficient methodfor inspecting the data.Typically,profile data is visualizedusing a flamegraph.

The Profiling docs list several packagesfor visualizing profiles.We will use ProfileCanvas.jl,which creates an HTML filewe can view in a web browserand interactively inspect the profiling results:

julia> using ProfileCanvasjulia> ProfileCanvas.view()

This code creates and displays a flamegraph:

Profile of original workflow

One thing that stands out in this flamegraphis the three yellow rectangles.These indicate occurrences of garbage collection (GC),which implies the code allocated memory.Since these yellow blocksrepresent a decent portion of the run time(as indicated by the width of the rectangles),let’s investigate.

Allocation Profiling

We will use Julia’s allocation profiling toolsto further inspectthe allocations we know the workflow has.The process is similar to performance profiling,with the following differences:

  • We will use Profile.Allocs.@profileand pass the option sample_rate = 1.0to record all allocations.In a larger workflow with many more allocations,a smaller sample_rate is advised(the default value is 0.1).
  • We will run the workflow just one time.
  • We will visualize the results with ProfileCanvas.view_allocs.

Here’s how to profile allocations:

julia> using Profile, ProfileCanvasjulia> Profile.Allocs.clear()julia> workflow_original();julia> Profile.Allocs.@profile sample_rate = 1.0 workflow_original();julia> ProfileCanvas.view_allocs()

Allocation profile of original workflow

In this flamegraph,the widths of each rectanglecorrespond to how many allocations were made.In this example,each yellow blockis the same width,meaning they all contributedthe same number of allocations.

Now we need to determinewhether we can do anything about these allocations.

Moving up from the bottom of the flamegraphlets us trace the call stackto see where these allocations came from.For example,we can see that GenericMemorywas called from Array,which itself was called from Array,and so on.Eventually,we get to workflow_original.

If we hover our mouse cursor over that block,it will display the file and line number:

Mouse hover displaying file and line info

(In this case,I defined workflow_originalin the first REPL promptof a fresh Julia session,so that’s why REPL[1] shows up for the file name.)

Looking at these profile resultsshows us that the offending lines are:

  • x = SVector{10, Float64}(rand(10)) (in workflow_original)
  • y = zeros(N + 1) (in kernel_original)

This makes sense;in each of these lines,we explicitly create an array(via rand and zeros),which allocates memory.

But it turns outwe can eliminate these allocations!

Eliminating the first allocationrequires knowledgeof the StaticArrays.jl package.In particular,the @SVector macrocan create an SVectordirectly from standard array expressions(like rand)without allocating memory.So,instead of using the SVector constructor,we can write:

x = @SVector rand(10)

The second allocation,however,is a bit trickier to remove.

Reducing Allocations with @generated Functions

What makes it difficultto remove the allocationin kernel_originalis the elements of the vector ydepend on previous elementsof the vector.That means we have to compute one elementto compute the next,which means we have to store that element somehow.If we knew exactly how long y would be,we could store the computationsin local (scalar) variables.However, the function needs to work with any input size;we don’t want to create a separate methodfor each possible input size.

At least, not by hand.

It turns outJulia can automatethe creation of these specialized methods.The trick is to use the @generated macro.

A method annotated with @generateduses type informationto produce specialized implementationsof the methoddepending on the input types.And since type information is available at compile time,this specialization occurs at compile time,leading to run time improvements.

Using a @generated functionto replace kernel_originalwill allow us to, essentially,move the run time allocationto compile time.

Here’s what the new function looks like:

@generated function kernel_generated(x::SVector{N}) where N    assignments = [:(y1 = x[1])]    for i = 2:N        yprev = Symbol(:y, i - 1)        yi = Symbol(:y, i)        push!(assignments, :($yi = 0.5 * $yprev + x[$i]))    end    yend = Symbol(:y, N + 1)    sum_expr = reduce((a, b) -> :($a + $b), (Symbol(:y, i) for i = 1:N))    push!(assignments, :($yend = $sum_expr))    vars = ntuple(i -> Symbol(:y, i), N + 1)    return quote        $(assignments...)        return SVector{$(N + 1), Float64}($(vars...))    endend

The first thing you might notice,especially if you are unfamiliar with metaprogramming,is that this function looks quite differentfrom kernel_original.So, let’s unpack this a bit:

  • A @generated functionneeds to return an expression.The compiler will then compilethe code resulting from the expression.Finally, at run time,the compiled code will be used,not the code usedto generate the compiled expression.In other words,this function must returnthe specialized code itself,not the result of a run time computation.
  • The returned expressionis created with a quote block.This quote blockinterpolates (using $) expressionsbuilt up earlier in the function.The computationsto build up these expressionsoccur only at compile time.

If we want to inspectwhat the generated function looks like,we need to refactor the code just a bit.Essentially,we’ll create a regular Julia functionthat returns an expression,and the @generated functionwill just call that function:

function _gen_kernel_generated(::Type{<:SVector{N}}) where N    # Same code as `kernel_generated` above.end@generated kernel_generated(x::SVector) = _gen_kernel_generated(x)

Then we can call _gen_kernel_generated directlyto see what code actually runs:

julia> _gen_kernel_generated(typeof(@SVector rand(1)))quote    #= REPL[2]:18 =#    y1 = x[1]    y2 = y1    #= REPL[2]:19 =#    return SVector{2, Float64}(y1, y2)endjulia> _gen_kernel_generated(typeof(@SVector rand(2)))quote    #= REPL[2]:18 =#    y1 = x[1]    y2 = 0.5y1 + x[2]    y3 = y1 + y2    #= REPL[2]:19 =#    return SVector{3, Float64}(y1, y2, y3)endjulia> _gen_kernel_generated(typeof(@SVector rand(3)))quote    #= REPL[2]:18 =#    y1 = x[1]    y2 = 0.5y1 + x[2]    y3 = 0.5y2 + x[3]    y4 = (y1 + y2) + y3    #= REPL[2]:19 =#    return SVector{4, Float64}(y1, y2, y3, y4)end

Yep, the implementation looks right!

We can also seethat the generated codeavoids creating an arrayto store intermediate results,instead storing computationsin local variables.But we didn’t have to writeany of those methods ourselves!Using @generated allows usto maintain just one functionto generate all these specialized implementations.

Let’s benchmark the new implementation:

julia> @btime workflow_generated();  1.093 s (0 allocations: 0 bytes)

Nice, six times fasterand no allocations!

And here’s the profile:

Profile of optimized workflow

(Click here to see the code for profiling.) juliausing StaticArrays: @SVector, SVector@generated function kernel_generated(x::SVector{N}) where N assignments = [:(y1 = x[1])] for i = 2:N yprev = Symbol(:y, i - 1) yi = Symbol(:y, i) push!(assignments, :($yi = 0.5 * $yprev + x[$i])) end yend = Symbol(:y, N + 1) sum_expr = reduce((a, b) -> :($a + $b), (Symbol(:y, i) for i = 1:N)) push!(assignments, :($yend = $sum_expr)) vars = ntuple(i -> Symbol(:y, i), N + 1) return quote $(assignments...) return SVector{$(N + 1), Float64}($(vars...)) endendfunction workflow_generated() total = 0.0 for i = 1:100 x = @SVector rand(10) y = kernel_generated(x) total += y[end] end return totalendusing BenchmarkTools@btime workflow_generated();using Profile, ProfileCanvasProfile.clear()Profile.@profile for _ in 1:200_000 workflow_generated() endProfileCanvas.view()Note,there’s no need for allocation profilingbecause there are no allocations.

There aren’t obvious places for improvement,so I’d say we did a pretty good joboptimizing the code.

Summary

In this post,we saw how to use the Profile standard libraryto profile the run timeand the allocationsof a piece of code.We also illustratedhow a @generated functioncan eliminate run time allocationsand speed up the code.These were key ideas we usedto help one of our clientsspeed up their simulations.

Do you need help pinpointing performance bottlenecksor tracking down allocationsin your code?Contact us, and we can help you out!

Additional Links

]]>

Enhance Model Accuracy Using Universal Differential Equations

By: Great Lakes Consulting

Re-posted from: https://blog.glcs.io/ude_symbolic_regression

This post was written by Steven Whitaker.

In this Julia for Devs post,we will explore the fascinating world of using universal differential equations (UDEs)for model discovery.A UDE is a differential equationwhere part (or all) of itis defined by a universal approximator(typically a neural network).These UDEs play a pivotal rolein uncovering missing aspectsof established models.

UDEs enable seamless integrationof known mechanistic modelswith data-driven modeling approaches.Rather than omit or guess at unknown aspects of a model,a neural network can be embedded into the model,which can then be trained using observed data.

Once trained,the embedded neural networkcan be approximated by a mathematical expression.This allows us to replace the neural networkwith an interpretable formula,improving transparencywhile filling in the missing aspectsof the original modelwith new structure learned from observed data.

This post will illustratekey ideasand design decisions madewhile helping one of our clientsuncover missing dynamics in their model,a complex system representingthe human body under treatmentof a rare disease.

By replacing just one aspect of our client’s modelwith a neural network,we achieved a significant milestone:a 30% reduction in the error of the model predictions.This success story is a testament to the powerof UDEs in enhancing model performance.

We will not share client-specific code,but will provide similar examplesto illustrate our approach.

Here are some of the key ideas we will focus on:

  • Ensure nonnegativity of state variables (as needed).
  • Appropriately initialize neural network weights.
  • Normalize neural network inputs.
  • Make dynamics modular.
  • Fit symbolic expression.

Note that we will discussfully connected neural networks,so some comments belowmay not apply if other architectures are used.

And with that,let’s dive in!

Ensure Nonnegativity of States

When working with a modelrepresenting a real phenomenon,chances are that at least some of the model’s state variablesare naturally nonnegative.For example,the energy of a systemor the concentration of a chemicalcannot be negative quantities.

However,a differential equation solversees only math,not the physical phenomenonthe math represents.So,we have to tell the solverto respect the natural constraintsthat exist.

One way to enforce nonnegativityis to pass the isoutofdomain optionto solve,where isoutofdomain is a functionthat returns truewhen at least one of the provided state variablesviolates the nonnegativity constraint.For example,if all state variables should be nonnegative,the function can be defined as:

isoutofdomain(u, p, t) = any(<(0), u)

As another example,if only the first and third states should be nonnegative:

isoutofdomain(u, p, t) = u[1] < 0 || u[3] < 0

For more informationand other toolsfor tackling nonnegativity,check out the SciML docs.

The ideaof telling the solverabout the nonnegativity constraintsmay seem obvious.Still, it has important implicationsfor the initialization of neural networksembedded into the model.Care should be takento ensure the initial network weights(before training)do not cause states to become negative.See the next section below for more details.

Appropriately Initialize Neural Network Weights

To learn missing dynamics,the neural network in a UDEneeds to be trained.

Training the neural network can occuronly if the UDE can be solvedusing the initial network weights.Otherwise,the gradient of the loss,which is used by optimization algorithmsto train the neural network,cannot be computed.

As a result,it is necessaryto choose a good initializationfor the neural network weights.Here are some approaches for initializationand how well they work:

  • Random initialization:Typically,neural network weightsare initialized randomly.However,this results in random UDE dynamics,so whether a state variablestays positive(or satisfies any other constraints)is left to chance.And if constraints are violated,but we try to enforce themvia, e.g., isoutofdomain,the solver will be unable to solve the UDE,preventing computation of the loss gradient.So,random initialization might work,but it might not.
  • Zero initialization:Don’t do this.Network weights won’t ever update during trainingbecause the gradients with respect to those weightswill always be zero.
  • Constant initialization:Don’t do this, either.Initializing all the weightsto the same, non-zero valuewill result in some amount of learning,but it severely limitsthe expressivity of the network.
  • Pre-training:If the neural network in the UDEis used to replace an expressionthat is believed to be incorrect(but otherwise gives a usable model),one way to initialize the network weightsfor UDE trainingis to initialize the weights randomlybut then train the network directly(i.e., not the UDE as a whole)to fit the expression the network is to replace.This works because no constraints are involvedduring pre-training(so randomly initializing the weights is okay),but when training the UDE,the network weights have been setto avoid running into issues with constraints.
  • Smoothed Collocation:Smoothed collocation is another approachfor pre-training the neural networkwithout involving constraints.See the SciML docs for more info.

For our client,randomly initialized weightsfailed to give a usable gradientfor training,so we decided on pre-trainingbecause they had a reasonable guessthat we could fit the initial network weights to.

Normalize Neural Network Inputs

One key to ensuring UDE training proceeds nicelyis to normalize the inputsto the neural networkto ensure all inputsare roughly of the same magnitude.

Normalizing is essentialbecause the gradient of the training loss functionis proportional to the network inputs.If one input tends to be 1000 times largerthan another,the gradient will also tend to be that much larger,dominating the learning algorithm.

Effect of input scale imbalance on training

To illustrate how to normalize,suppose we have the following neural network(created using Lux.jl):

nn = Chain(    Dense(2, 5, tanh),    Dense(5, 5, tanh),    Dense(5, 1),)

Then we can add a function to the Chainthat performs the normalization:

nn_normalized = Chain(    x -> x ./ normalization,    # Other layers as before)

In this example,normalization is a Vectorcontaining the valuesto scale each input variable.And of course,we’re not limited to scaling the inputs;we can implement whatever function we wantto process the inputs(such as z-score standardization).

The neural network in our client’s UDEdid not learn without normalization.After normalizing the inputs,we got the neural networkto learn meaningful dynamics.

Make Dynamics Modular

While not strictly necessary for UDEs,making the dynamics modularvastly simplifies comparingbetween different approaches.

For our client,we wanted to comparethree versions of the dynamics:the original dynamics,the UDE(where part of the original dynamicswas replaced with a neural network),and the updated dynamics(where the trained neural networkwas replaced with a symbolic expressionfit to the neural network).

Rather than duplicate the dynamics function three timeswith each slight variation,we allowed the variable part of the dynamicsto be passed in as an optionto the dynamics function.To illustrate:

function f!(du, u, p, t; g = original_g)    # ... some code ...    # `a` and `b` are values computed above    # or pulled from `u`, etc.    val = g(a, b, p)    # `val` is used to update `du` in some way.    # ... some code ...end

However,ODEProblems don’t allow passing optionsto the dynamics function.But that’s not a problem;we can create a closureto set the optionbefore handing the function to an ODEProblem.For example,if we want g to use a Lux.jl neural network:

nn = Chain(...)(p_nn, st) = Lux.setup(...)# Be sure to include `p_nn` in the `ODEProblem` parameters `p`.g_nn = (a, b, p) -> nn([a, b], p.p_nn, st)[1][1]f_nn! = (du, u, p, t) -> f!(du, u, p, t; g = g_nn)

Setting up our client’s code like thisreduced code duplicationwhile allowing easy comparisonsacross different versions of the dynamics.

Fit Symbolic Expression

One of the key stepsto learning interpretable dynamicsfrom a UDEis to fit a symbolic expressionto the trained neural network.Doing so allows us to replacethe black box neural networkwith a mathematical expressionwe can understand.

The first stepis to collect the data to usefor the fitting.If the neural network takes two inputs,we will need a Matrix of values:

X = [    a1 a2 ... aN;    b1 b2 ... bN;]

The (ai, bi) pairsshould cover the expected range of input values.One way to determine this rangeis to solve the UDEand save off the valuesthat end up going into the neural network.

To get the target values,evaluate the neural networkwith the X we just created:

y = nn(X, p_nn, st)[1] |> vec

(Here we assume the neural networkoutputs a single numberper input pair.)

Now we can figure outa mathematical expressionrelating X to y.One way to do thisis to use SymbolicRegression.jl.

After obtaining an Expression object eq(called tree in the README for SymbolicRegression.jl),we can use it in our dynamics functionas described in the previous section:

# `p` needs to be an input even though it's not used.g_eq = (a, b, p) -> eq([a; b;;])[1]f_eq! = (du, u, p, t) -> f!(du, u, p, t; g = g_eq)

Using SymbolicRegression.jlto fit the trained UDE neural network,we could provide our clientwith a precise mathematical expressionrepresenting the dynamics that were missingfrom their original differential equation.

Summary

In this post,we discussed vital aspectsof implementing UDEsto learn missing dynamics.We learned some tipsfor helping UDE training,including appropriately initializing network weightsand normalizing network inputs.We also saw the benefitsof code modularityand how that can easily allow usto insert a learned mathematical expressioninto the dynamics.For our client,we were able to use these ideasto train a UDEto achieve 30% less errorin the model predictions.

Do you want to leverage UDEsto improve your models’ predictive capabilities?Contact us, and we can help you out!

Additional Links

]]>