Tag Archives: json

Julia, custom serialization with JSON.jl

By: Picaud Vincent

Re-posted from: https://pixorblog.wordpress.com/2026/05/05/julia-custom-serialization-with-json-jl/

Introduction

The GitHub:JSON3.jl package has been deprecated. That bothered me a little because I had to migrate a lot of my code to use GitHub:JSON.jl. Luckily, the migration turned out to be easier than I expected.

My use case is a bit special: I have to serialize my structures with type information so that I can retrieve the exact types after deserialization.

I know about GitHub:BSON.jl (see also Wiki:BSON) and Julia:Serialization, but I didn’t want to use them because they produce binary files. I wanted to keep a human‑readable format.

In this note I give a minimal working example that might save you some time.

Code

We’ll need the JSON.jl package. We also use StaticArrays.jl to show how to preserve the right vector type when deserializing an AbstractVector.

using JSON
using StaticArrays 

Let’s imagine we have an abstract type Abstract_Foo and two concrete types: Foo_A and Foo_B.

abstract type Abstract_Foo end

@nonstruct struct Foo_A{V <: AbstractVector}  <: Abstract_Foo
    v::V
    x::Float64
end

@nonstruct struct Foo_B <: Abstract_Foo
    v::AbstractVector
    n::Int
end 

Nothing special here, except the @nonstruct macro. That macro comes from GitHub:StructUtils.jl, a package used by JSON.jl to automate common struct operations (construction, etc.).

Using Doc:@nonstruct in front of a struct definition marks it as “special”. You tell JSON.jl to treat it as a primitive type that should be converted directly using lift() and lower() methods, rather than constructing it from field values. In short, you have to do all the work by hand, but you also get all the freedom to serialize and deserialize the structure however you want.

Serialization

During serialization the lower() method is called. We save the field values but also any type information needed for deserialization. Personally, I store this information in a field called type that holds the type of the structure. The name type isn’t special, you could call it internal_type, but I think it’s good practice to adopt a convention and stick to it.

function StructUtils.lower(to_serialize::Foo_A)

    return (type = string(typeof(to_serialize)),
            v = to_serialize.v,
            x = to_serialize.x)
end

For Foo_B, it’s a bit more complicated because the v field is an AbstractVector type, so we need an extra field to save the type information:

function StructUtils.lower(to_serialize::Foo_B)

    return (type = string(typeof(to_serialize)),
            v_type = string(typeof(to_serialize.v)),
            v = to_serialize.v,
            n = to_serialize.n)
end

Demonstration

Here’s a demonstration of serialization:

a = Foo_A(@SVector(Int[1,2]),1.2)

a_json_str = JSON.json(a, pretty=true)
{
  "type": "Foo_A{SVector{2, Int64}}",
  "v": [
    1,
    2
  ],
  "x": 1.2
}

Now for Foo_B

b = Foo_B(Float16[3,4],34)

b_json_str = JSON.json(b, pretty=true)
{
  "type": "Foo_B",
  "v_type": "Vector{Float16}",
  "v": [
    3.0,
    4.0
  ],
  "n": 34
}

Deserialization

To deserialize you have to define the lift() methods.

First, we intercept all Abstract_Foo occurrences and extract the concrete type. Right now the type is a String, to turn it into a Julia DataType we use Base.eval() and Meta.parse(). Once we have that instantiated type, we continue deserialization with it.

function StructUtils.lift(type::Type{<:Abstract_Foo},
                          to_deserialize)

    actual_type = Base.eval(Main,Meta.parse(to_deserialize.type))
    StructUtils.lift(actual_type,to_deserialize)
end

Now we redefine lift() for the specific concrete types. You have to be careful to define these new methods for all possible specializations, otherwise you’ll get an infinite recursion with the previous function. It would be nice to detect this situation, but how? (feel free to add a comment πŸ™‚ )

For Foo_A:

function StructUtils.lift(type::Type{<:Foo_A{V}},
                          to_deserialize) where {V<:AbstractVector}

    v = StructUtils.lift(V,to_deserialize.v) # deserialize vect.
    x = to_deserialize.x

    type(v,x)
end 

For Foo_B:

function StructUtils.lift(type::Type{<:Foo_B},
                          to_deserialize)

    v_type = Base.eval(Main,Meta.parse(to_deserialize.v_type))
    v = StructUtils.lift(v_type,to_deserialize.v) # deserialize vect.
    n = to_deserialize.n

    type(v,n)
end 

Demonstration

Notice that we don’t need to give the exact type, just Abstract_Foo is enough.

JSON.parse(a_json_str,Abstract_Foo)
Foo_A{SVector{2, Int64}}([1, 2], 1.2)
JSON.parse(b_json_str,Abstract_Foo)
Foo_B(Float16[3.0, 4.0], 34)

Remarks

@kwdef and @nonstruct together

You cannot use @kwdef and @nonstruct together. The following code generates an error:

@nonstruct @kwdef struct Foo_C <: Abstract_Foo
end

The solution is to do the work of @nonstruct by hand. First, look at what this macro does:

@macroexpand @nonstruct  struct Foo_C <: Abstract_Foo
end
quote
    begin
        $(Expr(:meta, :doc))
        struct Foo_C <: Abstract_Foo
        end
    end
    StructUtils.structlike(::StructUtils.StructStyle, ::Type{<:Foo_C}) = false
end

So the fix is simply to replace

@nonstruct @kwdef struct Foo_C <: Abstract_Foo
end

by

@kwdef struct Foo_C <: Abstract_Foo
end

StructUtils.structlike(::StructUtils.StructStyle,
                       ::Type{<:Foo_C}) = false

Writing / reading file

Please follow the JSON.jl official doc, nothing special here:

JSON.json(file, a, pretty=true)      # write file
JSON.parsefile(file, Abstract_Foo)   # read file

Complete code

To make your life easier, here’s the complete code:

using JSON
using StaticArrays

abstract type Abstract_Foo end

@nonstruct struct Foo_A{V <: AbstractVector}  <: Abstract_Foo
    v::V
    x::Float64
end

@nonstruct struct Foo_B <: Abstract_Foo
    v::AbstractVector
    n::Int
end

function StructUtils.lower(to_serialize::Foo_A)

    return (type = string(typeof(to_serialize)),
            v = to_serialize.v,
            x = to_serialize.x)
end

function StructUtils.lower(to_serialize::Foo_B)

    return (type = string(typeof(to_serialize)),
            v_type = string(typeof(to_serialize.v)),
            v = to_serialize.v,
            n = to_serialize.n)
end

a = Foo_A(@SVector(Int[1,2]),1.2)

a_json_str = JSON.json(a, pretty=true)

println(a_json_str)

b = Foo_B(Float16[3,4],34)

b_json_str = JSON.json(b, pretty=true)

println(b_json_str)

function StructUtils.lift(type::Type{<:Abstract_Foo},
                          to_deserialize)

    actual_type = Base.eval(Main,Meta.parse(to_deserialize.type))
    StructUtils.lift(actual_type,to_deserialize)
end

function StructUtils.lift(type::Type{<:Foo_A{V}},
                          to_deserialize) where {V<:AbstractVector}

    v = StructUtils.lift(V,to_deserialize.v) # deserialize vect.
    x = to_deserialize.x

    type(v,x)
end

function StructUtils.lift(type::Type{<:Foo_B},
                          to_deserialize)

    v_type = Base.eval(Main,Meta.parse(to_deserialize.v_type))
    v = StructUtils.lift(v_type,to_deserialize.v) # deserialize vect.
    n = to_deserialize.n

    type(v,n)
end

JSON.parse(a_json_str,Abstract_Foo)

JSON.parse(b_json_str,Abstract_Foo)

Conclusion

There’s nothing more ridiculous than a conclusion, because nothing is ever finished. But I admit it’s still handy to say goodbye πŸ™‚

Tricksy Tuple Types…

By: Jacob Quinn

Re-posted from: https://quinnj.home.blog/2019/06/19/tricksy-tuple-types/

Some of you may be aware of my obsession with JSON libraries, and it’s true, there’s something about simple data formats that sends my brain into endless brainstorming of ways to optimize reading, writing, and object-mapping in the Julia language. JSON3.jl is my latest attempt at a couple of new ideas for JSON <=> Julia fun. The package is almost ready for a public release, and I promise I’ll talk through some of the fun ideas going on there, but today, just wanted to point out a tricky performance issue that took a bit of sleuthing to track down.

Here’s the scenario: we have a string of JSON like {"a": 1}, super simple right? In the standard Julia JSON.jl library, you just call JSON.parse(str) and get back a Dict{String, Any}. In JSON3.jl, we have a similar “plain parse” option which looks like JSON3.read(str), which returns a custom JSON3.Object type which I can talk about in another post in more detail. Another option in JSON3.jl, is to do JSON3.read(str, Dict{String, Any}), i.e. we can specify the type we’d like to parse from any string of JSON. While doing some quick benchmarking to make sure things look reasonable, I noticed JSON3.jl was about 2x slower compared to both JSON.parse, and JSON3.read(str, Dict{String, Int}). Hmmm, what’s going on here??

I first turned to profiling, and used the wonderful StatProfilerHTML.jl package to inspect my profiling results. That’s when I noticed around ~40% of the time was spent on a seemingly simple line of code:

Hmmmm……a return statement with a simple ifelse call? Seems fishy. Luckily, there’s a fun little project called Cthulhu.jl, which allows debugger “stepping” functionality with Julia’s unparalleled code inspection tools (@code_lowered, @code_typed, @code_llvm, etc.). As I “descended into madness” to take a look at the @code_typed of this line of code, I found this:

%1865 = (JSON3.ifelse)(%1864, %1857, %1851)::Union{Float64, Int64}
%1866 = (Core.tuple)(%1853, %1865)::Tuple{Int64,Union{Float64, Int64}}

Ruh-roh Shaggy…….the issue here is this Tuple{Int64,Union{Float64,Int64}} return type. It’s not concrete and leads to worse type inference in later code that tries to access this tuple’s second element. This is also undesirable because we know that the value should be either an Int64 or Float64, so ideally we could structure things so that code generation can just do a single branch and generate nice clean code the rest of the way down. If we change the code to:

Let’s take another cthulic descent and check out the generated code:

%1863 = (%1857 === %1862)::Bool
β”‚ β”‚ @ float.jl:484 within `==' @ float.jl:482
β”‚ β”‚β”Œ @ bool.jl:40 within `&'
β”‚ β”‚β”‚ %1864 = (Base.and_int)(%1861, %1863)::Bool
β”‚ β””β””
└──── goto #691 if not %1864
@ /Users/jacobquinn/.julia/dev/JSON3/src/structs.jl:330 within `read' @ /Users/jacobquinn/.julia/dev/JSON3/src/structs.jl:99
690 ─ %1866 = (Core.tuple)(%1853, %1857)::Tuple{Int64,Int64}
└──── goto #693
@ /Users/jacobquinn/.julia/dev/JSON3/src/structs.jl:330 within `read' @ /Users/jacobquinn/.julia/dev/JSON3/src/structs.jl:101
691 ─ %1868 = (Core.tuple)(%1853, %1851)::Tuple{Int64,Float64}
└──── goto #693

Ah, much better! Though there’s a few more steps, we can now see we’re getting what we’re after: our return type will be Tuple{Int64,Int64} or Tuple{Int64,Float64} instead of Tuple{Int64,Union{Int64,Float64}}. And the final performance results? Faster than JSON.jl!

Thanks for reading and I’ll try to get things polished up in JSON3.jl soon so you can take it for a spin.

Feel free to follow me on twitter, ask questions, or discuss this post there ?

Cheers.