Overview

Background

Similar to Javascript’s relationship with JSON, where JSON can directly be converted into JS objects, a lot of C code consuming JSON is basically trying to turn the JSON into a struct that matches the schema. It’s not trivial to do that either using traditional JSON -> object model -> struct or a stream parser like lejp… members may appear in any order and there may be deeper structure like lists of other objects. The other direction is also common, you have the information in a struct, and need to translate that to JSON conveniently.

For example, a common pattern is receive JSON representing an object / struct, validate it and then want to apply it in an sql database. Or, receive a request in a URL or JSON, and need to return the sql database query results as JSON. These patterns boil down to JSON -> struct -> SQL, or SQL -> struct -> JSON.

lws_struct overview

Doing it by hand is possible if it’s one or two instances, but if it’s the basic bread-and-butter of your application and is done dozens or hundreds of times thoughout the code with different structs and schema, having to deal with it at a low level quickly overwhelms any chance to be able to maintain it. Trying to manually deal with schemas where, eg, the struct contains a list of different structs that contain lists of different structs, gets out of hand in terms of the amount of custom code needed.

Features

lws_struct lets you describe the struct members you want to convert in a table that can then be used by generic apis to serialize and deserialize your actual structs or list of structs between JSON, on-heap structs, and sqlite3 interchangeably… lws_dll2 support is built into it, so, eg, it handles linked-lists of subobjects that can be manipulated before consuming them as structs. And it also natively uses lwsac for heap storage, so deserialized objects exist inside a single logical chained heap allocation that can be destroyed in one step, no matter how much complexity or amount of objects were allocated inside, without having to walk the objects inside. Strings pointed to by struct members are also allocated inside the same lwsac.

JSON itself and lws_struct approach to produce explicit schemas burns some transmission efficiency. But it’s real easy to look at packets and understand what is going on, and much easier to produce and understand code translating between JSON / structs / sqlite3. If your usecase is nontrivial, you may care a lot more about keeping the complexity manageable than some bloat on data.

Glossary

Serialization

The process of encoding some or all members of a struct suitable for storage or transmission in Sqlite3 or JSON, be they strings, numbers, arrays of objects etc.

Deserialization

The process or decoding a Sqlite3 record or JSON back into a C struct, together with copies of any strings or other objects it referenced.

Preparing your objects for lws_struct

First you would define your C struct as usual… it can have other members that are not part of the serialization or, eg, are absent in a particular JSON object… any structs that are instantiated are zeroed down by default so other members and unspecified members become zero or NULL by default.

    typedef struct mystruct {
        lws_dll2_t      list;  /* not serialized, optional list we are part of */
        char            fixstring[30];
        const char      *varstring;
        int             value;
    } mystruct_t;

To use lws_struct, you would first mark up the serializable members using mapping helpers that lws defines for you. It’s OK if some members have no markup, they will be skipped for serialization and deserialized to NULL / 0 until you set them. You only need one of these “maps” per struct type that you will serialize or deserialize.

Member Helper Functionality
LSM_SIGNED Signed integer… size will be discovered by sizeof
LSM_UNSIGNED Unsigned integer… size will be discovered by sizeof
LSM_BOOLEAN true or false… size will be discovered by sizeof
LSM_CARRAY C String array… size will be discovered by sizeof
LSM_STRING_PTR const char * string pointer
LSM_LIST A lws_dll2_t list of other objects (ie, [ {...}, ... ])
LSM_CHILD_PTR A single pointer to an object of a given type

The general format is to map the type to the member name in the type, and an export name for the member, this is used as a JSON field name and as an sqlite3 schema field name for the member.

const lws_struct_map_t lsm_mystruct[] = {
    LSM_CARRAY      (mystruct_t, fixstring, "fixstring"),
    LSM_STRING_PTR  (mystruct_t, varstring, "varstring"),
    LSM_UNSIGNED    (mystruct_t, value,     "value"),
};

It is possible to model serializable, typed, lists-of-objects in members, but for simplicity let’s just stick with these simple types. The arguments to the helper list the members twice, the first is the member name in the struct, and the second is the name in JSON or sqlite column.

The toplevel types used with lws_struct need at least one “schema” description.

const lws_struct_map_t lsm_schema_map_mystruct[] = {
    LSM_SCHEMA  (mystruct_t, NULL,
                 lsm_mystruct,      "mystruct-schema-name"),
};

The last entry is used as a member .schema in JSON, or the table name to use for this type of object in sqlite3. It’s possible to have different SCHEMA structs using the same lws_struct_map_t so you can use different names for the JSON schema and sqlite3 table cases.

The toplevel schema naming allows pattens like receiving arbirtrary lws_struct messages for different purposes, and using the schema to understand what kind of message you have and what type of struct it would instantiate to, basically a polymorphic deserialization to the correct C type.

lws_struct for JSON

When lws_struct produces JSON output, it includes a “schema” entry with the name given above, “mystruct-schema-name”. When it’s asked to parse JSON back into an object, it checks through the array of schemas it was given to find out which matching object to instantiate, for the above, a mystruct_t.

The code to parse the incoming JSON object into a struct is

    struct lejp_ctx ctx;
    lws_struct_args_t a;
    mystruct_t *ms;

    memset(&a, 0, sizeof(a));
    a.map_st[0] = lsm_schema_map_mystruct;
    a.map_entries_st[0] = LWS_ARRAY_SIZE(lsm_schema_map_mystruct);
    a.ac_block_size = 512;

    lws_struct_json_init_parse(&ctx, NULL, &a);
    m = (int)(signed char)lejp_parse(&ctx, in, len);
    if (m < 0) {
        lwsl_notice("%s: JSON decode failed '%s'\n",
                __func__, lejp_error_to_string(m));
        return m;
    }

    if (!a.dest)
        return 1;

    /* parsed object is pointed-to by a.dest, a.top_schema_index says
     * which schema it is, 0 = first in map array, etc */

    switch (a.top_schema_index) {
    case 0:
        ms = (mystruct_t *)a.dest;
        ...
        break;
    }

    ...

    lwsac_free(&a.ac); /* destroy everything from the parse action */

You can see it easily extends to being able to parse a bunch of different schemas into different structs if the map array contained more SCHEMA entries.

Conversely, the code to emit a single JSON from a struct is like this

    ... mystruct_t *ms ...

    uint8_t buf[1024], *start = buf, *end = buf + sizeof(buf) - 1, *p = start;
    lws_struct_serialize_t *js = lws_struct_json_serialize_create(
                        lsm_schema_map_mystruct,
                        LWS_ARRAY_SIZE(lsm_schema_map_mystruct), 0, ms);
    size_t w;

    if (!js)
        return -1;

    lws_struct_json_serialize(js, p, end - p, &w);
    lws_struct_json_serialize_destroy(&js);

    /* w = number of bytes used from p */

This will emit something like

{
    "schema": "mystruct-schema-name",
    "fixstring": "whatever",
    "varstring": "blah",
    "value": 1
}

JSON serialization of lists

You can use a slightly different schema type to indicate that you will give an lws_dll2 list of the structures (list is the member name of the list in mystruct_t) instead of a pointer to the structure itself


typedef struct mystruct_owner {
    lws_dll2_owner_t  mylist; /* list of mystructs */
} mystruct_owner_t;

static const lws_struct_map_t lsm_mystruct_owner[] = {
    LSM_LIST    (mystruct_owner_t, mylist, mystruct_t,
                 list, NULL, lsm_mystruct, "mylist"),
};

const lws_struct_map_t lsm_schema_map_mystruct_list[] = {
    LSM_SCHEMA_DLL2 (mystruct_owner_t, mylist, NULL, lsm_mystruct_owner,
                              "mystruct-list-schema-name"),
};

It still points to the same underlying lsm_mystruct definition above, but instead of just binding the schema name to that it introduces a top level lws_dll2_t list and identifies the list in the objects. This way it’s the list owner that is passed in as the thing that’s actually being dumped. It will produce output like this (with the number of elements in the mylist [...] reflecting the number of entries in the list

{
    "schema": "mystruct-list-schema-name",
    "mylist": [
        {
            "schema": "mystruct-schema-name",
            "fixstring": "whatever",
            "varstring": "blah",
            "value": 1
        },
        {
            "schema": "mystruct-schema-name",
            "fixstring": "something",
            "varstring": "something else",
            "value": 2
        }
    ]
}

Using the same lws_struct metadata that produced this, the recipient can turn it back into the same structs, including the object list as an lws_dll2. And again reusing the same metadata, it can store those structs in sqlite3, and recover them back from there later into structs.

What did we learn this time?

  • If your data is following a lifecycle of JSON for transport, in a struct for processing, and maybe Sqlite for storage, lws_struct can help formalize handling it at each step and drastically reduce the code involved

  • Although the member description is overhead, you only have to do it once and it works for both JSON and sqlite cases. It’s also smaller and much easier to maintain than the equivalent code in all 3 cases in both directions.