CBOR IoT Data Serialization for Embedded C and Rust

Jared Wolff · 2020.11.19 · 13 Minute Read · embedded · zephyr

Title image

When sending data theres a few ways you can go about it. In the embedded world, it’s not uncommon to serialize data so it can be efficently sent through the ether. On the other end it’s reassembled and used however necessary. In today’s post we’ll chat about serializing with CBOR (CBOR stands for Concise Binary Object Reprentation). It’s open, widely adopted and available in most popular programming languages.

The idea of CBOR is to take a structure or object, pack it up efficently for network usage. That way we transmit the least amount of data necessary. This will save you bandwidth but also cost especially for LTE deployments.

For the purpose of this post we’ll be creating a simple codec for use in embedded. On the “cloud” side we’ll use Rust’s serde_cbor to do the heavy lifting for us. We’ll start with the embedded side.

Your embedded options

For the purposes of this post, we’ll work on creating an example you can run on the machine you’re using. The only extra thing you’ll need for the process is to download and include a CBOR library in your project. I’m using QCBOR for this example but other libraries are available like TinyCBOR. I’ve chosen QCBOR over the later because of its ease of use.

Creating the C Codec

First, lets create a codec that will encode and decode messages into CBOR format.

  1. First, let’s create the header file: telemetry_codec.h

    #include <qcbor/qcbor.h>
    
    /* Position enum */
    enum
    {
        telemetry_version_pos,
        telemetry_rssi_pos,
    } telemetry_data_positions;
    
    /* Used to store strings */
    typedef struct
    {
        char bytes[64];
        size_t size;
    } telemetry_data_string_t;
    
    /* Struct for telemetry data */
    typedef struct
    {
        telemetry_data_string_t version;
        int32_t rssi;
    
    } telemetry_data_t;
    
    /* Encode function */
    QCBORError telemetry_codec_encode(const telemetry_data_t *p_data, uint8_t *p_buf, size_t len, size_t *p_size);
    

    As you can see above, telemetry_data_t is the data that we’re trying to encode/decode. Within telemetry_data_t is a telemetry_data_string_t which you can use to store variable length buffer for strings.

    You’ll also notice that i’m defining an enum called telemetry_data_positions. This definies the position in the CBOR that each piece of data will live. This is in substitution of string key/value setup like you’d often see in JSON. See below:

    Instead of:

    {"version":"0.1.0"}
    

    You get something like this:

    {0:"0.1.0"}
    

    As you can imagine, you’re saving some bytes here. Even more when you encode using CBOR rather than JSON!

  2. Next we’ll want to define the encode function in telemetry_codec.c. First using the defintion provided in the .h file:

    QCBORError telemetry_codec_encode(const telemetry_data_t *p_data, uint8_t *p_buf, size_t len, size_t *p_size)
    

    Within the function we want to set up a UsefulBuf which we’ll be encoding the data into. There’s also a QCBOREncodeContext that will contain the state of the encoder as it’s used:

        // Setup of the goods
        UsefulBuf buf = {
            .ptr = p_buf,
            .len = len};
        QCBOREncodeContext ec;
        QCBOREncode_Init(&ec, buf);
    

    There are a handful ways to encode data. To emulate a JSON key/value store we’ll be putting our values into a map:

        /* Create over-arching map */
        QCBOREncode_OpenMap(&ec);
    

    This allows us to use functions like QCBOREncode_AddBytesToMapN and QCBOREncode_AddInt64ToMapN to add data along with a position key:

        /* Add the version string */
        UsefulBufC data = {
            .ptr = p_data->version.bytes,
            .len = p_data->version.size};
        QCBOREncode_AddBytesToMapN(&ec, telemetry_version_pos, data);
    
        /* Add the rssi */
        QCBOREncode_AddInt64ToMapN(&ec, telemetry_rssi_pos, p_data->rssi);
    

    As you can see above, you can use another UsefulBufC for encoding the bytes from the version. You can encode values into a singluar entry by using QCBOREncode_AddInt64ToMapN or similar. This function will size the CBOR output appropirately an send the least amount of bytes necessary. Thus, in reality it will almost never send a fully 64 bit number.

    Finally, make sure you close out the map and finish encoding!

        /* Close the map*/
        QCBOREncode_CloseMap(&ec);
    
        /* Finish !*/
        return QCBOREncode_FinishGetSize(&ec, p_size);
    

    In the end your telemetry_codec.c should look something like this:

    #include <telemetry_codec.h>
    #include <stdio.h>
    
    /* Encode function */
    QCBORError telemetry_codec_encode(const telemetry_data_t *p_data, uint8_t *p_buf, size_t len, size_t *p_size)
    {
    
        // Setup of the goods
        UsefulBuf buf = {
            .ptr = p_buf,
            .len = len};
        QCBOREncodeContext ec;
        QCBOREncode_Init(&ec, buf);
    
        /* Create over-arching map */
        QCBOREncode_OpenMap(&ec);
    
        /* Add the version string */
        UsefulBufC data = {
            .ptr = p_data->version.bytes,
            .len = p_data->version.size};
        QCBOREncode_AddBytesToMapN(&ec, telemetry_version_pos, data);
    
        /* Add the rssi */
        QCBOREncode_AddInt64ToMapN(&ec, telemetry_rssi_pos, p_data->rssi);
    
        /* Close the map*/
        QCBOREncode_CloseMap(&ec);
    
        /* Finish !*/
        return QCBOREncode_FinishGetSize(&ec, p_size);
    }
    

    That’s all what’s needed to encode a telemetry_data_t strucure! I’ve also created a decode function in the example code. You can get that at the bottom of this post.

Creating the HTTP Client

  1. Before getting started you’ll need some dependencies. On Mac you can install cmake and curl using Homebrew.

    > brew install cmake curl
    
  2. You should also create a CMakeLists.txt with the following contents:

    cmake_minimum_required(VERSION 3.8.2)
    
    # Project
    project(CBORClientExample)
    
    # Include directory
    include_directories(include)
    include_directories(lib/QCBOR/inc/)
    
    # Include sources 
    file(GLOB QCBOR_FILES lib/QCBOR/src/*.c)
    
    # Libcurl
    link_libraries(curl)
    
    # Executable
    add_executable(client main.c src/telemetry_codec.c ${QCBOR_FILES})
    

    This allows us to import dependencies and compile everything that’s needed. You’ll also notice that theres a lib/QCBOR reference. We’ll need to clone the QCBOR repo for that:

    > mkdir -p lib/
    > cd lib
    > git clone https://github.com/laurencelundblade/QCBOR.git
    

    In the example code I added it as a submodule. No cloning necessary. (Initialize with git submodule update --init)

  3. Finally in main.c you we can get to work encoding some data and then sending it via HTTP. First let’s create the data:

    /* Create the object */
    telemetry_data_t data = {
        .rssi = -49,
        .version = {
            .bytes = "0.1.0",
            .size = strlen("0.1.0"),
        }};
    

    Then ecode using the telemetry_codec_encode function we just created:

    /* Encode the data */
    uint8_t buf[256];
    size_t size = 0;
    QCBORError err = telemetry_codec_encode(&data, buf, sizeof(buf), &size);
    if (err)
        printf("QCBOR Error: %d\n", err);
    

    Then using some modified example code from here, we’ll POST the data to the Rust server we’ll develop in the next section:

        /* Send data using Libcurl */
        /* Adopted from here: https://curl.se/libcurl/c/http-post.html */
        CURL *curl;
        CURLcode res;
    
        /* In windows, this will init the winsock stuff */
        curl_global_init(CURL_GLOBAL_ALL);
    
        /* get a curl handle */
        curl = curl_easy_init();
        if (curl)
        {
            /* First set the URL that is about to receive our POST. This URL can
           just as well be a https:// URL if that is what should receive the
           data. */
            curl_easy_setopt(curl, CURLOPT_URL, "http://localhost:3030/telemetry/1234");
    
            /* Now specify the POST data */
            curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, size);
            curl_easy_setopt(curl, CURLOPT_POSTFIELDS, buf);
    
            /* Perform the request, res will get the return code */
            res = curl_easy_perform(curl);
            /* Check for errors */
            if (res != CURLE_OK)
                fprintf(stderr, "curl_easy_perform() failed: %s\n",
                        curl_easy_strerror(res));
    
            /* always cleanup */
            curl_easy_cleanup(curl);
        }
        curl_global_cleanup();
    

    The biggest difference fromt he example is the usage of CURLOPT_POSTFIELDSIZE and CURLOPT_POSTFIELDS:

    /* Now specify the POST data */
    curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, size);
    curl_easy_setopt(curl, CURLOPT_POSTFIELDS, buf);
    

    This will define the variable sized data we have to send over after encoding. In this case the size will be size and the buffer involved is buf (defined earlier)

    Here’s a full look at the code:

    #include <telemetry_codec.h>
    #include <qcbor/qcbor.h>
    #include <stdio.h>
    #include <curl/curl.h>
    
    int main()
    {
        printf("Start of QCBOR Example\n");
    
        /* Create the object */
        telemetry_data_t data = {
            .rssi = -49,
            .version = {
                .bytes = "0.1.0",
                .size = strlen("0.1.0"),
            }};
    
        /* Encode the data */
        uint8_t buf[256];
        size_t size = 0;
        QCBORError err = telemetry_codec_encode(&data, buf, sizeof(buf), &size);
        if (err)
            printf("QCBOR Error: %d\n", err);
    
        /* Send data using Libcurl */
        /* Adopted from here: https://curl.se/libcurl/c/http-post.html */
        CURL *curl;
        CURLcode res;
    
        /* In windows, this will init the winsock stuff */
        curl_global_init(CURL_GLOBAL_ALL);
    
        /* get a curl handle */
        curl = curl_easy_init();
        if (curl)
        {
            /* First set the URL that is about to receive our POST. This URL can
           just as well be a https:// URL if that is what should receive the
           data. */
            curl_easy_setopt(curl, CURLOPT_URL, "http://localhost:3030/telemetry/1234");
    
            /* Now specify the POST data */
            curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, size);
            curl_easy_setopt(curl, CURLOPT_POSTFIELDS, buf);
    
            /* Perform the request, res will get the return code */
            res = curl_easy_perform(curl);
            /* Check for errors */
            if (res != CURLE_OK)
                fprintf(stderr, "curl_easy_perform() failed: %s\n",
                        curl_easy_strerror(res));
    
            /* always cleanup */
            curl_easy_cleanup(curl);
        }
        curl_global_cleanup();
    
        return 0;
    }
    

    If you’re working in an embedded context you’ll likely substitute all the curl calls with an appropriate library of your choice (MQTT, CoAP, LWM2M, HTTP, etc). Since the codec only needs QCBOR as a dependency, it’s quite portable to other platforms. I’ve used this technique both on Zephyr and bare metal.

  4. Building is fairly straight forward after that:

    > cd client
    > cmake .
    -- The C compiler identification is AppleClang 12.0.0.12000032
    -- The CXX compiler identification is AppleClang 12.0.0.12000032
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /Users/jaredwolff/Git/cbor-example/client
    >
    > cmake --build .
    Scanning dependencies of target client
    [ 12%] Building C object CMakeFiles/client.dir/main.c.o
    [ 25%] Building C object CMakeFiles/client.dir/src/telemetry_codec.c.o
    [ 37%] Building C object CMakeFiles/client.dir/lib/QCBOR/src/UsefulBuf.c.o
    [ 50%] Building C object CMakeFiles/client.dir/lib/QCBOR/src/ieee754.c.o
    [ 62%] Building C object CMakeFiles/client.dir/lib/QCBOR/src/qcbor_decode.c.o
    [ 75%] Building C object CMakeFiles/client.dir/lib/QCBOR/src/qcbor_encode.c.o
    [ 87%] Building C object CMakeFiles/client.dir/lib/QCBOR/src/qcbor_err_to_str.c.o
    [100%] Linking C executable client
    [100%] Built target client
    

    The client app is ready to roll! Let’s get the server working next.

Rust server bits

Make sure you have the latest version of Rust installed. The fastest way to do that is with rustup: https://rustup.rs

  1. Once you’re set, we’ll create a new Rust project using cargo:

    cargo init cbor-example
    
  2. Open up the cbor-example folder and update your Cargo.toml file with these dependencies:

    [dependencies]
    warp = "0.2"
    serde = { version = "1.0", features = ["derive"] }
    serde_cbor = "0.11"
    tokio = { version = "0.2", features = ["full"] }
    

    Warp is our HTTP server, serde allows us to efficiently serialize/deserialize data an Tokio allows us to use Warp (which is inherrently async).

    Note: warp has yet to upgrade to Tokio version 0.3 so we’re using 0.2 he

  3. In [main.rs](http://main.rs) we’ll create a very basic warp server:

    use serde::{Deserialize, Serialize};
    use serde_cbor;
    use warp::{hyper::body, Filter};
    
    #[tokio::main]
    async fn main() {
        // POST /telemetry/:id  {"version":"0.1.0","rssi":-49}
        let telemetry = warp::post()
            .and(warp::path("telemetry"))
            .and(warp::path::param::<String>())
            .and(warp::body::content_length_limit(1024))
            .and(warp::body::bytes())
            .map(|id, data: body::Bytes| {
                println!("Message from id: {}", id);
    
                            // Code goes here!
    
                warp::reply::json(&{})
            });
    
        warp::serve(telemetry).run(([127, 0, 0, 1], 3030)).await
    }
    

    Where we’re working with a payload defined at the top of main.rs

    #[derive(Deserialize, Serialize)]
    struct TelemetryData {
        version: String,
        rssi: i32,
    }
    
  4. For sending an erorr, we’ll also create an error type:

    #[derive(Deserialize, Serialize)]
    struct TelemetryError {
        error: String,
    }
    
  5. To translate the bytes into TelemetryData, we’ll need to use serde_cbor to decode the bytes:

    // Get the telemetry
    let telemetry: TelemetryData = match serde_cbor::from_slice(&data) {
        Ok(t) => t,
        Err(e) => {
            // Create error
            let error = TelemetryError {
                error: "Unable to parse telemetry data.".to_string(),
            };
    
            // Return error
            return warp::reply::json(&error);
        }
    };
    

    This uses the from_slice function which turns raw bytes into something useful. I’m using a match function here to help filter the results. If the data is valid, we’re good to go. If the data is invalid, reply with an error and return from this function.

That’s all we need to test! In the end your code should look something like this

use serde::{Deserialize, Serialize};
use serde_cbor;
use warp::{hyper::body, Filter};

#[derive(Deserialize, Serialize, Debug)]
struct TelemetryData {
    version: String,
    rssi: i32,
}

#[derive(Deserialize, Serialize)]
struct TelemetryError {
    error: String,
}

#[tokio::main]
async fn main() {
    // POST /telemetry/:id  {"version":"0.1.0","rssi":-49}
    let telemetry = warp::post()
        .and(warp::path("telemetry"))
        .and(warp::path::param::<String>())
        .and(warp::body::content_length_limit(1024))
        .and(warp::body::bytes())
        .map(|id, data: body::Bytes| {
            println!("Message from id: {}", id);

            // Get the telemetry
            let telemetry: TelemetryData = match serde_cbor::from_slice(&data) {
                Ok(t) => t,
                Err(_) => {
                    // Create error
                    let error = TelemetryError {
                        error: "Unable to parse telemetry data.".to_string(),
                    };

                    // Return error
                    return warp::reply::json(&error);
                }
            };

            println!("Telemetry: {:?}", telemetry);

            warp::reply::json(&{})
        });

    warp::serve(telemetry).run(([127, 0, 0, 1], 3030)).await
}

Make it rain

To test we’ll boot up the Rust server using cargo run.

❯ cargo run
   Compiling cbor-example v0.1.0 (/Users/jaredwolff/Git/cbor-example)
    Finished dev [unoptimized + debuginfo] target(s) in 3.22s
     Running `target/debug/cbor-example`

Then we’ll run the example C code with:

❯ ./client
Start of QCBOR Example

If you’re paying attention to the output of the Rust setup you’ll see some output:

cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.48s
     Running `target/debug/cbor-example`
Message from id: 1234
Telemetry: TelemetryData { version: "0.1.0", rssi: -49 }

Success! ✨We’ve successfully serialized, transmitted some CBOR data over HTTP and deserialized it!

Comparing sizes

We can compare sizes of CBOR versus optimized JSON for the same TelemetryData. We’ll use the Rust server to do these calculations.

  1. First make sure that you include serde_json in your Cargo.toml

    [dependencies]
    warp = "0.2"
    serde = { version = "1.0", features = ["derive"] }
    serde_cbor = "0.11"
    tokio = { version = "0.2", features = ["full"] }
    serde_json = "1.0"
    
  2. Then we’ll add some extra bits in the POST function before decoding the CBOR. In this case i’m adding how many bytes the binary CBOR data is:

    println!("Message from id: {} size: {} bytes", id, data.len());
    
  3. After decoding we’ll encode to JSON and get the size of the JSON string.

    // Now encode to json
    let json = match serde_json::to_string(&telemetry) {
        Ok(j) => j,
        Err(_) => {
            // Create error
            let error = TelemetryError {
                error: "Unable to parse telemetry data to JSON.".to_string(),
            };
    
            // Return error
            return warp::reply::json(&error);
        }
    };
    println!("JSON message size: {} bytes", json.len());
    
  4. Make sure you also add use serde_json; to the top of your file!

  5. Re-run cargo run and then re-run the C client. The results?

    Message from id: 1234 size: 11 bytes
    Telemetry: TelemetryData { version: "0.1.0", rssi: -49 }
    JSON message size: 30 bytes
    

    CBOR is 1/3 the size of JSON for the same exact data! (11 bytes versus 30!) That’s what i’m talking about! Now imagine multipling that by how ever many API calls you expect and it starts adding up quite fast!

There is something important here about the savings though. The way I am indexing the CBOR data is with an enum. Instead of using strings as keys, it will use a number. As you can see, it reduces the size of your CBOR binary significantly!

By default, the to_vec function for serde_cbor does not send this optimized form. To use the compacct form use theserde_cbor::to_vec_packed function. serde_cbor does the same thing that my manual C code did but automagically. As long as your C code is in sync you can send messages back and forth all day long!

Other Options

CBOR isn’t the only game in town. I’ve personally written about Protobuf and have used it in several projects. Protobuf allows you to create a top-level definition of your data structures. You then generate your code in your language of choice. It works on the most popular programming languages like C, C++, Go, Rust and more.

Another interesting project is Cap’n Proto. One of the original developers of Protobuf is the founding developer on the project. He took many of his learnings at Google and applied them to make Capn' Proto. So, in some cases, it soars where Protobuf falls short.

Let me explain.

Instead of serializing your data structures, they’re kept in memory in a ready-to-send format. This is a virtual “no overhead” protocol because there is no serialization per se. It does (marginally) increase the work required when accessing and updating your data. I have not played much with Cap’n Proto but it seems to have promise.

Simlarly there’s also Msg Pack which is most simlar to CBOR. It’s also avialble in most programming languages. It does tend to serialize a bit larger than CBOR or Protobuf but it’s another great option. The choice is in your hands!

Conclusion

In this post you’ve learned how to:

  • Create a codec in C to encode and decode a simple data structure.
  • Send that data over HTTP to a Rust server
  • Program a Rust server using warp to recieve binary data
  • Convert the binary data to the equivalent Rust data structure.

While you can run this example locally, you can utilize the concepts elsewhere. For example, you can apply the same techniques for sending data over a metered LTE link. This will save you mucho dinero over the long run. Especially when your deployments start to scale!

If you’d like to tinker more, make sure you check out the example code and give it a star. ⭐️

Like this post? Be sure to share and spread the word. 👍

Last Modified: 2020.11.19

Subscribe here!

You may also like