# 04-18 Upload Queue

I am working on uploading measurement data now. I will explain the problems that I am facing in my approach in this blog post.

# Technical background

I have two conflicting requirements for measurement uploading and have to find a compromise between them. On the one hand I want to have a good data throughput for each request. The amount of necessary requests is reduced if I upload a lot of measurement data at once. There will be less load on the server and I can reduce system resource overhead that is related to posting data to the server. On the other hand it is also important to send a continuous stream of data to the server for processing. I can update / recalculate the position of the client more often if data is instantaneously available.

The compromise I have reached is the following: In my measurementWebservice class I have a private member which is of type vector<IpsMeasurement>. This is a "first in first out" queue for the measurement data, where I can back up the measurements I collect from the Servers. After a given time frame or vector length (to be defined) the data is then converted to a JSON for upload to the server. Like this I can adjust the refresh rate by choosing how large the vector can get and also send many measurements in one go.

{
    "client_addr": "XX:XX:XX:XX:XX:XX",
    "measurements": [
        {
            "server_addr": "AB:CD:EF:GH:IJ:KL",
            "rssi": "-50"
        },
        {
            "server_addr": "12:34:56:78:90:12",
            "rssi": "-21"
        },
        {
            "server_addr": "AB:CD:EF:GH:IJ:KL",
            "rssi": "-74"
        }
    ]
}

There are multiple C++ container type wrappers that provide array style FIFO functionality. One of them is the std::queue. This one needs some sort of container type to support it, since it only is an adapter to encapsulate the implementation.

The std::deque is the default container type used in a queue. The std::list container provides similar functionality but performs worse is in operations that involve frequent insertion or removals at the beginning or end. Its main purpose is to provide access to elements in the middle and splice.

# Problem

How do I make sure that I do not leak data when posting? POST is a synchronous call, so it waits for the response. During that time there will constantly be incoming measurements and the POST request is not allowed to block the data collection process.

# Possible Solutions

# FIFO Queue and Concurrency

# splice vector or list and pass by ref, recycle queue

The queue uses a list. The list operation allows for the easy splicing. New measurements data is constantly being pushed to the back of the queue. On store call, the entire contents of the list is spliced and passed by reference to the json converter. Splicing removes the rvalues but keeps the list container intact for reuse so that measurements can be inserted concurrently as the post request and JSON serialization are underway. The trick is to make this whole process so fast that concurrency is guaranteed by the efficiency of the serializer and splicer. The operation is so fast that no blocking ultimately happens. The main difficulty is going to be the splicing and making sure that not too much data is being copied around. It would be very expensive to transfer the list to another list by copying it. I am afraid that that is the way that splicing works.

# Static deque or LIFO stack

Static variables are not destroyed when they leave scope. That means that I can reuse the container every time that the function is called. Their statics vars and their values persist throughout the programs lifetime. I fill the static deque by pushing back measurements to it. Then I pass the contents by ref to the serializer. After serializing, the post request is called. Afterwards clear deque without awaiting promise. Static deque is not destroyed, continue to next time store request is called. The benefit is that there is no copying around rvalues between iterations. Static vars are not stored on the stack and already have memory allocated to them. The deque performs better than lists for simple push back ops. The downside is that time the data being serialized and the http request being processed are blocking I/O and will still make it impossible to save new measurements.

# Change pointer to & of new Queue

Have a few deques. Use a pointer to the address of a deque and push to back of the queue as one would do with a list. Splicing is not possible, so pass entire deque by reference to the serializer and then post async request to server. Change the pointer to the address of a different deque. Clear the first deque. The benefit of this is that not a lot of data has to be copied around blockingly as with transferring a splice to a new list. It seems quite complicated though and not very easy to pull off successfully.

# Make REST API calls async

When post with async XHR Request like in javascript AJAX. The current configuration will wait for server to receive, processes and save the measurements on the server and wait for a response. Synchronous HTTP requests are blocking. They stop the program from executing until the Server responds in any way and wont be open to pushing new measurements to the back of the queue. It is therefore of interest to minimize this time. I think that an async request would alleviate concurrency issues because they allow the program to continue immediately after calling the webservice.

# Further actions

  • Make the POST request asynchrous so that HTTP ops are not blocking. Waiting for a response will lock up the program for tens of milliseconds and make it unresponsive to collecting measurement data.
  • The static deque seems to be the best option here since it seems the least complex. I could also try to make a static vector. It would be like a stack (LIFO) but technically would work as well. The deque seems more fitting for a FIFO task.
Last Updated: 11/23/2020, 9:42:47 PM