# 10-01 Testing

Did even more tests today. Once again I evaluated the accuracy of the positioning system based on the root mean squared error (RMSE) of the position estimates. I experimented with different parameters for averaging calculations in the frontend and backend. My idea was that by increasing the input's population sizes for the mean, the resulting values would yield lower RMSEs. One can tweak the population sizes in two concrete places, once in backend and once in frontend.

First I benchmarked the system using the recommended default settings that I got from looking at research papers online. The below 3D graphic shows the RMSE values in relation to the clients position on the test grid. You effectively have the test grid as a plane on the bottom in 3D space. The varying heights show the RMSE (higher is bad) based on nine locations distributed across the field. Furthermore I distinguish between the values for the interpolated positions (blue) and the untouched, raw positions (orange). The RMSE values are then connected to form a 3D plane, to better understand how the RMSE changes based on client location in 2D space.

# Baseline configuration

{ width=60% }

The average raw RMSE is 1.13m. On average the interpolated RMSE is 0.64m. Every single average has a population size of 50, so the central limit theorem is applicable. The interpolated clearly has a lower RMSE than the raw values and always is more accurate. In the graphic above the planes never intersect. What also is interesting is that the shape of the plane is creased along the middle in X-direction to form something akin to a tent. That means that along that crease line in the middle there are higher RMSE values in respect to the bottom and top edges of the grid. The largest error is measured in the right part of the grid, in the middle of the Y-axis.

# 50 position interpolation

{ width=60% }

In a second experimental setup I changed the amount of positions that are processed to get the interpolation, ceteris paribus. Here the average position component consisted of 50 data points as opposed to only ten in the previous example. The average raw RMSE is 1.42m. On average the interpolated RMSE is 0.70m. Both RMSE are higher than the baseline setup, the increase in RMSE is 8.9% (interpolated) or respectively 20.4% (raw). Both increases are statistically significant. There no longer is a tent shape, the topography is generally more flat (first derivative of slopes are smaller) but the right center node still has an abnormally high RMSE compared to its neighbors and thus forms a little peak.

# 31 measurement interpolation

{ width=60% }

Next I cranked up the amount of measurements that are used to get the moving average in the backend. As opposed to getting the mean of more X and Y components as was the variable in the previous experiment, here the linearly scaled mean of more RSSI values is the variable. Once again the RMSE of positions based on the means is the observable quantity, generally all else is equal. The raw RMSE mean is 1.07m, the interpolated RMSE is 0.84m. Compared to the baseline values we have a statistically insignificant raw RMSE increase of 5% and a significant RMSE increase of 30% for the interpolation. What must be noted is the extreme outlier of 2.51m once again in the same right node that was mentioned previously. When ignoring the outlier, the average interpolated RMSE is 0.64m. This is the same as the baseline experimental result. The raw RMSE would have been only 0.79m, which is a 29% reduction of the RMSE compared to the baseline. The raw and the interpolated planes are much closer to each other in this experiment. Thus the spread of the raw RMSE is smaller, since the interpolation average is very close to the raw average.

# Discussion of results

Compared to other IPS built by various universities around the world, my system fared pretty well. Researchers at the Chongqing University had an average RMSE of 0.7343m. The ETH wireless communications group reached an average position RMSE of 1.36m and students at the department for computer science in Oldenburg, Germany got a mean absolute error of 0.71 m.

The GUI side increase of the sample size actually made the accuracy of the system worse. It is also has to include older data in order for the estimate to consist of 50 measurements so responsiveness to movements is really bad.

When using a higher amount of data in the backend, the reliability of the raw position estimates increases. Apart from the huge outlier (large peak in plane) that formed on the right side of the grid, the results were almost the same as the baseline configuration. It is disappointing that the results are not even an improvement (only for the raw data), but I think that might caused by a different issue. I collected the measurements over the course of a few hours with lunch in between, so towards the end the path loss model parameters might have been worse. Thus the comparison is not completely fair. It might be good to give it another chance some other time.

The peak on the right side and tent also existed outdoors, when there were no objects next to the measurement field as there were indoors (columns, tables, chairs). The beacon placement and client orientation was the same, so it probably is related to that. That means it might really be worth giving the 31 measurement average window another chance.

{ width=60% }

# Single Client

In a fourth experiment with recalibrated PLM parameters I tested the use of only a single client. The results are really bad compared to the multiclient setup used before. Here are the results:

{ width=60% }

The average RMSE of interpolated positions was 1.21m. On average the RMSE of raw positions was 1.38m. What is worrisome about this is that the RMSE was often above 1.5m, which is really bad in such a small field like this. If the dimensions are 3.9m * 3m that means that the system didn't only barely knew in which half of the room the client was. Since the multiclient setup is not more work to use than a single client, it makes sense to move forward with that setup.

# Next steps

I want to do one more test with a much larger grid. Then I will move forward with creating the mockup store and writing the blog post.

← 10-02 Latex Graphics 09-30 RMSE average and outdoor use →