2000 (GMTI) Tracks on a Raspberry Pi

Pentland Edge has developed a new, efficient, tracker core, which we plan to use to increase the number of simultaneous GMTI (Ground Moving Target Indication) tracks that can be maintained on a range of hardware. This article is based on some initial benchmarking tests we performed to provide a baseline to measure future improvements against. Testing on a simple Raspberry Pi 4 single board computer shows that the tracker is already capable of handling a steady load of 2000 simultaneous tracks on a modest compute platform, with further scope for improvement.

What is GMTI Tracking?

Airborne surveillance radar systems will typically have GMTI (Ground Moving Target Indication) detection modes, which scan an area on the ground and periodically generate detections of moving vehicles and perhaps people. One of the difficulties in making use of this raw data is that the interval between detections may be a few seconds long, and from each snapshot of detections, without context, it can be difficult to make sense (particularly in a busy environment) of what is actually happening on the ground. Tracker systems aim to maintain context between radar updates, model the behaviour of the moving targets, and allow for interpolation and extrapolation of the radar data points. This allows operator displays to update continuously, showing a moving scene rather than a series of static snapshots. By modelling movement, the tracker is also able to reduce some of the noise inherent in the raw detections, and present a more accurate set of positions at a given time.

Why is Tracking Compute Intensive?

As the number of simultaneous tracks grows, and the maximum number of radar detections fed in during each scan interval increases, the process of combining each incoming new detection with existing tracks, or potentially starting a new track, becomes increasingly compute intensive as the number of potential combinations increases. Due to the “combinatorial explosion”, increasing the track count beyond a certain point quickly overwhelms live tracking systems that must keep up with the real-time arrival rate. Depending on the implementation of the tracker, updating the track models can also be a significant task. It is believed that many existing, readily deployable (not requiring a shipping container full of computers) systems are limited to between 500 to 1000 simultaneous tracks. Our aim is to find a way to boost this to 10000 simultaneous tracks through improved algorithms and efficiently written software that takes advantage of modern compute platforms.

Hawkstream Tracker Initial GMTI Benchmarking

While the software has been carefully written, it has not yet had any optimisation effort, or any attempt to make full use of the number of cores available on modern processors. As a starting point, we need to measure the initial performance so that we can tell if any changes made are indeed improvements, and to inform us where optimisation effort is best applied.

To provide a continuous test load, we used our detection simulator that can generate real-time detections for a specified number of targets, moving along a network of roads. The tracker was configured with a limit of 2000 tracks, and the simulator set to generate 2000 targets, updating once per second. The tracker was also configured to generate extrapolated track updates at a 10Hz rate (100 milliseconds between updates).

Soak testing on a desktop P.C.

The first test system was a standard desktop P.C. (AMD Ryzen 5 5500, 6 cores, 3600 Mhz). Running under constant load, the system took an average of 170 ms. to process each batch of 2000 new detections, occupying around 17% of one of the 6 cores on the CPU, with the overall system resource monitor showing little deviation in the baseline CPU loading of 5-10%. There is clearly scope to take advantage of the idle cores to reduce the 170ms latency in providing updates based on the latest detections, and to boost the track count further.

Soak testing on a Raspberry Pi (version 4).

A Raspberry Pi was selected as a simple test device, due in no small part to the fact that we have a number of these lying around, but also because it is reasonably representative of the sort of small, single board platform that can be deployed on all sorts of mobile platforms. The Pi was one of the older Pi 4 models, running a 32-bit Raspbian OS. The tracker core was compiled for the 1.5 GHz Broadcom CPU based on an ARM Cortex-A72 with 4 cores.

The detection simulator and output recorder processes ran on the host P.C. and communicated with the tracker over the wireless network.

Under constant load, the time to process updates was 788ms. This is still keeping up with the real-time arrival rate, but is getting close to the wire at 2000 tracks. Monitoring the resource usage showed that the tracker process was using around 85% of one of the CPU cores. A 780ms lag in processing position updates is not ideal. Ideally, the updates should be processed before the next extrapolated position output is provided, which would mean a 100ms limit for a 10 Hz update rate. In a similar manner to the desktop system, there is clearly considerable scope to make use of the other cores to reduce the update latency. However, 2000 simultaneous real-time tracks on a Raspberry Pi is a good baseline to start from, since this already exceeds the performance of many other systems.

Conclusions and Further Work

The soak testing showed that 2000 simultaneous tracks is already achievable on resource limited platforms comparable to a Raspberry Pi 4, and a desktop system is nowhere near its processing limit under that load. Further work will focus first on making use of the idle cores, followed by mapping to the Nvidia GPU which sits inside the desktop system, to see how far the track limit can be raised in a practical, readily deployable system.