Accelerate outputs transformations
This MR focuses on accelerating the transformations of the output clouds:
- Improve multi-threading during pointclouds transformations or undistortion
- Do not aggregate, undistort or transform keypoints that should not be used (e.g. blobs)
- Only transform keypoints to WORLD coordinates if needed: if the user wants to output them in WORLD coordinates, or if we are dealing with a new keyframe
- Only aggregate and transform all input points to WORLD coordinates if needed: if the user wants to get them, or if the overlap computation is enabled.
This also brings some new helpers to easily perform aggregation or transformations of pointclouds to BASE or WORLD coordinates systems.
Overall gain
Here are some numbers of processing duration (in ms) of modified steps for a VLP16 dataset played with ROS wrapping, displaying the current registered frame:
SLAM step | master | this MR |
---|---|---|
Localization: initial undistortion | 0.545 | 0.114 |
Localization: 1 LM optim iteration | 2.074 | 1.660 |
Maps update | 0.131 | 0.105 |
ROS wrapping: publish output | 2.535 | 0.145 |
TOTAL: ROS wrapping: main scan callback | 32.972 | 28.990 |
For this dataset, this MR brings an overall ~14% speedup.
Multithreading gain
Here are some numbers about the gain obtained with multithreading for various steps:
- transformation of a single pointcloud:
- no gain for small clouds
- for bigger clouds (~100000 points): x1.25 with 2 threads, x1.5 with 4 threads
- aggregation of input frames:
- without undistortion: x1.35 with 2 threads, x1.5 with 4 threads
- with undistortion: x1.8 with 2 threads, x3 with 4 threads
- refinement of undistortion: x1.9 with 2 threads, x3.5 with 4 threads
Edited by Nicolas Cadart