Skip to content

Copy speed benchmarks

Cleans up the device adapter algorithm benchmarks and makes it easier to modify things like array size, fixed buffer size vs numValues, and TBB threads used. Also reduced the default benchmarking iterations from 500 to 100 for time's sake.

Added a new benchmark that copies various sized arrays and prints the transfer speed.

The new copy benchmarks highlighted an issue that std::copying Pairs and Vecs were not optimized to memcpy. For a 256 MiB buffer on my laptop w/ GCC, the serial copy speeds were:

UInt8:                 10.10 GiB/s
Vec<UInt8, 2>           3.12 GiB/s
Pair<UInt32, Float32>   6.92 GiB/s

After the last patch, which ensures triviality of these containers, the optimization occurs:

UInt8:                 10.12 GiB/s
Vec<UInt8, 2>           9.66 GiB/s
Pair<UInt32, Float32>   9.88 GiB/s

Merge request reports