Compare and Swap (CAS) Sequence for BenchmarkAtomicArray Takes too long to Iterate

When running the CASSeq BenchmarkAtomicArray test the amount of time it takes to complete a single iteration is ~10 minutes for single value cases. This is a sign that the benchmark might be inefficiently implemented to correctly benchmark the functionality that the benchmark should be stressing. This benchmark should be refactored to take less time per iteration and ensure that the correct features are benchmarked.

The following benchmark results were produced on the sandia gpu cluster with the following config and command:

cuda 10.2, gcc 7.3.0, kokkos 3.4.01
./compare-benchmarks.py --benchmark1='../../build/bin/BenchmarkAtomicArray --vtkm-device Cuda' --benchmark2='../../build/bin/BenchmarkAtomicArray --vtkm-device Kokkos' -- benchmarks

Cuda results:

BenchCASSeq<unsigned int>/AtomicsValues:1/AtomicOps:33554432/manual_time                                      616023 ms       610317 ms            1       217
.878k/s
BenchCASSeq<unsigned int>/AtomicsValues:8/AtomicOps:33554432/manual_time                                       72146 ms        71492 ms            1       1.8
6035M/s
BenchCASSeq<unsigned int>/AtomicsValues:64/AtomicOps:33554432/manual_time                                      10449 ms        10367 ms            1       12.
8444M/s
BenchCASSeq<unsigned int>/AtomicsValues:512/AtomicOps:33554432/manual_time                                       256 ms          255 ms            3       524
.776M/s
BenchCASSeq<unsigned int>/AtomicsValues:4096/AtomicOps:33554432/manual_time                                     8.20 ms         8.22 ms           85       16.
3659G/s
BenchCASSeq<unsigned int>/AtomicsValues:32768/AtomicOps:33554432/manual_time                                    1.14 ms         1.15 ms          613       117
.789G/s
BenchCASSeq<unsigned int>/AtomicsValues:262144/AtomicOps:33554432/manual_time                                  0.452 ms        0.461 ms         1551       297
.178G/s
BenchCASSeq<unsigned int>/AtomicsValues:1048576/AtomicOps:33554432/manual_time                                 0.452 ms        0.461 ms         1544       297

Edited Jun 30, 2021 by Nickolas Davis