Skip to content

Prevent compiler from optimizing out benchmark body.

Since the result of the Reduce call was not used, several compilers were omitting the call completely on release builds.

FWIW, the TBB implementation shows pretty solid scaling once this issue is fixed:

Speedup Warn serial parallel Benchmark (Type)
7.107 0.002597 +- 0.000185 0.000365 +- 0.000033 Reduce on 2097152 values (vtkm::Float32)
3.039 ! 0.002816 +- 0.000130 0.000927 +- 0.000058 Reduce on 2097152 values (vtkm::Float64)
1.972 !!! 0.000648 +- 0.000039 0.000328 +- 0.000037 Reduce on 2097152 values (vtkm::Int32)
1.421 !!! 0.001321 +- 0.000049 0.000929 +- 0.000059 Reduce on 2097152 values (vtkm::Int64)
1.983 !!! 0.000650 +- 0.000037 0.000328 +- 0.000038 Reduce on 2097152 values (vtkm::UInt32)
3.170 ! 0.000112 +- 0.000016 0.000035 +- 0.000003 Reduce on 2097152 values (vtkm::UInt8)
2.148 !! 0.004162 +- 0.000135 0.001938 +- 0.000071 Reduce on 2097152 values (vtkm::Vec< vtkm::Float32, 4 >)
1.488 !!! 0.004316 +- 0.000133 0.002900 +- 0.000080 Reduce on 2097152 values (vtkm::Vec< vtkm::Float64, 3 >)
1.229 !!! 0.001159 +- 0.000051 0.000943 +- 0.000060 Reduce on 2097152 values (vtkm::Vec< vtkm::Int32, 2 >)
1.495 !!! 0.001556 +- 0.000123 0.001041 +- 0.000029 Reduce on 2097152 values (vtkm::Vec< vtkm::UInt8, 4 >)
Edited by Allison Vacanti

Merge request reports