vtkm::Vec: unrolls arithmetic operators
This MR simply explicitly unrolls arithmetic operations in vtkm::Vec when having Vector1 operation Vector2
or Vector operation Scalar
.
UPDATE
Below are the latest benchmark results of the last commit.
Rationale
As discussed in one of the followup discussion in the pasts VTKm weekly meeting, it is found that array unrolling optimization is given up by some compilers in vtkm::Vec
arithmetic operators when multiple of them are chained. This is, v1 + v2 + ... + vN.
Design caveats
Unfortunately, it have required an extensive number of arithmetic functions with all the variations we might find in our codebase.
For that matter, this has been implemented by using code generations with pre-processor macros.
Impact
No difference in size of generated libraries.
library | master | unroll |
---|---|---|
libvtkm_filter_gradient-1.5.a | 185M | 185M |
libvtkm_filter_common-1.5.a | 194M | 194M |
libvtkm_filter_extra-1.5.a | 216M | 216M |
libvtkm_cont-1.5.a | 238M | 238M |
libvtkm_filter_contour-1.5.a | 259M | 259M |
libvtkm_rendering-1.5.a | 286M | 286M |
libvtkm_io-1.5.a | 29M | 29M |
libvtkm_worklet-1.5.a | 33M | 33M |
libvtkm_source-1.5.a | 8.4M | 8.4M |
There is a slight performance improvement.
Comparing master.json to unroll.json
Benchmark Time CPU Time Old Time New CPU Old CPU New
--------------------------------------------------------------------------------------------------------------------------------------------
BenchGradientScalar/manual_time -0.0487 -0.0504 997 949 997 947
BenchGradientVector/manual_time -0.0387 -0.0403 2713 2608 2713 2603
BenchGradientVectorRow/manual_time -0.0348 -0.0364 2617 2526 2617 2522
BenchGradientPoint/manual_time -0.1744 -0.1763 867 716 867 714
BenchGradientDivergence/manual_time -0.0404 -0.0405 2648 2541 2648 2540
BenchGradientVorticity/manual_time -0.0384 -0.0383 2681 2578 2681 2578
BenchGradientQCriterion/manual_time -0.0370 -0.0370 2697 2597 2696 2597
BenchGradientAll/manual_time -0.1182 -0.1165 1113 981 1112 983
BenchThreshold/manual_time -0.0092 -0.0075 584 579 584 580
BenchThresholdPoints/CompactPts:0/manual_time -0.0578 -0.0577 441 416 441 416
BenchThresholdPoints/CompactPts:1/manual_time -0.0474 -0.0474 3898 3713 3897 3713
Signed-off-by: Vicente Adolfo Bolea Sanchez vicente.bolea@kitware.com