Skip to content

ArrayHandleZip performance implications

Many of the ByKey methods in DeviceAdapterAlgorithm implementations use a 'building block' approach, combining inputs with keys into an ArrayHandleZip and calling a simpler version of the algorithm.

We've noticed large overhead to these methods, possibly due to creating temporary vtkm::Pairs in ArrayPortalZip::Get and the related WrappedBinaryOperator/ArrayPortalValueReference::Swap.

The following benchmark highlights this issue: BenchmarkZipArraySort.cxx

Test Serial TBB CUDA
Sort Basic Array 1.864s 0.638s 0.457s
Sort Zip Array 2.884s 1.229s 2.390s
Sort Zip Array w/ KeyComparator 2.513s 1.167s XXXXXX

It may be worthwhile to look at either optimizing the zip portals or refactoring the algorithms to remove them.