Skip to content

[WIP] OpenMP Backend

Will Usher requested to merge Twinklebear/vtk-m:backend-openmp into master

This is some initial work on putting together an OpenMP backend for VTK-m.

Specialized Algorithms:

  • Schedule: Just a #pragma omp parallel for around a for loop calling the kernel.

  • Synchronize: Does nothing, the backend follows a fork/join model for each algorithm so no tasks are left running in the background after returning from a method.

  • ScanInclusive: This is a port of a hand-vectorized scan optimized for MIC by Jeff Inman at LANL, it performs well but is ~35x slower than Intel's TBB scan on MIC. There's still some work left to do here to catch up to TBB, I took a quick glance at their code but would like to find a paper or better description of what they've implemented since it looks a bit complicated. The plots below show speedup vs. the serial backend of sort on CPU and MIC when working on 4194304 vtkm::Float32's.

cpu_scan_speedup mic_scan_speedup

  • Sort: This is a parallel quick sort, it's TBB also uses a parallel quicksort and we get similar performance, however there may be a few bugs to sort out here. If I reduce the grain size to some amount so that more than one thread is used in the Sort unit test it seems to hang after finishing the sort. I'm not sure why at the moment.

Let me know what comments you have, there's still a good bit of work left on this backend.

Merge request reports