CUDA kernel now execute for multiple values in a strided manner. (!1090) · Merge requests · VTK / VTK-m

Robert Maynard requested to merge robertmaynard/vtk-m:more_efficient_cuda_task_scheduling into master Feb 21, 2018

Instead of launching lots of kernel instances, we re-use the kernel instances using a stride iteration pattern.

The compelling reasons for this change is that it allows us to re-use the worklet infrastructure that is created for each thread, and therefore reduce constant overhead costs.

CUDA kernel now execute for multiple values in a strided manner.

Merge request reports