Fix the default launch sizes for Tesla hardware. (!1667) · Merge requests · VTK / VTK-m

Robert Maynard requested to merge robertmaynard/vtk-m:cuda_hpc_defaults into master May 06, 2019

The 8x8x8 is a better launch strategy for most VTK-m kernels. The current problem is that a couple of VTK-m kernels use a high number of registers and this number of threads combines to require too many registers.

What we should do in the longer run is have more controls over kernel launches on a per kernel basis. This will require VTK-m to extract the number of registers being used by each kernel

Admin message

Fix the default launch sizes for Tesla hardware.

Merge request reports