Skip to content

Fix the default launch sizes for Tesla hardware.

Robert Maynard requested to merge robertmaynard/vtk-m:cuda_hpc_defaults into master

The 8x8x8 is a better launch strategy for most VTK-m kernels. The current problem is that a couple of VTK-m kernels use a high number of registers and this number of threads combines to require too many registers.

What we should do in the longer run is have more controls over kernel launches on a per kernel basis. This will require VTK-m to extract the number of registers being used by each kernel

Merge request reports