vtkm does not compile for CUDA architectures < 60 (Solution included)
Problem
I found a compile error in:
- vtkm 2.0.0
- on Windows MSVC,
- Using CUDA 11.8, and
-
CMAKE_CUDA_ARCHITECTURES
"all-major"
Looking at the source code, the problem is also in master, but I haven't tested it on master, because compilation it takes forever. The problem should be reproducible when compiling with CUDA architecture 'sm_35' though.
The compile fails in Atomic.h, because atomicAdd()
is not natively supported for double in cuda architectures < 60.
Solution
The fallback implementation is actually in the source file Atomic.h
, but it is named wrongly.
Since there are no usages of vtkmAtomicAdd()
in the whole code base, I assume it should be named AtomicAddImpl()
and was just a mistake during refactoring. The same applies to the implementation for vtkm::Float32
directly above, which should be renamed from vtkmAtomicAddImpl()
to AtomicAddImpl()
.
Renaming both of the implementations to AtomicAddImpl()
lets me compile on above config using "all-major", which for me goes down to 'sm_35'.