Implement Kokkos based device adapter backend
A few notes based on my investigation:
ArrayHandle
- Host side storage will be handled by current VTK-m infrastructure.
- Execution side portal access will be provided through
Kokkos::View
- To perform the memory transfer we will have to use the Kokkos facility to form a view from a pointer to existing memory allocation:
- Build a
Kokkos::View
from the raw-pointer obtained fromStorageTagBasic
- Use
Kokkkos::create_mirror
andKokkos::deep_copy
to transfer it to execution space.-
Kokkos::create_mirror
is needed to create a View in a different execution/memory space that is compatible for deep-copy purposes.
-
- Use shallow copy to a
Kokkos::View<const T*>
for const portal.
- Build a
- The above only works for default memory layout (AOS) for other layouts it looks like we will have to perform a copy to AOS buffer before performing the transfer.
Atomic Access
- Atomic access is possible without performing a deep-copy. Just shallow copy to a
Kokkos::View
with the “Atomic” trait. Shallow copy is performed using the copy-constructors and assignment operators.
DeviceAdapterAlgorithmKokkos
- Both 1D and 3D Schedule can be implemented using
Kokkos::parallel_for
using execution policies-
RangePolicy
for 1D -
MDRangePolicy
for 3D
-
- All execution side functions need to be tagged with the
KOKKOS_INLINE_FUNCTION
macro. It looks like it is the same asVTKM_EXEC
macro so it should just work. - There are implementations for basic sort and reduce functionalities.
- We will have to rely on
DeviceAdapterAlgorithmGeneral
for the rest of the algorithms and implement optimized functions for performance critical algorithms.