Fix issue with CUDA and ArrayHandleMultiplexer
When you try to call the Reduce
operation in the CUDA device adapter
with a sufficently complex interator type, you get a compile error
that says error: cannot pass an argument with a user-provided copy-constructor to a device-side kernel launch
.
This appears to be a bug in either nvcc or Thrust. I believe it is related to the following reported issues:
Work around this problem by making a special condition for calling
Reduce
with an ArrayHandleMultiplexer
that calls the generic
algorithm in DeviceAdapterAlgorithmGeneral
instead of the algorithm in
Thrust.