-
Kenneth Moreland authored
The implementation was calling PrepareForOutput on the delegate arrays rather than PrepareForInPlace, do when used with CUDA you did not get the data on the device. Also added a regression test to check this.
4d1da547
The implementation was calling PrepareForOutput on the delegate arrays rather than PrepareForInPlace, do when used with CUDA you did not get the data on the device. Also added a regression test to check this.