Make BenchmarkArrayTransfer actually benchmark transfers
Previously, most of the benchmarks just measured time spent reading or writing the array on the device. The transfer only happened on the first iteration and was then cached on the device.
This change clears out the array every iteration so that the array has to be transferred afresh.