Manage cuda device memory with cudaMalloc instead of thrust::vector.

258 jobs for 144-thrust-cudaMalloc