Multi block memory

Use asyncronous memory malloc/free in cuda

Merge request reports

Loading