WIP: True virtual functions
This branch changes the behavior of VirtualObjectTransfer to copy a real virtual object from control (sometimes host) to execution (sometimes device) rather then set up a bunch of function pointers to create faux virtual objects. The main reason not to do this in the first place was that calling new on a CUDA device is prohibitively slow, so we need to allocate the data from the host. However, in this branch we still do the allocation from the host and then do a "placement new" on the device that sets up the virtual table (and calls the constructor) using pre-allocated memory. This ultimately requires writing less code to set up these virtual objects.