ResampleToImage crashes when memory buffers are too big
in an attempt to resample a large (10 B cells) rectilinear grid to an Image Data, using ResampleToImage(), I run into crashes due to 'std::bad_alloc'. Yet, physical memory is not an issue, because as it turns out, I can continue to increase the number of pvservers running on the same compute nodes, and then above a certain threshold, it just works fine. This had already been suggested to me by @kmorel with the idea that MPI messages are too big. in the case at hand, I have a mesh of 25746089216 grid points. Using a fixed set of compute nodes (4), ResampletoImage will crash for all jobs with a number of pvservers less than 128. A few questions come to mind:
-
is there a way to do a back-of-the-envelope calculation to find out what is the minimum number of pvserver tasks to allocate?
-
is this dependent on the shape (ratio of dimensions in i,j,k) of the grid as well?
-
could we do this in a streaming fashion with less pvserver tasks?