Skip to content

Pre-allocate arrays for MergePartitionedDataSet

The initial implementation of MergePartitionedDataSet would grow each array as it was generated. As each partition was revisited, the arrays being merged would be reallocated and data appended to the end. Although this works, it is slower than necessary. Each reallocation has to copy the previously saved data into the newly allocated memory space.

This new implementation first counts how big each array should be and then copies data from each partition into the appropriate location of each dataset.

Also changed the templating of how fields are copied so that all field types are supported, not just those in the common types.

Merge request reports