Skip to content

Dataset::AddField replaces existing fields with the same name

fixes: #194 (closed)

This MR changes the behavior of Dataset::Addfield when two fields with the same names are added.

In our current implementation both of the field would be inserted, but, only the first one would be reachable when using Dataset::GetField(name, assoc, found)

This MR introduces an invariant in which only a single field with same {name,assoc} can exists in a given Dataset.

This is implemented by the replacing the data structure holding the fields from a vector to a map.

The choice of data structure is also to increase the performance of AddField / Getfield at the expense of memory usage and some bits of performances when only few fields are used.

Moving to non-contiguous memory will impact cpu cache benefits, however, since each of the Field has again another pointer to its actual data, this benefits where actually just small.

Also the choice of map v.s. unordered_map is about the number of elements, very likely few rehashing happenings from until <100.

A sorted vector was also considered however the resize, shifting the elements after insertion made it very unattractive.

Edited by Vicente Bolea

Merge request reports