Improve MultiDimensionBrowser for distributed environment
Update MultiDimensionBrowser behavior in distributed environment.
Typical pipeline
-
FilterA
: produce a dataset. May have aGlobalIds
array flagged. -
TemporalMultiplexing
: produce a table. GlobalIds is in array list but it looks like it is not flagged asGlobalIds
. -
MultiDimensionBrowser
: never hasGlobalIds
flagged as input.
Old behavior
The Index
parameter is set for the hidden dimension on each rank. With N
timesteps and R
ranks, this leads to NxR
rows for a given index. So final table depends on number of ranks. This does not make sense.
New behavior
We aim to have N
rows in output table. So ideally only one rank should contribute to output, with a deterministic heuristic.
Generic strategy
Each rank has a hidden dimension of size M
. We internally attribute a local offset depending on the rank number and size of the previous ranks. So each hidden table has a unique global id across the ranks. Then only the rank containing Index
is used, with its backend set accordingly.
Use GlobalIds array
Hidden dimension mostly match a point in an input dataset (in the DSP plugin). Supposing GenerateGlobalIds
was applied on input dataset, we can use it.
Then only the rank containing Index
in the global id array is used, with its backend set accordingly.