Space-time decompsition in the Exodus reader
Space-time decomposition in the Exodus reader.
Here is a feature request from a user.
Here is Ken's reply to said user:
SNIP,
I’m pretty sure the interleaved time slices will not work with the current Exodus reader. The reader is designed for restarted simulations and it assumes that any overlap is caused by a restart where the simulation likely crashed and caused invalid results.
Alan,
I’m adding you to this email thread I have been having with SNIP. He has a use case where he needs to read time steps that are interleaved among exodus files. (There will likely be other requirements as well.) As we start planning for the Exodus reader rewrite we should keep these on the list of things to support.
-Ken
From user: Hi Ken,
Thanks for the prompt reply!!
You are right, there are two questions here. One is getting the time-decomposed data to visualize in some reasonable way. The restart mechanism you described is adequate for this, thanks for the suggestion. I was basically missing the .e-s. ending in the file names. This works as advertised for time-continuous data.
The second question is about visualizing data with processors partitioned in space-time. Is this even needed? We have some parallel-in-time algorithms for problems where tens of thousands of time slices are distributed among hundreds of processors, in addition to spatial decomposition for each time group (also using hundreds or thousands of processors). In other words, all time is kept in distributed memory during a simulation, in contrast to the conventional model where time is sequential and only one slice is kept in memory. To do this, we split an MPI communicator into time and space groups. However, once the simulation is done and data is written to disk, time is effectively serialized. My guess is that some version of the restart mechanism could work. Possibly the only hurdle may be that the time slices may not be in order, e.g., a time decomposition may give the following global time indices in each time group:
0 15 3 6 4 2 13 12 1 5 7 8 9 10 11 14
Will the restart mechanism work as-is?
Thanks again,
From Ken:
SNIP,
There are a couple of questions in there. The first is about loading data from Exodus files where a subgroup of files holds all the spatial partitions of a range of time. This is supported by ParaView’s Exodus reader to support reading files from restarted simulations. If you are not seeing all the time steps, it is likely that the files are not following the file naming convention to differentiate what groups of files constitute a partition of space and what groups constitute a partition of time. The naming conventions are documented here:
https://www.paraview.org/Wiki/Restarted_Simulation_Readers#Exodus
As for actually loading all the data of a mesh for some set of time on a subset of processors in ParaView, that is not really supported right now. It is possible to read in multiple time steps at once. They will be partitioned among all the processes. That’s not really what you asked for, but since ParaView is often run on a small set of nodes than the simulation, it might be close enough.
The Exodus reader is slated for a rewrite soon. We have been working with Greg S. <snip, Exodus file guru> to support the new way in which IOSS is writing out Exodus files. They will need to be repartitioned as they are read in.
Is your need to have ParaView be able to read in multiple time steps on different partitions of nodes or simply for ParaView to be able to read data that was partitioned on the simulator like this?
-Ken
From user:
Hi Ken,
I have a quick question about ParaView. Is time decomposition supported? For example, each processor would own and output a certain number of time snapshots, and the created Exodus files would have time signatures as follows (after ncdump):
time_whole = 0, 0.1, 0.2, 0.3, 0.4 ; ... ... time_whole = 6.1, 6.2, 6.3, 6.4 ;
(an example with 65 time steps). I've created such an example, with 16 files, where each file contains four snapshots with the exception of the first file (five snapshots). They are recognized as a group in ParaView, but the total number of time slices is reported as 4, not 65, so I don't think they are logically put together.
Taking this a step further, we would also want to combine spatial decomposition with time decomposition.
There is a lot of new research in space-time decomposition algorithms. They may become fairly common in the exascale era.