Oscillator mini app fails with 65k ranks
As part of testing a containerized SENSEI software stack, I tried the oscillator miniapp on Theta with 1024 nodes and 64 ranks per node (total of 65536 ranks). The build consists of SENSEI 2.1.1, VTK 8.2.0, and ADIOS 1.13.1. I used the default histogram settings with 10 bins for mesh "mesh" and array "data". The config file passed to the miniapp is sample.osc
The job failed with lots of error messages such as:
Error: could not find appropriate neighbor for particle: xxxxx
In addition, the job error file contains multiple entries like this:
=========================================================
Process id 28120 Caught SIGABRT
Program Stack:
0x2aaaac67c5f0 : ??? [(???) ???:-1]
0x2aaab17c5337 : gsignal [(libc.so.6) ???:-1]
0x2aaab17c6a28 : abort [(libc.so.6) ???:-1]
0x484de4 : Block::move_particles(float, diy::Master::ProxyWithLink const&) [(oscillator) ???:-1]
0x46c3f1 : diy::Master::ProcessBlock::operator()() [(oscillator) ???:-1]
0x46d1e8 : diy::Master::execute() [(oscillator) ???:-1]
0x46d4c1 : void diy::Master::foreach_<Block>(std::function<void (Block*, diy::Master::ProxyWithLink const&)> const&, std::function<bool (int, diy::Master const&)> const&) [(oscillator) ???:-1]
0x4589ab : main [(oscillator) ???:-1]
0x2aaab17b1505 : __libc_start_main [(libc.so.6) ???:-1]
0x45b2b7 : ??? [(???) ???:-1]
=========================================================
This same configuration worked fine with 128 nodes and 64 ranks per node (total of 8192 ranks)
I will keep all log files related to the failed run in case additional information is needed.
Thanks, Silvio
Edited by Silvio Rizzi