Very interesting. I usually turn examples off for such builds, likely I don't see it due to that.
Let's report upstream with blosc2?
Why can ParaView's transform options (either as filter or in the renderer) not address this issue? Does really not look like something we should be double book keeping on the data reader side :)
I am ok to add a GUI option "rescale all extents and positions to unit box" for openPMD data [1], but would like to keep the amount of hacks minimal for the read logic.
[1] Note that this will also have alignment issues if not all fields and particle extents are aligned in a data set, which is a valid case to load as well.
Thank you for the report and digging into this.
The current solution appears incomplete to me and I would like to not merge it in that form yet. Please see details inline.
The name _rescaleFactor
is not descriptive enough.
Rescaling could be:
Also, this is only applied to fields (meshes) now, not to particles, leaving data sets that have both broken (half-scaled in extent).
The current solutions looks like working around a rendering issue in the data reader issue, which is not ideal. It also extents the file user-facing API part in the .pmd
file, which I want to keep very simple.
If we really want to merge something like this, I would propose to implement a "normalize extent to unit box" button or similar in the GUI when loading openPMD files, that rescales particle positions and grid spacing / offset equally.
I would like this to not be part of the .pmd
file but part of the ParaView GUI if that is possible.
@berkgeveci can we add a GUI field for this?
@BenWibking please add a inline doc string as we do for the other parameters in the lines above.
I think there is a detail we missed in !1109. In ADIOS 2.9, the compressor c-blosc was updated to c-blosc2.
Can you potentially update this as well for the ParaView superbuild? :) (c-blosc2 in ADIOS2 is backwards compatible to c-blosc files).
@vbolea @caitlin.ross Thank you for this update!
I think there is a detail we missed in this PR. In ADIOS 2.9, the compressor c-blosc was updated to c-blosc2.
Can you potentially update this as well for the ParaView superbuild? :) (c-blosc2 in ADIOS2 is backwards compatible to c-blosc files).
(Tracking in #263)
Yes, absolutely. Please go to 0.15.2 of openPMD-api, which plays well with ADIOS2 2.9
Wohoo! :) Thanks!
An early attempt by one of our AMReX co-developers (Ben Wibking) hacks nvc++ -cuda
as CMAKE_CUDA_COMPILER
like this:
-DCMAKE_C_COMPILER=nvc -DCMAKE_CXX_COMPILER=nvc++ -DCMAKE_CUDA_COMPILER=nvc++ -DCMAKE_CUDA_COMPILER_ID=NVHPC -DCMAKE_CUDA_ARCHITECTURES=80 -DCMAKE_CUDA_COMPILER_FORCED=ON -DCMAKE_CUDA_COMPILE_FEATURES=cuda_std_17 -DAMReX_GPU_RDC=OFF -DCMAKE_CXX_FLAGS="-cuda" -DCMAKE_CUDA_FLAGS="-cuda"
Update: officially, it is not -x cuda
or -x cu
but -cuda
.
Update: NVHPC supports compiling CUDA C++, e.g., to translate the parallel STL they ship in. I tested wtih with NVHPC 22.7 on an A100 for a simple example and it compiled and ran :)
The feature is an frontend flag: nvc++ -cuda ...
for translating CUDA CXX
sources.
@robertmaynard @tambre @brad.king et al., do you think we could add support for this in CMake's CUDA compilers (CMAKE_CUDA_COMPILER=nvc++
and no CMAKE_CUDA_HOST_COMPILER
) so people can experiment with it in their CUDA C++ projects?
$ nvc++ --version
nvc++ 22.7-0 64-bit target on x86-64 Linux -tp zen3
NVIDIA Compilers and Tools
Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
$ nvc++ --help
...
-cudalib[=cublas|cufft:{callback}|cufftw|cufftmp|curand|cusolver|cusolvermp|cusparse|cutensor|nvblas|nccl|nvshmem|nvlamath|nvlamath_ilp64|nvtx3]
Add appropriate versions of the CUDA-optimized libraries
...
-cusuffix=<suff> Define a CUDA suffix
...
-cuda[=charstring|madconst]
Enable CUDA; please refer to -gpu for target specific options
charstring Enable limited support for character strings in GPU kernels
madconst Put Module Array Descriptors in CUDA Constant Memory
...
-gpu=ccnative|ccXY|ccall|ccall-major|cudaX.Y|[no]debug|fastmath|[no]flushz|nvlamath|[no]fma|keep|[no]lineinfo|zeroinit|[no]autocollapse|deepcopy|loadcache:{L1|L2}|maxregcount:<n>|pinned|[no]rdc|safecache|stacklimit:<l>|nostacklimit|ptxinfo|[no]unroll|[no]managed|beta|autocompare|redundant|[no]implicitsections
Select specific options for GPU code generation
ccnative Detects the visible GPUs on the system and generates codes for them. If no device is available, the compute capability matching NVCC's default will be used.
ccXY Compile for compute capability X.Y; supported values: 35,50,60,61,62,70,72,75,80,86
ccall Compile for all supported compute capabilities
ccall-major Compile for all major supported compute capabilities.
cudaX.Y Use CUDA X.Y Toolkit compatibility, where installed
[no]debug Generate GPU debug information
fastmath Use fast math library
[no]flushz Enable flush-to-zero mode on the GPU
nvlamath Use nvlamath module in program units
[no]fma Generate fused mul-add instructions (default at -O3)
keep Keep kernel files
[no]lineinfo Generate GPU line information
zeroinit Initialize allocated device memory with zero
[no]autocollapse
Automatically collapse tightly nested OpenACC parallel loops
deepcopy Enable Full Deepcopy support in OpenACC Fortran
loadcache Choose what hardware level cache to use for global memory loads
L1 Use L1 cache
L2 Use L2 cache
maxregcount:<n> Set maximum number of registers to use on the GPU
pinned Use CUDA Pinned Memory
[no]rdc Generate relocatable device code
safecache Allows variable-sized array sections in OpenACC cache directives and assumes they fit into CUDA shared memory
stacklimit:<l>|nostacklimit
Sets the limit of stack variables in a procedure or kernel, in KB
ptxinfo Print ptxas information
[no]unroll Enable automatic inner loop unrolling (default at -O3)
[no]managed Use CUDA Managed Memory
beta Enable beta code generation features
autocompare Automatically compare OpenACC CPU/GPU results: implies redundant
redundant Redundant OpenACC CPU/GPU execution
[no]implicitsections
Implicitly convert array element to array section in OpenMP or OpenACC data clauses
...
--c++03 C++98/C++03 language dialect
--c++11 C++11 language dialect
--c++14 C++14 language dialect
--c++17 C++17 language dialect
--c++20 C++20 language dialect
--c++23 C++23 language dialect
--c++98 Implies --c++03
-c11 C11 language dialect
-c17 C17 language dialect
-c18 Implies -c17
-c89 C89/C90 language dialect
-c90 Implies -c89
-c99 C99 language dialect
...
Do: test
@mwestphal done
@mwestphal thanks for the guidance, I was not aware of this workflow over here
Pushed now.
Please excuse the delay, I was traveling for work and then on vacation over the last month.
Axel Huebl at 17 Jul 17:57
@ward-hpe Do you know the equivalent name key in PE_PKGCONFIG_PRODUCTS
for Cray-HDF5? I tried PE_HDF5
and HDF5
w/o success and would generally need to add a similar workflow to other Cray-shipped products (e.g., FindHDF5.cmake
).
@mwestphal Thank you for the ping, this is ready for review.
This PR adds a decent amount of inline documentation for existing logic and future directions. @berkgeveci and I continue to work on this reader for a variety of features.
I would suggest to merge these doc strings already (now freshly rebased) so we can continue on top of these changes for future updates and avoid the rebase against these doc-and-formatting changes.