HIP: Does not fully work with ROCm 6 packages in Fedora 40
While updating our CI base image from Fedora 39 to Fedora 40, I've not been able to get all our HIP tests passing.
There are a few problems that did not occur in Fedora 39's ROCm/HIP packages, but do with Fedora 40:
-
The HIP tests fail to see the AMD device inside the CI job. If I run a container manually on the same host, they work fine. The failure occurs with the same gitlab CI runner under which the HIP tests in a CI job in a Fedora 39 container are able to see the AMD device.
-
The
HIP.MathFunctions
test hangs when the test binary runs. -
The
rocm-device-libs
package only provides files for Clang 17, but Fedora 40's defaultclang
package isclang-18
. Thereforerocm-hip-devel
requires an extraclang-17
package. Also, if we want to useHIPCXX=/usr/bin/clang++-18
, we need an additionalHIPFLAGS='--rocm-path=/usr --rocm-device-lib-path=/usr/lib/clang/17/amdgcn/bitcode'
In order to move forward with updating our CI to Fedora 40, I'm dropping the ROCm/HIP packages from the base image for now.