gitlab-ci: add jobs testing cuda11.6 with nvcc and clang 13

Also run CUDA and HIP test jobs in any non-MR pipeline. Previously we only ran these in a scheduled nightly pipeline. We should run them in pipelines on integration branches too, particularly for the release branch.

