Use CUDA_ARCH instead of CUDACC to guard device-only code.
CUDACC is defined when compiling host code under nvcc, while CUDA_ARCH is only defined for host code.
CUDACC is defined when compiling host code under nvcc, while CUDA_ARCH is only defined for host code.