Snippets Groups Projects

You need to sign in or sign up before continuing.

CUDA-Kernels instead of OpenACC

@jahns Currently packing/unpacking kernels for the GPU support are generated using OpenACC. Example for packing 8-Byte-data:

static void xt_ddt_pack_8(
  size_t count, ssize_t *restrict displs, void const *restrict src,
  void *restrict dst, enum xt_memtype memtype) {
XtPragmaACC(
  parallel loop independent deviceptr(src, dst, displs)
  if (memtype != XT_MEMTYPE_HOST))
  for (size_t i = 0; i < count; ++i)
    ((int8_t*)dst)[i] = *(int8_t*)((unsigned char *)src + displs[i]);
}

Alternatively we could write the kernels in CUDA-code and compile them at runtime using NVRTC. This approach is a little bit more complex for us, but the advantages would be:

no dependencies on OpenACC
we could compile at runtime for the architecture that is actually being used
the configure would not have to determine any compiling/linking flags for the CUDA support (or provided by the user); the CUDA-root directory would be sufficient

Edited 2 years ago

Designs

Child items ...

Activity

Moritz Hanke assigned to @k202077 2 years ago

assigned to @k202077
Moritz Hanke changed the description 2 years ago

changed the description
Moritz Hanke added currently not important label 9 months ago

added currently not important label

Please register or sign in to reply

Imprint | Privacy Policy