Effective Host-GPU Memory Management through Code Generation

Hans-Nikolai Vießmann, Sven-Bodo Scholz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)
120 Downloads (Pure)


NVIDIA's CUDA provides several options to orchestrate the management of host and device memory as well as the communication between them. In this paper we look at these choices, identify the program changes required when switching between them, and we observe their effect on application performance. We present code generation schemes that translate resource-agnostic program specifications, i.e., programs without any explicit notion of memory or GPU kernels, into five CUDA versions that differ in the use of the memory and communication API of CUDA only. An implementation of these code generators within the compiler of the functional programming language Single-Assignment C (SaC) shows performance differences between the variants by up to a factor of 3. Performance analyses reveal that the preferred choices depend on a combination of several factors, including the actual hardware being used, and several aspects of the application itself. A clear choice, therefore, cannot be made a priori. Instead, it seems essential that different variants can be generated from a single source for achieving performance portability across GPU devices.

Original languageEnglish
Title of host publicationIFL 2020: IFL 2020: Proceedings of the 32nd Symposium on Implementation and Application of Functional Languages
EditorsOlaf Chitil
PublisherAssociation for Computing Machinery
Number of pages12
ISBN (Electronic)9781450389631
Publication statusPublished - 2 Sept 2020
Event32nd Symposium on Implementation and Application of Functional Languages 2020 - Virtual, Online, United Kingdom
Duration: 2 Sept 20204 Sept 2020


Conference32nd Symposium on Implementation and Application of Functional Languages 2020
Abbreviated titleIFL 2020
Country/TerritoryUnited Kingdom
CityVirtual, Online


  • code generation
  • communication models
  • CUDA
  • GPU
  • memory management
  • SaC
  • transfer bandwidth

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software


Dive into the research topics of 'Effective Host-GPU Memory Management through Code Generation'. Together they form a unique fingerprint.

Cite this