Algorithmic Skeleton Framework for the Orchestration of GPU Computations

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Citations (Scopus)


The Graphics Processing Unit (GPU) is gaining popular- ity as a co-processor to the Central Processing Unit (CPU). However, harnessing its capabilities is a non-trivial exercise that requires good knowledge of parallel programming, more so when the complexity of these applications is increasingly rising. Languages such as StreamIt [1] and Lime [2] have addressed the offloading of composed computations to GPUs. However, to the best of our knowledge, no support exists at library level. To this extent, we propose Marrow, an algorithmic skeleton frame- work for the orchestration of OpenCL computations. Marrow expands the set of skeletons currently available for GPU computing, and enables their combination, through nesting, into complex structures. Moreover, it introduces optimizations that overlap communication and computa- tion, thus conjoining programming simplicity with performance gains in many application scenarios. We evaluated the framework from a perfor- mance perspective, comparing it against hand-tuned OpenCL programs. The results are favourable, indicating that Marrow’s skeletons are both flexible and efficient in the context of GPU computing.
Original languageUnknown
Title of host publicationLecture Notes in Computer Science
ISBN (Electronic)978-3-642-40047-6
Publication statusPublished - 1 Jan 2013
EventEuro-Par 2013 - 19th International Conference on Parallel Processing -
Duration: 1 Jan 2013 → …


ConferenceEuro-Par 2013 - 19th International Conference on Parallel Processing
Period1/01/13 → …

Cite this