Breaking the GPU programming barrier with the auto-parallelising SAC compiler

J. Guo, J. Thiyagalingam, S. Scholz

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    31 Citations (Scopus)

    Abstract

    Over recent years, the use of Graphics Processing Units (GPUs) for general-purpose computing has become increasingly popular. The main reasons for this development are the attractive performance/price and performance/power ratios of these architectures. However, substantial performance gains from GPUs come at a price: they require extensive programming expertise and, typically, a substantial re-coding effort. Although the programming experience has been significantly improved by existing frameworks like CUDA and OpenCL, it is still a challenge to effectively utilise these devices. Directive-based approaches such as hiCUDA or OPENMP-variants offer further improvements but have not eliminated the need for the expertise on these complex architectures. Similarly, special purpose programming languages such as Microsoft's Accelerator try to lower the barrier further. They provide the programmer with a special form of GPU data structures and operations on them which are then compiled into GPU code. In this paper, we take this trend towards a completely implicit, high-level approach yet another step further. We generate CUDA code from a MATLAB-like high-level functional array programming language, Single Assignment C (SAC). To do so, we identify which data structures and operations can be successfully mapped on GPUs and transform existing programs accordingly. This paper presents the first runtime results from our GPU backend and it presents the basic set of GPU-specific program optimisations that turned out to be essential. Despite our high-level program specifications, we show that for a number of benchmarks speedups between a factor of 5 and 50 can be achieved through our parallelising compiler.
    Original languageEnglish
    Title of host publicationProcs of the 6th ACM Workshop on Declarative Aspects of Multicore Programming, DAMP'11, 83988
    PublisherACM Press
    Pages15-23
    ISBN (Print)978-145030486-3
    DOIs
    Publication statusPublished - 2011

    Keywords

    • code
    • compiler
    • CUDA
    • generation
    • GPU
    • optimization

    Fingerprint

    Dive into the research topics of 'Breaking the GPU programming barrier with the auto-parallelising SAC compiler'. Together they form a unique fingerprint.

    Cite this