An explicitly declared delayed-branch mechanism for a superscalar architecture

R. Collins, G.B. Steven

    Research output: Book/ReportOther report

    1 Citation (Scopus)
    38 Downloads (Pure)


    One of the main obstacles to exploiting the fine-grained parallelism that is available in general-purpose code is the frequency of branches that cause unpredictable changes in the control flow of a program at run-time. Whenever a branch is taken, a performance penalty may be incurred as the processor waits for instructions to be fetched from the branch target stream. RISC processors introduce a delayed-branch mechanism which defines branch delay slots into which code can be scheduled. This strategy allows the processor to be kept busy executing useful instructions while the change of control flow takes place. While the concept of delayed-branches can be readily extended to VLIW architectures, it is less clear how it should be incorporated in a superscalar architecture. This paper proposes a general branch-delay mechanism which is suitable for a range of code-compatible superscalar processors and which completely avoids the need to introduce NOPs into the code. This technique was developed as an integral part of the HSP superscalar project. HSP is a superscalar architecture currently being developed at the University of Hertfordshire with the aim of using compile-time instruction scheduling to achieve an order of magnitude speed-up over traditional RISC architectures for a suite of non-numeric benchmark programs.
    Original languageEnglish
    PublisherUniversity of Hertfordshire
    Publication statusPublished - 1994

    Publication series

    NameUH Computer Science Technical Report
    PublisherUniversity of Hertfordshire


    • instruction-level parallelism
    • code scheduling
    • conditional branches
    • delayed branch


    Dive into the research topics of 'An explicitly declared delayed-branch mechanism for a superscalar architecture'. Together they form a unique fingerprint.

    Cite this