Evaluating Three Approaches to Extracting Fault Data from Software Change Repositories

Tracy Hall, David Bowes, Gernot Liebchen, Paul Wernick

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    8 Citations (Scopus)


    Software products can only be improved if we have a good understanding of the faults they typically contain. Code faults are a significant source of software product problems which we currently do not understand sufficiently. Open source change repositories are potentially a rich and valuable source of fault data for both researchers and practitioners. Such fault data can be used to better understand current product problems so that we can predict and address future product problems. However extracting fault data from change repositories is difficult. In this paper we compare the performance of three approaches to extracting fault data from the change repository of the Barcode Open Source System. Our main findings are that we have most confidence in our manual evaluation of diffs to identify fault fixing changes. We had less confidence in the ability of the two automatic approaches to separate fault fixing from non-fault fixing changes. We conclude that it is very difficult to reliably extract fault fixing data from change repositories, especially using automatic tools and that we need to be cautious when reporting or using such data.

    Original languageEnglish
    Title of host publicationProduct-Focused Software Process Improvement
    Subtitle of host publicationProcs of 11th Int Conf PROFES 2010
    EditorsMA Babar, M Vierimaa, M Oivo
    Place of PublicationBERLIN
    PublisherSpringer Nature
    Number of pages9
    ISBN (Print)978-3-642-13791-4
    Publication statusPublished - 2010
    Event11th Int Conf, PROFES 2010 - Limerick, Ireland
    Duration: 21 Jun 201023 Jun 2010

    Publication series

    NameLecture Notes in Computer Science


    Conference11th Int Conf, PROFES 2010


    • Software
    • fault
    • data
    • prediction


    Dive into the research topics of 'Evaluating Three Approaches to Extracting Fault Data from Software Change Repositories'. Together they form a unique fingerprint.

    Cite this