University of Hertfordshire

By the same authors

The Jinx on the NASA software defect data sets

Research output: Chapter in Book/Report/Conference proceedingConference contribution

View graph of relations
Original languageEnglish
Title of host publicationProceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering (EASE)
Place of PublicationNY, New York
PublisherACM Press
Volume01-03-June-2016
ISBN (Print)9781450336918
DOIs
Publication statusPublished - 1 Jun 2016
Event20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016 - Limerick, Ireland
Duration: 1 Jun 20163 Jun 2016

Conference

Conference20th International Conference on Evaluation and Assessment in Software Engineering, EASE 2016
CountryIreland
CityLimerick
Period1/06/163/06/16

Abstract

Background: The NASA datasets have previously been used extensively in studies of software defects. In 2013 Shepperd et al. presented an essential set of rules for removing erroneous data from the NASA datasets making this data more reliable to use. Objective: We have now found additional rules necessary for removing problematic data which were not identified by Shepperd et al. Results: In this paper, we demonstrate the level of erroneous data still present even after cleaning using Shepperd et al.'s rules and apply our new rules to remove this erroneous data. Conclusion: Even after systematic data cleaning of the NASA MDP datasets, we found new erroneous data. Data quality should always be explicitly considered by researchers before use.

Notes

Jean Petrić, David Bowes, Tracy Hall, Bruce Christianson, Nathan Baddoo, ‘The Jinx on the NASA software defect data sets’ in Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, University of Limerick, Limerick, Ireland, 1-3 June 2016. ISBN 9781450336918. doi: 10.1145/2915970.2916007.

ID: 10384619