Building an Ensemble for Software Defect Prediction Based on Diversity Selection

Jean Petri, David Bowes, Tracy Hall, Bruce Christianson, Nathan Baddoo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Citations (Scopus)

Abstract

Background: Ensemble techniques have gained attention
in various scientific fields. Defect prediction researchers have
investigated many state-of-the-art ensemble models and concluded that in many cases these outperform standard single
classifier techniques. Almost all previous work using ensemble
techniques in defect prediction rely on the majority
voting scheme for combining prediction outputs, and on
the implicit diversity among single classifiers. Aim: Investigate
whether defect prediction can be improved using an explicit
diversity technique with stacking ensemble, given the
fact that different classifiers identify different sets of defects.
Method: We used classifiers from four different families and
the weighted accuracy diversity (WAD) technique to exploit
diversity amongst classifiers. To combine individual predictions,
we used the stacking ensemble technique. We used
state-of-the-art knowledge in software defect prediction to
build our ensemble models, and tested their prediction abilities
against 8 publicly available data sets. Conclusion:
The results show performance improvement using stacking
ensembles compared to other defect prediction models. Diversity
amongst classifiers used for building ensembles is essential
to achieving these performance improvements.
Original languageEnglish
Title of host publicationProceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
Place of PublicationNew York, NY, USA
PublisherACM Press
Pages46:1-46:10
ISBN (Print)978-1-4503-4427-2
DOIs
Publication statusPublished - 9 Sept 2016
EventIEEE International Symposium on Empirical Software Engineering and Measurement - Cuidad Real, Spain
Duration: 8 Sept 20169 Sept 2016

Publication series

NameESEM '16
PublisherACM

Conference

ConferenceIEEE International Symposium on Empirical Software Engineering and Measurement
Country/TerritorySpain
CityCuidad Real
Period8/09/169/09/16

Keywords

  • Software defect prediction, diversity, ensembles of learning machines, software faults, stacking

Fingerprint

Dive into the research topics of 'Building an Ensemble for Software Defect Prediction Based on Diversity Selection'. Together they form a unique fingerprint.

Cite this