TY - JOUR
T1 - Identification of Software Bugs by Analyzing Natural Language-Based Requirements Using Optimized Deep Learning Features
AU - Haq, Qazi Mazhar ul
AU - Arif, Fahim
AU - Aurangzeb, Khursheed
AU - Ain, Noor ul
AU - Khan, Javed Ali
AU - Rubab, Saddaf
AU - Anwar, Muhammad Shahid
N1 - © 2024 Tech Science Press. All rights reserved. This is an open access article distributed under the Creative Commons Attribution License, to view a copy of the license, see: https://creativecommons.org/licenses/by/4.0/
PY - 2024/3/26
Y1 - 2024/3/26
N2 - Software project outcomes heavily depend on natural language requirements, often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements. Researchers are exploring machine learning to predict software bugs, but a more precise and general approach is needed. Accurate bug prediction is crucial for software evolution and user training, prompting an investigation into deep and ensemble learning methods. However, these studies are not generalized and efficient when extended to other datasets. Therefore, this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems. The methods involved feature selection, which is used to reduce the dimensionality and redundancy of features and select only the relevant ones; transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets, and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model. Four National Aeronautics and Space Administration (NASA) and four Promise datasets are used in the study, showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve (AUC-ROC) values when different classifiers were combined. It reveals that using an amalgam of techniques such as those used in this study, feature selection, transfer learning, and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing, useful end mode.
AB - Software project outcomes heavily depend on natural language requirements, often causing diverse interpretations and issues like ambiguities and incomplete or faulty requirements. Researchers are exploring machine learning to predict software bugs, but a more precise and general approach is needed. Accurate bug prediction is crucial for software evolution and user training, prompting an investigation into deep and ensemble learning methods. However, these studies are not generalized and efficient when extended to other datasets. Therefore, this paper proposed a hybrid approach combining multiple techniques to explore their effectiveness on bug identification problems. The methods involved feature selection, which is used to reduce the dimensionality and redundancy of features and select only the relevant ones; transfer learning is used to train and test the model on different datasets to analyze how much of the learning is passed to other datasets, and ensemble method is utilized to explore the increase in performance upon combining multiple classifiers in a model. Four National Aeronautics and Space Administration (NASA) and four Promise datasets are used in the study, showing an increase in the model’s performance by providing better Area Under the Receiver Operating Characteristic Curve (AUC-ROC) values when different classifiers were combined. It reveals that using an amalgam of techniques such as those used in this study, feature selection, transfer learning, and ensemble methods prove helpful in optimizing the software bug prediction models and providing high-performing, useful end mode.
KW - ensemble learning
KW - feature selection
KW - KEYWORDS Natural language processing
KW - software bug prediction
KW - transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85189468253&partnerID=8YFLogxK
U2 - 10.32604/cmc.2024.047172
DO - 10.32604/cmc.2024.047172
M3 - Article
AN - SCOPUS:85189468253
SN - 1546-2218
VL - 78
SP - 4379
EP - 4397
JO - Computers, Materials & Continua
JF - Computers, Materials & Continua
IS - 3
ER -