TY - JOUR
T1 - How Do Crowd-Users Express Their Opinions Against Software Applications in Social Media? A Fine-Grained Classification Approach
AU - Khan, Nek Dil
AU - Ali Khan, Javed
AU - Li, Jianqiang
AU - Ullah, Tahir
AU - Alwadain, Ayed
AU - Yasin, Affan
AU - Zhao, Qing
N1 - © 2024 The Author(s). This is an open access article under the Creative Commons Attribution-Non Commercial-No Derivatives CC BY-NC-ND licence, https://creativecommons.org/licenses/by-nc-nd/4.0/
PY - 2024/7/10
Y1 - 2024/7/10
N2 - App stores allow users to search, download, and purchase software applications to accomplish daily tasks. Also, they enable crowd-users to submit textual feedback or star ratings to the downloaded software apps based on their satisfaction. Recently, crowd-user feedback contains critical information for software developers, including new features, issues, non-functional requirements, etc. Previously, identifying software bugs in low-star software applications was ignored in the literature. For this purpose, we proposed a natural language processing-based (NLP) approach to recover frequently occurring software issues in the Amazon Software App (ASA) store. The proposed approach identified prevalent issues using NLP part-of-speech (POS) analytics. Also, to better understand the implications of these issues on end-user satisfaction, different machine learning (ML) algorithms are used to identify crowd-user emotions such as anger, fear, sadness, and disgust with the identified issues. To this end, we shortlisted 45 software apps with comparatively low ratings from the ASA Store. We investigated how crowd-users reported their grudges and opinions against the software applications using the grounded theory & content analysis approaches and prepared a grounded truth for the ML experiments. ML algorithms, such as MNB, LR, RF, MLP, KNN, AdaBoost, and Voting Classifier, are used to identify the associated emotions with each captured issue by processing the annotated end-user data set. We obtained satisfactory classification results, with MLP and RF classifiers having 82% and 80% average accuracies, respectively. Furthermore, the ROC curves for better-performing ML classifiers are plotted to identify the best-performing under or oversampling classifier to be selected as the final best classifier. Based on our knowledge, the proposed approach is considered the first step in identifying frequently occurring issues and corresponding end-user emotions for low-ranked software applications. The software vendors can utilize the proposed approach to improve the performance of low-ranked software apps by incorporating it into the software evolution process promptly.
AB - App stores allow users to search, download, and purchase software applications to accomplish daily tasks. Also, they enable crowd-users to submit textual feedback or star ratings to the downloaded software apps based on their satisfaction. Recently, crowd-user feedback contains critical information for software developers, including new features, issues, non-functional requirements, etc. Previously, identifying software bugs in low-star software applications was ignored in the literature. For this purpose, we proposed a natural language processing-based (NLP) approach to recover frequently occurring software issues in the Amazon Software App (ASA) store. The proposed approach identified prevalent issues using NLP part-of-speech (POS) analytics. Also, to better understand the implications of these issues on end-user satisfaction, different machine learning (ML) algorithms are used to identify crowd-user emotions such as anger, fear, sadness, and disgust with the identified issues. To this end, we shortlisted 45 software apps with comparatively low ratings from the ASA Store. We investigated how crowd-users reported their grudges and opinions against the software applications using the grounded theory & content analysis approaches and prepared a grounded truth for the ML experiments. ML algorithms, such as MNB, LR, RF, MLP, KNN, AdaBoost, and Voting Classifier, are used to identify the associated emotions with each captured issue by processing the annotated end-user data set. We obtained satisfactory classification results, with MLP and RF classifiers having 82% and 80% average accuracies, respectively. Furthermore, the ROC curves for better-performing ML classifiers are plotted to identify the best-performing under or oversampling classifier to be selected as the final best classifier. Based on our knowledge, the proposed approach is considered the first step in identifying frequently occurring issues and corresponding end-user emotions for low-ranked software applications. The software vendors can utilize the proposed approach to improve the performance of low-ranked software apps by incorporating it into the software evolution process promptly.
KW - Reviews
KW - Computer bugs
KW - Social networking (online)
KW - Software algorithms
KW - Blogs
KW - Data mining
KW - User experience
KW - User reviews
KW - app store analytics
KW - software issues
KW - bug reports
KW - data-driven requirements
UR - http://www.scopus.com/inward/record.url?scp=85198297438&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3425830
DO - 10.1109/ACCESS.2024.3425830
M3 - Article
SN - 2169-3536
VL - 12
SP - 98004
EP - 98028
JO - IEEE Access
JF - IEEE Access
ER -