TY - JOUR
T1 - A Big Data Analytics Approach for Construction Firms Failure Prediction Models
AU - Alaka, Hafiz
AU - Oyedele, Lukumon O.
AU - Owolabi, Hakeem O
AU - Akinade, Olugbenga O.
AU - Bilal, Muhammad
AU - Ajayi, Saheed O.
N1 - © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
PY - 2019/11/1
Y1 - 2019/11/1
N2 - Using 693 000 datacells from 33 000 sample construction firms that operated or failed between 2008 and 2017, failure prediction models were developed using artificial neural network (ANN), support vector machine, multiple discriminant analysis (MDA), and logistic regression (LR). The accuracy of the models on test data surprisingly showed ANN to have only a slightly better accuracy than LR and MDA. The ANN's number of units in the hidden layer and weight decay hyperparameters were consequently tuned using the grid search. Tuning process led to tedious machine computation that was aborted after many hours without completion. The state of art big data analytics (BDA) technology was, for the first time in failure prediction, consequently employed and the tuning was completed in some seconds. Mean accuracy from cross validation was used for selection of the model with best parameter values, which were used to develop a new ANN model that outperformed all previously developed models on test data. Subsequent use of selected variables to develop new models led to reduced tuning computational cost, but not improved performance. Since the real-life effect of a misclassification cost is greater than the tedious computation cost, it was concluded that BDA is the best compromise.
AB - Using 693 000 datacells from 33 000 sample construction firms that operated or failed between 2008 and 2017, failure prediction models were developed using artificial neural network (ANN), support vector machine, multiple discriminant analysis (MDA), and logistic regression (LR). The accuracy of the models on test data surprisingly showed ANN to have only a slightly better accuracy than LR and MDA. The ANN's number of units in the hidden layer and weight decay hyperparameters were consequently tuned using the grid search. Tuning process led to tedious machine computation that was aborted after many hours without completion. The state of art big data analytics (BDA) technology was, for the first time in failure prediction, consequently employed and the tuning was completed in some seconds. Mean accuracy from cross validation was used for selection of the model with best parameter values, which were used to develop a new ANN model that outperformed all previously developed models on test data. Subsequent use of selected variables to develop new models led to reduced tuning computational cost, but not improved performance. Since the real-life effect of a misclassification cost is greater than the tedious computation cost, it was concluded that BDA is the best compromise.
KW - Artificial neural networks
KW - big data applications
KW - construction industry
KW - machine learning
KW - predictive models
KW - support vector machines
UR - http://www.scopus.com/inward/record.url?scp=85051761533&partnerID=8YFLogxK
U2 - 10.1109/TEM.2018.2856376
DO - 10.1109/TEM.2018.2856376
M3 - Article
SN - 0018-9391
VL - 66
SP - 689
EP - 698
JO - IEEE Transactions on Engineering Management
JF - IEEE Transactions on Engineering Management
IS - 4
M1 - 8438924
ER -