logo

SCIENCE CHINA Information Sciences, Volume 59, Issue 12: 122901(2016) https://doi.org/10.1007/s11432-015-5426-3

Empirical analysis of network measures for predicting high severity software faults

More info
  • ReceivedMay 18, 2016
  • AcceptedJun 18, 2016
  • PublishedNov 7, 2016

Abstract

Network measures are useful for predicting fault-prone modules. However, existing work has not distinguished faults according to their severity. In practice, high severity faults cause serious problems and require further attention. In this study, we explored the utility of network measures in high severity fault-proneness prediction. We constructed software source code networks for four open-source projects by extracting the dependencies between modules. We then used univariate logistic regression to investigate the associations between each network measure and fault-proneness at a high severity level. We built multivariate prediction models to examine their explanatory ability for fault-proneness, as well as evaluated their predictive effectiveness compared to code metrics under forward-release and cross-project predictions. The results revealed the following: (1) most network measures are significantly related to high severity fault-proneness; (2) network measures generally have comparable explanatory abilities and predictive powers to those of code metrics; and (3) network measures are very unstable for cross-project predictions. These results indicate that network measures are of practical value in high severity fault-proneness prediction.


Acknowledgment

Acknowledgments

This work was partially supported by National Natural Science Foundation of China (Grant Nos. 61472175, 61472178, 61272082, 61272080, 91418202), Natural Science Foundation of Jiangsu Pro-vince (Grant No. BK20130014), and Natural Science Foundation of Colleges in Jiangsu Province (Grant No. \linebreak 13KJB520018). All support is gratefully acknowledged.


References

[1] Zhou Y, Leung H. Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Trans Softw Eng, 2006, 32: 771-789 CrossRef Google Scholar

[2] Chhillar R S, Nisha . Empirical analysis of object-oriented design metrics for predicting high, medium and low severity faults using mallows Cp. ACM SIGSOFT Softw Eng Notes, 2011, 36: 1-9 Google Scholar

[3] Basili V R, Briand L C, Melo W L. A validation of object-oriented design metrics as quality indicators. IEEE Trans Softw Eng, 1996, 22: 751-761 CrossRef Google Scholar

[4] Subramanyam R, Krishnan M S. Empirical analysis of ck metrics for object-oriented design complexity: implications for software defects. IEEE Trans Softw Eng, 2003, 29: 297-310 CrossRef Google Scholar

[5] Nagappan N, Ball T, Zeller A. Mining metrics to predict component failures. In: Proceedings of the 28th International Conference on Software Engineering. New York: ACM, 2006. 452--461. Google Scholar

[6] Zhang H Y. An investigation of the relationships between lines of code and defects. In: Proceedings of 2009 IEEE International Conference on Software Maintenance. Piscataway: IEEE, 2009. 274--283. Google Scholar

[7] Nagappan N, Ball T. Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th International Conference on Software Engineering. Piscataway: IEEE, 2005. 284--292. Google Scholar

[8] Moser R, Pedrycz W, Succi G. A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction. In: Proceedings of the 30th International Conference on Software Engineering. Piscataway: IEEE, 2008. 181--190. Google Scholar

[9] Hassan A E. Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering. Piscataway: IEEE, 2009. 78--88. Google Scholar

[10] Hassan A E, Holt R C. The top ten list: dynamic fault prediction. In: Proceedings of the 21st IEEE International Conference on Software Maintenance. Piscataway: IEEE, 2005. 263--272. Google Scholar

[11] Ostrand T J, Weyuker E J, Bell R M. Predicting the location and number of faults in large software systems. IEEE Trans Softw Eng, 2005, 31: 340-355 CrossRef Google Scholar

[12] Zhang W Q, Nie L M, Jiang H, et al. Developer social networks in software engineering: construction, analysis, and applications. Sci China Inf Sci, 2014, 57: 121101-355 Google Scholar

[13] Zimmermann T, Nagappan N. Predicting defects with program dependencies. In: Proceedings of the 3rd International Symposium on Empirical Software Engineering and Measurement. Piscataway: IEEE, 2009: 435--438. Google Scholar

[14] Zimmermann T, Nagappan N. Predicting defects using network analysis on dependency graphs. In: Proceedings of the 30th International Conference on Software Engineering. New York: ACM, 2008. 531--540. Google Scholar

[15] Taba S E S, Khomh F, Zou Y, et al. Predicting bugs using antipatterns. In: Proceedings of the 29th IEEE International Conference on Software Maintenance. Piscataway: IEEE, 2013. 270--279. Google Scholar

[16] Tosun A, Turhan B, Bener A. Validation of network measures as indicators of defective modules in software systems. In: Proceedings of the 5th International Conference on Predictor Models in Software Engineering. New York: ACM, 2009. 1--5. Google Scholar

[17] Premraj R, Herzig K. Network versus code metrics to predict defects: a replication study. In: Proceedings of the 5th International Symposium on Empirical Software Engineering and Measurement. Piscataway: IEEE, 2011. 215--224. Google Scholar

[18] Nguyen T H D, Adams B, Hassan A E. Studying the impact of dependency network measures on software quality. In: Proceedings of the 2010 IEEE International Conference on Software Maintenance. Piscataway: IEEE, 2010. 1--10. Google Scholar

[19] Ma Y, He K, Li B, et al. How multiple-dependency structure of classes affects their functions a statistical perspective. In: Proceedings of the 2nd International Conference on Software Technology and Engineering. Piscataway: IEEE, 2010, v2: 60--66. Google Scholar

[20] Basili V R, Shull F, Lanubile F. Building knowledge through families of experiments. IEEE Trans Softw Eng, 1999, 25: 456-473 CrossRef Google Scholar

[21] Halstead M H. Elements of Software Science. New York: Elsevier, 1977. 50--70. Google Scholar

[22] Chidamber S R, Kemerer C F. A metrics suite for object oriented design. IEEE Trans Softw Eng, 1994, 20: 476-493 CrossRef Google Scholar

[23] Hosmer Jr D W, Lemeshow S. Applied Logistic Regression. New Jersey: John Wiley & Sons, 2004. 153--223. Google Scholar

[24] Belsley D A, Kuh E, Welsch R E. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New Jersey: John Wiley & Sons, 2005. 6--38. Google Scholar

[25] Harrell F E. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer, 2001. 215--268. Google Scholar

[26] Kutner M H, Nachtsheim C, Neter J. Applied Linear Regression Models. 4th ed. Chicago: Irwin, 2004. 20--70. Google Scholar

[27] Maddala G S. Limited-Dependent and Qualitative Variables in Econometrics. New York: Cambridge University Press, 1983. 100--124. Google Scholar

[28] Nagelkerke N J D. A note on a general definition of the coefficient of determination. Biometrika, 1991, 78: 691-692 CrossRef Google Scholar

[29] Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc Ser B (Methodological), 1995, 57: 289-300 Google Scholar

[30] Freeman E A, Moisen G G. A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa. Ecol Model, 2008, 217: 48-58 CrossRef Google Scholar

[31] He Z, Shu F, Yang Y, et al. An investigation on the feasibility of cross-project defect prediction. Autom Softw Eng, 2012, 19: 167-199 CrossRef Google Scholar

[32] Chang R H, Mu X D, Zhang L. Software defect prediction using non-negative matrix factorization. J Softw, 2011, 6: 2114-2120 Google Scholar

[33] Yin R K. Case Study Research: Design and Methods. 3rd ed. New York: SAGE Publications, 2002. 120--180. Google Scholar

[34] Kim S, Zhang H, Wu R, et al. Dealing with noise in defect prediction. In: Proceedings of the 33rd International Conference on Software Engineering. Piscataway: IEEE, 2011. 481--490. Google Scholar

[35] Zhou Y, Xu B, Leung H. On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J Syst Softw, 2010, 83: 660-674 CrossRef Google Scholar

[36] Zhou Y, Leung H, Xu B. Examining the potentially confounding effect of class size on the associations between object-oriented Metrics and change-proneness. IEEE Trans Softw Eng, 2009, 35: 607-623 CrossRef Google Scholar

[37] Pan K, Kim S, Whitehead E J. Bug classification using program slicing metrics. In: Proceedings of the 6th International Working Conference on Source Code Analysis and Manipulation, Philadelphia, 2006. 31--42. Google Scholar

[38] Koru A G, Tian J. Comparing high-change modules and modules with the highest measurement values in two large-scale open-source products. IEEE Trans Softw Eng, 2005, 31: 625-642 CrossRef Google Scholar

[39] Menzies T, Greenwald J, Frank A. Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng, 2007, 33: 2-13 CrossRef Google Scholar

[40] Singh Y, Kaur A, Malhotra R. Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J, 2010, 18: 3-35 CrossRef Google Scholar

[41] Shatnawi R, Li W. The effectiveness of software metrics in identifying fault-prone classes in post-release software evolution process. J Syst Softw, 2008, 81: 1868-1882 CrossRef Google Scholar

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1