logo

SCIENCE CHINA Information Sciences, Volume 64 , Issue 3 : 132101(2021) https://doi.org/10.1007/s11432-019-2745-5

Identifying change patterns of API misuses from code changes

More info
  • ReceivedSep 23, 2019
  • AcceptedDec 26, 2019
  • PublishedFeb 7, 2021

Abstract


References

[1] Robillard M P, DeLine R. A field study of API learning obstacles. Empir Software Eng, 2011, 16: 703-732 CrossRef Google Scholar

[2] Hou D, Li L. Obstacles in using frameworks and APIs: An exploratory study of programmers' newsgroup discussions. In: Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension, 2011. 91--100. Google Scholar

[3] Nadi S, Krüger S, Mezini M, et al. Jumping through hoops: why do Java developers struggle with cryptography apis? In: Proceedings of the 38th International Conference on Software Engineering, 2016. 935--946. Google Scholar

[4] Zibran M F, Eishita F Z, Roy C K. Useful, but usable? factors affecting the usability of apis. In: Proceedings of the 2011 18th Working Conference on Reverse Engineering, 2011. 151--155. Google Scholar

[5] Robillard M P, Bodden E, Kawrykow D. Automated API Property Inference Techniques. IIEEE Trans Software Eng, 2013, 39: 613-637 CrossRef Google Scholar

[6] Zhong H, Xie T, Zhang L, et al. MAPO: Mining and recommending API usage patterns. In: Drossopoulou S, eds. ECOOP 2009--- Object-Oriented Programming. ECOOP 2009. Lecture Notes in Computer Science, vol 5653. Berlin, Heidelberg: Springer, 2009. 318--343. Google Scholar

[7] Uddin G, Robillard M P. How API Documentation Fails. IEEE Softw, 2015, 32: 68-75 CrossRef Google Scholar

[8] Linares-Vásquez M, Bavota G, Bernal-Cárdenas C, et al. API change and fault proneness: a threat to the success of android apps. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, 2013. 477--487. Google Scholar

[9] McDonnell T, Ray B, and Kim M. An empirical study of api stability and adoption in the android ecosystem. In: Proceedings of the 2013 IEEE International Conference on Software Maintenance, 2013. 70--79. Google Scholar

[10] Dig D, Johnson R. How do APIs evolve? A story of refactoring. J Softw Maint Evol-Res Pract, 2006, 18: 83-107 CrossRef Google Scholar

[11] Xavier L, Brito A, Hora A, et al. Historical and impact analysis of API breaking changes: a large-scale study. In: 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2017. 138--147. Google Scholar

[12] Jezek K, Dietrich J, Brada P. How Java APIs break - An empirical study. Inf Software Tech, 2015, 65: 129-146 CrossRef Google Scholar

[13] Raemaekers S, van Deursen A, Visser J. Semantic versioning and impact of breaking changes in the Maven repository. J Syst Software, 2017, 129: 140-158 CrossRef Google Scholar

[14] Jung C, Rus S, Railing B P. Brainy. SIGPLAN Not, 2011, 46: 86 CrossRef Google Scholar

[15] Xu G. CoCo: Sound and adaptive replacement of Java collections. In: Castagna G, eds. ECOOP 2013---Object-Oriented Programming. ECOOP 2013. Lecture Notes in Computer Science, vol 7920. Berlin, Heidelberg: Springer, 2013. 1--26. Google Scholar

[16] Chen B, Liu Y, Le W. Generating performance distributions via probabilistic symbolic execution. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 49--60. Google Scholar

[17] Zhao Y, Xiao L, Wang X, et al. Localized or architectural: an empirical study of performance issues dichotomy. In: Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2019. 316--317. Google Scholar

[18] Georgiev M, Iyengar S, Jana S, et al. The most dangerous code in the world: validating SSL certificates in non-browser software. In: Proceedings of the 2012 ACM Conference on Computer and Communications Security, 2012. 38--49. Google Scholar

[19] Fahl S, Harbach M, Perl H, et al. Rethinking SSL development in an appified world. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, 2013. 49--60. Google Scholar

[20] Egele M, Brumley D, Fratantonio Y, et al. An empirical study of cryptographic misuse in android applications. In: Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, 2013. 73--84. Google Scholar

[21] Li L, Bissyandé T F, Traon Y L, et al. Accessing inaccessible android apis: an empirical study. In: Proceedings of the 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2016. 411--422. Google Scholar

[22] Li Z, Zhou Y. PR-Miner. SIGSOFT Softw Eng Notes, 2005, 30: 306 CrossRef Google Scholar

[23] Thummalapenta S, Xie T. Alattin: mining alternative patterns for detecting neglected conditions. In: Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, 2009. 283--294. Google Scholar

[24] Monperrus M, Bruch M, Mezini M. Detecting missing method calls in object-oriented software. In: D'Hondt T, eds. ECOOP 2010---Object-Oriented Programming. ECOOP 2010. Lecture Notes in Computer Science, vol 6183. Berlin, Heidelberg: Springer, 2010. 2--25. Google Scholar

[25] Wasylkowski A, Zeller A, Lindig C. Detecting object usage anomalies. In: Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, 2007. 35--44. Google Scholar

[26] Moritz E, Linares-Vásquez M, Poshyvanyk D, et al. ExPort: detecting and visualizing API usages in large source code repositories. In: Proceedings of the 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2013. 646--651. Google Scholar

[27] Fowkes J, Sutton C. Parameter-free probabilistic API mining across github. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 254--265. Google Scholar

[28] Zhang T, Upadhyaya G, Reinhardt A, et al. Are code examples on an online Q&A forum reliable? a study of API misuse on stack overflow. In: 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), 2018. 886--896. Google Scholar

[29] Williams C C, Hollingsworth J K. Recovering system specific rules from software repositories. SIGSOFT Softw Eng Notes, 2005, 30: 1 CrossRef Google Scholar

[30] Livshits B, Zimmermann T. DynaMine. SIGSOFT Softw Eng Notes, 2005, 30: 296 CrossRef Google Scholar

[31] Uddin G, Dagenais B, Robillard M P. Temporal analysis of API usage concepts. In: Proceedings of the 2012 34th International Conference on Software Engineering (ICSE), 2012. 804--814. Google Scholar

[32] Azad S, Rigby P C, Guerrouj L. Generating API call rules from version history and stack overflow posts. ACM Trans Softw Eng Methodol, 2017, 25: 1-22 CrossRef Google Scholar

[33] Liang B, Bian P, Zhang Y, et al. Antminer: mining more bugs by reducing noise interference. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 333--344. Google Scholar

[34] Ramanathan M K, Grama A, Jagannathan S. Path-sensitive inference of function precedence protocols. In: 29th International Conference on Software Engineering (ICSE'07), 2007. 240--250. Google Scholar

[35] Nguyen H A, Dyer R, Nguyen T N, et al. Mining preconditions of APIs in large-scale code corpus. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2014. 166--177. Google Scholar

[36] Ramanathan M K, Grama A, Jagannathan S. Static specification inference using predicate mining. SIGPLAN Not, 2007, 42: 123 CrossRef Google Scholar

[37] Wasylkowski A, Zeller A. Mining temporal specifications from object usage. Autom Softw Eng, 2011, 18: 263-292 CrossRef Google Scholar

[38] Chang R Y, Podgurski A, Yang J. Finding what's not there: a new approach to revealing neglected conditions in software. In: Proceedings of the 2007 International Symposium on Software Testing and Analysis, 2007. 163--173. Google Scholar

[39] Nguyen T T, Nguyen H A, Pham N H, et al. Graph-based mining of multiple object usage patterns. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, 2009. 383--392. Google Scholar

[40] Falleri J R, Morandat F, Blanc X, et al. Fine-grained and accurate source code differencing. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, 2014. 313--324. Google Scholar

[41] Sunghun Kim , Whitehead E J, Yi Zhang E J. Classifying Software Changes: Clean or Buggy?. IIEEE Trans Software Eng, 2008, 34: 181-196 CrossRef Google Scholar

[42] Jin G, Song L, Shi X. Understanding and detecting real-world performance bugs. SIGPLAN Not, 2012, 47: 77-88 CrossRef Google Scholar

[43] Chen Z, Chen B, Xiao L, et al. Speedoo: prioritizing performance optimization opportunities. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 811--821. Google Scholar

[44] Zhou Y and Sharma A. Automated identification of security issues from commit messages and bug reports. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017. 914--919. Google Scholar

[45] Wei L, Liu Y, Cheung S C. Taming android fragmentation: characterizing and detecting compatibility issues for Android apps. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 226--237. Google Scholar

[46] Herzig K, Zeller A. The impact of tangled code changes. In: Proceedings of the 2013 10th Working Conference on Mining Software Repositories (MSR), 2013. 121--130. Google Scholar

[47] Dias M, Bacchelli A, Gousios G, et al . Untangling fine-grained code changes. In: Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2015. 341--350. Google Scholar

[48] Hattori L P, Lanza M. On the nature of commits. In: Proceedings of the 2008 23rd IEEE/ACM International Conference on Automated Software Engineering-Workshops, 2008. 63--71. Google Scholar

[49] Liu H, Liu Q, Staicu C A, et al. Nomen est omen: exploring and exploiting similarities between argument and parameter names. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 1063--1073. Google Scholar

[50] Pradel M, Gross T R. Detecting anomalies in the order of equally-typed method arguments. In: Proceedings of the 2011 International Symposium on Software Testing and Analysis, 2011. 232--242. Google Scholar

[51] Pradel M, Gross T R. Name-based analysis of equally typed method arguments. IIEEE Trans Software Eng, 2013, 39: 1127-1143 CrossRef Google Scholar

[52] Rice A, Aftandilian E, Jaspan C, et al. Detecting argument selection defects. In: Proceedings of the ACM on Programming Languages, 2017. 104:1--104:22. Google Scholar

[53] Williams C C, Hollingsworth J K. Automatic mining of source code repositories to improve bug finding techniques. IIEEE Trans Software Eng, 2005, 31: 466-480 CrossRef Google Scholar

[54] Hovemeyer D, Pugh W. Finding bugs is easy. SIGPLAN Not, 2004, 39: 92-106 CrossRef Google Scholar

[55] Aftandilian E, Sauciuc R, Priya S, et al. Building useful program analysis tools using an extensible Java compiler. In: Proceedings of the 2012 IEEE 12th International Working Conference on Source Code Analysis and Manipulation, 2012. 14--23. Google Scholar

[56] Copeland T. PMD Applied. Arexandria, Va, USA: Centennial Books, 2005. Google Scholar

[57] Thung F, Lucia F, Lo D. To what extent could we detect field defects? An extended empirical study of false negatives in static bug-finding tools. Autom Softw Eng, 2015, 22: 561-602 CrossRef Google Scholar

[58] Habib A, Pradel M. How many of all bugs do we find? a study of static bug detectors. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018. 317--328. Google Scholar

[59] Sabetta A, Bezzi M. A practical approach to the automatic classification of security-relevant commits. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2018. 579--582. Google Scholar

[60] Xu Z, Chen B, Chandramohan M, et al. SPAIN: Security patch analysis for binaries towards understanding the pain and pills. In: Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017. 462--472. Google Scholar

[61] Pearson S, Campos J, Just R, et al. Evaluating and improving fault localization. In: Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017. 609--620. Google Scholar

[62] Kawrykow D, and Robillard M P. Non-essential changes in version histories. In: Proceedings of the 2011 33rd International Conference on Software Engineering (ICSE), 2011. 351--360. Google Scholar

[63] Barnett M, Bird C, Brunet J A, et al. Helping developers help themselves: automatic decomposition of code review changesets. In: Proceedings of the 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Florence, 2015. 134--144. Google Scholar

[64] Y. Tao and S. Kim, Partitioning composite code changes to facilitate code review. In: Proceedings of the 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, 2015. 180--190. Google Scholar

[65] Paletov R, Tsankov P, Raychev V. Inferring crypto API rules from code changes. SIGPLAN Not, 2018, 53: 450-464 CrossRef Google Scholar

[66] Amann S, Nguyen H A, Nadi S. A systematic evaluation of static API-misuse detectors. IIEEE Trans Software Eng, 2019, 45: 1170-1188 CrossRef Google Scholar

[67] Engler D, Chen D Y, Hallem S. Bugs as deviant behavior. SIGOPS Oper Syst Rev, 2001, 35: 57 CrossRef Google Scholar

[68] Eyal Salman H. Identification multi-level frequent usage patterns from APIs. J Syst Software, 2017, 130: 42-56 CrossRef Google Scholar

[69] Xie T, Pei J. MAPO: Mining API usages from open source repositories. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, 2006. 54--57. Google Scholar

[70] Kagdi H, Collard M L, Maletic J I. An approach to mining call-usage patternswith syntactic context. In: Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering, 2007. 457--460. Google Scholar

[71] Acharya M, Xie T, Pei J, et al. Mining API patterns as partial orders from source code: from usage scenarios to specifications. In: Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, 2007. 25--34. Google Scholar

[72] Gruska N, Wasylkowski A, Zeller A. Learning from 6000 projects: lightweight cross-project anomaly detection. In: Proceedings of the 19th International Symposium on Software Testing and Analysis, 2010. 119--130. Google Scholar

[73] Thummalapenta S, Xie T. Mining exception-handling rules as sequence association rules. In: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering, 2009. 496--506. Google Scholar

[74] Wang J, Dang Y, Zhang H, et al. Mining succinct and high-coverage API usage patterns from source code. In: Proceedings of the 2013 10th Working Conference on Mining Software Repositories (MSR), 2013. 319--328. Google Scholar

[75] Gu X, Zhang H, Zhang D, et al. Deep API learning. In: Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 631--642. Google Scholar

[76] Wen M, Liu Y, Wu R, et al. Exposing library API misuses via mutation analysis. In: Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 2019. Google Scholar

[77] Mandelin D, Xu L, Bodík R. Jungloid mining. SIGPLAN Not, 2005, 40: 48 CrossRef Google Scholar

[78] Zhong H, Zhang H L, Mei H. Inferring specifications of object oriented APIs from API source code. In: Proceedings of the 2008 15th Asia-Pacific Software Engineering Conference, 2008. 221--228. Google Scholar

[79] Buse R P, Weimer W. Synthesizing api usage examples. In: Proceedings of the 2012 34th International Conference on Software Engineering (ICSE), 2012. 782--792. Google Scholar

[80] Niu H, Keivanloo I, Zou Y. API usage pattern recommendation for software development. J Syst Software, 2017, 129: 127-139 CrossRef Google Scholar

[81] Wang S, Chollak D, Movshovitz-Attias D, et al. Bugram: bug detection with n-gram language models. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 708--719. Google Scholar

[82] Murali V, Chaudhuri S, Jermaine C. Bayesian specification learning for finding API usage errors. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017. 151--162. Google Scholar

[83] Murphy-Hill E, Sadowski C, Head A, et al. Discovering API usability problems at scale. In: Proceedings of the 2nd International Workshop on API Usage and Evolution, 2018. 14--17. Google Scholar

[84] Uddin G, Dagenais B, Robillard M P. Analyzing temporal API usage patterns. In: Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), 2011. 456--459. Google Scholar

[85] Bruch M, Monperrus M, Mezini M. Learning from examples to improve code completion systems. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, 2009. 213--222. Google Scholar

[86] Wang L, Fang L, Wang L, et al. APIExample: an effective web search based usage example recommendation system for Java APIs. In: Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), 2011. 592--595. Google Scholar

[87] Negara S, Codoban M, Dig D, et al. Mining fine-grained code changes to detect unknown change patterns. In: Proceedings of the 36th International Conference on Software Engineering, 2014. 803--813. Google Scholar

[88] Meng N, Kim M, McKinley K S. Systematic editing. SIGPLAN Not, 2011, 46: 329 CrossRef Google Scholar

[89] Meng N, Kim M, McKinley K S. LASE: locating and applying systematic edits by learning from examples. In: Proceedings of the 2013 35th International Conference on Software Engineering (ICSE), 2013. 502--511. Google Scholar

[90] Rolim R, Soares G, D'Antoni L, et al. Learning syntactic program transformations from examples. In: Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017. 404--415. Google Scholar

[91] Kim D, Nam J, Song J, et al. Automatic patch generation learned from human-written patches. In: Proceedings of the 2013 35th International Conference on Software Engineering (ICSE), 2013. 802--811. Google Scholar

[92] Long F, Amidon P, Rinard M. Automatic inference of code transforms for patch generation. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017. 727--739. Google Scholar

[93] Liu X, Zhong H. Mining stackoverflow for program repair. In: Proceedings of the 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2018. 118--129. Google Scholar

[94] Roychoudhury A, Xiong Y. Automated program repair: a step towards software automation. Sci China Inf Sci, 2019, 62: 200103 CrossRef Google Scholar

[95] Brown D B, Vaughn M, Liblit B, et al. The care and feeding of wild-caught mutants. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017. 511--522. Google Scholar

[96] Monperrus M, Eichberg M, Tekes E. What should developers be aware of? An empirical study on the directives of API documentation. Empir Software Eng, 2012, 17: 703-737 CrossRef Google Scholar

[97] Dekel U, Herbsleb J D. Improving API documentation usability with knowledge pushing. In: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering, 2009. 320--330. Google Scholar

[98] Saied M A, Sahraoui H, Dufour B. An observational study on API usage constraints and their documentation. In: 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2015. 33--42. Google Scholar

[99] Zhou Y, Gu R, Chen T, et al. Analyzing apis documentation and code to detect directive defects. In: 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017. 27--37. Google Scholar

[100] Wu W, Guéhéneuc Y G, Antoniol G, et al. AURA: a hybrid approach to identify framework evolution. In: Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 1, 2010. 325--334. Google Scholar

[101] Dagenais B, Robillard M P. Recommending adaptive changes for framework evolution. ACM Trans Softw Eng Methodol, 2011, 20: 1-35 CrossRef Google Scholar

  •   
  • Table 1  

    Table 1Distribution of change types

    Change type Extracted changes (#) Covered projects (#)
    T1 4793 (288) 386
    T2 3689 (271) 387
    T3 4026 (117) 380
    T4 4120 380
    T5 4796 403
    Missed 2119 280
  •   
  •   
  • Table 2  

    Table 2Statistics about commits and time overhead

    Total Commits after Commits after (#) Time overhead (s)
    commits (#)H1 (#)H2 and H3Step 1 Step 2 Step 3 Total
    Sum 3191057 715620 320222 945.6 257946.2 46521.7 305413.5
    Average 2746 616 276 0.8 222.0 40.0 262.8
  • Table 3  

    Table 3Statistics about detected bugs

    Change pattern Bugs Fixed
    $\langle$Integer.valueOf(String), Integer.parseInt(String)$\rangle$ 15 4
    $\langle$String.equals(Object), Objects.equals(Object, Object)$\rangle$ 7 4
    $\langle$Thread.sleep(long), CountDownLatch.await(long, TimeUnit)$\rangle$ 3 1
    StringBuilder.append(String) in T5 19 9
    Sum 44 18