logo

SCIENCE CHINA Information Sciences, Volume 63 , Issue 9 : 190101(2020) https://doi.org/10.1007/s11432-019-2830-8

LSTM-based argument recommendation for non-API methods

More info
  • ReceivedSep 19, 2019
  • AcceptedJan 20, 2020
  • PublishedAug 14, 2020

Abstract

Automatic code completion is one of the most useful features provided by advanced IDEs. Argument recommendation, as a special kind of code completion, is widely used as well. While existing approaches focus on argument recommendation for popular APIs, a large number of non-API invocations are requesting for accurate argument recommendation as well. To this end, we propose an LSTM-based approach to recommending non-API arguments instantly when method calls are typed in. With data collected from a large corpus of open-source applications, we train an LSTM neural network to recommend actual arguments based on identifiers of the invoked method, the corresponding formal parameter, and a list of syntactically correct candidate arguments. To feed these identifiers into the LSTM neural network, we convert them into fixed-length vectors by Paragraph Vector, an unsupervised neural network based learning algorithm. With the resulting LSTM neural network trained on sample applications, for a given call site we can predict which of the candidate arguments is more likely to be the correct one. We evaluate the proposed approach with ten-fold validation on 85 open-source C applications. Results suggest that the proposed approach outperforms the state-of-the-art approaches in recommending non-API arguments. It improves the precision significantly from 71.46% to 83.37%.


Acknowledgment

The work was supported by National Natural Science Foundation of China (Grant Nos. 61772071, 61690205, 61832009) and National Key RD Program (Grant Nos. 2018YFB1003904).


References

[1] Robillard M, Walker R, Zimmermann T. Recommendation systems for software engineering. IEEE software, (2010) 27(4):80--86. Google Scholar

[2] Murphy GC, Kersten M, Findlater L. How are Java software developers using the Eclipse IDE? IEEE Softw, 2006, 23(4):76--83, doi: 10.1109/MS.2006.105 http://dx.doi.org/10.1109/MS.2006.105. Google Scholar

[3] Liu H, Liu Q, Staicu C A, et al. Nomen est omen: exploring and exploiting similarities between argument and parameter names. In: Proceedings of the 38th International Conference on Software Engineering. New York: ACM, 2016. 1063--1073. Google Scholar

[4] Zhang C, Yang J, Zhang Y, et al. Automatic parameter recommendation for practical API usage. In: Proceedings of the 2012 International Conference on Software Engineering. Piscataway: IEEE Press, 2012. 826--836. Google Scholar

[5] Asaduzzaman M, Roy C K, Monir S, et al. Exploring API method parameter recommendations. In: Proceedings of 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2015. 271--280. Google Scholar

[6] Raychev V, Vechev M, Yahav E. Code completion with statistical language models. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation. New York: ACM, 2014. 419--428. Google Scholar

[7] Hellendoorn V J, Devanbu P. Are deep neural networks the best choice for modeling source code? In: Proceedings of Joint Meeting on Foundations of Software Engineering, 2017. 763--773. Google Scholar

[8] Le Q, Mikolov T. Distributed representations of sentences and documents. In: Proceedings of the 31th International Conference on Machine Learning, Beijing, 2014. 1188--1196. Google Scholar

[9] Kuoa RJ, Chen ZY, Tien FC. Integration of particle swarm optimization and genetic algorithm for dynamic clustering. Information Sciences 2012, 195(13):124--140. Google Scholar

[10] Pradel M, Sen K. Deepbugs: A learning approach to name-based bug detection. Proc ACM Program Lang, 2018, 2: 147:1--147:25, doi: 10.1145/3276517 http://doi.acm.org/10.1145/3276517. Google Scholar

[11] Liu H, Xu Z, Zou Y. Deep learning based feature envy detection. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. New York: ACM, 2018. 385--396. Google Scholar

[12] Liu H, Jin J, Xu Z. Deep Learning Based Code Smell Detection. IIEEE Trans Software Eng, 2019, : 1-1 CrossRef Google Scholar

[13] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation 1997, 9(8):1735--1780, doi: 10.1162/neco.1997.9.8.1735. Google Scholar

[14] Wu D, Chi M. Long Short-Term Memory With Quadratic Connections in Recursive Neural Networks for Representing Compositional Semantics. IEEE Access, 2017, 5: 16077-16083 CrossRef Google Scholar

[15] Theano Development Team. Theano: a Python framework for fast computation of mathematical expressions. 2016,. arXiv Google Scholar

[16] Sears A, Shneiderman B. Split menus:effectively using selection frequency to organize menus. ACM Transactions on Computer-Human Interaction (TOCHI) (1994), 1(1):27--51. Google Scholar

[17] Butler S, Wermelinger M, Yu Y, et al. Improving the tokenisation of identifier names. In: Proceedings of European Conference on Object-Oriented Programming. Berlin: Springer, 2011. 130--154. Google Scholar

[18] Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. 2013,. arXiv Google Scholar

[19] Pennington J, Socher R, Manning C. Glove: global vectors for word representation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2014. 1532--1543. Google Scholar

[20] Joulin A, Grave E, Bojanowski P, et al. Fasttext: Compressing text classification models. 2016. Google Scholar

[21] Hindle A, Barr E T, Gabel M, et al. On the naturalness of software. Commun ACM, 2016, 59: 122--131, doi: 10.1145/2902362, http://doi.acm.org/10.1145/2902362. Google Scholar

[22] Tu Z, Su Z, Devanbu P. On the localness of software. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. New York: ACM, 2014. 269--280. Google Scholar

[23] Allamanis M, Barr E T, Bird C, et al. Learning natural coding conventions. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. New York: ACM, 2014. 281--293. Google Scholar

[24] Allamanis M, Barr E T, Bird C, et al. Suggesting accurate method and class names. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. New York: ACM, 2015. 38--49. Google Scholar

[25] Raychev V, Vechev M, Krause A. Predicting program properties from “big code". In: Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. New York: ACM, 2015. 111--124. Google Scholar

[26] Lafferty J, McCallum A, Pereira FC. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th International Conference on Machine Learning. New York: ACM, 2001. 282--289. Google Scholar

[27] White M, Vendome C, Linares-Vásquez M, et al. Toward deep learning software repositories. In: Proceedings of the 12th Working Conference on Mining Software Repositories. Piscataway: IEEE Press, 2015. 334--345. Google Scholar

[28] Murali V, Qi L, Chaudhuri S, et al. Neural sketch learning for conditional program generation. 2017,. arXiv Google Scholar

[29] Wang K, Singh R, Su Z. Dynamic neural program embedding for program repair. 2017,. arXiv Google Scholar

[30] Harris Z S. Distributional structure. WORD, 1954, 10: 146--162. Google Scholar

  • Figure 1

    (Color online) Results of manual argument selection.

  • Figure 2

    (Color online) Overview of the approach.

  • Figure 3

    (Color online) Example.

  • Figure 4

    (Color online) Architecture of our network.

  • Figure 5

    (Color online) Comparison between inner and inter-type recommendation.

  • Figure 6

    Anova analysis on precision.

  • Figure 7

    Anova analysis on recall.

  • Figure 8

    Anova analysis on F1 measure.

  • Table 1  

    Table 1Comparison against existing approaches

    Ten-foldProposed approachSimilarity-based approachSLP-Core approach
    Precision (%) Recall (%) F1Precision (%) Recall (%) F1Precision (%) Recall (%) F1
    1#87.77 52.63 0.66 74.65 47.37 0.58 38.14 38.14 0.38
    2#79.26 65.44 0.72 73.09 58.18 0.65 37.94 37.94 0.38
    3#92.92 55.38 0.69 69.55 50.00 0.58 27.86 27.86 0.28
    4#93.97 68.74 0.79 85.27 64.30 0.73 52.96 52.96 0.53
    5#65.30 56.90 0.61 55.78 52.58 0.54 20.81 20.81 0.21
    6#84.23 70.22 0.77 66.79 59.35 0.63 28.03 28.03 0.28
    7#84.42 60.53 0.71 77.01 56.97 0.65 53.73 53.73 0.54
    8#84.42 57.96 0.69 66.37 51.10 0.58 28.46 28.46 0.28
    9#89.89 78.07 0.84 89.62 83.33 0.86 40.34 40.34 0.40
    10#71.54 58.99 0.65 56.43 42.91 0.49 26.48 26.48 0.26
    Avg83.37 62.49 0.71 71.46 56.61 0.63 35.47 35.47 0.35
  • Table 2  

    Table 2Influence of learning models

    Ten-foldProposed approachFNN-based approachCNN-based approach
    Precision (%) Recall (%) F1Precision (%) Recall (%) F1Precision (%) Recall (%) F1
    1#87.77 52.63 0.66 15.63 10.72 0.13 67.64 46.42 0.55
    2#79.26 65.44 0.72 19.50 13.11 0.16 59.94 40.30 0.48
    3#92.92 55.38 0.69 15.76 10.75 0.13 67.65 46.22 0.55
    4#93.97 68.74 0.79 10.75 7.61 0.09 73.86 52.26 0.61
    5#65.30 56.90 0.61 13.45 9.61 0.11 59.29 42.35 0.49
    6#84.23 70.22 0.77 20.88 14.90 0.18 60.52 41.19 0.49
    7#84.42 60.53 0.71 15.0 10.73 0.13 64.24 44.48 0.53
    8#84.42 57.96 0.69 16.72 11.39 0.14 67.62 46.09 0.55
    9#89.89 78.07 0.84 19.32 13.29 0.16 69.54 47.85 0.57
    10#71.54 58.99 0.65 19.46 12.86 0.15 68.83 45.48 0.55
    Avg83.37 62.49 0.71 16.80 11.50 0.14 65.92 45.26 0.54
  • Table 3  

    Table 3Influence of parameter types

    Type of parametersNumber of parameters Inter-type recommendationInner-type recommendation
    Precision (%)Recall (%) Precision (%) Recall (%)
    Struct395262 94.91 81.51 92.39 84.51
    Union1572 84.02 56.46 73.01 67.70
    Enum12953 88.60 23.77 80.54 82.52
    Void 14608 72.22 22.99 43.67 17.56
    Char14106 75.70 26.47 51.84 40.25
    Typedef10333 81.01 30.21 82.08 18.76
    Numeral117500 76.83 22.07 68.97 13.66
    Const 65914 80.03 21.75 62.13 21.69
    Bool 17251 81.71 18.04 44.56 13.59
  • Table 4  

    Table 4Influence of vectorization approaches

    Ten-foldParagraph VectorOne-hot approach
    Precision (%) Recall (%)F1Precision (%) Recall (%)F1
    1#87.77 52.63 0.66 86.11 45.32 0.59
    2#79.26 65.44 0.72 69.46 55.29 0.62
    3#92.92 55.38 0.69 87.78 48.61 0.63
    4#93.97 68.74 0.79 85.69 64.62 0.74
    5#65.30 56.90 0.61 59.95 34.11 0.43
    6#84.23 70.22 0.77 78.00 54.77 0.64
    7#84.42 60.53 0.71 77.54 46.93 0.58
    8#84.42 57.96 0.69 79.30 45.96 0.58
    9#89.89 78.07 0.84 78.08 72.61 0.75
    10#71.54 58.99 0.65 76.75 45.27 0.57
    Avg83.37 62.49 0.71 77.87 51.35 0.61
  • Table 5  

    Table 5Influence of evaluation models

    Evaluation modelPrecision (%)Recall (%)
    Cross-project evaluation83.37 62.49
    Within-project evaluation48.25 37.55
  • Table 6  

    Table 6Influence of unsupported arguments

    Evaluation data Precision (%)Recall (%)
    Including unsupported arguments83.3762.49
    Excluding unsupported arguments 92.6972.80

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号