logo

SCIENCE CHINA Information Sciences, Volume 63 , Issue 9 : 190103(2020) https://doi.org/10.1007/s11432-019-2929-9

An analysis of correctness for API recommendation: are the unmatched results useless?

More info
  • ReceivedSep 20, 2019
  • AcceptedApr 24, 2020
  • PublishedAug 13, 2020

Abstract

API recommendation is a promising approach which is widely used during software development.However, the evaluation of API recommendation is not explored with sufficient rigor.The current evaluation of API recommendation mainly focuses oncorrectness, the measurement is conducted by matching recommended results with ground-truth results.In most cases, there is only one set of ground-truth APIs for each recommendation attempt,but the object code can be implemented in dozens of ways.The neglect of code diversity results in a possible defect in the evaluation.To address the problem, we invite 15 developers to analyze the unmatched results ina user study.The online evaluation confirms that some unmatched APIs can also benefit to programmingdue to the functional correlation with ground-truth APIs. Then we measure the API functionalcorrelation based on the relationships extracted from API knowledge graph, API method name,and API documentation.Furthermore, we propose an approach to improve the measurement of correctnessbased on API functional correlation. Our measurement is evaluated on a dataset of 6141 requirements and historical codefragments from related commits.The results show that 28.2% of unmatched APIs can contribute to correctness in our experiments.


Acknowledgment

This work was supported in part by National Key RD Program of China (Grant No. 2018YFB100-3900), in part by National Natural Science Foundation of China (Grant Nos. 61402103, 61572126, 61872078), in part by Open Research Fund of Key Laboratory of Safety-Critical Software Fund (Nanjing University of Aeronautics and Astronautics) (Grant No. NJ2019006), and in part by Key Laboratory of Computer Network and Information Integration of the Ministry of Education of China (Grant No. 93K-9).


References

[1] Xia X, Bao L, Lo D. What do developers search for on the web?. Empir Software Eng, 2017, 22: 3149-3185 CrossRef Google Scholar

[2] Bao L F, Xing Z C, Wang X Y, et al. Tracking and analyzing cross-cutting activities in developers' daily work. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, 2015. 277--282. Google Scholar

[3] Lv C, Jiang W, Liu Y, et al. APISynth: a new graph-based API recommender system. In: Proceedings of the 36th International Conference on Software Engineering, 2014. 596--597. Google Scholar

[4] Thung F. API recommendation system for software development. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 896--899. Google Scholar

[5] Yu H B, Song W H, Mine T. APIBook: an effective approach for finding APIs. In: Proceedings of the 8th Asia-Pacific Symposium on Internetware, 2016. 45--53. Google Scholar

[6] Robillard M, Walker R, Zimmermann T. Recommendation Systems for Software Engineering. IEEE Softw, 2010, 27: 80-86 CrossRef Google Scholar

[7] Rahman M M, Roy C K, Lo D. Rack: automatic API recommendation using crowdsourced knowledge. In: Proceedings of the 23rd International Conference on Software Analysis, Evolution, and Reengineering, 2016. 349--359. Google Scholar

[8] Raghothaman M, Wei Y, Hamadi Y. Swim: synthesizing what I mean-code search and idiomatic snippet synthesis. In: Proceedings of the 38th International Conference on Software Engineering, 2016. 357--367. Google Scholar

[9] Xu C, Sun X, Li B. MULAPI: Improving API method recommendation with API usage location. J Syst Software, 2018, 142: 195-205 CrossRef Google Scholar

[10] Proksch S, Amann S, Nadi S, et al. Evaluating the evaluations of code recommender systems: a reality check. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016. 111--121. Google Scholar

[11] Nguyen A T, Nguyen T N. Graph-based statistical language model for code. In: Proceedings of the 37th IEEE International Conference on Software Engineering, 2015. 858--868. Google Scholar

[12] Asaduzzaman M, Roy C K, Schneider K A. A Simple, Efficient, Context-sensitive Approach for Code Completion. J Softw Evol Proc, 2016, 28: 512-541 CrossRef Google Scholar

[13] Thung F, Lo D, Lawall J. Automated library recommendation. In: Proceedings of the 20th Working Conference on Reverse Engineering, 2013. 182-=191. Google Scholar

[14] Li J, Wang Y, Lyu M R, et al. Code completion with neural attention and pointer networks. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018. 4159--4165. Google Scholar

[15] Yuan W Z, Nguyen H H, Jiang L X, et al. LibraryGuru: API recommendation for Android developers. In: Proceedings of the 40th International Conference on Software Engineering, 2018. 364--365. Google Scholar

[16] Cao B Q, Liu X Q, Rahman M, et al. Integrated content and network-based service clustering and web APIs recommendation for mashup development. IEEE Trans Serv Comput, 2020, 13: 99--113. Google Scholar

[17] Beel J, Genzmehr M, Langer S, et al. A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation. In: Proceedings of International Workshop on Reproducibility and Replication in Recommender Systems Evaluation, 2013. 7--14. Google Scholar

[18] Sanderson M. Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. ISBN-13 978-0-521-86571-5, xxi + 482 pages.. Nat Lang Eng, 2010, 16: 100-103 CrossRef Google Scholar

[19] Thung F, Wang S W, Lo D, et al. Automatic recommendation of API methods from feature requests. In: Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering, 2013. 290--300. Google Scholar

[20] Thung F, Oentaryo R J, Lo D. WebAPIRec: Recommending Web APIs to Software Projects via Personalized Ranking. IEEE Trans Emerg Top Comput Intell, 2017, 1: 145-156 CrossRef Google Scholar

[21] Nguyen A T, Hilton M, et al. API code recommendation using statistical learning from fine-grained changes. In: Proceedings of ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 511--522. Google Scholar

[22] Shani G, Gunawardana A. Evaluating recommendation systems. In: Recommender systems handbook. Berlin: Springer, 2011. 257--297. Google Scholar

[23] Huang Q, Xia X, Xing Z C, et al. API method recommendation without worrying about the task-API knowledge gap. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018. 293--304. Google Scholar

[24] Nguyen A T, Hilton M, Codoban M, et al. API code recommendation using statistical learning from fine-grained changes. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 511--522. Google Scholar

[25] Bruch M, Monperrus M, Mezini M. Learning from examples to improve code completion systems. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, 2009. 213--222. Google Scholar

[26] Gu X D, Zhang H Y, Zhang D M, et al. Deep API learning. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 631--642. Google Scholar

[27] Weimer M, Karatzoglou A, Le V, et al. COFIRANK: maximum margin matrix factorization for collaborative ranking. In: Proceedings of Annual Conference on Neural Information Processing Systems, 2007. 222--230. Google Scholar

[28] Weimer W, Fry Z P, Forrest S. Leveraging program equivalence for adaptive program repair: models and first results. In: Proceedings of the 28th International Conference on Automated Software Engineering, 2013. 356--366. Google Scholar

[29] Levenshtein V. Binary codes capable of correcting insertions and reversals. Sov Phys Dokl, 1966, 10: 707--710. Google Scholar

  • Figure 1

    An example on the contributions of unmatched APIs.

  • Figure 2

    An example of “see also" relationship in API documentation.

  • Figure 3

    The overall framework of the evaluation of correctness based on API functional correlation.

  • Figure 4

    Combination of API functional correlations in a matrix.

  • Figure 5

    (Color online) Statistics of the difference between traditional and improved measurements.

  •   

    Algorithm 1 Greedy weight tuning algorithm

    Require:numIter =number of iterations.

    ${\rm~weights}~=~\{0.0,0.0,1.0,1.0,1.0\}$;

    ${\rm~maxEvalScore}~=~0$;

    ${\rm~valForMax}~=~0$;

    for $i=0$ to numIter

    for $j=0$ to weights.Length

    for $k=1.0$ to $0.0$ by 0.01

    weights$[j]~=~k$;

    evalScore = eval();

    if maxEvalScore $>$ evalScore then

    maxEvalScore = evalScore;

    valForMax$~=~k$;

    end if

    end for

    weights$[j]=$valForMax;

    end for

    end for

    return weights.

  • Table 1  

    Table 1The statistics of projects in our study

    Project # Feature requests Project # Feature requests
    Ambari 176 Avro 155
    Axis2 162 Cassandra 195
    CXF 192 Derby 199
    Drill 180 Falcon 155
    Felix 121 Flume 154
    Groovy 162 HBase 286
    Hive 397 Jackrabbit 201
    LuceneCore 358 Mahout 285
    MyFaces 188 Nutch 237
    OFBiz 164 Oozie 111
    PDFBox 173 Phoenix 166
    Pig 318 Qpid 175
    Solr 393 Sqoop 121
    Tajo 138 Tez 92
    Thrift 158 Tika 119
    Wicket 79 ZooKeeper 131
  • Table 2  

    Table 2The feedback on unmatched results from the user study

    Developers Useless (%) A little helpful (%) Very helpful (%)
    Senior developers 60 25 15
    Primary developers 54 30 16
  • Table 3  

    Table 3The recommendation results on HBase-18646

    Feature request HBase-18646
    Recommended results [‘LoggerFactory.getLogger',‘Arrays.asList',‘Path.toString',‘Configuration.setBoolean',‘Configuration.set']
    Ground-truth results [‘LogFactory.getLog']
    Precision and Recall 0.0 and 0.0
    $\rm~Precision_{corr}$ and $\rm~Recall_{corr}$ 0.154 and 0.77

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号