1. 29 Jiangjun Ave. , Nanjing Jiangsu China 211106
2. Dalian University of Technology, Dalian University of Technology , Dalian China 116620
3. Peking University, |No.5 Yiheyuan Road Haidian District , Beijing Beijing China 100871
4. School of EECS, Peking University, RM1542, Science Building No,1, Peking University , Beijing Beijing China 100871
5. Hangzhou Dianzi University, Xiasha Higher Education Zone , Hangzhou Zhejiang China 310018
Developers increasingly rely on Application Programming Interface (API) libraries to facilitate software development. As constraints and restrictions of APIs in API specifications, API directives seriously impact developers on using APIs. Hence, it could be ideal if API directives can be automatically detected. However, resolving the task of detecting API directives needs to tackle two challenges, i.e., the imbalance between directives and non-directives (Class Imbalance Challenge) and multiple morphologies of directives (Multi-Morphologies Challenge). Even though researchers have proposed an approach relying on syntactic patterns to detect API directives, this approach cannot well tackle the two challenges and its results need to be further improved. In this paper, we propose a novel deep learning based approach named DeepDir. To address the Class Imbalance Challenge, DeepDir first over samples API directives in the class imbalanced training set to balance directives and non-directives. Then, it trains a Bidirectional Long Short Term Memory (Bi-LSTM) network to capture the semantic differences between directives and non-directives to tackle the Multi-Morphologies Challenge. Finally, given a new sentence in an API specification, the trained Bi-LSTM network is used to predict whether it is a directive or not. We have evaluated DeepDir over an annotated API directive corpus with more than 85 thousand sentences from three API specifications. The experimental results reveal that the over sampling strategy could boost the performance of DeepDir. In addition, DeepDir significantly improves the state-of-the-art approach by up to 22.83% in terms of F-Measure. When conducting the cross-project prediction, DeepDir achieves a F-Measure of up to 58.98%.
This work is partially supported by the National Key Research and Development Plan of China under Grants No. 2018YFB1003900.
Copyright 2020 CHINA SCIENCE PUBLISHING & MEDIA LTD. 中国科技出版传媒股份有限公司 版权所有