1. State Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210093, China
2. Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester 14623, USA
Various software-engineering problems have been solved by crowdsourcing.In many projects, the software outsourcing process is streamlined on cloud-based platforms.Among software engineering tasks, test-case development is particularly suitable for crowdsourcing, because a large number of test cases can be generated at little monetary cost.However, the numerous test cases harvested from crowdsourcing can be high- or low-quality.Owing to the large volume, distinguishing the high-quality tests by traditional techniques is computationally expensive.Therefore, crowdsourced testing would benefit from an efficient mechanism distinguishes the qualities of the test cases.This paper introduces an automated approach — TCQAxspace — to evaluate the quality of test cases based on the onsite coding history.Quality assessment by TCQAxspace proceeds through three steps: (1) modeling the code history as a time series, (2) extracting the multiple relevant features from the time series, and (3) building a model that classifies the test cases based on their qualities. Step (3) is accomplished by feature-based machine-learning techniques.By leveraging the onsite coding history, TCQAxspace can assess the test-case quality without performing expensive source-code analysis or executing the test cases.Using the data of nine test-development tasks involving more than 400 participants, we evaluated TCQAxspace from multiple perspectives.The TCQAxspace approach assessed the quality of the test cases with higher precision, faster speed, and lower overhead than conventional test-case quality-assessment techniques.Moreover, TCQAxspace provided yield real-time insights on test-case quality before the assessment was finished.
This work was partly supported by National Key Research and Development Program of China (Grant No. 2018YFB1403400) and National Natural Science Foundation of China (Grant Nos. 61690201, 61772014).
(Color online) Overview of assessing the quality of a test case from the dynamic code history using TCQAxspace.
(Color online) Data-volume dependence on the precision performance of each task (in the within-task scenarios). (a) CMD; (b) Datalog; (c) ITClocks; (d) JMerkle; (e) LunarCalendar; (f) QuadTree.
Representative dynamic histories of codes with different quality levels. The $x$ and $y$ axes represent the percentage of the development, time and the size growth of the test-case code, respectively. (a) Low quality tests; (b) medium quality tests; (c) high quality tests.
|Maximum||Highest (normalized) value of the time series.|
|Simple metrics||Mean||Mean of the time series.|
|sum_of_reoccurring_values||Sum of reoccurring values in the time series.|
|c3*||Non-linearity of the time series, see |
|Statistical metrics||abs_energy||Absolute energy of the time series (sum of the squared values).|
|agg_linear_trend*||Linear least-squares regression of values of the time series.|
|fft_coefficient*||Fourier coefficients of the one-dimensional|
|Frequency-based metrics||discrete fast fourier transform for real parameters.|
|spkt_welch_density||Cross-power spectral density of the time series at different frequencies.|
* Multiple features of this type can result from different input parameters.
|Task||No. tests||LOC||No. classes|
|Task||Traditional scoring||TCQA||TCQA in production environment|
|Feature extraction||Training||Prediction||(feature extraction + prediction)|
Copyright 2020 CHINA SCIENCE PUBLISHING & MEDIA LTD. 中国科技出版传媒股份有限公司 版权所有