SCIENCE CHINA Information Sciences, Volume 59, Issue 8: 080104(2016) https://doi.org/10.1007/s11432-016-5595-8

## Determinants of pull-based development in the context of continuous integration

• AcceptedMay 30, 2016
• PublishedJul 18, 2016
Share
Rating

### Abstract

The pull-based development model, widely used in distributed software teams on open source communities, can efficiently gather the wisdom from crowds. Instead of sharing access to a central repository, contributors create a fork, update it locally, and request to have their changes merged back, i.e.xspace submit a pull-request. On the one hand, this model lowers the barrier to entry for potential contributors since anyone can submit pull-requeststo any repository, but on the other hand it also increases the burden on integrators, who are responsible for assessing the proposed patches and integrating the suitable changes into the central repository. The role of integrators in pull-based development is crucial. They must not only ensure that pull-requests should meet the project's quality standards before being accepted, but also finish the evaluations in a timely manner. To keep up with the volume of incoming pull-requests, continuous integration (CI) is widely adopted to automatically build and test every pull-request at the time of submission. $\mathcal{CI}$xspace provides extra evidences relating to the quality of pull-requests, which would help integrators to make final decision (i.e.xspace accept or reject). In this paper, we present a quantitative study that tries to discover which factors affect the process of pull-based development model, including acceptance and latency in the context of CI. Using regression modeling on data extracted from a sample of GitHub projects deploying the Travis-CI service, we find that the evaluation process is a complex issue, requiring many independent variables to explain adequately. In particular, $\mathcal{CI}$xspace is a dominant factor for the process, which not only has a great influence on the evaluation process per se, but also changes the effects of some traditional predictors.

### Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant Nos. 61432020, 61472430, 61502512) and Postgraduate Innovation Fund of University of Defense Technology (Grant No. B130607). We thank Premkumar Devanbu, Vladimir Filkov and Bogdan Vasilescu for their very useful feedback on this paper.

### References

[1] Osterweil L. Software processes are software too. In: Proceedings of the 9th International Conference on Software Engineering Los Alamitos: IEEE, 1987. 2--13. Google Scholar

[2] Jiang J J, Klein G, Hwang H G. An exploration of the relationship between software development process maturity and project performance. Inf Manage, 2004, 41: 279-288 CrossRef Google Scholar

[3] Kogut B. Open-Source Software Development and Distributed Innovation. Oxford Rev Economic Policy, 2001, 17: 248-264 CrossRef Google Scholar

[4] Barr E T, Bird C, Rigby P C, et al. Cohesive and isolated development with branches. In: Proceedings of the 15th International Conference on Fundamental Approaches to Software Engineering Berlin/Heidelberg: Springer-Verlag, 2012. 316--331. Google Scholar

[5] Gousios G, Pinzger M, van Deursen A. An exploratory study of the pull-based software development model. In: Proceedings of the 36th International Conference on Software Engineering New York: ACM, 2014. 345--355. Google Scholar

[6] Gousios G, Zaidman A, Storey M-A, et al. Work practices and challenges in pull-based development: the integrator's perspective. In: Proceedings of the 37th International Conference on Software Engineering Piscataway: IEEE, 2015. 358--368. Google Scholar

[7] Bird C, Gourley A, Devanbu P, et al. Open borders? Immigration in open source projects. In: Proceedings of the 4th International Workshop on Mining Software Repositories. Washington, DC: IEEE, 2007. 6. Google Scholar

[8] Gharehyazie M, Posnett D, Vasilescu B, et al. Developer initiation and social interactions in OSS: a case study of the Apache Software Foundation Empir Softw Eng 2014, 20: 1318--1353. Google Scholar

[9] Gousios G, Storey M-A, Bacchelli A. Work practices and challenges in pull-based development: the contributor's perspective. In: Proceedings of the 38th International Conference on Software Engineering New York: ACM, 2016. 285--296. Google Scholar

[10] Dabbish L, Stuart C, Tsay J, et al. Social coding in GitHub: transparency and collaboration in an open software repository. In: Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work New York: ACM, 2012. 1277--1286. Google Scholar

[11] Dabbish L, Stuart C, Tsay J. Leveraging Transparency. IEEE Softw, 2013, 30: 37-43 CrossRef Google Scholar

[12] Yu Y, Wang H M, Yin G, et al. Who should review this pull-request: reviewer recommendation to expedite crowd collaboration. In: Proceedings of the 2014 21st Asia-Pacific Software Engineering Conference Jeju, 2014. 335--342. Google Scholar

[13] Yu Y, Wang H M, Yin G, et al. Reviewer recommender of pull-requests in GitHub In: Proceedings of the 2014 International Conference on Software Maintenance and Evolution Washington, DC: IEEE, 2014. 609--612. Google Scholar

[14] Yu Y, Yin G, Wang H M, et al. Exploring the patterns of social behavior in GitHub. In: Proceedings of the 1st International Workshop on Crowd-based Software Development Methods and Technologies New York: ACM, 2014. 31--36. Google Scholar

[15] Pham R, Singer L, Liskin O, et al. Creating a shared understanding of testing culture on a social coding site. In: Proceedings of International Conference on Software Engineering Piscataway: IEEE, 2013. 112--121. Google Scholar

[16] Tsay J, Dabbish L, Herbsleb J. Influence of social and technical factors for evaluating contribution in GitHub In: Proceedings of the 36th International Conference on Software Engineering New York: ACM, 2014. 356--366. Google Scholar

[17] Vasilescu B, Yu Y, Wang H M, et al. Quality and productivity outcomes relating to continuous integration in GitHub. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering New York: ACM, 2015. 805--816. Google Scholar

[18] Yu Y, Wang H M, Filkov V, et al. Wait for it: determinants of pull request evaluation latency on GitHub In: Proceedings of Working Conference on Mining Software Repositories Florence, 2015. 367--371. Google Scholar

[19] Duvall P M, Matyas S, Glover A. Continuous Integration: Improving Software Quality and Reducing Risk Boston: Pearson Education, 2007. Google Scholar

[20] Booch G. Object-Oriented Analysis and Design with Applications. 3rd ed. Redwood City: Addison Wesley Longman Publishing Co., Inc., 2004. Google Scholar

[21] Fowler M. Continuous integration, 2006. urlhttp://martinfowler.com/articles/continuousIntegration.html. Google Scholar

[22] Holck J, Jørgensen N. Continuous integration and quality assurance: a case study of two open source projects. Australas J Inform Syst 2007, 11, doi: 10.3127/ajis.v11i1.145. Google Scholar

[23] Hars A, Ou S S. Working for free? Motivations of participating in open source projects. Int J Electron Comm, 2002, 6: 25--39. Google Scholar

[24] Dempsey B J, Weiss D, Jones P, et al. Who is an open source software developer? Commun ACM 2002, 45: 67--72. Google Scholar

[25] Meyer M. Continuous integration and its tools. IEEE Softw 2014, 31: 14--16. Google Scholar

[26] Vasilescu B, van Schuylenburg S, Wulms J, et al. Continuous integration in a social-coding world: empirical evidence from GitHub In: Proceedings of International Conference on Software Maintenance and Evolution New York: ACM, 2014. 401--405. Google Scholar

[27] Tsay J, Dabbish L, Herbsleb J. Let's talk about it: evaluating contributions through discussion in GitHub In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering New York: ACM, 2014. 144--154. Google Scholar

[28] Hellendoorn V J, Devanbu P T, Bacchelli A. Will they like this? Evaluating code contributions with language models. In: Proceedings of Working Conference on Mining Software Repositories Florence, 2015. 157--167. Google Scholar

[29] Hindle A, Barr E T, Su Z D, et al. On the naturalness of software. In: Proceedings of the 34th International Conference on Software Engineering Piscataway: IEEE, 2012. 837--847. Google Scholar

[30] Nagappan N, Murphy B, Basili V. The influence of organizational structure on software quality: an empirical case study. In: Proceedings of the 30th International Conference on Software Engineering New York: ACM, 2008. 521--530. Google Scholar

[31] Bettenburg N, Hassan A E. Studying the impact of social structures on software quality. In: Proceedings of the 18th International Conference on Program Comprehension Braga, 2010. 124--133. Google Scholar

[32] Zimmermann T, Premraj R, Bettenburg N, et al. What makes a good bug report? IEEE Trans Softw Eng 2010, 36: 618--643. Google Scholar

[33] Duc Anh N, Cruzes D S, Conradi R, et al. Empirical validation of human factors in predicting issue lead time in open source projects. In: Proceedings of International Conference on Predictive Models in Software Engineering New York: ACM, 2011. 13. Google Scholar

[34] Vasilescu B, Filkov V, Serebrenik A. Perceptions of diversity on GitHub: a user survey. In: Proceedings of the 8th International Workshop on Cooperative and Human Aspects of Software Engineering. Piscataway: IEEE, 2015. 50--56. Google Scholar

[35] Gousios G. The GHTorrent dataset and tool suite. In: Proceedings of the 10th Working Conference on Mining Software Repositories Piscataway: IEEE, 2013. 233--236. Google Scholar

[36] Vasilescu B, Posnett D, Ray B, et al. Gender and tenure diversity in GitHub teams. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems New York: ACM, 2015. 3789--3798. Google Scholar

[37] Gousios G, Zaidman A. A dataset for pull-based development research. In: Proceedings of the 11th Working Conference on Mining Software Repositories New York: ACM, 2014. 368--371. Google Scholar

[38] Zhu J X, Zhou M H, Mockus A. Patterns of folder use and project popularity: a case study of GitHub repositories. In: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement New York: ACM, 2014. 30. Google Scholar

[39] Sauer C, Jeffery D R, Land L. The effectiveness of software development technical reviews: a behaviorally motivated program of research. IIEEE Trans Software Eng, 2000, 26: 1-14 CrossRef Google Scholar

[40] Rigby P C, Bird C. Convergent contemporary software peer review practices. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering New York: ACM, 2013. 202--212. Google Scholar

[41] Śliwerski J, Zimmermann T, Zeller A. When do changes induce fixes? In: Proceedings of the 2005 International Workshop on Mining Software Repositories New York: ACM, 2005. 1--5. Google Scholar

[42] Bates D M. lme44: mixed-effects modeling with R. 2010. http://lme4.r-forge.r-project.org/lMMwR/lrgprt.pdf. Google Scholar

[43] Patel J K, Kapadia C H, Owen D B. Handbook of Statistical Distributions New York: M. Dekker, 1976. Google Scholar

[44] Rousseeuw P J, Croux C. Alternatives to the Median Absolute Deviation. J Am Statistical Association, 1993, 88: 1273-1283 CrossRef Google Scholar

[45] Hanley J A, McNeil B J. The meaning and use of the area under a receiver operating characteristic (ROC) curve.. Radiology, 1982, 143: 29-36 CrossRef PubMed Google Scholar

[46] Robin X, Turck N, Hainard A. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 2011, 12: 77 CrossRef PubMed Google Scholar

[47] Extension of Nakagawa & Schielzeth's<italic>R</italic><sup>2</sup><sub>GLMM</sub> to random slopes models. Methods Ecol Evol, 2014, 5: 944-946 CrossRef PubMed Google Scholar

[48] A general and simple method for obtaining<italic>R</italic><sup>2</sup> from generalized linear mixed-effects models. Methods Ecol Evol, 2013, 4: 133-142 CrossRef Google Scholar

[49] Barton K, Barton M K. Package `MuMIn'. 2015. https://cran.r-project.org/web/packages/MuMIn/MuMIn.pdf. Google Scholar

[50] Cohen J, Cohen P, West S G, et al. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences 3rd ed. New York: Routledge, 2013. Google Scholar

[51] Metz C E. Basic principles of ROC analysis. Seminars Nucl Med, 1978, 8: 283-298 CrossRef Google Scholar

[52] Stolberg S. Enabling agile testing through continuous integration. In: Proceedings of Agile Conference, Chicago, 2009. 369--374. Google Scholar

[53] Beck K. Embracing change with extreme programming. Comp, 1999, 32: 70-77 CrossRef Google Scholar

Citations

• #### 3

Altmetric

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有