SCIENCE CHINA Information Sciences, Volume 61, Issue 9: 092102(2018) https://doi.org/10.1007/s11432-017-9418-9

Building application-specific operating systems: a profile-guided approach

More info
  • ReceivedNov 21, 2017
  • AcceptedMar 13, 2018
  • PublishedAug 13, 2018


Although operating system optimization has been studied extensively, previous work mainly focuses on solving performance problems. In the cloud era, many servers only run a single application, making it desirable to provide anapplication-specific operating system (ASOS) that is most suitable for the application.In contrast to existing approaches that build ASOS by manual redesign and reimplementation, this paper presents Tarax, a compiler-based approach to constructing an ASOS for each application. With profile collected from executing the target application on an instrumented Linux kernel, Tarax recompiles the kernel while applying profile-guided optimizations (PGOs).Although GCC has already implemented the optimization process that can be applied to user applications, it does not work on the Linux kernel directly. We modify the Linux kernel and GCC to support kernel instrumentation and profile collection. We also modify GCC to reduce the size of optimized kernel images.We conduct experiments on six popular server applications: Apache, Nginx, MySQL, PostgreSQL, Redis and Memcached. Experimental results show that application performance improves by 8.8% on average (up to 16%) on the ASOS. We also perform detailed analysis to reveal how the resulting ASOS improves performance, and discuss future directions in ASOS construction.


This work was partly supported by National Key Research and Development Program (Grant No. 2017YFB1001904), and National Natural Science Foundation of China (Grant No. 61772042).


[1] Mei H, Guo Y. Network-oriented operating systems: status and challenges (in Chinese). Sci Sin Inform, 2013, 43: 303--321. Google Scholar

[2] Mei H, Guo Y. Toward ubiquitous operating systems: a software-defined perspective. Computer, 2018, 51: 50--56. Google Scholar

[3] Mei H. Understanding “software-defined" from an OS perspective: technical challenges and research issues. Sci China Inf Sci, 2017, 60: 126101 CrossRef Google Scholar

[4] Anderson T E. The Case for Application-Specific Operating Systems. Berkeley: University of California, 1992. Google Scholar

[5] Engler D R, Kaashoek M F, O'Toole J J. Exokernel: an operating system architecture for application-level resource management. In: Proceedings of the 15th ACM Symposium on Operating Systems Principles, Copper Mountain, 1995. 251--266. Google Scholar

[6] Madhavapeddy A, Mortier R, Rotsos C, et al. Unikernels: library operating systems for the cloud. In: Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems, Houston, 2013. 461--472. Google Scholar

[7] Manco F, Lupu C, Schmidt F, et al. My VM is lighter (and safer) than your container. In: Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, 2017. 218--233. Google Scholar

[8] Peter S, Li J L, Zhang I. Arrakis: the operating system is the control plane. ACM Trans Comput Syst, 2015, 33: 11 CrossRef Google Scholar

[9] Belay A, Prekas G, Klimovic A, et al. IX: a protected dataplane operating system for high throughput and low latency. In: Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), Broomfield, 2014. Google Scholar

[10] Yuan P F, Guo Y, Chen X Q. Experiences in profile-guided operating system kernel optimization. In: Proceedings of the 5th Asia-Pacific Workshop on Systems, Beijing, 2014. Google Scholar

[11] Gupta R, Mehofer E, Zhang Y. Profile Guided Code Optimizations. Boca Raton: CRC Press, 2002. Google Scholar

[12] Boyd-Wickizer S, Clements A T, Mao Y, et al. An analysis of Linux scalability to many cores. In: Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, Vancouver, 2010. Google Scholar

[13] Melo A. The new Linux `perf' tools. In: Proceedings of the 17th International Linux System Technology Conference (Linux Kongress), Nuremberg, 2010. 21--24. Google Scholar

[14] Chen D H, Vachharajani N, Hundt R, et al. Taming hardware event samples for FDO compilation. In: Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, Toronto, 2010. 42--52. Google Scholar

[15] Chen D H, Li D X, Moseley T. AutoFDO: automatic feedback-directed optimization for warehouse-scale applications. In: Proceedings of the 2016 International Symposium on Code Generation and Optimization, Barcelona, 2016. 12--23. Google Scholar

[16] Yuan P F, Guo Y, Chen X Q, et al. Device-specific Linux kernel optimization for android smartphones. In: Proceedings of the 6th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering, Bamberg, 2018. 65--72. Google Scholar

[17] Chanet D, Sutter B D, Bus B D. Automated reduction of the memory footprint of the Linux kernel. Trans Embed Comput Syst, 2007, 6: 23 CrossRef Google Scholar

[18] Pu C, Autrey T, Black A, et al. Optimistic incremental specialization: streamlining a commercial operating system. In: Proceedings of the 15th ACM Symposium on Operating Systems Principles, Copper Mountain, 1995. 314--321. Google Scholar

[19] Wang X L, Luo T W, Hu J Y. Evaluating the impacts of hugepage on virtual machines. Sci China Inf Sci, 2017, 60: 012103 CrossRef Google Scholar

[20] Lynch W C. Operating system performance. Commun ACM, 1972, 15: 579-585 CrossRef Google Scholar

[21] Chen J B, Bershad B N. The impact of operating system structure on memory system performance. SIGOPS Oper Syst Rev, 1993, 27: 120-133 CrossRef Google Scholar

[22] Lozi J P, Lepers B, Funston J, et al. The linux scheduler: a decade of wasted cores. In: Proceedings of the 11th European Conference on Computer Systems, London, 2016. Google Scholar

[23] Lu L, Arpaci-Dusseau A C, Arpaci-Dusseau R H, et al. A study of Linux file system evolution. In: Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST 13), San Jose, 2013. 31--44. Google Scholar

[24] Huang J, Qureshi M K, Schwan K. An evolutionary study of Linux memory management for fun and profit. In: Proceedings of 2016 USENIX Annual Technical Conference (USENIX ATC 16), Denver, 2016. 465--478. Google Scholar

[25] Speer S E, Kumar R, Partridge C. Improving UNIX kernel performance using profile based optimization. In: Proceedings of USENIX Winter 1994 Technical Conference, San Francisco, 1994. Google Scholar

[26] Schmidt W J, Roediger R R, Mestad C S. Profile-directed restructuring of operating system code. IBM Syst J, 1998, 37: 270-297 CrossRef Google Scholar

[27] Flower R, Luk C K, Muth R, et al. Kernel optimizations and prefetch with the Spike executable optimizer. In: Proceedings of the 4th Workshop on Feedback-Directed and Dynamic Optimization (FDDO-4), Austin, 2001. Google Scholar

[28] Chanet D, Cabezas J, Morancho E, et al. Linux kernel compaction through cold code swapping. In: Transactions on High-Performance Embedded Architectures and Compilers II. Berlin: Springer, 2009. 173--200. Google Scholar

  • Figure 3

    (Color online) System architecture of Tarax.

  • Figure 4

    (Color online) The building and optimization workflow in Tarax.

  • Figure 5

    figurePerformance speedup on different optimized kernels. For each application, the numbers are normalized to the performance when it runs on the kernel optimized for itself.

  • Figure 6

    (Color online) Performance comparison with different workload. (a) Nginx; (b) memcached.

  • Figure 11

    I-cache miss rates from dynamic profiling. (a) Kernel; (b) application.

  • Figure 12

    Branch statistics from dynamic profiling. (a) Branch misprediction rate; (b) taken branch instructions;protect łinebreak (c) branch taken rate.

  • Figure 13

    Top 10 live kernel functions at runtime of Apache. (a) Vanilla kernel; (b) optimized kernel.

  • Figure 14

    Code of kernel function thread_group_cputime_adjusted.

  • Figure 15

    Effects of profile feedback on different GCC optimizations, results shown are performance improvements of enabling the respective option over disabling it, with or without profile feedback. (a) Function inlining; (b) basic block and function reordering.

  • Table 1   Experimental environment
    Type Parameters
    Processor Intel Core i7-4770
    Memory 32 GB DDR3 1600 MHz
    Network 10 Gbps LAN
    Kernel Linux 4.1.2
    Kernel compiler GCC 5.1.1
    Operating system Debian sid amd64
    File system tmpfs
  • Table 2   Application versions
    Application name Version
    Apache 2.4.23
    Nginx 1.10.2
    MySQL 5.6.25
    PostgreSQL 9.3.9
    Redis 3.0.2
    Memcached 1.4.21
  • Table 3   table
    VanillaTaraxImprovement (%)
    Mean Stdev (%) Mean Stdev (%)
    Apache (requests/s)
    61843 0.16 69186 0.71 11.9
    Nginx (requests/s)
    255397 0.25 298443 0.30 16.9
    MySQL (trans/min)
    70499 0.25 74489 0.43 5.7
    PostgreSQL (trans/min)
    80943 0.59 83194 0.50 2.8
    Redis (operations/s)
    367807 0.45 396407 0.23 7.8
    Memcached (operations/s)
    427715 0.80 464129 0.23 8.5
    Average (geomean)

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1       京公网安备11010102003388号