SCIENTIA SINICA Informationis, Volume 47 , Issue 11 : 1483-1492(2017) https://doi.org/10.1360/N112017-00106

Model reuse with domain knowledge

More info
  • ReceivedMay 16, 2017
  • AcceptedMay 22, 2017
  • PublishedNov 14, 2017


The life spans of machine learning models are often short and a large number of models are wasted because they can only be applied to a specific task. However, a well-designed, carefully trained model contains learned knowledge from its task, which may be more concise than training data. Furthermore, when we have no access to training data, the trained model is the last remaining source of information. This study introduces a framework to reuse existing models trained in other tasks and help improve the model for the current task, especially when limited data is available for the current task. This framework incorporates high-level domain knowledge to combine existing models and treat them as black boxes, in order for them to be universal for complex models. Experiments on applying the framework to practical problems demonstrate that we can improve the performance on the current task by reusing existing models.

Funded by



[1] Zhou Z H. Learnware: on the future of machine learning. Front Comput Sci, 2016, 10: 589-590 CrossRef Google Scholar

[2] 周志华. 机器学习: 发展与未来. 中国计算机学会通讯 2017, 13: 44--51. Google Scholar

[3] Pan S J, Yang Q. A Survey on Transfer Learning. IEEE Trans Knowl Data Eng, 2010, 22: 1345-1359 CrossRef Google Scholar

[4] Jiang J. A literature survey on domain adaptation of statistical classifiers. 2008. http://sifaka.cs.uiuc.edu/jiang4/domain_adaptation/survey/da_survey.pdf. Google Scholar

[5] Blitzer J, McDonald R, Pereira F. Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing Sydney, 2006. 120--128. Google Scholar

[6] Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th International Conference on Machine Learning Bellevue, 2011. 513--520. Google Scholar

[7] Luo J, Tommasi T, Caputo B. Multiclass transfer learning from unconstrained priors. In: Proceedings of the 2011 International Conference on Computer Vision Washington, 2011. 1863--1870. Google Scholar

[8] Da Q, Yu Y, Zhou Z H. Learning with augmented class by exploiting unlabeled data. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence Quebec City, 2014. 1760--1766. Google Scholar

[9] Mu X, Zhu F D, Du J, et al. Streaming classification with emerging new class by class matrix sketching. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, 2017. 2373--2379. Google Scholar

[10] Zhu Y, Ting K M, Zhou Z H. Multi-label learning with emerging new labels. In: Proceedings of the 16th International Conference on Data Mining Barcelona, 2016. 1371--1376. Google Scholar

[11] Zhu Y, Ting K M, Zhou Z H. Discover multiple novel labels in multi-instance multi-label learning. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence San Francisco, 2017. 2977--2984. Google Scholar

[12] Li N, Tsang I W, Zhou Z H. Efficient optimization of performance measures by classifier adaptation.. IEEE Trans Pattern Anal Mach Intell, 2013, 35: 1370-1382 CrossRef PubMed Google Scholar

[13] Yang Y, Zhan D C, Fan Y, et al. Deep learning for fixed model reuse. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence San Francisco, 2017. 2831--2837. Google Scholar

[14] Zhao P, Jiang Y, Zhou Z H. Multi-view matrix completion for clustering with side information. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining Jeju, 2017. 403--415. Google Scholar

[15] Dai W Z, Zhou Z H. Combining logical abduction and statistical induction: discovering written primitives with human knowledge. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence San Francisco, 2017. 4392--4398. Google Scholar

[16] Li Fei-Fei , Fergus R, Perona P. One-shot learning of object categories.. IEEE Trans Pattern Anal Machine Intell, 2006, 28: 594-611 CrossRef PubMed Google Scholar

[17] Palatucci M, Pomerleau D, Hinton G E, et al. Zero-shot learning with semantic output codes. In: Proceedings of the 22nd International Conference on Neural Information Processing Systems Vancouver, 2009. 1410--1418. Google Scholar

[18] Lampert C H, Nickisch H, Harmeling S. Attribute-based classification for zero-shot visual object categorization.. IEEE Trans Pattern Anal Mach Intell, 2014, 36: 453-465 CrossRef PubMed Google Scholar

[19] Mei S, Zhu J, Zhu J. Robust regbayes: selectively incorporating first-order logic domain knowledge into bayesian models. In: Proceedings of the 31st International Conference on Machine Learning Beijing, 2014. 253--261. Google Scholar

[20] Fu Y W, Sigal L. Semi-supervised vocabulary-informed learning. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition Las Vegas, 2016. 5337--5346. Google Scholar

[21] Radivojac P, Clark W T, Oron T R. A large-scale evaluation of computational protein function prediction.. Nat Meth, 2013, 10: 221-227 CrossRef PubMed Google Scholar

[22] The UniProt Consortium Uniprot: a hub for protein information. Nucleic Acids Res 2014, 43: D204. Google Scholar

[23] Ashburner M, Ball C A, Blake J A. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.. Nat Genet, 2000, 25: 25-29 CrossRef PubMed Google Scholar

[24] Ofer D, Linial M. ProFET: Feature engineering captures high-level protein functions.. Bioinformatics, 2015, 31: 3429-3436 CrossRef PubMed Google Scholar

[25] Zhou Z H, Zhang M L. Multi-instance multi-label learning with application to scene classification. In: Proceedings of the 19th International Conference on Neural Information Processing Systems Cambridge: MIT Press, 2006. 1609--1616. Google Scholar

[26] Chua T S, Tang J H, Hong R C, et al. Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval Santorini, 2009. Google Scholar

[27] Lin T Y, Maire M, Belongie S J, et al. Microsoft COCO: common objects in context. In: Proceedings of the 13th European Conference on Computer Vision, Zurich, 2014. 740--755. Google Scholar

[28] Gao B B, Xing C, Xie C W. Deep Label Distribution Learning With Label Ambiguity. IEEE Trans Image Process, 2017, 26: 2825-2838 CrossRef PubMed ADS arXiv Google Scholar

[29] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. Comput Vision Recogn 2014,. arXiv Google Scholar

[30] Pennington J, Socher R, Manning C D. Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing Doha, 2014. 1532--1543. Google Scholar

  • Figure 1

    The illustration of the MRDK framework

  • Figure 2

    The data and model flow of MRDK

  • Figure 3

    Part of the ancestor chart of endodeoxyribonuclease activity

  • Table 1   Proteome datasets information
    Domain Proteome #Instance #Class Label cardinality
    BacteriaGEOSL 378 319 3.143
    AZOVD 406 340 3.993
    ArchaeaHALMA 304 234 3.247
    PYRFU 425 321 4.480
    Eukaryota YEAST 3507 1566 5.887
  • Table 2   Results of protein function prediction
    Proteome Hamming loss $\downarrow$F-measure $\uparrow$
    $f_0$ $f^+$ $f_0$ $f^+$
  • Table 3   Information of image datasets
    Dataset #Instance #Class Label cardinality
    Scene 2000 5 1.236
    VOC07 9963 20 1.437
    MS-COCO 122218 80 2.926
    NUS-WIDE 133441 81 1.761
  • Table 4   Results of image classification
    Dataset Hamming loss $\downarrow$F-measure $\uparrow$
    $f_0$ $f^+$ $f_0$ $f^+$

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号