logo

SCIENCE CHINA Information Sciences, Volume 59, Issue 7: 070104(2016) https://doi.org/10.1007/s11432-016-5587-8

Identification of the clustering structure \\in microbiome data by density clustering \\on the Manhattan distance

More info
  • ReceivedApr 15, 2016
  • AcceptedMay 19, 2016
  • PublishedJun 16, 2016

Abstract

Clustering technology is a method for grouping data points into clusters containing a group of similar data points. In a real dataset such as microbiome data, the data points are presented as profiles or a probability distribution. These data points form the periphery of a cluster, making it difficult to identify the real clustering structure. In this study, we used density clustering on several distance measures to overcome this difficulty. Experiments using a real dataset indicated that the Manhattan distance is an appropriate distance measure for clustering analysis of microbiome data.


Funded by

Self-determined Research Funds of CCNU from the Colleges' Basic Research and Operation of MOE(CCNU16KFY04)

National Natural Science Foundation of China(61532008)

Self-determined Research Funds of CCNU from the Colleges' Basic Research and Operation of MOE(CCNU14A02008)

International Cooperation Project of Hubei Province(2014BHE0017)


Acknowledgment

Acknowledgments

This research was supported by National Natural Science Foundation of China (Grant No. 61532008), International Cooperation Project of Hubei Province (Grant No. 2014BHE0017), and Self-determined Research Funds of CCNU from the Colleges' Basic Research and Operation of MOE (Grant No. CCNU16KFY04, CCNU14A02008).


References

[1] Cani P D. Gut microbiota and obesity: lessons from the microbiome. Brief Funct Genom, } 2013, 12: 381-387 CrossRef Google Scholar

[2] DeWeerdt S. Microbiome: a complicated relationship status. Nature, } 2014, 508: S61-S63 CrossRef Google Scholar

[3] Bornigen D, Morgan X C, Franzosa E A, et al. Functional profiling of the gut microbiome in disease-associated inflammation? {Genom Med,} 2013, 5: 65. Google Scholar

[4] Caporaso J G, Kuczynski J, Stombaugh J, et al. QIIME allows analysis of high-throughput community sequencing data. Nat Meth, } 2010, 7: :-336 Google Scholar

[5] Gevers D, Pop M, Schloss P D, et al. Bioinformatics for the human microbiome project. PLoS Comput Biol, } 2012, 8: :-336 Google Scholar

[6] Goodrich J K, Di Rienzi S C, Poole A C, et al. Conducting a microbiome study. Cell, } 2014, 158: :-262 Google Scholar

[7] La Rosa P S, Shands B, Deych E, et al. Statistical object data analysis of taxonomic trees from human microbiome data. PLoS ONE, } 2012, 7: e48996-262 CrossRef Google Scholar

[8] Arumugam M, Raes J, Pelletier E, et al. Enterotypes of the human gut microbiome. Nature, } 2011, 473: 174-180 CrossRef Google Scholar

[9] Wang J, Linnenbrink M, Kunzel S, et al. Dietary history contributes to enterotype-like clustering and functional metagenomic content in the intestinal microbiome of wild mice. Proc Nat Acad Sci USA, } 2014, 111: :-E2710 Google Scholar

[10] Viaene L, Thijs L, Jin Y, et al. Heritability and clinical determinants of serum indoxyl sulfate and p-cresyl sulfate, candidate biomarkers of the human microbiome enterotype. PLoS ONE, } 2014, 9: e79682-E2710 CrossRef Google Scholar

[11] Knights D, Ward T L, McKinlay C E, et al. Rethinking ``enterotypes". Cell Host Microbe, } 2014, 16: 433-437 CrossRef Google Scholar

[12] Chen X, Hu X H, Lim T Y, et al. Exploiting the functional and taxonomic structure of genomic data by probabilistic topic modeling. IEEE/ACM Trans Comput Biol Bioinform, } 2012, 9: 980-991 CrossRef Google Scholar

[13] Gevers D, Knight R, Petrosino J F, et al. The Human Microbiome Project: a community resource for the healthy human microbiome, {PLoS Biol,} 2012, 10: e1001377. Google Scholar

[14] Peterson J, Garges S, Giovanni M, et al. The NIH human microbiome project. Genome Res, } 2009, 19: :-2323 Google Scholar

[15] Aggarwal C C, Reddy C K. Data Clustering: Algorithms and Applications. Boca Raton: CRC Press, 2013. Google Scholar

[16] Rodriguez A, Laio A. Clustering by fast search and find of density peaks. Science, } 2014, 344: :-1496 Google Scholar

[17] Kurzyński P, Kaszlikowski D. Information-theoretic metric as a tool to investigate nonclassical correlations. Phys Rev A, } 2014, 89: 012103-1496 CrossRef Google Scholar

[18] Lellouch L, Pavoine S, Jiguet F, et al. Monitoring temporal change of bird communities with dissimilarity acoustic indices. Meth Ecol Evol, } 2014, 5: 495-505 CrossRef Google Scholar

[19] Simpson G. CRAN task view: analysis of ecological and environmental data. 2014. https://cran.r-project.org/web\linebreak/views/Environmetrics.html. Google Scholar

[20] Bourguet D, Chaufaux J, Seguin M, et al. Frequency of alleles conferring resistance to Bt maize in French and US corn belt populations of the European corn borer, Ostrinia nubilalis. Theor Appl Genet, } 2003, 106: 1225-1233 Google Scholar

[21] Allen V M, Tinker D B, Hinton M H, et al. Dispersal of micro-organisms in commercial defeathering systems. Brit Poult Sci, } 2003, 44: 53-59 CrossRef Google Scholar

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1