English | 2016 | ISBN: 3658143183 | 105 Pages | PDF | 10 MB
This thesis presents a scalable, generic methodology for microbial phenotype prediction based on supervised machine learning, several models for biological and ecological traits of high relevance, and the deployment in metagenomic datasets. The results suggest that the presented prediction tool can be used to automatically annotate phenotypes in near-complete microbial genome sequences, as generated in large numbers in current metagenomic studies. Unraveling relationships between a living organism’s genetic information and its observable traits is a central biological problem. Phenotype prediction facilitated by machine learning techniques will be a major step forward to creating biological knowledge from big data.