Project Description

A/Prof. Aaron Darling

University of Technology Sydney

Statistics and machine learning for metagenomics: current solutions and open challenges

Data science and machine learning for bioinformatics

Wednesday 4 July 2018

Aaron Darling is an Associate Professor in Computational Genomics and Bioinformatics in the UTS Faculty of Science’s ithree institute. He has over a decade of experience developing computational methods for comparative genomics and evolutionary modeling and in 2013 moved from the University of California-Davis to start a computational genomics group at UTS.

Metagenomics is the analysis of microbial communities via sequencing the total DNA of the biomass (human faeces, for example). By its nature, the sample processing causes DNA from all the organisms in the sample to become mixed, and the reads generated by sequencing are no longer associated with the cell or species from which they were originally derived. However, in order to answer many pressing biological questions, the sequence read data must be assigned back to the species or genomes that were present in the sample. In this talk I will review some of the common types of analysis applied to metagenomic data, including taxonomic analysis, community profiling, metagenome assembly, and binning. For each of these I will highlight how machine learning and statistical methods can play a role in the analysis. Finally I will describe performance of some common metagenomic analysis software packages and highlight some ongoing challenges in this field.

Not available.