Project Description

Ms Alexandra Essebier

School of Chemistry and Molecular Biosciences
The University of Queensland

Exploring regulation through integration of high throughput sequencing data

Data science and machine learning for bioinformatics

Tuesday 3 July 2018

Alex Essebier completed her undergraduate degrees in Science (Biochemistry and Molecular Biology) and Information Technology at the University of Queensland in 2013 and a Masters of Bioinformatics in 2015. Alex first developed an interest in bioinformatics in her second year of university when she discovered it would allow her to solve a variety of biological problems through the application of her programming skills. She has undertaken a number of research projects over the last five years as part of A/Prof. Mikael Bodén’s group at UQ.

These projects involved large biological datasets and allowed Alex to explore big data techniques to extract relevant patterns and relationships. Her main focus is on the use of machine learning to analyse high-throughput genomic datasets. She is currently a PhD student investigating the application of machine learning to predict long distance regulatory interactions. The ability to accurately detect these interactions can improve our understanding of developmental disorders and diseases such as cancer.

Alex’s multidisciplinary background has allowed her to provide bioinformatic insight on a number of research projects engaging with multiple collaborators. It has also provided her with the skills to drive her own research and work toward developing new bioinformatic tools and techniques.

High throughput sequencing (HTS) technology has contributed to a number of discoveries in the human genome. A handful of consortiums, including ENCODE and the Roadmap Epigenome, have generated thousands of HTS data sets describing different regulatory controls in the human genome. These range from ChIP-seq and DNase-seq to describe transcription factor binding and epigenetic state, RNA-seq and CAGE to describe gene expression changes, to chromosome conformation capture to explore three-dimensional structure. Together, these data describe the regulatory system in the human genome. The challenge we are currently facing is how to integrate this information to discover biologically relevant information?

This talk aims to explore how multiple HTS data sets can be used to answer a variety of biological questions using different bioinformatic and data science techniques.