Multivariate models for dimension reduction and biomarker selection in omics data – Dr. Kim-Anh Le Cao

Recent advances in high throughput ’omics’ technologies enable quantitative measurements of expression or abundance of biological molecules of a whole biological system. The transcriptome, proteome and metabolome are dynamic entities, with the presence, abundance and function of each transcript, protein and metabolite being critically dependent on its temporal and spatial location.
Whilst single omics analyses are commonly performed to detect between‐groups difference from either static or dynamic experiments, the integration or combination of multi‐layer information is required to fully unravel the complexities of a biological system. Data integration relies on the currently accepted biological assumption that each functional level is related to each other. Therefore, considering all the biological entities (transcripts, proteins, metabolites) as part of a whole biological system is crucial to unravel the complexity of living organisms.
With many contributors and collaborators, we have further developed several multivariate approaches to project high dimensional data into a smaller space and select relevant biological features, while capturing the largest sources of variation in the data. These approaches are based on variants of Partial Least Squares regression and Canonical Correlation Analysis and enable the integration of several types of omics data.
In this presentation, I will illustrate how various techniques enable exploration, biomarker selection andvisualisation for different types of analytical frameworks.


Comments are closed.