In this talk I use an analysis of human NuMts, (usually partial) copies of the mitochondrial genome found in nuclear genomes insertion sites to demonstrate some of the ways in which public data can be used to characterize genome regions. I will cover level of transcription and open chromatin structure in various tissue types, as derived from publicly available NGS studies. I will also discuss prediction of DNA physical structure (e.g. bendability) from sequence as well as simple sequence trends such as C+G content and dimer frequency. I will also touch upon repetitive sequences, both low ‐ comlexity sequences and the retrotransposons Alu and L1. Finally I will explain the normalization methods we employed to try to understand the inter ‐ dependencies between these genome characteristics. Throughout the lecture I will use NuMts as a case study, both to make the discussion concrete and because NuMts are an interesting topic. This NuMt study was a collaboration with Junko Tsuji, Martin Frith and Kentaro Tomii. In particular Junko Tsuji did much of the thinking and all of the work.
Reference: Tsuji, J., Frith, M.C., Tomii, K. & Horton, P. (2012) Mammalian NUMT insertion is non ‐ random. Nucleic Acids Research 40:9073 ‐ 9088.