Sep - Max Libbrecht¶
Speaker: Max Libbrecht
Talk Title: “Understanding the human genome through unsupervised machine learning”
Thursday, Sept 21, 2017 6:00pm
Affiliation: Assistant Professor, School of Computing Science, Simon Fraser University.
Web-site: Max Libbrecht
Despite having sequenced the human genome over fifteen years ago, much is still unknown about how it functions. With the advent of high-throughput genomics technologies, it is now possible to measure properties of the genome across the entire genome in a single experiment, such as measuring where a given protein binds to the DNA or what genes are expressed. However, the complexity and massive scale of these data sets–billions of base pairs with thousands of measurements each–pose challenges to their analysis. My research focuses on the development of new machine learning methods that address the challenges posed by genomics data sets.
I will focus on two projects. First, I will present on method for understanding chromatin domains. The genomic domain where a gene resides (on the scale of 100k-1M base pairs) influences its regulation: the same gene with the same local regulatory elements (that is, the same promoter) may be expressed in one neighborhood but be silent in another. This type of regulation is crucial for gene regulation, but is currently much less well understood than local regulation. I will present a new method for discovering and annotating genomic domains that integrates many types of genomics data sets. Unlike previous methods, this approach can incorporate information about the 3D conformation of the genome in the nucleus. This is possible through a novel type of regularization applied to a probabilistic graphical model.
Second, I will present on the use of submodular optimization for selecting representative subsets of biological data sets. Convex optimization has revolutionized many fields in the past few decades, including machine learning and computational biology. Submodularity is a discrete analog to convexity: a set function—that is a mathematical function defined on subsets of a larger set—is submodular if it satisfies a particular diminishing returns property. Submodular optimization has had great success in many fields, but it is not yet widely used in biology. I demonstrate the utility of submodular optimization for biology with two applications: (1) selecting a most-informative panel of genomics assays to perform on a new tissue type, and (2) removing redundancy in protein sequence data sets.
Maxwell Libbrecht is an Assistant Professor in Computing Science at Simon Fraser University. He received his PhD in 2016 from the Computer Science and Engineering department at University of Washington, advised by Bill Noble and Jeff Bilmes. He received his undergraduate degree in Computer Science from Stanford University, where he did research with Serafim Batzoglou. His research focuses on developing machine learning methods applied to high-throughput genomics data sets. He was the first author of a paper named one of ISCB’s Top 10 Regulatory and Systems Genomics papers of 2015.
Trainees are invited to meet with the VanBUG speaker for open discussion of both science and career paths. This takes place 5:00-5:45pm in either the Boardroom or Lunchroom on the ground floor of the BCCRC
- Maxwell W. Libbrecht, William S. Noble. Machine learning applications in genetics and genomics. Nature Reviews Genetics, 16: 321-332, 2015. http://dx.doi.org/10.1038/nrg3920
- Maxwell W. Libbrecht, Ferhat Ay, Michael M. Hoffman, David M. Gilbert, Jeffrey A. Bilmes, and William S. Noble. Joint annotation of chromatin state and chromatin conformation reveals relationships among domain types and identifies domains of cell-type-specific expression. Genome Research, 25: 544-557, 2015. http://doi.org/10.1101/gr.184341.114
- Maxwell W. Libbrecht, Michael M. Hoffman, Jeffrey A. Bilmes, William S. Noble. Entropic graph-based posterior regularization. Proceedings of the International Conference on Machine Learning (ICML) 2015. http://jmlr.org/proceedings/papers/v37/libbrecht15.html
- Maxwell W. Libbrecht, Jeffrey A. Bilmes, William S. Noble. Eliminating redundancy among protein sequences using submodular optimization. http://dx.doi.org/10.1101/051201
(This technology is brought to you by Compute Canada and WestGrid with support from PHSA Telehealth)