Mar - Michael Hoffman¶
Speaker: Michael Hoffman
Talk Title: Identifying transcription factor binding using open chromatin, transcriptome, and methylation data
Wednesday, March 17th, 2021 11:00am ~ 12:30 pm (Pacific Time)
Virtually on Zoom
Affiliation: Associate Professor, Department of Medical Biophysics/Department of Computer Science, University of Toronto
Michael Hoffman creates predictive computational models to understand interactions between genome, epigenome, and phenotype in human cancers. He implemented the genome annotation method Segway, which simplifies interpretation of large multivariate genomic datasets, and was a linchpin of the NIH ENCODE Project analysis. He is a principal investigator at the Princess Margaret Cancer Centre and Associate Professor in the Departments of Medical Biophysics and Computer Science, University of Toronto. He was named a CIHR New Investigator and has received several awards for his academic work, including the NIH K99/R00 Pathway to Independence Award, and the Ontario Early Researcher Award.
First, we will discuss a new method, Virtual ChIP-seq which predicts binding of individual transcription factors in new cell types using an artificial neural network that integrates ChIP-seq results from other cell types and chromatin accessibility data in the new cell type. Virtual ChIP-seq also uses learned associations between gene expression and transcription factor binding at specific genomic regions. This approach outperforms methods that use transcription factor sequence preferences in the form of position weight matrices, predicting binding for 33 transcription factors.
Second, we will discuss a new method to discover transcription factor motifs and identify transcription factor binding sites in DNA with covalent modifications such as methylation. Just as transcription factors distinguish one standard nucleobase from another, they also distinguish unmodified and modified bases. To represent the modified bases in a sequence, we replace cytosine (C) with symbols for 5-methylcytosine (5mC), and 5-hydroxymethylcytosine (5hmC). Similarly, we adapted the well-established position weight matrix model of transcription factor binding affinity to an expanded alphabet. We created an expanded-alphabet genome sequence using genome-wide maps of 5mC, and 5hmC in mouse naive T cells. Using this sequence and expanded-alphabet position weight matrixes, we reproduced various known methylation binding preferences, including the preference of ZFP57 and C/EBPβ for methylated motifs and the preference of c-Myc for unmethylated motifs. Using these known binding preferences to tune model parameters enables discovery of novel modified motifs.
Introductory Speaker: Xi (Nicole) Zhang (Dr. Wyeth Wasserman’s lab, UBC)
Talk Title: Novel Cell-specific Generative Deep Learning Model