Apr - Ibrahim Numanagic¶

Speaker: Ibrahim Numanagic

Talk Title: How to Build a High-Performance and User-Friendly Language for Data Science and Bioinformatics

Event Details

Note the change in time/location!

Date/Time:

Thursday, April 16th, 2020 4:00pm

Location:

Zoom link details:

– Direct link: https://zoom.us/j/860225192?pwd=U2lUVUM2RlArb2NRZ1RGM3YxSThZQT09

– Room ID: 860 225 192

– Password: 090148

Affiliation: Assistant Professor, Department of Computer Science, University of Victoria

Canada Research Chair in Computational Biology

Bio:

Assistant Professor in the UVic Department of Computer Science. Postdoctoral associate in the Computation and Biology Group at MIT CSAIL, and PhD in Computational Biology at SFU. Research interest includes developing efficient and scalable combinatorial algorithms and tools to help analyze vast amounts of genomic sequencing data.

Abstract:

The scope and scale of biological data is increasing at an exponential rate due to sequencing technologies becoming radically cheaper and more prevalent. Over the last two decades, the cost of sequencing a genome has dropped from $100 million to nearly $100—a factor of over million—and the amount of data to be analyzed has increased proportionally. Yet, as Moore’s Law continues to slow, scientists can no longer rely on computing hardware to compensate for the ever-increasing size of biological datasets. In a field where many researchers are primarily focused on biological analysis over computational optimization, the unfortunate solution to this problem is often to simply buy larger and faster machines— an ad-hoc solution that cannot stand for a long time.

In this talk, I will introduce Seq, the first language tailored specifically to bioinformatics, which marries the ease and productivity of Python with C-like performance. Seq is a clone of Python—and a drop-in replacement—that also incorporates novel bioinformatics- and computational bioinformatics applications. On equivalent CPython code, Seq attains a performance improvement of up to two orders of magnitude, and a 175x improvement once domain-specific language features and optimizations are used. Compared to optimized C++ code, which is already difficult for most biologists to produce, Seq frequently attains up to a 2x improvement, and with shorter, cleaner code. Thus, Seq opens the door to an age of democratization of highly-optimized bioinformatics software.

Link to slides:

to be posted after the talk

Introductory Speaker: Emre Erhan (UBC- Jones Lab)

Talk Title: Support vector machines predict metastatic cancer patient drug response

Link to slides:

to be posted after the talk