Skip to content

Sep - Rayan Chikhi

Speaker: Rayan Chikhi

Talk Title: Efficient indexing of k-mer presence and abundance in sequencing datasets

Event Details

Note the change in day/time/location!


Wed, Sept 16th, 2020 11:00am



Virtually on Zoom

Seminar Date/Time:

Wednesday, September 16th, 2020 11:00am ~ 12:30 pm (Pacific Time)

Affiliation: Group leader and researcher at the Pasteur Institute and the University of Lille, France


Rayan Chikhi studied Computer Science at ENS Rennes and obtained a PhD in 2012 under the supervision of Dominique Lavenier. After a postdoc at Penn State University in Paul Medvedev’s lab, he was hired by the French National Centre for Scientific Research (CNRS) as a junior researcher in 2014 and was then a part of the Bonsai bioinformatics team and where he still supervises researchers. In 2019, he started a “Sequence Bioinformatics” research group at the Center of Bioinformatics, Biostatistics and Integrative Biology of Institut Pasteur, partly funded by the Inception program.


We will talk about k-mer indexing through REINDEER, a novel computational method that performs indexing of sequences across collections of sequencing datasets. The main novelty here is that other indexing methods have so far been unable to associate abundances efficiently to k-mers. We used REINDEER to index the abundances of sequences within 2,585 human RNA-seq experiments in 45 hours using only 56 GB of RAM. This makes REINDEER the first method able to record abundances within a large k-mer matrix, of size ∼4 billion (= distinct k-mers seen across datasets) times 2,585 (= datasets). REINDEER uses a compacted de Bruijn graph behind the scenes, and also supports exact presence/absence queries of k-mers. We will briefly discuss some actual and potential applications through the indexing of thousands of SARS-CoV-2 sequencing datasets and a user-friendly interface (

Introductory Speaker: Courtney van Ballegooie (UBC, Donald Yapp and Marcel Bally labs)

Talk Title: Clinical Protein Nanoparticles: Creating a Tunable and Reproducible Microfluidic Synthesis Method