Mar - Chris Upton¶
Speaker: Chris Upton
March 13, 2003
Title of talk: How to manage and analyze poxvirus genomes with the Viral Genome Organizer and Poxvirus Orthologous Clusters software and still have time for a beer.
Affiliation: Biochemistry and Microbiology, UVic
Poxviruses are large complex viruses containing ds DNA genomes (150-320kb) that encode more than 200 proteins. More than 30 genomes have been sequenced. Despite significant efforts we still don’t know which proteins are responsible for the virulent phenotype of smallpox in comparison to the vaccine strain vaccinia virus. Also, of the 80 genes present in every orthopoxvirus, 20% are of unknown function.
We have developed several programs designed to organize the poxvirus genome data in SQL databases and analyze the data at the level of the genome as well as the gene. Our focus has been on providing easy-to-use tools that allow the molecular virologist to make full use of the available sequence data. The tools can be thought of as catalysts for sequence analysis.
The Viral Genome Organizer (VGO) and Poxvirus Orthologous Clusters (POCs) are JAVA client-server applications that access an updated database containing all complete poxvirus genomes. This architecture permits support of a wide variety of platforms and removes the problem of providing database and software updates to many users. Orthologous genes are automatically grouped into families based on BLASTP scores and assessed by a human database curator. POCs permits complex SQL queries to retrieve interesting groups of DNA and protein sequences as well as gene families for subsequent interrogation by a variety of integrated tools: BLASTP, BLASTX, TBLASTN, Jalview (multiple alignment), Dotlet (Dotplot), Laj (local alignment), and NAP (nucleotide to amino acid alignment).
The POCs database stores 1) a corrected version of the data provided in the GenBank file for complete poxvirus genomes and 2) annotations, ORFs, MW, PI, nucleotide and amino acid (aa) frequencies, codon use, BLASTP scores, and orthologous cluster membership data for each gene. VGO is used to interact with the genome sequences via a GUI. Preprocessed BLAST (BLASTP and PSI-BLAST) searches (updated monthly) for all proteins against the NR protein database and EST database (TBLASTN) are available.
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License.