NCBI Logo NCBI News NCBI News Masthead
National Center for Biotechnology Information National Institutes of Health National Library of Medicine Nation Center for Biotechnology Information Fall 2001



In this issue...

Using TaxPlot to
Compare Genomes

New RefSeq Accession
Numbers for Curated
Genomic Regions

GenBank News

Recent Publications

DART Targets
Protein Domains

Evidence Viewer
Facilitates Analysis
of NCBI Human
Gene Models

Frequently Asked
Questions

BLAST Lab

Masthead


Notice to Subscribers

For your records, there was no Summer 2001 issue of NCBI News.

 



Using TaxPlot to Compare Genomes


TaxPlot is a tool for 3-way comparisons of genomes on the basis of the protein sequences they encode. To use TaxPlot, one selects a reference genome to which two other genomes are compared. Pre-computed BLAST results are then used to plot a point for each predicted protein in the reference genome, based on the best alignment with proteins in each of the two genomes being compared.

Figure 1 shows a TaxPlot in which E. coli K12 has been selected as the reference genome for comparison of two strains of H. pylori, J99 and 26695. Each point in the figure represents a single E. coli protein. The X and Y coordinates represent the BLAST score for the protein’s closest match in the two strains of H. pylori. There are 217 E. coli proteins that are equally similar to proteins in the two H. pylori strains, as shown by the points lying on the central diagonal. E. coli has 678 proteins with greater similarity to H. pylori strain J99 (if only marginally), and 687 proteins with greater similarity to H. pylori strain 26695.

Chart showing Distribution of E. Coli K12 homologs

Figure 1: TaxPlot for two strains of H. pylori against E. coli as the reference genome. Points representing proteins involved in amino acid transport and metabolism are highlighted in blue.

Overall, the proteomes of the two H. pylori strains appear to be equally similar to that of E. coli. However, a few significant differences between the H. pylori strains show up as off-diagonal points toward the left-hand portion of the plot. These points represent proteins in E. coli that better match in one strain of H. pylori than in the other.

For instance, a number of E. coli proteins have low BLAST scores against the H. pylori J99 strain, yet relatively high BLAST scores against the 26695 strain. These points may represent cases in which selection pressures operating on the orthologs of these E. coli proteins in the two H. pylori strains are different. To determine if there is a pattern to these differences, one may identify individual points on the plot to learn the function of the E. coli proteins indicated.

Subsets of the points plotted can be selected by simply clicking on an area of the graph or by using a menu box to select proteins by functional class. Hyperlinks to the BLAST2 Sequences service provide displays of pairwise alignments. In the figure, those proteins known in E. coli to be involved in amino-acid transport and metabolism have been selected and appear blue in the plot. Note that most of the off-diagonal proteins of this type are more similar to proteins from H. pylori strain 26695 than J99, suggesting that H. pylori strain J99 may be undergoing a restructuring of some aspects of its amino acid processing systems. Such restructuring could represent an important adaptation in the J99 strain of relevance to its pathogenesis.

The TaxPlot tool is accessible from the Entrez Genomes Web page, under Tools and Analysis. In addition to the microbial genome version described here, there is also a TaxPlot service for eukaryotic genomes.


Continue


NCBI News | Spring 2000 NCBI News Footer