Maximizing the efficacy of SAGE analysis identifies novel transcripts in Arabidopsis

Plant Physiol. 2004 Oct;136(2):3223-33. doi: 10.1104/pp.104.043406.

Abstract

The efficacy of using Serial Analysis of Gene Expression (SAGE) to analyze the transcriptome of the model dicotyledonous plant Arabidopsis was assessed. We describe an iterative tag-to-gene matching process that exploits the availability of the whole genome sequence of Arabidopsis. The expression patterns of 98% of the annotated Arabidopsis genes could theoretically be evaluated through SAGE and using an iterative matching process 79% could be identified by a tag found at a unique site in the genome. A total of 145,170 reliable experimental tags from two Arabidopsis leaf tissue SAGE libraries were analyzed, of which 29,632 were distinct. The majority (93%) of the 12,988 experimental tags observed greater than once could be matched within the Arabidopsis genome. However, only 78% were matched to a single locus within the genome, reflecting the complexities associated with working in a highly duplicated genome. In addition to a comprehensive assessment of gene expression in Arabidopsis leaf tissue, we describe evidence of transcription from pseudo-genes as well as evidence of alternative mRNA processing and anti-sense transcription. This collection of experimental SAGE tags could be exploited to assist in the on-going annotation of the Arabidopsis genome.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Antisense Elements (Genetics)
  • Arabidopsis / genetics*
  • Arabidopsis / metabolism
  • DNA, Plant
  • Expressed Sequence Tags
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation, Plant*
  • Gene Library
  • Molecular Sequence Data
  • Plant Leaves / metabolism

Substances

  • Antisense Elements (Genetics)
  • DNA, Plant

Associated data

  • GENBANK/GSM30396