2011 Rustbelt RNA Meeting
RRM
Poster abstracts
Abstract:
The peanut (Arachis hypogaea), an annual herbaceous plant in the legume, is an important crop for oil production and food sources. As in July 15, 2011, there were a total of 198,156 nucleotide sequences for Arachis hypogaea in NCBI GenBank, including 39,854 Nucleotide (Core Nucleotide Sequences), 150,177 EST (Expressed Sequence Tag), and 8,125 GSS (Genome Survey Sequence). Based on previously available 73,407 Sanger ESTs, NCBI UniGene Build#1 was developed on Dec 2009, which contains 11,909 UniGene clusters. With the next generation sequencing like 454 pyrosequencing and Illumina SBS, more and more genomics resources are being generated for peanut. For example, there are a total of 596.5 million bases (3.6 GB) 454 ESTs available in NCBI SRA. Recently, Illumina paired-end sequencing of peanut transcriptome has been conducted in Institute of Plant Protection, Chinese Academy of Agricultural Sciences and generated a total of 20 GB sequence data (~ 80 million reads) for peanut. As the first public genomic database dedicated to Arachis hypogaea, PeanutDB currently focuses on the transcriptomics analysis of Arachis hypogaea. Using all aformentioned cDNA data, we have created our first release of peanut transriptome assembly consisting of 32,619 contigs. Not only we provided GO, KEGG, EC and InterproScan annotation to these contigs, but also we determined the relationship among these contigs based on sequence similarity and the potential linkage from the paired-end/clone-end read information. Obviously, our database will be an useful bioinformatics resource that will facilitate peanut genome sequencing and gene annotation in the future.
Keywords: peanut , transcriptome , database