Poster abstracts

Poster number 6 submitted by Sri Appasamy

Improved Non-Redundant datasets for RNA 3D Structure Annotation and Data Integration Pipeline to support RNA Science

Sri D. Appasamy (Department of Biological Sciences, Bowling Green State University), Blake Sweeney (Department of Biological Sciences, Bowling Green State University), James Roll, Craig L. Zirbel (Department of Mathematics and Statistics, Bowling Green State University), Jamie J. Cannone (Center for Computational Biology and Bioinformatics, University of Texas at Austin), Poorna Roy, Maryam Hosseini, Neocles B. Leontis (Department of Chemistry, Bowling Green State University)

Abstract:
The number and size of RNA 3D structures being deposited in the Protein Data Bank (PDB) due to advances in RNA structure determination methods and interest generated by functional diversity of RNA molecules has resulted in a pressing need for automated methods to annotate, compare and analyze these structures. We have developed and maintain an RNA 3D structure annotation pipeline in collaboration with Nucleic Acid Database (NDB), and recently overhauled it to take advantage of the new mmCIF structure format for large structures. All output is available on our website rna.bgsu.edu. One can now view annotations of basepairs and other pairwise interactions in all RNA-containing 3D structures, updated each week. The pipeline now groups together RNA structures by molecule, sequence, and geometry at the level of RNA chains to produce “equivalence classes.” We have improved the methodology for selecting a representative from each equivalence class to produce non-redundant (NR) lists of 3D structures. The NR lists are the building blocks of the RNA 3D Motif Atlas, an online database that contains automatically-classified internal and hairpin loop RNA 3D motifs. Work is in progress to integrate new visualization components and RSR values from PDB to facilitate the analysis and evaluate the reliability of the annotations in this database. Apart from these, we have developed several web-based tools for identifying and studying the conservation RNA 3D motifs in sequences. A new server called JAR3D allows users to match the sequences of RNA internal loop and hairpin loop to motif groups in the Motif Atlas, providing a mechanism for the identification of RNA 3D motifs in novel RNA sequences, even in the absence of exact sequence matches to known RNA motifs. Another tool that is closely related to RNA sequences is R3D-2-MSA, which allows the user to look up known sequence variants of nucleotides at given positions in RNA 3D structures. Collectively, these tools serve as valuable resources for investigating diverse questions regarding the relationship between RNA 3D structure and sequence.

Keywords: RNA 3D motifs, RNA 3D structure annotation pipeline, non-redundant (NR) list of 3D structures