Poster abstracts

Poster number 25 submitted by Swapna Vidhur Daulatabad

Lantern: a semi-automated pipeline and repository for annotating lncRNAs with ontologies

Swapna Vidhur Daulatabad (Department of Biohealth Informatics, Indiana University-Purdue University Indianapolis ), Sarath Chandra Janga (Department of Biohealth Informatics, Indiana University-Purdue University Indianapolis )

Abstract:
With advancements in omics technologies, the range of biological processes long non-coding RNAs (lncRNAs) are involved in, is expanding extensively [1, 2]. The accelerating rate of evidence discovery for lncRNAs’ role in various critical biochemical, cellular and physiological processes is thereby necessitating the need for robust lncRNA annotation resources. Although, there are a plethora of resources for annotating genes, despite extensive corpus of lncRNA literature, available resources with lncRNA ontology annotations are rare. Here, we present a semi-automated pipeline and corresponding lncRNA annotation extractor and repository (Lantern), developed using PubMed’s abstract retrieval engine and NCBO’s recommender annotation system [3]. We extracted between 1-150 abstracts per lncRNA, which were subsequently used for extracting annotations with respect to each ontology by querying NCBO’s recommender system via Application Programming Interface (API). To evaluate the quality of annotations in Lantern, we performed an analysis by deploying our pipeline over 182 lncRNAs from lncRNAdb [4] and compared the extracted annotations against annotations mapped onto the lncRNAdb’s manually curated free text. Benchmarking analysis suggested that Lantern has a recall of 0.62 against lncRNAdb for 182 lncRNAs and precision of 0.8 based on manual verification of ontology annotations for 50 lncRNAs. Lantern’s web-interface currently provides Gene Ontology (GO), Human Phenotype Ontology (HPO) and Human Disease Ontology (DOID) annotations for 182 lncRNAs along with 5 different annotation evaluation scores from NCBO recommender system. The extracted annotations for 182 lncRNAs are available at http://www.iupui.edu/~sysbio/lantern/. Lantern will be expanded to increase the number of annotated lncRNAs and corresponding ontologies, to make it a public resource with high quality controlled annotations for improving the annotation of the noncoding transcriptome.

References:
1. Mohanty, V., et al., Role of lncRNAs in health and disease—size and shape matter. Briefings in functional genomics, 2014. 14(2): p. 115-129.
2. Quinn, J.J. and H.Y. Chang, Unique features of long non-coding RNA biogenesis and function. Nature Reviews. Genetics, 2016. 17(1): p. 47.
3. Martínez-Romero, M., et al., NCBO Ontology Recommender 2.0: an enhanced approach for biomedical ontology recommendation. Journal of biomedical semantics, 2017. 8(1): p. 21.
4. Quek, X.C., et al., lncRNAdb v2. 0: expanding the reference database for functional long noncoding RNAs. Nucleic acids research, 2014. 43(D1): p. D168-D173.

Keywords: lncRNA, ontology, automated annotation