Tissue-spEcific mrNa iSoform functIOnal Networks (TENSION) Scripts

posted on 09.01.2019 by Julie Dickerson, Gaurav Kandoi
This folder contains all the scripts used to build Tissue-spEcific mrNa iSoform functIOnal Networks (TENSION). The README file describes the purpose of every script. The scripts are documented in detail within the actual files.

Alternative Splicing produces multiple mRNA isoforms of a gene which have important diverse roles such as regulation of gene expression, human heritable diseases, and response to environmental stresses. However, very little has been done to assign functions at the mRNA isoform level. Functional networks, where the interactions are quantified by their probability of being involved in the same biological process are typically generated at the gene level. We use a diverse array of tissue-specific RNA-seq datasets and sequence information to train random forest models for predicting the functional networks following a leave-one-tissue-out strategy. Since there is no mRNA isoform-level gold standard, we use single isoform genes co-annotated to Gene Ontology biological process annotations, Kyoto Encyclopedia of Genes and Genomes pathways, BioCyc pathways and protein-protein interactions as functionally related (positive pair). To generate the non-functional pairs (negative pair), we use the Gene Ontology annotations tagged with “NOT” qualifier. We describe 17 Tissue-spEcific mrNa iSoform functIOnal Networks (TENSION) in addition to an organism level reference functional network for mouse. We validate our predictions by comparing its performance with previous methods, randomized positive and negative class labels, updated Gene Ontology annotations, and by literature evidence.

Version 2: improvements were made to the framework resulting in better performance and new datasets.


This material is based upon work supported by the National Science Foundation under Grant IOS-1062546. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562. This work used the XSEDE Comet cluster at San Diego Supercomputer Center (SDSC) through allocation TG-BIO170049.

ABI Innovation: Model-based Alternative Splicing Analysis Across Expression Platforms

Directorate for Biological Sciences

Find out more...

XSEDE 2.0: Integrating, Enabling and Enhancing National Cyberinfrastructure with Expanding Community Involvement

Directorate for Computer & Information Science & Engineering

Find out more...