propr: A Software Package for Identifying Proportionally Abundant Features using Compositional Data Analysis — ASN Events

propr: A Software Package for Identifying Proportionally Abundant Features using Compositional Data Analysis (#150)

Thomas Quinn 1 , Mark Richardson 1 , David Lovell 2 , Tamsyn Crowley 1
  1. Bioinformatics Core Research Facility, Deakin University, Geelong, Victoria, Australia
  2. Queensland University of Technology, Brisbane, Queensland, Australia

Advances in the technology used to assay biological systems have led to a rapid increase in the amount of data generated. However, analytical methods have not kept up. Often, analysis of biological data will involve correlation-based statistics to prioritize a subset of genes for subsequent analysis. Yet, correlation is not a valid measure of association for compositional data (i.e., data that carry only relative information). Examples of compositional data include some of the most frequently studied biological data, for example data produced by high-throughput RNA-sequencing, chromatin immunoprecipitation (ChIP), ChIP-sequencing, Methyl-Capture sequencing, or other techniques. Here, we present the topic of compositional data analysis, discuss its relevance to biological research, and present a programmatic framework for dealing with these kinds of data. Specifically, we show how proportionality, an alternative measure of gene dependence, avoids the spurious results introduced by correlation misuse. We overview our free and open source software, the propr package for R, that provides a fast and user-friendly interface for calculating proportionality, along with a number of visualization tools to conceptualize high dimensional biological data. Then, we show how this tool could apply to multi-omic datasets to uncover co-operative relationships within the interactome.

#LorneGenome