Transcriptomic Toolkit for Analysis of Second Generation Sequencing Data in the Cloud — ASN Events

Transcriptomic Toolkit for Analysis of Second Generation Sequencing Data in the Cloud (#270)

James R Torpy 1 , Nenad Bartonicek 1 , Melanie L Lehman 2 , Marcel Dinger 1
  1. Kinghorn Centre for Clinical Genomics, Sydney, NSW, Australia
  2. Cancer Genomics and Genetics, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia

Bioinformatic analysis of sequencing data is presented with an ever-greater challenge of higher compute and storage load requirements. The field of transcriptomics is under additional pressure due to a lack of consensus on standard workflows and a large number of available tools that are often platform specific or lack sufficient performance optimisation. In order to mitigate these issues we have employed the DNAnexus platform, a cloud-based system that uses Amazon computing power. DNAnexus allows access to data and analyses through both a graphical user interface and the command line, allowing a wider range of user expertise. Here we present a set of benchmarked and standardised workflows for RNA-seq that will be made available for public use through DNAnexus. The workflows are built by combining inputs and outputs of multiple benchmarked tools and cover quality control, genome mapping, transcript quantification and de novo assembly. Additional workflows for long read sequencing and single cell analysis provide a platform to a one-stop-shop approach to bioinformatics analyses in the cloud.

#LorneGenome