Development of a whole genome sequencing based classifier to determine telomere maintenance mechanism in tumours — ASN Events

Development of a whole genome sequencing based classifier to determine telomere maintenance mechanism in tumours (#263)

Michael Lee 1 , Rebecca A Dagg 2 , Loretta M Lau 2 , Nic Waddell 3 , John V Pearson 4 , Sean M Grimmond 5 , Roger R Reddel 6 , Jonathan W Arthur 7 , Hilda A Pickett 1
  1. Telomere Length Regulation Unit, Children's Medical Research Institute, University of Sydney, Westmead, NSW, Australia
  2. Children's Cancer Research Unit, The Children's Hospital at Westmead, University of Sydney, Westmead, NSW, Australia
  3. Medical Genomics Group, QIMR Berghofer Medical Research Institute, Herston, Queensland, Australia
  4. Genome Informatics Group, QIMR Berghofer Medical Research Institute, Herston, Queensland, Australia
  5. Centre for Cancer Research, University of Melbourne, Melbourne, Victoria, Australia
  6. Cancer Research Unit, Children’s Medical Research Institute, University of Sydney, Westmead, NSW, Australia
  7. Bioinformatics Unit, Children's Medical Research Institute, University of Sydney, Westmead, NSW, Australia

Telomeres are terminal repetitive DNA sequences at the ends of chromosomes, and are considered to consist almost exclusively of the hexameric sequence TTAGGG. Cancer cells must employ a telomere maintenance mechanism to attain proliferative immortality, and do this by activating the enzyme telomerase, or by the Alternative Lengthening of Telomeres (ALT) pathway. We have identified elevated levels of telomere variant repeat sequences in the telomeres of human cancers which use the ALT pathway. We aimed to use these differences in telomere sequence content to develop a classifier capable of distiguishing between ALT and telomerase tumours. Analysis of telomeric reads extracted from whole genome sequencing (WGS) datasets revealed that telomeric sequences are subject to substantial sequencing error bias due to their repetitive nature. We developed a synthetic telomere substrate in order to measure the extent of sequencing error and devised an approach to minimise the sequencing error bias in order to accurately quantitate telomeric variant repeats. We applied this approach to analyse WGS data from a panel of ALT and telomerase-positive tumours along with matched normal tissue. Telomere reads were extracted and filtered, and the number of variant repeats was quantitated. Statistical analysis was performed and the most statistically significant variant repeats were selected to generate a random forest classifier. We identified a subset of variant telomeric repeats that are able to classify ALT and telomerase tumours with 90% accuracy. We also found that including the telomeric data extracted from the matched normal samples did not improve the classifier, suggesting that sequencing the tumour DNA alone is sufficient to identify telomere maintenance mechanism. We propose that telomeric variant repeat profiles extracted from WGS data can be used to determine the telomere maintenance mechanism of a tumour.

#LorneGenome