Gene expression analysis identifies global gene dosage sensitivity in cancer. Fehrmann RS, Karjalainen JM, Krajewska M, Westra HJ, Maloney D, Simeonov A, Pers TH, Hirschhorn JN, Jansen RC, Schultes EA, van Haagen HH, de Vries EG, Te Meerman GJ, Wijmenga C, van Vugt MA, Franke L. Nature Genetics 2015
Press release 12 January 2015
New method makes larger studies into the origin of cancer possible
Re-using large amounts of biological data makes ‘big data’ analysis of DNA mutations possible
Cancer arises due to mutations in the DNA. But in order to identify these mutations, it is necessary to study the DNA of very many cancer patients. Researchers at the University Medical Centre Groningen (UMCG) have developed a method to do this and have now analysed the data from 16,000 cancer patients. This is one of the largest oncology studies to date, worldwide. They have published their method in the authorative journal Nature Genetics.
The new method was developed by a team under the leadership of Dr. Lude Franke, a statistical geneticist: “The systematic analysis of DNA from 16,000 tumours is very costly. However, in the past 15 years, studies have looked at a lot of gene expression and these measurements are publically available. We have developed a new statistical method so that we can re-assess this information. By investigating more than 16,000 tumours, we have been able to determine which changes occur in the DNA. We saw that certain mutations in the tumour DNA are very common, while others occur only in specific tumour types, like breast cancer.”
Medical-oncologist Dr. Rudolf Fehrmann pointed out that this method makes it possible to look at gene expression profiles in a fresh light compared to the past 15 years. “It has enabled us to indicate potential starting points for developing new therapies for a group of cancers that are difficult to treat, for example, (a group that has many mutations in the DNA). These are now being investigated with experiments in the laboratory.”
The researchers studied 80,000 expression profiles in developing their method. Such a ‘big data’ approach has only recently become possible, with the arrival of better computers and new mathematical techniques that permit very efficient research to be performed. Large amounts of data, which were gathered for completely different purposes, are now proving useful to further insight into how cancer arises. This method makes new studies in this field possible and will greatly reduce the costs.
Many cancer-associated somatic copy number alterations (SCNAs) are known. Currently, one of the challenges is to identify the molecular downstream effects of these variants. Although several SCNAs are known to change gene expression levels, it is not clear whether each individual SCNA affects gene expression. We reanalyzed 77,840 expression profiles and observed a limited set of ‘transcriptional components’ that describe well-known biology, explain the vast majority of variation in gene expression and enable us to predict the biological function of genes. On correcting expression profiles for these components, we observed that the residual expression levels (in ‘functional genomic mRNA’ profiling) correlated strongly with copy number. DNA copy number correlated positively with expression levels for 99% of all abundantly expressed human genes, indicating global gene dosage sensitivity. By applying this method to 16,172 patient-derived tumor samples, we replicated many loci with aberrant copy numbers and identified recurrently disrupted genes in genomically unstable cancers.
For Dutch readers: Luister naar een radio interview met Lude Franke over zijn werk: RTV Noord (ga naar 20-1-2015, en dan naar tijdslot 12.00, dan naar 12.41, en dan kom je bij het interview)