Molecular genetic strategy for finding de novo mutations causing common disorders such as autism and schizophrenia.
There are several lines of evidence supporting the role of de novo mutations as a mechanism for common disorders, such as autism and schizophrenia. First, the de novo mutation rate in humans is relatively high, so new mutations are generated at a high frequency in the population. However, de novo mutations have not been reported in most common diseases. Mutations in genes leading to severe diseases where there is a strong negative selection against the phenotype, such as lethality in embryonic stages or reduced reproductive fitness, will not be transmitted to multiple family members, and therefore will not be detected by linkage gene mapping or association studies. The observation of very high concordance in monozygotic twins and very low concordance in dizygotic twins also strongly supports the hypothesis that a significant fraction of cases may result from new mutations. Such is the case for diseases such as autism and schizophrenia. Second, despite reduced reproductive fitness1 and extremely variable environmental factors, the incidence of some diseases is maintained worldwide at a relatively high and constant rate. This is the case for autism and schizophrenia, with an incidence of approximately 1% worldwide. Mutational load can be thought of as a balance between selection for or against a deleterious mutation and its production by de novo mutation. Lower rates of reproduction constitute a negative selection factor that should reduce the number of mutant alleles in the population, ultimately leading to decreased disease prevalence. These selective pressures tend to be of different intensity in different environments. Nonetheless, these severe mental disorders have been maintained at a constant relatively high prevalence in the worldwide population across a wide range of cultures and countries despite a strong negative selection against them2. This is not what one would predict in diseases with reduced reproductive fitness, unless there was a high new mutation rate. Finally, the effects of paternal age: there is a significantly increased risk of the disease with increasing paternal age, which could result from the age related increase in paternal de novo mutations. This is the case for autism and schizophrenia3. The male-to-female ratio of mutation rate is estimated at about 4–6:1, presumably due to a higher number of germ-cell divisions with age in males. Therefore, one would predict that de novo mutations would more frequently come from males, particularly older males4. A high rate of new mutations may in part explain why genetic studies have so far failed to identify many genes predisposing to complexes diseases genes, such as autism and schizophrenia, and why diseases have been identified for a mere 3% of genes in the human genome. Identification for de novo mutations as a cause of a disease requires a targeted molecular approach, which includes studying parents and affected subjects. The process for determining if the genetic basis of a disease may result in part from de novo mutations and the molecular approach to establish this link will be illustrated, using autism and schizophrenia as examples.
1. Selection of disease that may be caused by de novo mutations
A disease that corresponds to the following criteria can fit with the de novo mutation hypothesis:
Analysis of the likelihood that a common disease where de novo mutations may in part explain the genetic basis is a critical first step.
2. Selection of cases and DNA samples
Selection of appropriate samples is critical for the success of the identification of de novo mutations. To maximize the chance of finding de novo mutations, we recommend the following:
3. Gene resequencing; two major approaches
4. Genomic variants prioritization
Identified variants are then prioritized for follow up according to their probability in being de novo and deleterious to protein or mRNA function and /or structure. The variant follow up priorities for detection of de novo variant should be as follow:
If using whole exome sequencing, selection of candidate genes can be used as a strategy for prioritizing variants for further study.
5. Genetic validation
6. Representative Results:
Following this protocol, we were able to identify new genes for schizophrenia and autism. One example is our recently SHANK3 gene discovery (Figure 2). Two different de novo mutations in SHANK3 gene, one nonsense mutation found in three affected brother and one missense mutation in one affected female.
Figure 2. (A) Segregation of the R1117X nonsense mutation in three affected brothers of family PED 419. The proband is indicated by the arrow. (B) Segregation of the R536W missense mutation in the proband but not her non-affected brother in PED 56.
The procedure outlined here aims to identify specific common diseases that likely result, in part, from de novo mutations, and to prove this hypothesis. De novo mutations are a well established mechanism for the development of a number of diseases, for example the hereditary cancer syndromes, but has been poorly explored in common diseases. This in part results from the technical challenges involved in the identification of de novo mutations, which requires the sequencing of large amounts of DNA, which has only very recently become cost effective with the advent of Next Generation Sequencing. In addition, the de novo mutation rate in humans was, until very recently, only an estimate. Only very recently have there been reports directly determining the mutation rate in humans. Prior to these measurements, it was difficult to predict the sample size needed for this kind of study and to determine if the observed de novo mutation rate is greater than the baseline rate. Sequencing candidate genes versus whole genome? Since the majority of reported disease mutations are missense/nonsense mutations and are splice site mutations (according to HGMD web site) our screening strategy would identify over 68% of known mutations. There is also a clear relationship between the severity of amino acid replacement and the likelihood of a clinical phenotype. As compared with a conservative amino acid substitution, a nonsense change is 9.0 times more likely to present clinically 7. Thus, at this time sequencing candidate genes is the most cost effective strategy.
The success of the outlined procedure depends on several critical steps, which are outlined in detail and illustrated using two examples, autism and schizophrenia. There are many pitfalls which need to be avoided, such as which disease to select, which patients to screen, source of DNA, and details of how to efficiently identify the de novo mutations. We provide a method for most efficiently determining the fraction of cases of any disease which results from such spontaneous mutations.
The authors have nothing to disclose.
We thanks our funding sources Genome Canada and Génome Québec, and Université de Montréal as well as funding from the Canadian Foundation for Innovation for funding our ‘Synapse to Disease’ (S2D) project.