Estimating the Selective Effect of Heterozygous Protein Truncating Variants from Human Exome Data

Created on 16th September 2016

Christopher A. Cassa; Donate Weghorn; Daniel J. Balick; Daniel M. Jordan; David Nusinow; Kaitlin E. Samocha; Anne O'Donnell Luria; Daniel G. MacArthur; Mark J. Daly; David R. Beier; Shamil R. Sunyaev;

The dispensability of individual genes for viability has interested generations of geneticists. For some genes it is essential to maintain two functional chromosomal copies, while other genes may tolerate the loss of one or both copies. Exome sequence data from 60,706 individuals provide sufficient observations of rare protein truncating variants (PTVs) to make genome-wide estimates of selection against heterozygous loss of gene function. The cumulative frequency of rare deleterious PTVs is primarily determined by the balance between incoming mutations and purifying selection rather than genetic drift. This enables the estimation of the genome-wide distribution of selection coefficients for heterozygous PTVs and corresponding Bayesian estimates for individual genes. The strength of selection can help discriminate the severity, age of onset, and mode of inheritance in Mendelian exome sequencing cases. We find that genes under the strongest selection are enriched in embryonic lethal mouse knockouts, putatively cell-essential genes inferred from human tumor cells, Mendelian disease genes, and regulators of transcription. Using an essentiality screen, we find a large set of genes under strong selection that are likely to have critical function but that have not yet been studied extensively.

Show more