Research area: genetics

Comprehensive Analysis of Constraint on the Spatial Distribution of Missense Variants in Human Protein Structures

Created on 18th February 2017

Robert Michael Sivley; Jonathan Kropski; Jonathan Sheehan; Joy Cogan; Xiaoyi Dou; Timothy S Blackwell; John A Phillips; Jens Meiler; William S Bush; John A Capra;

The spatial distribution of genetic variation within proteins is shaped by evolutionary constraint and thus can provide insights into the functional importance of protein regions and the potential pathogenicity of protein alterations. Here, we comprehensively evaluate the 3D spatial patterns of constraint on human germline and somatic variation in 4,568 solved protein structures. Different classes of coding variants have significantly different spatial distributions. Neutral missense variants exhibit a range of 3D constraint patterns, with a general trend of spatial dispersion driven by constraint on core residues. In contrast, germline and somatic disease-causing variants are significantly more likely to be clustered in protein structure space. We demonstrate that this difference in the spatial distributions of disease-associated and benign germline variants provides a signature for accurately classifying variants of unknown significance (VUS) that is complementary to current approaches for VUS classification. We further illustrate the clinical utility of our approach by classifying new mutations identified from patients with familial idiopathic pneumonia (FIP) that segregate with disease.

Show more