About Intensification
Intensification is a database that contains the results for 12 repeat protein domains, from the amplification of population-genetic signal by constructing a motif-based multiple sequence alignment (motif-MSA). Because protein-coding regions are typically under high selective constraints, these variants occur at low frequencies, such that there is often insufficient statistics for downstream calculations. We make use of the modular structure of repeat motifs to amplify signals of selection from population genetics and traditional inter-species conservation. For each repeat protein repeat domain, we construct a motif-MSA and then accumulate single nucleotide variants (SNVs) across the human population based on the genomic coordinate system of the motif-MSA. This allows us to integrate all the corresponding SNV population-genetic profiles, including enrichment of rare variants, non-synonymous-to-synonymous ratio and delta DAFs, with the amino acid variation across the motif-MSA.
'Query' The query page allows the user to input a range of genomic positions, a PDB ID, or select one of the motifs from a pull-down menu to explore our resource.
'Download' This section contains the zip files and brief descriptions of all the repeat domains used in this resource. We include flat files for the various SNV profiles (e.g. population allele frequencies, and SIFT and Condel output) and the sequence logos of the motif-MSAs.
For more details, please refer to the 'Documentation' section.