FAVOR
Ten functional categories consisting of 106 annotation fields from the FAVOR database have been selected and organized into GTB format. The categories are as follows: Integrative Score (25 fields), Protein Function (7 fields), Conservation (10 fields), Epigenetics (15 fields), Transcription Factors (2 fields), Chromatin States (25 fields), Local Nucleotide Diversity (3 fields), Mutation Density (9 fields), Mappability (8 fields), and Proximity Table (2 fields). The categories Variant Category, ClinVar, and Frequencies documented in the original FAVOR have not been included, as similar fields are provided in the latest version by KGGA. The field descriptions are sourced from the FAVOR official annotations, expect for those marked with an asterisk (*).
Category | Field | Description | Type |
---|---|---|---|
IntegrativeScore | apc_protein_function | Protein function annotation PC: the first PC of the standardized scores of "SIFTval, PolyPhenVal, Grantham, Polyphen2_HDIV_score, Polyphen2_HVAR_score, MutationTaster_score, MutationAssessor_score" in PHRED scale. Range: [2.970, 97.690]. | Float |
IntegrativeScore | apc_protein_function_v2 | *Protein function annotation PC: the second PC of the standardized scores of "SIFTval, PolyPhenVal, Grantham, Polyphen2_HDIV_score, Polyphen2_HVAR_score, MutationTaster_score, MutationAssessor_score" in PHRED scale. | Float |
IntegrativeScore | apc_protein_function_v3 | *Protein function annotation PC: the third PC of the standardized scores of "SIFTval, PolyPhenVal, Grantham, Polyphen2_HDIV_score, Polyphen2_HVAR_score, MutationTaster_score, MutationAssessor_score" in PHRED scale. | Float |
IntegrativeScore | apc_conservation | Conservation annotation PC: the first PC of the standardized scores of "GerpN, GerpS, priPhCons, mamPhCons, verPhCons, priPhyloP, mamPhyloP, verPhyloP" in PHRED scale. Range: [1.478E-09, 99.451]. | Float |
IntegrativeScore | apc_conservation_v2 | *Conservation annotation PC: the second PC of the standardized scores of "GerpN, GerpS, priPhCons, mamPhCons, verPhCons, priPhyloP, mamPhyloP, verPhyloP" in PHRED scale. | Float |
IntegrativeScore | apc_epigenetics | *Epigenetic annotation PC | Float |
IntegrativeScore | apc_epigenetics_active | Active Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K4me1.max, EncodeH3K4me2.max, EncodeH3K4me3.max, EncodeH3K9ac.max, EncodeH3K27ac.max, EncodeH4K20me1.max,EncodeH2AFZ.max,” in PHRED scale.Range: [0, 99.451]. | Float |
IntegrativeScore | apc_epigenetics_repressed | Repressed Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K9me3.max, EncodeH3K27me3.max” in PHRED scale. Range: [0, 99.451]. | Float |
IntegrativeScore | apc_epigenetics_transcription | Transcription Epigenetic annotation PC: the first PC of the standardized scores of “EncodeH3K36me3.max, EncodeH3K79me2.max” in PHRED scale. Range: [0, 99.451]. | Float |
IntegrativeScore | apc_local_nucleotide_diversity | Local nucleotide diversity annotation PC: the first PC of the standardized scores of "bStatistic, RecombinationRate, NuclearDiversity" in PHRED scale. Range: [0, 99.451]. | Float |
IntegrativeScore | apc_local_nucleotide_diversity_v2 | *Local nucleotide diversity annotation PC: the second PC of the standardized scores of "bStatistic, RecombinationRate, NuclearDiversity" in PHRED scale. | Float |
IntegrativeScore | apc_local_nucleotide_diversity_v3 | *Local nucleotide diversity annotation PC: the third PC of the standardized scores of "bStatistic, RecombinationRate, NuclearDiversity" in PHRED scale. | Float |
IntegrativeScore | apc_mutation_density | Mutation density annotation PC: the first PC of the standardized scores of "Common100bp, Rare100bp, Sngl100bp, Common1000bp, Rare1000bp, Sngl1000bp, Common10000bp, Rare10000bp, Sngl10000bp" in PHRED scale. Range: [0, 99.451]. | Float |
IntegrativeScore | apc_transcription_factor | Transcription factor annotation PC: the first PC of the standardized scores of "RemapOverlapTF, RemapOverlapCL" in PHRED scale. Range: [1.185, 99.451]. | Float |
IntegrativeScore | apc_mappability | Mappability annotation PC: the first PC of the standardized scores of "umap_k100, bismap_k100, umap_k50, bismap_k50, umap_k36, bismap_k36, umap_k24, bismap_k24" in PHRED scale. Range: [0.185, 99.451]. | Float |
IntegrativeScore | apc_micro_rna | *Micro RNA annotation PC | Float |
IntegrativeScore | apc_proximity_to_coding | *Proximity to coding annotation PC: the first PC | Float |
IntegrativeScore | apc_proximity_to_coding_v2 | *Proximity to coding annotation PC: the second PC | Float |
IntegrativeScore | apc_proximity_to_tsstes | Proximity to TSS (Transcription Starting Site) and TES (Transcription Ending Site) annotation PC: the first PC of "minDistTSS, minDistTSE" in PHRED scale. Range: [0, 99.451]. | Float |
IntegrativeScore | cadd_rawscore | The CADD raw score (integrative score). A higher CADD score indicates more deleterious. Range: [-237.102, 22.763]. | Float |
IntegrativeScore | cadd_phred | The CADD score in PHRED scale (integrative score). A higher CADD score indicates more deleterious. Range: [0, 99]. | Float |
IntegrativeScore | linsight | The LINSIGHT score (integrative score). A higher LINSIGHT score indicates more functionality. Range: [0.215, 0.995]. | Float |
IntegrativeScore | fathmm_xf | The FATHMM-XF score (integrative score). A higher FATHMM-XF score indicates more functionality. Range: [0.405, 99.451]. | Float |
IntegrativeScore | funseq_value | A flexible framework to prioritize regulatory mutations from cancer genome sequencing (integrative score). | Float |
IntegrativeScore | aloft_value | ALoFT provides extensive annotations to putative loss-of-function variants (LoF) in protein-coding genes including functional, evolutionary and network features (integrative score). | Float |
ProteinFunction | polyphen_val | PolyPhen score: It predicts the functional significance of an allele replacement from its individual features. Range: [0, 1] (default: 0). | Float |
ProteinFunction | polyphen2_hdiv_score | Predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. HumDiv is Mendelian disease variants vs. divergence from close mammalian homologs of human proteins (>=95% sequence identity). Range: [0, 1] (default: 0). | Float |
ProteinFunction | polyphen2_hvar_score | Predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. HumVar is all human variants associated with some disease (except cancer mutations) or loss of activity/function vs. common (minor allele frequency >1%) human polymorphism with no reported association with a disease of other effect. Range: [0, 1] (default: 0). | Float |
ProteinFunction | grantham | Grantham score: oAA, nAA. It attempts to predict the distance between two amino acids, in an evolutionary sense. A lower Grantham score reflects less evolutionary distance. A higher Grantham score reflects a greater evolutionary distance, and is considered more deleterious. Range: [0, 215] (default: 0). | Float |
ProteinFunction | mutation_taster_score | MutationTaster is a free web-based application to evaluate DNA sequence variants for their disease-causing potential. The software performs a battery of in silico tests to estimate the impact of the variant on the gene product/protein. Range: [0, 1] (default: 0). | Float |
ProteinFunction | mutation_assessor_score | Predicts the functional impact of amino-acid substitutions in proteins, such as mutations discovered in cancer or missense polymorphisms. Range: [-5.135, 6.490] (default: -5.545). | Float |
ProteinFunction | sift_val | SIFT score, ranges from 0.0 (deleterious) to 1.0 (tolerated). Range: [0, 1] (default: 1). | Float |
Conservation | priPhCons | Primate phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 0.999] (default: 0.0). | Float |
Conservation | mamPhCons | Mammalian phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 1] (default: 0.0). | Float |
Conservation | verPhCons | Vertebrate phastCons conservation score (excl. human). A higher score means the region is more conserved. PhastCons considers n species rather than two. It considers the phylogeny by which these species are related, and instead of measuring similarity/divergence simply in terms of percent identity. It uses statistical models of nucleotide substitution that allow for multiple substitutions per site and for unequal rates of substitution between different pairs of bases. Range: [0, 1] (default: 0.0). | Float |
Conservation | priPhyloP | Primate phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-10.761, 0.595] (default: -0.029). | Float |
Conservation | mamPhyloP | Mammalian phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-20, 4.494] (default: -0.005). | Float |
Conservation | verPhyloP | Vertebrate phyloP score (excl. human). A higher score means the region is more conserved. PhyloP scores measure evolutionary conservation at individual alignment sites. The scores are calculated by comparing with the evolution expected under neutral drift. Positive scores: measure conservation, i.e., slower evolution than expected, at sites that are predicted to be conserved. Negative scores: measure acceleration, i.e., faster evolution than expected, at sites that are predicted to be fast-evolving. Range: [-20, 11.295] (default: 0.042). | Float |
Conservation | GerpN | Neutral evolution score defined by GERP++. A higher score means the region is more conserved. Range: [0, 19.8] (default: 3.0). | Float |
Conservation | GerpS | Rejected Substitution score defined by GERP++. A higher score means the region is more conserved. GERP (Genomic Evolutionary Rate Profiling) identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint. These deficits are referred to as "Rejected Substitutions". Rejected substitutions are a natural measure of constraint that reflects the strength of past purifying selection on the element. GERP estimates constraint for each alignment column; elements are identified as excess aggregations of constrained columns. Positive scores (fewer than expected) indicate that a site is under evolutionary constraint. Negative scores may be weak evidence of accelerated rates of evolution. Range: [-39.5, 19.8] (default: -0.2). | Float |
Conservation | GerpRS | *Gerp element score | Float |
Conservation | GerpRSpVal | *Gerp element p-Value | Float |
Epigenetics | encode_dnase_sum | Maximum Encode DNase-seq level over 12 cell lines. Range: [0, 118672] (default: 0.0). | Float |
Epigenetics | encodeh2afz_sum | Maximum Encode H2AFZ level over 13 cell lines. Range: [0.020, 468.98] (default: 0.42). | Float |
Epigenetics | encodeh3k27ac_sum | Maximum Encode H3K27ac level over 14 cell lines. Range: [0.010, 1442.690] (default: 0.36). | Float |
Epigenetics | encodeh3k27me3_sum | Maximum Encode H3K27me3 level over 14 cell lines. Range: [0.010, 193.38] (default: 0.47). | Float |
Epigenetics | encodeh3k36me3_sum | Maximum Encode H3K36me3 level over 10 cell lines. Range: [0.020, 246.88] (default: 0.39). | Float |
Epigenetics | encodeh3k4me1_sum | Maximum Encode H3K4me1 level over 13 cell lines. Range: [0.010, 227.81] (default: 0.37). | Float |
Epigenetics | encodeh3k4me2_sum | Maximum Encode H3K4me2 level over 14 cell lines. Range: [0.010, 774.99] (default: 0.37). | Float |
Epigenetics | encodeh3k4me3_sum | Maximum Encode H3K4me3 level over 14 cell lines. Range: [0.010, 1093.75] (default: 0.38). | Float |
Epigenetics | encodeh3k79me2_sum | Maximum Encode H3K79me2 level over 13 cell lines. Range: [0.020, 553.06] (default: 0.34). | Float |
Epigenetics | encodeh3k9ac_sum | Maximum Encode H3K9ac level over 13 cell lines. Range: [0.010, 1340.42] (default: 0.41). | Float |
Epigenetics | encodeh3k9me3_sum | Maximum Encode H3K9me3 level over 14 cell lines. Range: [0.010, 226.64] (default: 0.38). | Float |
Epigenetics | encodeh4k20me1_sum | Maximum Encode H4K20me1 level over 11 cell lines. Range: [0.010, 226.64] (default: 0.47). | Float |
Epigenetics | encodetotal_rna_sum | Maximum Encode totalRNA-seq level over 10 cell lines (minus and plus strand separately). Range: [0, 385096] (default: 0.0). | Float |
Epigenetics | gc | Percent GC in a window of +/- 75bp. Range: [0, 1] (default: 0.42). | Float |
Epigenetics | cpg | Percent CpG in a window of +/- 75bp. Range: [0, 0.604] (default: 0.02). | Float |
TranscriptionFactors | remap_overlap_cl | Remap number of different transcription factor - cell line combinations binding. Range: [1, 1068] (default: -0.5). | Int |
TranscriptionFactors | remap_overlap_tf | Remap number of different transcription factors binding. Range: [1, 350] (default: -0.5). | Int |
ChromatinStates | chmm_e1 | Number of 48 cell types in chromHMM state E1_poised. (default: 1.92). | Float |
ChromatinStates | chmm_e2 | Number of 48 cell types in chromHMM state E2_repressed (default: 1.92) | Float |
ChromatinStates | chmm_e3 | Number of 48 cell types in chromHMM state E3_dead (default: 1.92) | Float |
ChromatinStates | chmm_e4 | Number of 48 cell types in chromHMM state E4_dead (default: 1.92) | Float |
ChromatinStates | chmm_e5 | Number of 48 cell types in chromHMM state E5_repressed (default: 1.92) | Float |
ChromatinStates | chmm_e6 | Number of 48 cell types in chromHMM state E6_repressed (default: 1.92) | Float |
ChromatinStates | chmm_e7 | Number of 48 cell types in chromHMM state E7_weak (default: 1.92) | Float |
ChromatinStates | chmm_e8 | Number of 48 cell types in chromHMM state E8_gene (default: 1.92) | Float |
ChromatinStates | chmm_e9 | Number of 48 cell types in chromHMM state E9_gene (default: 1.92) | Float |
ChromatinStates | chmm_e10 | Number of 48 cell types in chromHMM state E10_gene (default: 1.92) | Float |
ChromatinStates | chmm_e11 | Number of 48 cell types in chromHMM state E11_gene (default: 1.92) | Float |
ChromatinStates | chmm_e12 | Number of 48 cell types in chromHMM state E12_distal (default: 1.92) | Float |
ChromatinStates | chmm_e13 | Number of 48 cell types in chromHMM state E13_distal (default: 1.92) | Float |
ChromatinStates | chmm_e14 | Number of 48 cell types in chromHMM state E14_distal (default: 1.92) | Float |
ChromatinStates | chmm_e15 | Number of 48 cell types in chromHMM state E15_weak (default: 1.92) | Float |
ChromatinStates | chmm_e16 | Number of 48 cell types in chromHMM state E16_tss (default: 1.92) | Float |
ChromatinStates | chmm_e17 | Number of 48 cell types in chromHMM state E17_proximal (default: 1.92) | Float |
ChromatinStates | chmm_e18 | Number of 48 cell types in chromHMM state E18_proximal (default: 1.92) | Float |
ChromatinStates | chmm_e19 | Number of 48 cell types in chromHMM state E19_tss (default: 1.92) | Float |
ChromatinStates | chmm_e20 | Number of 48 cell types in chromHMM state E20_poised (default: 1.92) | Float |
ChromatinStates | chmm_e21 | Number of 48 cell types in chromHMM state E21_dead (default: 1.92) | Float |
ChromatinStates | chmm_e22 | Number of 48 cell types in chromHMM state E22_repressed (default:1.92) | Float |
ChromatinStates | chmm_e23 | Number of 48 cell types in chromHMM state E23_weak (default: 1.92) [@ernst2015large] | Float |
ChromatinStates | chmm_e24 | Number of 48 cell types in chromHMM state E24_distal (default: 1.92) | Float |
ChromatinStates | chmm_e25 | Number of 48 cell types in chromHMM state E25_distal (default: 1.92) | Float |
LocalNucleotideDiversity | recombination_rate | Recombination rate measures the probability of how likely the region tends to undergo recombination. Range: [0, 54.96] (default: 0). | Float |
LocalNucleotideDiversity | bstatistic | Background selection score. A background selection (B) value for each position in the genome. B indicates the expected fraction of neutral diversity that is present at a site, with values close to 0 representing near complete removal of diversity as a result of selection and values near 1000 indicating little effect of selection. Range: [0, 1000] (default: 800). | Int |
LocalNucleotideDiversity | nucdiv | Nuclear diversity measures the probability of how likely the region diversify. Range: [0.05, 60.25] (default: 0). | Float |
MutationDensity | freq100bp | Number of common (MAF > 0.05) BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 100. Range: [0, 14] (default: 0). | Int |
MutationDensity | rare100bp | Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 100. Range: [0, 31] (default: 0). | Int |
MutationDensity | sngl100bp | Number of single occurrence of BRAVO SNVs in the nearby 100 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 100. Range: [0, 99] (default: 0). | Int |
MutationDensity | freq1000bp | Number of common (MAF > 0.05) BRAVO SNVs in the nearby1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 1000. Range: [0, 73] (default: 0). | Int |
MutationDensity | rare1000bp | Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 1000. Range: [0, 74] (default: 0). | Int |
MutationDensity | sngl1000bp | Number of single occurrence of BRAVO SNVs in the nearby 1000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 1000. Range: [0, 658] (default: 0). | Int |
MutationDensity | freq10000bp | Number of common (MAF > 0.05) BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 10000. Range: [0, 443] (default: 0). | Int |
MutationDensity | rare10000bp | Number of rare (MAF < 0.05) BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutations. Scores range from 0 to 10000. Range: [0, 355] (default: 0). | Int |
MutationDensity | sngl10000bp | Number of single occurrence of BRAVO SNVs in the nearby 10000 bp window (default: 0). A higher value indicates more mutations happen in the region and a higher likelihood of mutation. Scores range from 0 to 10000. Range: [0, 4750] (default: 0). | Int |
Mappability | k24_bismap | Mappability of the bisulfite-converted genome. Bisulfite sequencing approaches used to identify DNA methylation introduce large numbers of reads that map to multiple regions. This annotation identifies mappability of the bisulfite-converted genome. Range: [0, 1] (default: 0). | Float |
Mappability | k24_umap | Mappability of unconverted genome. It measures the extent to which a position can be uniquely mapped by sequence reads. Lower mappability means the estimates of genomic and epigenomic characteristics from sequencing assays are less reliable, and the region has increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Range: [0, 1] (default: 0). | Float |
Mappability | k36_bismap | Mappability of the bisulfite-converted genome. Bisulfite sequencing approaches used to identify DNA methylation introduce large numbers of reads that map to multiple regions. This annotation identifies mappability of the bisulfite-converted genome. Range: [0, 1] (default: 0). | Float |
Mappability | k36_umap | Mappability of unconverted genome. It measures the extent to which a position can be uniquely mapped by sequence reads. Lower mappability means the estimates of genomic and epigenomic characteristics from sequencing assays are less reliable, and the region has increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Range: [0, 1] (default: 0). | Float |
Mappability | k50_bismap | Mappability of the bisulfite-converted genome. Bisulfite sequencing approaches used to identify DNA methylation introduce large numbers of reads that map to multiple regions. This annotation identifies mappability of the bisulfite-converted genome. Range: [0, 1] (default: 0). | Float |
Mappability | k50_umap | Mappability of unconverted genome. It measures the extent to which a position can be uniquely mapped by sequence reads. Lower mappability means the estimates of genomic and epigenomic characteristics from sequencing assays are less reliable, and the region has increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Range: [0, 1] (default: 0). | Float |
Mappability | k100_bismap | Mappability of the bisulfite-converted genome. Bisulfite sequencing approaches used to identify DNA methylation introduce large numbers of reads that map to multiple regions. This annotation identifies mappability of the bisulfite-converted genome. Range: [0, 1] (default: 0). | Float |
Mappability | k100_umap | Mappability of unconverted genome. It measures the extent to which a position can be uniquely mapped by sequence reads. Lower mappability means the estimates of genomic and epigenomic characteristics from sequencing assays are less reliable, and the region has increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Range: [0, 1] (default: 0). | Float |
ProximityTable | minDistTSE | Distance to closest Transcribed Sequence End (TSE). Range: [1, 3608885] (default: 1e7). | Int |
ProximityTable | minDistTSS | Distance to closest Transcribed Sequence Start (TSS). Range: [1, 3604063] (default: 1e7). | Int |