Region/Gene Prioritization with the prioritize Module
About
The prioritize module in KGGA is a powerful tool designed to rank genomic regions and genes based on the presence of multiple mutations. It extends beyond simply analyzing allelic distribution differences tied to phenotypes by incorporating genomic feature annotations, such as functional prediction scores and allelic frequencies from reference populations. This integrated approach allows researchers to prioritize regions or genes that are most likely to influence the phenotype under investigation, making it an invaluable asset for genetic studies targeting complex traits or diseases.
Functions in the Prioritize Module
The prioritize module has had three specialized functions by far, each addressing a unique aspect of prioritization. These can be applied individually or in combination, depending on the research objectives and available data.
RUNNER: Identifies and prioritizes genes or genomic regions enriched with rare sequence variants compared to expected background mutation rates. It utilizes a negative binomial regression model to evaluate the enrichment of rare variants within a gene or region. The model integrates allelic frequencies sourced from reference populations (e.g., gnomAD) to pinpoint rare variants and functional prediction scores (such as CADD or PolyPhen scores) to assess the potential impact of variants.
PubMed: Prioritize genes according to the literature in PubMed. Papers that co-mention the genes to be prioritized and specified phenotypes will be retrieved directly.
Basic Usage
java -jar kgga.jar prioritize --input <input1> --input <input2> --output <output> [options]