gnomAD Allele frequency

The Genome Aggregation Database (gnomAD) is a critical resource for genetic studies, providing a high-precision frequency map of human genetic variations. The latest release, gnomAD v4.1, incorporates sequencing data from 807,162 individuals, a fivefold increase compared to previous versions. It includes two core datasets:

  1. Exome Dataset: Covers 730,947 individuals, including 416,555 samples from the UK Biobank.
  2. Whole Genome Dataset: Includes 76,215 individuals with whole-genome sequencing data.

All data are based on the Hg38 reference genome and integrate resources from projects like ExAC, 1000 Genomes, and UK Biobank. A standardized QC pipeline ensures data reliability by filtering low-quality samples and variants.

KGGA has extracted three distinct datasets from gnomAD v4.1: the whole-genome sequencing frequency data, the exome sequencing frequency data, and the joint dataset. These datasets provide comprehensive mutation frequency information across global populations, serving as a critical resource for disease association studies.

Generate gnomAD Annotation Files

  1. Switch to the gnomAD directory:
cd path/to/gnomad/vcf/files
  1. Run the command to generate the annotation file:
java -Dccf.compressor.zstd.level=16 -jar kgga.jar gbc make-database --gnomad $(echo $(ls *.vcf.bgz) | tr ' ' ',') -o gnomad.joint.v4.1.sites.hg38.gtb -t 20

Command Explanation

Parameter Description
-Dccf.compressor.zstd.level=16 Sets Zstandard compression level to 16 for optimal efficiency.
-jar kgga.jar Runs the kgga.jar program.
gbc make-database Invokes the gbc tool to create a database.
--gnomad Specifies input files as gnomAD VCF files.
echo $(ls *.vcf.bgz) Outputs the file list as a single line.
tr ' ' ',') Replaces spaces in the file list with commas to generate a comma-separated list.
-o gnomad.joint.v4.1.sites.hg38.gtb Specifies the output file name.
-t 20 Uses 20 threads for parallel processing.

gnomAD Annotation Fields

Annotation Field Description
gnomAD_joint@ALL Alternate allele frequency in joint dataset.
gnomAD_joint@AFR Alternate allele frequency in samples of African / African - American ancestry in joint dataset.
gnomAD_joint@AMI Alternate allele frequency in samples of Amish ancestry in joint dataset.
gnomAD_joint@AMR Alternate allele frequency in samples of Latino ancestry in joint dataset.
gnomAD_joint@ASJ Alternate allele frequency in samples of Ashkenazi Jewish ancestry in joint dataset.
gnomAD_joint@EAS Alternate allele frequency in samples of East Asian ancestry in joint dataset.
gnomAD_joint@FIN Alternate allele frequency in samples of Finnish ancestry in joint dataset.
gnomAD_joint@MID Alternate allele frequency in samples of Middle Eastern ancestry in joint dataset.
gnomAD_joint@NFE Alternate allele frequency in samples of Non - Finnish European ancestry in joint dataset.
gnomAD_joint@SAS Alternate allele frequency in samples of South Asian ancestry in joint dataset.
gnomAD_genomes@ALL Alternate allele frequency in genomes dataset.
gnomAD_genomes@AFR Alternate allele frequency in samples of African / African - American ancestry in genomes dataset.
gnomAD_genomes@AMI Alternate allele frequency in samples of Amish ancestry in genomes dataset.
gnomAD_genomes@AMR Alternate allele frequency in samples of Latino ancestry in genomes dataset.
gnomAD_genomes@ASJ Alternate allele frequency in samples of Ashkenazi Jewish ancestry in genomes dataset.
gnomAD_genomes@EAS Alternate allele frequency in samples of East Asian ancestry in genomes dataset.
gnomAD_genomes@FIN Alternate allele frequency in samples of Finnish ancestry in genomes dataset.
gnomAD_genomes@MID Alternate allele frequency in samples of Middle Eastern ancestry in genomes dataset.
gnomAD_genomes@NFE Alternate allele frequency in samples of Non - Finnish European ancestry in genomes dataset.
gnomAD_genomes@SAS Alternate allele frequency in samples of South Asian ancestry in genomes dataset.
gnomAD_exomes@ALL Alternate allele frequency in exomes dataset.
gnomAD_exomes@AFR Alternate allele frequency in samples of African / African - American ancestry in exomes dataset.
gnomAD_exomes@AMI Alternate allele frequency in samples of Amish ancestry in exomes dataset.
gnomAD_exomes@AMR Alternate allele frequency in samples of Latino ancestry in exomes dataset.
gnomAD_exomes@ASJ Alternate allele frequency in samples of Ashkenazi Jewish ancestry in exomes dataset.
gnomAD_exomes@EAS Alternate allele frequency in samples of East Asian ancestry in exomes dataset.
gnomAD_exomes@FIN Alternate allele frequency in samples of Finnish ancestry in exomes dataset.
gnomAD_exomes@MID Alternate allele frequency in samples of Middle Eastern ancestry in exomes dataset.
gnomAD_exomes@NFE Alternate allele frequency in samples of Non - Finnish European ancestry in exomes dataset.
gnomAD_exomes@SAS Alternate allele frequency in samples of South Asian ancestry in exomes dataset.
Copyright ©MiaoXin Li all right reservedLast modified time: 2025-03-25 08:00:27

results matching ""

    No results matching ""