Skip to content

VCF-based Quality Control

Genotype-level Quality Control

Removing genotypes if at least one of the options is set to true.

Option Description Default
--gty-gq Exclude genotypes with the minimal genotype quality (Phred Quality Score) per genotype < minGq. Set to ‘0’ to disable this filter.
Format: --gty-gq <minGq>
Example: --gty-gq 20
Valid setting: [int] >=0
20
--gty-dp Exclude genotypes with the minimal read depth per genotype < minDp. Set to ‘0’ to disable this filter.
Format: --gty-dp <minDp>
Example: --gty-dp 8
Valid setting: [int] >=0
8
--gty-pl Exclude genotypes with the second smallest normalized Phred-scaled likelihoods for genotypes < minPl. Otherwise, there would be confusing genotypes. Set to ‘0’ to disable this filter.
Format: --gty-pl <minPl>
Example: --gty-pl 20
Valid setting: [int]>=0
20
--gty-ad-hom-ref Exclude genotypes with the fraction of the reads carrying alternative allele > maxAdHomRef at a reference-allele homozygous genotype. Set to ‘1’ to disable this filter.
Format: --gty-ad-hom-ref <maxAdHomRef>
Example: --gty-ad-hom-ref 0.05
Valid setting: [float] 0.0 ~ 1.0
0.05
--gty-ad-hom-alt Exclude genotypes with the fraction of the reads carrying alternative allele < minAdHomAlt at an alternative-allele homozygous genotype. Set to ‘0’ to disable this filter.
Format: --gty-ad-hom-alt <minAdHomAlt>
Example: --gty-ad-hom-alt 0.75
Valid setting: [float] 0.0 ~ 1.0
0.75
--gty-ad-het Exclude genotypes with the fraction of the reads carrying alternative allele < minAdHet at a heterozygous genotype. Set to ‘0’ to disable this filter.
Format: --gty-ad-het <minAdHet>
Example: --gty-ad-het 0.25
Valid setting: [float] 0.0 ~ 1.0
0.25
--gty-qc Exclude genotypes where the genotype quality metric corresponding to the keyword has not passed Java expression quality control. --gty-qcis is a combination of parameters that meet custom genotype QC needs. The default tag is used to control whether to retain or discard the genotype when a quality control parsing error occurs.
Format: --gty-qc <keyword> <rule> default=[RETAIN/DISCARD]
Example: --gty-qc DP "DP.toInt()>=10" default=DISCARD
[OFF]

Variant-level Quality Control

Option Description Default
--allele-num Exclude variants with the alternative allele number per variant outside the range [minAlleleNum, maxAlleleNum].
Format: --allele-num <minAlleleNum>~<maxAlleleNum>
Example: --allele-num 2~4
Valid setting: [int] 0 ~ 255
[OFF]
--seq-ac Exclude variants with the alternative allele count (AC) per variant outside the range [minAc, maxAc].
Format: --seq-ac <minAc>~<maxAc>
Example: --seq-ac 1~10
Valid setting: [int] >=0
[OFF]
--seq-an Exclude variants with the non-missing allele number (AN) per variant outside the range [minAn, maxAn].
Format: --seq-an <minAn>~<maxAn>
Example: --seq-an 160~200
Valid setting: [int] >=0
[OFF]
--seq-af Exclude variants with the alternative allele frequency (AF) per variant outside the range [minAf, maxAf].
Format: --seq-af <minAf>~<maxAf>
Example: --seq-af 0.05~1.0
Valid setting: [float] 0.0 ~ 1.0
[OFF]
--seq-qual Exclude variants with the minimal overall sequencing quality score (Phred Quality Score) per variant < minQual.
Format: --seq-qual <minQual>
Example: --seq-qual 30
Valid setting: [float] >=0.0
30
--seq-mq Exclude variants with the minimal overall mapping quality score (Mapping Quality Score) per variant < minMq.
Format: --seq-mq <minMq>
Example: --seq-mq 20
Valid setting: [float] >=0.0
20
--seq-fs Exclude variants with the overall strand bias Phred-scaled p-value (using Fisher’s exact test) per variant > maxFs. The strand bias estimation is best suited for low-coverage situations. Set to ‘100’ to disable this filter as the maximal phred-scaled p-value is 100.
Format: --seq-fs <maxFs>
Example: --seq-fs 60
Valid setting: [float] >=0.0
100
--seq-info Exclude variants where the value of the specified keyword in the INFO field does not pass the Java expression quality control. --seq-infois a combination of parameters. The default tag is used to control whether to retain or discard the genotype when a quality control parsing error occurs.
Format: --seq-info <keyword> <rule> default=[RETAIN/DISCARD]
Example: --seq-info keyword=MQ rule=MQ.char2Float()>=20 default=DISCARD
[OFF]
--seq-filter Exclude variants where the value of the specified keyword in the FILTER field of VCF does not pass quality control. It uses Java expressions flexibly. In the expressions, e.g., ‘value.XXX(string)’ operates the value of the FILTER field as a Java String.
Format: --seq-filter Java String expression
Example:
--seq-filter value.valueEquals(\"PASS\") will exclude variants at which the FILTER field is not equal to PASS
–seq-filter value.indexOf("q10") != -1` will exclude variants at which the FILTER field contains q10
[OFF]

Turn Off Quality Control

All quality control options mentioned above, including genotype-level and variant-level quality control options, can be turned off by --disable-qc.

Option Description Default
--disable-qc Disable all quality control options mentioned above.
NOTE It cannot be used in conjunction with other quality control options, as it will render them ineffective.
Format: --disable-qc
[OFF]

Mutation Type

Option Description Default
--only-snv Only single-nucleotide polymorphism variants (SNP) are retained and analyzed.
Format: --only-snv
[OFF]
--only-indel Only small insertion or deletion (InDel, <=50 bp) variants are retained and analyzed.
Format: --only-indel
[OFF]

In the following functions, if a PED file is provided, the genotype data for individuals present in both the VCF and PED files will be utilized for the selection process. Without a PED file, the selection will proceed based solely on the genotype data available for all individuals listed in the VCF file.

Allele Frequency

Option Description Default
--local-af Exclude variants in all subjects with alternative allele frequency (AF) outside the range [minAF, maxAF].
Format: --local-af <minAF>~<maxAF>
Example:--local-af 0.05~1.0
Valid setting: [float] 0.0 ~ 1.0
[OFF]
--local-af-case Exclude variants in cases with alternative allele frequency (AF) outside the range [minAF, maxAF].
Format: --local-af-case <minAF>~<maxAF>
Example:--local-af-case 0.05~1.0
Valid setting: [float] 0.0 ~ 1.0
[OFF]
--local-af-control Exclude variants in controls with alternative allele frequency (AF) outside the range [minAF, maxAF].
Format: --local-af-control <minAF>~<maxAF>
Example:--local-af-control 0.05~1.0
Valid setting: [float] 0.0 ~ 1.0
[OFF]
--min-case-control-af-ratio Exclude variants at which the alternative allele frequency (AF) in cases is less than that of in controls multiplied by a specified ratio.
Format: --min-case-control-af-ratio <ratio>
Example:--min-case-control-af-ratio 2.0
Valid setting: [float] >= 0.0
[OFF]
--local-maf Exclude variants in all subjects with minor allele frequency (MAF) outside the range [minMAF, maxMAF]. By definition, MAF represents the frequency of the less common allele. An interesting thing about the human reference genome is that the “reference” allele is not always the common or “major” allele in the human population. When AF<=0.5, MAF equals AF; when AF > 0.5, MAF is calulated as 1-AF.
Format: --local-maf <minMAF>~<maxMAF>
Example:--local-maf 0.05~0.5
Valid setting: [float] 0.0 ~ 0.5
[OFF]
--local-maf-case Exclude variants in cases with minor allele frequency (MAF) outside the range [minMAF, maxMAF].
Format: --local-maf-case <minMAF>~<maxMAF>
Example:--local-maf-case 0.05~0.5
Valid setting: [float] 0.0 ~ 0.5
[OFF]
--local-maf-control Exclude variants in controls with minor allele frequency (MAF) outside the range [minMAF, maxMAF].
Format: --local-maf-control <minMAF>~<maxMAF>
Example:--local-maf-control 0.05~0.5
Valid setting: [float] 0.0 ~ 0.5
[OFF]
--min-case-control-maf-ratio Exclude variants at which the minor allele frequency (MAF) in cases is less than that of in controls multiplied by a specified ratio.
Format: --min-case-control-maf-ratio <ratio>
Example:--min-case-control-maf-ratio 2.0
Valid setting: [float] >= 0.0
[OFF]

Missing Genotype Rate

Option Description Default
--min-obs-rate Exclude variants in all subjects with the observed rate of non-missing genotypes <minObsRate.
Format: --min-obs-rate <minObsRate>
Example: --min-obs-rate 0.8
Valid setting: [float] 0.0 ~ 1.0
[OFF]
--min-obs-rate-case Exclude variants in cases with the observed rate of non-missing genotypes <minObsRate.
Format: --min-obs-rate-case <minObsRate>
Example: --min-obs-rate-case 0.8
Valid setting: [float] 0.0 ~ 1.0
[OFF]
--min-obs-rate-control Exclude variants in controls with the observed rate of non-missing genotypes <minObsRate.
Format: --min-obs-rate-control <minObsRate>
Example: --min-obs-rate-control 0.8
Valid setting: [float] 0.0 ~ 1.0
[OFF]

Hardy-Weinberg Equilibrium

Option Description Default
--hwe Exclude variants in all subjects with the Hardy-Weinberg test p value <= pThreshold.
Format: --hwe <pThreshold>
Example: --hwe 1E-5
Valid setting: [double] 0.0 ~ 1.0
[OFF]
--hwe-case Exclude variants in cases with the Hardy-Weinberg test p value <=pThreshold.
Format: --hwe-case <pThreshold>
Example: --hwe-case 1E-5
Valid setting: [double] 0.0 ~ 1.0
[OFF]
--hwe-control Exclude variants in controls with the Hardy-Weinberg test p value <=pThreshold.
Format: --hwe-control <pThreshold>
Example: --hwe-control 1E-5
Valid setting: [double] 0.0 ~ 1.0
[OFF]