Input data:
1. A
Variant Call Format (VCF) file (a simulated data set)
examples/rare.disease.hg19.vcf
2. A
linkage pedigree file:
examples/rare.disease.ped.txt
Purpose: Identify
sequence variant candidate that may cause a recessive
Arthrogryposis
Run the commands step by step to see what will happen
1. Filter
by genetic feature and inheritance model (recessive)
java -jar kggseq.jar --vcf-file
examples/rare.disease.hg19.vcf --ped-file examples/rare.disease.ped.txt
--out test1 --excel --genotype-filter 1,2,6
//when QC is imposed
java -jar kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file
examples/rare.disease.ped.txt --out test1 --excel --genotype-filter 1,2,6 --seq-qual
50 --seq-mq 20 --seq-fs 60 --gty-qual 20 --gty-dp 8
2. Annotate
sequence variants by RefGenes:
java -jar kggseq.jar --vcf-file
examples/rare.disease.hg19.vcf --ped-file examples/rare.disease.ped.txt
--out test1 --excel --genotype-filter 1,2,6 --seq-qual 50 --seq-mq 20 --seq-fs
60 --gty-qual 20 --gty-dp 8 --db-gene refgene --gene-feature-in
0,1,2,3,4,5,6
3. Filter
sequence variants by Common variants
java -jar
kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file
examples/rare.disease.ped.txt --out test1 --excel --genotype-filter 1,2,6
--seq-qual 50 --seq-mq 20 --seq-fs 60 --gty-qual 20 --gty-dp 8 --db-gene
refgene --gene-feature-in 0,1,2,3,4,5,6 --db-filter-hard
dbsnp138nf --db-filter
1kg201305,1kg201204,dbsnp138,dbsnp141,ESP6500AA,ESP6500EA
--rare-allele-freq 0.01 --db-filter-hard dbsnp138nf --db-filter-hard
dbsnp138nf
4. Filter
neutral sequence variants by disease-causing prediction
java -jar
kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file
examples/rare.disease.ped.txt --out test1 --excel --genotype-filter 1,2,6
--seq-qual 50 --seq-mq 20 --seq-fs 60 --gty-qual 20 --gty-dp 8 --db-gene
refgene --gene-feature-in 0,1,2,3,4,5,6 --db-filter-hard dbsnp138nf
--db-filter 1kg201305,1kg201204,dbsnp138,dbsnp141,ESP6500AA,ESP6500EA
--rare-allele-freq 0.01 --db-score dbnsfp --mendel-causing-predict all --filter-nondisease-variant
5. Filter
sequence variants in super-duplicate regions which are often error-prone
java -jar
kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file
examples/rare.disease.ped.txt --out test1 --excel --genotype-filter 1,2,6
--seq-qual 50 --seq-mq 20 --seq-fs 60 --gty-qual 20 --gty-dp 8 --db-gene
refgene --gene-feature-in 0,1,2,3,4,5,6 --db-filter-hard dbsnp138nf
--db-filter 1kg201305,1kg201204,dbsnp138,dbsnp141,ESP6500AA,ESP6500EA
--rare-allele-freq 0.01 --db-score dbnsfp --mendel-causing-predict
all --filter-nondisease-variant --superdup-filter
6. Filter
genes which have too many, say 4 or more, rare and pathogenic variants
java -jar
kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file
examples/rare.disease.ped.txt --out test1 --excel --genotype-filter 1,2,6
--seq-qual 50 --seq-mq 20 --seq-fs 60 --gty-qual 20 --gty-dp 8 --db-gene
refgene --gene-feature-in 0,1,2,3,4,5,6 --db-filter-hard dbsnp138nf
--db-filter 1kg201305,1kg201204,dbsnp138,dbsnp141,ESP6500AA,ESP6500EA
--rare-allele-freq 0.01 --db-score dbnsfp --mendel-causing-predict
all --filter-nondisease-variant --superdup-filter --gene-var-filter
4
7. Prioritize
sequence variants by other genomic and OMIM annotation
java -jar
kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file
examples/rare.disease.ped.txt --out test1 --excel --genotype-filter 1,2,6
--seq-qual 50 --seq-mq 20 --seq-fs 60 --gty-qual 20 --gty-dp 8 --db-gene
refgene --gene-feature-in 0,1,2,3,4,5,6 --db-filter-hard dbsnp138nf
--db-filter 1kg201305,1kg201204,dbsnp138,dbsnp141,ESP6500AA,ESP6500EA
--rare-allele-freq 0.01 --db-score dbnsfp --mendel-causing-predict all --filter-nondisease-variant
--superdup-filter --gene-var-filter 4 --genome-annot --omim-annot
8. Prioritize
sequence variants by candidate genes with protein interaction
information
java -jar
kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file
examples/rare.disease.ped.txt --out test1 --excel --genotype-filter 1,2,6
--seq-qual 50 --seq-mq 20 --seq-fs 60 --gty-qual 20 --gty-dp 8 --db-gene
refgene --gene-feature-in 0,1,2,3,4,5,6 --db-filter-hard dbsnp138nf
--db-filter 1kg201305,1kg201204,dbsnp138,dbsnp141,ESP6500AA,ESP6500EA
--rare-allele-freq 0.01 --db-score dbnsfp --mendel-causing-predict all --filter-nondisease-variant
--superdup-filter --gene-var-filter 4 --genome-annot --omim-annot --candi-list
ECEL1,MYBPC1,TNNI2,TNNT3,TPM2 --ppi-annot string --ppi-depth 1
9. Prioritize
sequence variants by candidate genes with pathway information
java -jar
kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file examples/rare.disease.ped.txt
--out test1 --excel --genotype-filter 1,2,6 --seq-qual 50 --seq-mq 20 --seq-fs
60 --gty-qual 20 --gty-dp 8 --db-gene refgene --gene-feature-in
0,1,2,3,4,5,6 --db-filter-hard dbsnp138nf --db-filter
1kg201305,1kg201204,dbsnp138,dbsnp141,ESP6500AA,ESP6500EA
--rare-allele-freq 0.01 --db-score dbnsfp --mendel-causing-predict all --filter-nondisease-variant
--superdup-filter --gene-var-filter 4 --genome-annot --omim-annot
--candi-list ECEL1,MYBPC1,TNNI2,TNNT3,TPM2 --ppi-annot string --ppi-depth 1
--pathway-annot cura
10. Prioritize
sequence variants by PubMed
java -jar
kggseq.jar --vcf-file examples/rare.disease.hg19.vcf --ped-file
examples/rare.disease.ped.txt --out test1 --excel --genotype-filter 1,2,6
--seq-qual 50 --seq-mq 20 --seq-fs 60 --gty-qual 20 --gty-dp 8 --db-gene
refgene --gene-feature-in 0,1,2,3,4,5,6 --db-filter-hard dbsnp138nf
--db-filter 1kg201305,1kg201204,dbsnp138,dbsnp141,ESP6500AA,ESP6500EA
--rare-allele-freq 0.01 --db-score dbnsfp
--mendel-causing-predict all --filter-nondisease-variant --superdup-filter
--gene-var-filter 4 --genome-annot --omim-annot --candi-list ECEL1,MYBPC1,TNNI2,TNNT3,TPM2
--ppi-annot string --ppi-depth 1 --pathway-annot cura --pubmed-mining
Arthrogryposis,Arthrogryposis+multiplex+congenita
|