Small tutorial of kggseq for annotation and prioritization of exome sequencing variants of cancer samples

Miaoxin Li ( mxli@hku.hk)

 

Reference: http://statgenpro.psychiatry.hku.hk/limx/kggseq/doc/UserManual.html

Input data:

1.       A somatic variant summary file of prostate cancer  [compiled from Nature Genetics 44, 685–689 (2012)]

examples/hg19_prostate.txt

Note: Called variants in Variant Call Format (VCF) are even better in terms of somatic mutations.

Purpose: Identify cancer-driver somatic mutation, genes and pathways of prostate cancer 


Run the commands step by step to see what will happen

1.       (This step is ignored due to lack of vcf data) Filter by QC and genetic feature (only works for VCF data)

java -jar kggseq.jar --vcf-file XXX.vcf --ped-file XXX.ped.txt --indiv-pair NonTumor.1:Tumor.1,NonTumor.2:Tumor.2 --out test1 --excel --seq-qual 50.0 --gty-qual 20.0 --gty-sec-pl 50 --gty-dp 8 --gty-af-ref 0.05 --gty-af-het 0.25 --gty-af-alt 0.5 --gty-somat-p 0.05 --genotype-filter 8



2.       Annotate sequence variants by RefGenes:

java -jar kggseq.jar --annovar-file examples/hg19_prostate.txt --out test1 --excel --db-gene refgene --gene-feature-in 0,1,2,3,4,5,6,7

3.       Predict driver somatic-mutations and genes of cancers

java -jar kggseq.jar --annovar-file examples/hg19_prostate.txt --out test1 --excel --db-gene refgene --gene-feature-in 0,1,2,3,4,5,6,7 --db-score dbnsfp --cancer-driver-predict all --filter-nondisease-variant

 

4.       Predict cancer driver biological pathways 

java -jar kggseq.jar --annovar-file examples/hg19_prostate.txt --out test1 --excel --db-gene refgene --gene-feature-in 0,1,2,3,4,5,6,7 --db-score dbnsfp --cancer-driver-predict all --filter-nondisease-variant --pathway-db cura

 

5.       Annotate sequence variants COSMIC somatic and OMIM information 

java -jar kggseq.jar --annovar-file examples/hg19_prostate.txt --out test1 --excel --db-gene refgene --gene-feature-in 0,1,2,3,4,5,6,7 --db-score dbnsfp --cancer-driver-predict all --filter-nondisease-variant --pathway-db cura --cosmic-annot --omim-annot

 

6.       Prioritize sequence variants by candidate genes with  protein interaction information

java -jar kggseq.jar --annovar-file examples/hg19_prostate.txt --out test1 --excel --db-gene refgene --gene-feature-in 0,1,2,3,4,5,6,7 --db-score dbnsfp --cancer-driver-predict all --filter-nondisease-variant --pathway-db cura --cosmic-annot --omim-annot --candi-list NKX3,PTEN,TP53 --ppi-annot string --ppi-depth 1

 

7.       Prioritize sequence variants by candidate genes with  pathway information

java -jar kggseq.jar --annovar-file examples/hg19_prostate.txt --out test1 --excel --db-gene refgene --gene-feature-in 0,1,2,3,4,5,6,7 --db-score dbnsfp --cancer-driver-predict all --filter-nondisease-variant --pathway-db cura --cosmic-annot --omim-annot --candi-list NKX3,PTEN,TP53 --ppi-annot string --ppi-depth 1 --pathway-annot cura

 

8.       Prioritize sequence variants by PubMed

java -jar kggseq.jar --annovar-file examples/hg19_prostate.txt --out test1 --excel --db-gene refgene --gene-feature-in 0,1,2,3,4,5,6,7 --db-score dbnsfp --cancer-driver-predict all --filter-nondisease-variant --pathway-db cura --cosmic-annot --omim-annot --candi-list NKX3,PTEN,TP53 --ppi-annot string --ppi-depth 1 --pathway-annot cura --pubmed-mining-gene prostate+cancer