Software Platforms

KGGA systematic biological Knowledge-based mining system for Genome-wide Genetic studies
Read More
KGG (Knowledge-based mining system for Genome-wide Genetic studies) is a software tool to perform knowledge-based secondary analyses of p-values from genome-wide association studies (GWAS). The knowledge-based secondary analyses include gene-based, gene-pair-based and gene-set based association analysis.It is implemented by Java with a user-friendly graphic interface to facilitate data analysis and result visualization. Build on advanced algorithms, it is able to process up to 10 million variants in several hours with 15GB RAM on a workstation.
KGGSeqA biological Knowledge-based mining platform for Genomic and Genetic studies using Sequence data
Read More
KGGSeq is a software platform constituted of Bioinformatics and statistical genetics functions making use of valuable biologic resources and knowledge for sequencing-based genetic mapping of variants/genes responsible for human diseases/traits. Simply, KGGSeq is like a fishing rod facilitating geneticists to fish the genetic determinants of human diseases/traits in the big sea of DNA sequences. Compared with other genetic tools like plink/seq, KGGSeq paid more attention downstream analysis of genetic mapping. Currently, a comprehensive and efficient framework was newly implemented on KGGSeq to filter and prioritize genetic variants from whole exome sequencing data.
KGGSEEA biological Knowledge-based mining platform for Genomic and Genetic association Summary statistics using gEne Expression
Read More
KGGSEE is a standalone Java tool for knowledge-based secondary analyses of genomic and genetic association summary statistics of complex phenotypes by integrating gene expression and related data. It has four major integrative analyses, 1) unconditional gene-based association guided by expression quantitative trait loci (eQTLs), 2) conditional gene-based association guided by selective expression in tissues or cell types, 3) estimation of phenotype-associated tissues or cell-type based on gene expression in single-cell or bulk cells of different tissues, and 4) causal gene inference for complex diseases and/or traits based-on multiple eQTL. More integrative analysis functions will be added into this analysis platform in the future.
GBCA parallel toolkit based on fast-accessible byte blocks for extremely large-scale genotypes of species
Read More
GBC (short for GenoType Blocking Compressor) is a blocking compressor for genotype data, which aims at creating a unified and flexible structure-GenoType Block (GTB) for genotype data in the variant call format (VCF) files. There will be a less occupation of hard disk space, a faster data access and extraction function, a more convenient management of population files and a more efficient precess of data analysis with the GTB structure compared with the conventional gz format. GBC provides the following functions:Efficient compression, Quality control, Quick query, File management, Fast LD calculations and Genotype coding for a wide range of haploid/diploid species.
FAPIFast and Accurate P-value Imputation for genetic association
Read More
FAPI is a powerful multi-thread Java-based application developed to infer p-values of untyped Single-nucleotide polymorphisms (SNPs) through p-values of SNPs in LD with the untyped one. With similar imputation accuracy to other genotype imputation tools (including IMPUTE and MACH), FAPI is superfast, without requiring phases of reference genotypes and any sample raw genotypes.
RNA-SSNVA reliable somatic single nucleotide variant identification framework for bulk RNA-Seq data
Read More
RNA-SSNV is a scalable and efficient analysis method for RNA somatic mutation detection from RNA-WES (tumor-normal) paired sequencing data which utilized Mutect2 as core-caller and Multi-filtering strategy & Machine-learning based model to maximize precision & recall performance. It runs highly automated once related configs & infos get configurated properly. It reports an aggregated mutation file (standard maf format) to facilitate downstream analysis and clinical decision.
SnpTrackerTool to track SNPs
Read More
SnpTracker is a Java-based tool developed to extract the latest version rsID and genomic coordinates of SNPs given any version of rs ID(s) according to the SNP track history RsMergeArch, coordinates data SNPChrPosOnRef and deleted history SNPHistory in dbSNP.
IGGA tool to Integrate Genotypes for genome-wide Genetic Studies
Read More
IGG is an open-source Java package with graphic interface to efficiently and consistently integrate genotypes across high throughput genotyping platforms (e.g., Affymetrix and Illumina), the HapMap genotype repository (, and even genotypes from the collaborators’ projects. It is equipped with a series of functions to control qualities of genotype integration and to flexibly export genotypes for genetic studies as well.
GECGenetic Type I Error Calculator
Read More
The Genetic Type I error calculator (GEC) is a Java-based application developed to address multiple-testing issue with dependent Single-nucleotide polymorphisms (SNPs).

Web Apps

REZRobust-regression z-score
Read More
REZ (robust-regression z-score) is a powerful approach to calculate tissue selective expression of genes. The website provides query and tissue enrichment analysis of genes' selective expression produced by REZ.
PCGA Simultaneously estimate associated tissues/cell types and genes of complex diseases and traits by GWAS summary statistics.
Read More
PCGA, phenotype-cell-gene association analysis platform is a web server to simultaneously estimate associated tissues/cell types and genes of complex diseases and traits by GWAS summary statistics. PCGA has included 54 human tissues, 2,214 human single cell types and 4,384 mouse single cell types. Meanwhile, the website has analyzed 1,871 public GWASs of 1,588 unique phenotypes, which could be browsed and searched in the website.