Output
KGGSum operates using a task-by-task model for all analyses. Each task generates an efficient binary file named variants.annot.hg38.gtb, which stores variants along with harmonized statistics, intermediate results and annotations. This design allows users to seamlessly resume interrupted workflows or re-run analyses with adjusted parameters, starting from the previous breakpoints or creating branched workflows.
Option | Description | Default |
---|---|---|
--output |
Specify the output folder path. All data from each task will be put under the specified folder. That preserve intermediate files and can avoid duplicate tasks. Format: --output <dir> Example: --output ./out/test |
./kggsum |
--clean-intermediate-data |
Clean the all intermediate data of the analysis, reducing memory usage. Format: --clean-intermediate-data |
[OFF] |
While each analysis has a unique task, some common tasks are listed below.
File | Description |
---|---|
ConvertVCF2GTBTask\*.gtb | The gtb format file converted from the input VCF file by --ref-gty-file |
GenerateRootVariantSetTask\variants.annot.hg38.gtb | The variants extracted from the reference genotype file specified by --ref-gty-file in gtb format will be used as the base for the following analysis. |
AppendVariants2RootVariantSetTask\variants.annot.hg38.gtb | The base variants appended with GWAS summary statistics specified by the --sum-file |
GeneFeatureAnnotationTask\variants.annot.hg38.gtb | The variants annotated with gene features subsequently |
OutputVariants2TSVTask\variants.hg38.tsv.gz | The GWAS summary and annotations of variants retained for analysis in TSV format. |