Sort GTB by Coordinates

Typically GTB files are ordered by coordinates, however, but when the liftover occurs or sorted by certain annotated fields (e.g., by pathogenic potential of variants), it may cause the GTB to become coordinate-disordered. Use the following command to sort GTB files by coordinates:

java -jar gbc.jar sort <input> [output] [options]

When output is not set, the output file will overwrite the original file. If the input file is a remote site file, the output file is saved under the current local working path.

The "ordered" defined by the GTB is weakly-ordered, i.e., the variants with the same chromosome must be ordered and stored continuously. Ordered GTBs or VCFs are mandatory in many algorithm designs. For example, when calculating LD coefficients, an unordered GTB or VCF file will take a lot of time to capture the variants within the window.


Use GBC to sort the example file by the coordinates of the variants (this file is liftovered from hg19 to hg38 without sorting by the coordinates):

# Run directly in the terminal
java -jar gbc.jar sort

# Run it using docker
docker run -v `pwd`:`pwd` -w `pwd` --rm -it -m 4g gbc \

Program Options

Usage: sort <input> [output] [options]
Java-API: edu.sysu.pmglab.gbc.toolkit.GTBSorter
About: Sort the variants in *.gtb by coordinate fields (CHROM, POS).
  --chromosome  Specify the chromosome tags file. e.g., identify 'X, chrX, 
                CHRX, ChrX' as '(int) 22' chromosome.
                format: --chromosome <string>
  --threads,-t  Set the number of threads.
                default: 4
                format: --threads <int>

API Toolkit

The API tool for sorting GTB files is edu.sysu.pmglab.gbc.GTBSorter, and an example of its use is as follows:

Copyright ©Liubin Zhang all right reservedLast modified time: 2023-04-10 13:39:39

results matching ""

    No results matching ""