SDF Archive

image-20250511121143487

The SDF archive shown in the above figure is one of the core foundations of the SDFA tool and the cornerstone of subsequent downstream analysis. Here, we provide a more detailed description of the fields in the above SDF structure to gain a deeper understanding of the SDF file:

Group Field Value Type Description
LOCATION coordinate int[3] The start and end positions of the chromosome where the current SV is located
LOCATION length int The length of the current SV
(for example, for an insertion variation, it is impossible to determine its length only relying on the coordinate field value)
LOCATION type int Type of the current SV
GENOTYPE genotypes bytecode The genotype of the current sample under this SV
GENOTYPE metrics bytecodeList Quality metrics information of the current genotype
VCF Field id bytecode The ID information of the current SV in the original VCF file
VCF Field ref bytecode The REF information of the current SV in the original VCF file
VCF Field alt bytecode The ALT information of the current SV in the original VCF file
VCF Field qual bytecode The QUAL information of the current SV in the original VCF file
VCF Field filter bytecode The FILTER information of the current SV in the original VCF file
VCF Field info bytecodeList The INFO information of the current SV in the original VCF file
CSV INDEX line int The line number of the current SV in the original VCF file
CSV INDEX chr int[N] If the current SV is a complex SV, record the chromosomes where all the split SVs are located
ANNOTATION INDEX indexes int[N] Record the intervals of lines related to the current SV and various annotations

Decomposition and Assembly of SV

The "decomposition" concept of SV is introduced in SDF storage. Specifically, we decompose all SVs into multiple single intervals on the same chromosome. Each split single - interval SV is called a Standardized Decomposition SV (SDSV), which is the basic unit of SDF file analysis.

To better understand, we first draw the "Schematic Diagram of Decomposition and Reconstruction Principle" below:

image-20250511120941111

Next, use the VCF file as input to show a specific decomposition example:

image-20250511121037723
Copyright ©彭文杰 all right reserved文档修订时间: 2025-05-20 14:25:29

results matching ""

    No results matching ""