Skip to content

Mutation Annotation Format

MAF is a tab-delimited text file with aggregated mutation information from VCF files and are generated on a project-level. It is often used to describe somatic mutations. In KGGSeq, six columns are required, their header names are fixed but can be in any order. The six columns are:

  • Tumor_Sample_UUID: Aliquot UUID for tumor sample.
  • Chromosome: The affected chromosome.
  • Start_Position: Lowest numeric position of the reported variant on the genomic reference sequence. Mutation start coordinate.
  • Reference_Allele: The plus strand reference allele at this position. Includes the deleted sequence for a deletion or “-” for an insertion.
  • Tumor_Allele1: Primary data genotype for tumor sequencing (discovery) allele 1. A “-” symbol for a deletion represents a variant. A “-” symbol for an insertion represents wild-type allele. Novel inserted sequence for insertion does not include flanking reference bases.
  • Tumor_Allele2: Tumor sequencing (discovery) allele 2.

There is an example:

Tumor_Sample_UUID   Chromosome  Start_Position  Reference_Allele    Tumor_Allele1   Tumor_Allele2
TCGA-A8-A06P    chr19   58864307    C   A   C
TCGA-A8-A06P    chr19   58864307    C   A   C
TCGA-E9-A1NH    chr19   58864366    G   A   G
TCGA-E9-A22B    chr19   58862784    C   T   C
TCGA-BH-A0HP    chr10   52595854    G   A   G
TCGA-BH-A18P    chr10   52595937    G   A   G
TCGA-A2-A0EY    chr12   9246090 C   T   C
TCGA-A8-A08G    chr12   9251298 G   A   G
TCGA-B6-A0IC    chr12   9220358 -   T   -