VCF file format

The VCF file contains the genetic data (genotypes). Hereafter a minimal example:

##fileformat=VCFv4.1
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT UNR1 UNR2 UNR3 UNR4
chr7 123 SNP1 A G 100 PASS INFO GT:DS 0/0:0.001 0/0:0.000 0/1:0.999 1/1:1.999
chr7 456 SNP2 T C 100 PASS INFO GT:DS 0/0:0.001 0/0:0.000 0/1:1.100 0/0:0.100
chr7 789 SNP3 A T 100 PASS INFO GT:DS 1/1:2.000 0/1:1.001 0/0:0.010 0/1:0.890

A precise description of this file format can be found here. FastQTL needs at least one of the two following fields GT or DS. It uses in priority the DS field and if absent, the GT field from which it derives the required dosages. We strongly recommend to use dosages instead of fixed genotypes in order to account for imputation uncertainty.

Missing entries (./., ./0 or ./1) are internally imputed as mean dosage at the variant site.

Indexing VCF file (required)

To feed FastQTL with VCF files, you need to index them with tabix first. Hereafter, the commands that does it:

bgzip genotypes.vcf && tabix -p vcf genotypes.vcf.gz

Look here for more details on Tabix and Bgzip command lines. The above command line produces a file genotypes.vcf.gz.tbi that contains the index for data.vcf.gz. These tow files need to be together in the same folder in order for FastQTL do be able to also read the index file when reading genotypes.vcf.gz.