Create your config.yaml file for ProHap

General parameters



Select transcripts

Only available for Ensembl v.108 and above. For genes that do not include any MANE Select transcript in Ensembl, "Ensembl Canonical" transcripts will be selected.


ProHap



Data source:

Default: 1000 Genomes Project on GRCh38
VCFs are expected per chromosome, replace the chromosome number with "{chr}"
See the wiki page for details
Variants under this threshold will not be included in haplotypes
Name of the AF column in the VCF file ("AF" by default). Change if you want to use the frequency in a specific population within 1000 Genomes, or according to your own file
Threshold haplotypes by
Specify 0 to skip haplotype thresholding
Pseudo-autosomal regions (PAR) on the X chromosome
The default values for the GRCh38 human genome are 2781479 and 155701383. For GRCh37, use 2699520 and 154931044 respectively.

If disabled, UTR sequences are still removed in the final optimized database, but retained in the haplotypes FASTA.

If disabled, these haplotype cDNA sequences are translated in 3 reading frames, including UTR sequences.

ProVar



Add your VCF files:
Specify 0 to skip thresholding



(e.g., one of the F2 files in the Zenodo repository)
(e.g., one of the F3 files in the Zenodo repository)

or copy the content below to your config.yaml file: