Truth Challenge V2: Calling Variants from Short and Long Reads in Difficult-to-Map Regions

This challenge calls on the public to assess variant calling pipeline performance on a common frame of reference, with a focus on benchmarking in difficult-to-map regions, segmental duplications, and the Major Histocompatibility Complex (MHC).

  • Starts
    2020-05-01 21:00:00 UTC
  • Ends
    2020-06-09 03:00:59 UTC

13 days remaining

We are excited to announce that all datasets  for this challenge are now available! To accommodate the analysis of this data, we have extended the submission deadline for the challenge to June 8th!”


This challenge calls on the public to assess variant calling pipeline performance on a common frame of reference, with a focus on benchmarking in difficult-to-map regions, segmental duplications, and the Major Histocompatibility Complex (MHC).


In the context of whole human genome sequencing, software pipelines typically rely on mapping sequencing reads or assemblies to a reference genome and subsequently identifying variants (differences). One way of assessing the performance of such pipelines is by using well-characterized datasets such as Genome in a Bottle’s 7 human genome benchmarks ( Two of these benchmarks were used in the first precisionFDA “Truth Challenge” in 2016 ( However, these benchmarks were limited to “easier-to-map” regions of the genome. New long and linked read technologies along with new bioinformatics pipelines have enabled the characterization of increasingly challenging regions of the genome.

The Genome in a Bottle (GIAB) consortium, led by the National Institute of Standards and Technology (NIST), recently used linked and long reads to develop an expanded benchmark, v4.1, for the GIAB reference sample HG002, also known as NA24385, the son of an Ashkenazi trio. GIAB is developing similar benchmarks for the parents of HG002 (HG003 and HG004).

Taking advantage of this opportunity, the precisionFDA team decided to launch a “Truth Challenge Version 2” focused on variant calling in challenging genomic regions in GRCh38 before the release of the v4 HG003 and HG004 truth data by GIAB. By supplying short and long sequencing reads datasets (FASTQ) for HG002, HG003, and HG004, and using the GA4GH Benchmarking framework for comparing variant call format (VCF) results (, this challenge provides a common frame of reference for measuring performance aspects of participants’ pipelines on “difficult-to-map” regions, segmental duplications, and the Major Histocompatibility Complex (MHC).


The challenge begins with nine precisionFDA-provided input datasets, specifically whole genome sequencing from Illumina (~35x), PacBio HiFi (~35x), and Oxford Nanopore (~50-80x) of the HG002 (NA24385), HG003 (NA24149), and HG004 (NA24143) human samples. The samples were sequenced under similar sequencing conditions and instruments across the three genomes. Your mission is to process the FASTQ datasets from one or more technologies through your variant calling pipeline and create VCF files. You can generate those results in your own environment and upload them to precisionFDA. We do encourage you to implement your pipeline as an app on precisionFDA and run it there, since this would help others use your pipeline that could more easily be run on new datasets for subsequent challenges (e.g., on other samples). Regardless of how you generate your VCF files, these VCF files will be your entry to the challenge.


HG002 (NA24385 )

HG003 (NA24149)

HG004 (NA24143)

For HG002, the benchmark VCF and BED files are already available. You are therefore asked to conduct a comparison between your VCF and the GIAB HG002 (NA24385) benchmark, and include it in your submission entry, for the following reasons:

  1. To ensure that your VCF files are compatible with the comparison process (remember that we won’t be able to check on your HG003 and HG004 VCF until after the end of submissions, so you are using your HG002 VCF as a check that your files can be compared without issues)
  2. For the community to be able to contrast your performance on a previously known sample (HG002) versus previously unknown samples (HG003 and HG004), and to evaluate any overfitting on HG002
  3. To enable analysis of Mendelian consistency of the trio for each entry

Your entry to the challenge includes the following:

  • Submitted HG002, HG003, and HG004 VCFs from a particular WGS dataset
  • Submitted HG002 comparison results output as an HTML

To analyze potential submissions, including confirmation of methods against the provided HG002 truth set, the GA4GH Benchmarking app is publicly available on precisionFDA, and can be found here. A modified version of this tool will be used to grade HG003 and HG004 submissions. When preparing a submission, one of the components must be the HTML output from HG002, which is produced by this app.


For HG003 and HG004, the truth data will not be known during the challenge. After submissions close on June 1, GIAB will publish its benchmark VCF and BED files for HG003 and HG004. The precisionFDA team will run and publish comparisons between each contestant’s HG003 and HG004 VCF files and the GIAB benchmarks. This will publicly reveal how similar each result is to the GIAB benchmarks in 3 categories: (1) the MHC, (2) “difficult-to-map” regions and segmental duplications, and (3) other “easier-to-map” benchmark regions not in the MHC, “difficult-to-map” regions, or segmental duplications.


Selected participants and top performers from different technologies and “difficult-to-map” regions will be recognized on the precisionFDA website. Therefore, we hope you are willing to share your experience with others to further enhance the community's effort to ensure accuracy of NGS tests. You may submit multiple entries, particularly for the same methods applied to different technologies or substantially different methods applied to the same technology.

Your submission may be used in a future publication about the challenge. If a publication about the challenge includes your submission, you will have the option to be included as a co-author.


Please use the challenge discussion thread on the precisionFDA Discussions forum to discuss the challenge.


Question: Can I submit multiple entries with different variant calls from each data type?

Answer: Yes

Question: Can I submit an entry that uses more than one data type to generate the VCF?

Answer: Yes, but please let us know what data type(s) you used in your submission description

Question: Can you please clarify what metrics will be used to determine top performers? in the last challenge, sensitivity, specificity and Fmeas were recognized individually. Will it be the same this time? Also, will there be top performers for HG003 and HG004 separately, or will the metrics be averaged across the 2 subjects?

Answer: In this challenge we plan to recognize ~12 top performers. We will likely use the mean HG003 and HG004 Fmeas in (1) the whole genome, (2) the MHC, and (3) difficult to map regions and segmental duplications for entries using (1) Illumina, (2) PacBio, (3) Oxford Nanopore, and (4) Combinations of data. A panel of experts will determine the top performers based on the number and types of entries received.


NIST: Justin Zook, Justin Wagner, Nate Olson

PrecisionFDA: Elaine Johanson, Emily Boja, Zivana Tezak

DNAnexus: Omar Serang, Sam Westreich, John Didion, Jason Chin

Booz Allen: Holly Stephens, Zeke Maier