PrecisionFDA
NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge - Subchallenge 1


Sample mislabeling (accidental swapping of patient samples) or data mislabeling (accidental swapping of patient omics data) is known to be one of the obstacles in basic and translational research because this accidental swapping contributes to irreproducible results and invalid conclusions. The objective of this challenge is to encourage development and evaluation of computational algorithms that can accurately detect and correct mislabeled samples using rich multi-omics datasets.


  • Starts
    2018-09-24 19:00:00 UTC
  • Ends
    2018-11-05 04:59:59 UTC

The precisionFDA NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge – Subchallenge 1 ran from September 24, 2018 to November 4, 2018. This subchallenge asked participants to develop computational models for identifying samples that have unmatched clinical and protein profiling data. There were 148 valid entries from 51 participants to the challenge.

This NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge – Subchallenge 1 results page displays the summarized results in the tables below. As with previous challenges, due to novelties related to the truth data and the comparison methodology, these results offer a first glance at our understanding. We welcome the community to further explore these results and provide insight for the future.

Introductory Remarks

At the start of this challenge, participants were provided with paired proteomics and clinical data for each of the 160 tumor samples. The 160 tumor samples contained labelling errors and were divided into training and test sets. Participants were asked to develop computational algorithms to model the relationship between clinical attributes and protein profiles using the training data set, then apply the model to identify samples in the test data set that have unmatched clinical and protein profiling data. Sample mislabeling patterns and rates were introduced based on observations in the TCGA data sets.

Overview of Results

For each sample in the test data set, participants submitted a prediction indicating whether there is a mismatch between clinical and proteomics profile data. Predictions were compared to known mislabeled samples. For this subchallenge, precision, recall, and F-score were computed by comparing the binary mismatch predictions, to the known mismatched samples. These three evaluation metrics are defined in the table below

Metric Definition
Precision True Positives / (True Positives + False Positive)
Recall True Positives / (True Positives + False Negative)
F-score Harmonic mean of Precision and Recall

Finally, to determine significant performance differences between submissions, a bootstrapping approach was used to compute the confidence interval of the F-score of each submission. Rankings were generated based on: (1) method performance, by treating each submission as unique, and (2) submitter performance, by taking the median F-score of each participant’s submissions.

Method Performance Results

The table below shows the top 3 highest performing submissions/methods based on F-scores.

Name Submission Precision Recall F-score Mean (F-score) SD (F-score) 95%_CI_lower 95%_CI_upper
Renke Pan subchallenge_1_p2 1 0.75 0.857 0.854 0.05 0.845 0.864
Daniel Schlauch subchallenge_1 0.833 0.833 0.833 0.836 0.047 0.827 0.846
Renke Pan subchallenge_1_p3 0.9 0.75 0.818 0.813 0.052 0.803 0.823

The anonymized complete results table can be downloaded here. To protect the identity of participants, each participant’s performance will be emailed.

Submitter Performance Results

The table below shows the top 3 highest performing participants based on the median F-scores of their submissions.

Name F-score Mean(F-score) SD(F-score) 95%_CI_lower 95%_CI_upper
Daniel Schlauch 0.833 0.836 0.047 0.827 0.846
Eric Li 0.768 0.766 0.067 0.753 0.779
Renke Pan 0.75 0.76 0.051 0.75 0.77

The performance of all submitters is shown in the figure below:

Figure: Median F-scores and the corresponding 95% confidence intervals. The confidence intervals were derived through a bootstrap procedure with 100 iterations.

The anonymized complete results table can be downloaded here. To protect the identity of participants, each participant’s performance will be emailed.

Scientific Manuscript

The NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge team plans to prepare a scientific manuscript that describes that challenge and challenge results. All challenge participants that submit a one-page description of their methods will be included as a challenge participant consortium author. In addition, the challenge team will select some participants, based on performance and/or unique methodology, to participate in the manuscript development.

Challenge Key

The subchallenge 1 key, including mislabeling information for the test samples, is available here.