PrecisionFDA
NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge - Subchallenge 1


Sample mislabeling (accidental swapping of patient samples) or data mislabeling (accidental swapping of patient omics data) is known to be one of the obstacles in basic and translational research because this accidental swapping contributes to irreproducible results and invalid conclusions. The objective of this challenge is to encourage development and evaluation of computational algorithms that can accurately detect and correct mislabeled samples using rich multi-omics datasets.


  • Starts
    2018-09-24 19:00:00 UTC
  • Ends
    2018-11-05 04:59:59 UTC

The precisionFDA NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge – Subchallenge 1 ran from September 24, 2018 to November 4, 2018. This subchallenge asked participants to develop computational models for identifying samples that have unmatched clinical and protein profiling data. There were 148 valid entries from 51 participants to the challenge.

This NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge – Subchallenge 1 results page displays the summarized results in the tables below. As with previous challenges, due to novelties related to the truth data and the comparison methodology, these results offer a first glance at our understanding. We welcome the community to further explore these results and provide insight for the future.

Introductory Remarks

At the start of this challenge, participants were provided with paired proteomics and clinical data for each of the 160 tumor samples. The 160 tumor samples contained labelling errors and were divided into training and test sets. Participants were asked to develop computational algorithms to model the relationship between clinical attributes and protein profiles using the training data set, then apply the model to identify samples in the test data set that have unmatched clinical and protein profiling data. Sample mislabeling patterns and rates were introduced based on observations in the TCGA data sets.

Overview of Results

For each sample in the test data set, participants submitted a prediction indicating whether there is a mismatch between clinical and proteomics profile data. Predictions were compared to known mislabeled samples. For this subchallenge, precision, recall, and F-score were computed by comparing the binary mismatch predictions, to the known mismatched samples. These three evaluation metrics are defined in the table below

Metric Definition
Precision True Positives / (True Positives + False Positive)
Recall True Positives / (True Positives + False Negative)
F-score Harmonic mean of Precision and Recall

Finally, to determine significant performance differences between submissions, a bootstrapping approach was used to compute the confidence interval of the F-score of each submission. Rankings were generated based on: (1) method performance, by treating each submission as unique, and (2) submitter performance, by taking the median F-score of each participant’s submissions.

Method Performance Results

The table below shows the top 3 highest performing submissions/methods based on F-scores.

Name Submission Precision Recall F-score Mean (F-score) SD (F-score) 95%_CI_lower 95%_CI_upper
Renke Pan subchallenge_1_p2 1 0.75 0.857 0.854 0.05 0.845 0.864
Daniel Schlauch subchallenge_1 0.833 0.833 0.833 0.836 0.047 0.827 0.846
Renke Pan subchallenge_1_p3 0.9 0.75 0.818 0.813 0.052 0.803 0.823

The anonymized complete results table can be downloaded here. To protect the identity of participants, each participant’s performance will be emailed.

Submitter Performance Results

The table below shows the top 3 highest performing participants based on the median F-scores of their submissions.

Name F-score Mean(F-score) SD(F-score) 95%_CI_lower 95%_CI_upper
Daniel Schlauch 0.833 0.836 0.047 0.827 0.846
Eric Li 0.768 0.766 0.067 0.753 0.779
Renke Pan 0.75 0.76 0.051 0.75 0.77

The performance of all submitters is shown in the figure below:

Figure: Median F-scores and the corresponding 95% confidence intervals. The confidence intervals were derived through a bootstrap procedure with 100 iterations.

The anonymized complete results table can be downloaded here. To protect the identity of participants, each participant’s performance will be emailed.

Scientific Manuscript

The NCI-CPTAC Multi-omics Enabled Sample Mislabeling Correction Challenge team plans to prepare a scientific manuscript that describes that challenge and challenge results. All challenge participants that submit a one-page description of their methods will be included as a challenge participant consortium author. In addition, the challenge team will select some participants, based on performance and/or unique methodology, to participate in the manuscript development.

Challenge Key

The subchallenge 1 key, including mislabeling information for the test samples, is available here.

Submitter Organization Entry
Mohammad Siddique job-FKvGG6Q06B2V2yF23Bg38YVj-1
Marouen Ben Guebila job-FKyvQx006B2b95zyKVyz5j0v-1
Lisheng Zhou job-FP3JK8806B2Zg3Kk0fFZb8FJ-1
Yin Hung Lin Taiwan AI Labs job-FP6p3KQ06B2y19J994PZ8BFq-1
Sergio Kulikovsky job-FP8QPzj06B2f8GXkJpbYbZPV-1
Sergio Kulikovsky job-FP8QQZj06B2vPZk8417zvXkF-1
Ranjan Kumar Barman job-FP99j9j06B2v717x1b2qf48y-1
Cheng Qian job-FP9Z8f806B2k2JG740JK7BKP-1
Zachary Rom Deloitte job-FPB89VQ06B2vyGF83zZqbkpK-1
Mark Barna job-FPBvGF006B2gKZggK9KvppZq-1
Eric Li job-FPBx76Q06B2X6vXb405Q37PJ-1
Roland Luethy job-FPF415Q06B2yK8by4vpG3179-1
Sergio Kulikovsky job-FPFb7Qj06B2ggfpq5Xy6fygf-1
Libo Liu job-FPFvYKQ06B2vg3Z10vyYGYyF-1
Sergio Kulikovsky job-FPG1p3006B2ggfpq5Xy6g3fB-1
Bruno Giotti Biosciences and Biotechnology Institute of Grenoble (CEA, France) job-FPGpk6006B2p9Q5p0F4KYYkZ-1
Soon Jye Kho Wright State University job-FPGvQ2j06B2pqjxVP72yvGYk-1
Yeshwant Chiillakuru job-FPGyQ5006B2qg44J1x01vz2P-1
Soon Jye Kho Wright State University job-FPJ0PZQ06B2bYP7y1xVJ61gF-1
Jemma Wu job-FPJ355806B2fbJ2F0yXK4p71-1
Dana Pascovici Australian Proteome Analysis Facility (APAF). Macquarie University job-FPJ35Bj06B2Q8YQp0xvgQvpy-1
Xiaowei Zhan job-FPJj5bj06B2XK1yK9J7Z3y99-1
Roland Luethy job-FPJpfBj06B2p7j2jJXK1Xv2g-1
Renke Pan Sentieon Inc job-FPJvkv006B2p7j2jJXK1Xv85-1
Nelson Tang job-FPK07Bj06B2zz4ZBKFKG9bpv-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPK1Y9006B2VgjX65bG6Yvq1-1
Sergi Sayols Puig job-FPK1b2006B2y5Y6F9JvYBvY6-1
Taner Arslan Karolinska Institute job-FPK304j06B2xffj07xX5pyB5-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPK67fj06B2VgjX65bG6Z24P-1
Christophe Battail Biosciences and Biotechnology Institute of Grenoble (CEA, France) job-FPK6Zvj06B2qyzkF5PBP434j-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPK709j06B2y2vpj5ZX384K5-1
Taner Arslan Karolinska Institute job-FPK73J806B2VgjX65bG6Z3fp-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPK75P806B2pq0qB9Gf32FF5-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPK7K8j06B2pK2KJ7yJk9p2b-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPK7VY006B2g3jbG9G5X8qYF-1
Yuanfang Guan University of Michigan job-FPKBY2Q06B2XPq5B9GJqjGxK-1
Yuanfang Guan University of Michigan job-FPKBYg006B2xBv0F5Zb3pjJp-1
Yuanfang Guan University of Michigan job-FPKBZ7006B2jfjqF8jZ4f7b3-1
Yuanfang Guan University of Michigan job-FPKBZQ006B2zx1gbFZ0BFQKy-1
Amanda French Deloitte job-FPKBfb806B2zx1gbFZ0BFQVk-1
Amanda French Deloitte job-FPKBg7j06B2V76578jKg0B2P-1
Michael Bradshaw job-FPKJK4006B2kj8JZF5GK7qPv-1
Mohammad Siddique job-FPKPVv806B2VgjX65bG6Z9gp-1
Enhao Fang job-FPKgBKj06B2gyGfY5b3k25y7-1
Annelaura Bach Nielsen University of Copenhagen job-FPKy60Q06B2Z5fPGPBPXBQxv-1
Antonio Cappuccio job-FPP06Bj06B2x8J1xPvBfpV94-1
Achilleas Pitsillides job-FPP0qY006B2kp3FF069XGv4g-1
Annelaura Bach Nielsen University of Copenhagen job-FPP1vP806B2qyzkF5PBP50B6-1
Fredrik Vannberg job-FPP6Z6806B2V4gqpPJbb15pB-1
Fredrik Vannberg job-FPP6f3006B2jkYZ193FGYKF0-1
Fredrik Vannberg job-FPP6fVj06B2v8VKj3YF6QFV8-1
Fredrik Vannberg job-FPP6g8806B2XFQfPKZj7XY0f-1
Fredrik Vannberg job-FPP70kQ06B2xBv0F5Zb3q43Y-1
Tony Kaoma Luxembourg Institute of Health job-FPP8Ky006B2pq0qB9Gf32fPz-1
Tomas Bruna Georgia Institute of Technology job-FPP8fJj06B2xv0FB2032x812-1
Tomas Bruna Georgia Institute of Technology job-FPP9FqQ06B2xv0FB2032x84f-1
Tomas Bruna Georgia Institute of Technology job-FPP9J7j06B2pQVXf7x9bVB62-1
Rongshan Yu job-FPPJPG006B2y2vpj5ZX38xqq-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPQ6xQ06B2V5QGj12ZGpZB7-1
Patrick Leong job-FPPQJ1j06B2bPgv11Xb3bqb5-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPQZg006B2xBv0F5Zb3qBPk-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPV5pj06B2x8y118g08xB94-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPV6BQ06B2V4gqpPJbb192x-1
Feixiong Cheng Cleveland Clinic job-FPPV80806B2qyzkF5PBP551V-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPV9Q006B2gj7Yy5bxfPZ2K-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPV9pj06B2y2vpj5ZX390vX-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPVB9Q06B2xv0FB2032xF85-1
Rongshan Yu job-FPPVXv006B2kp3FF069XJ27b-1
Rongshan Yu job-FPPVY8Q06B2f87yF1bPf78YK-1
Rongshan Yu job-FPPVYY806B2f87yF1bPf78YQ-1
Rongshan Yu job-FPPVZJQ06B2Z5fPGPBPXBfFf-1
Xijin Ge South Dakota State University job-FPPXpV006B2gF4ZP6G7KPKxX-1
Xijin Ge South Dakota State University job-FPPY94006B2fpBBQ1YkB7VG4-1
Tony Kaoma Luxembourg Institute of Health job-FPPYj7j06B2V4gqpPJbb19v7-1
Tony Kaoma Luxembourg Institute of Health job-FPPYzz006B2jfjqF8jZ4gFPx-1
Tony Kaoma Luxembourg Institute of Health job-FPPZ2gj06B2gF4ZP6G7KPZxZ-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPZ8k006B2XFQfPKZj7XgKV-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPZ97j06B2VBgzy9gb976Pv-1
Sergi Sayols Puig job-FPPZYPj06B2VBgzy9gb977bp-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPbbj806B2pQVXf7x9bVbV3-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPbf4Q06B2bjG6XFbFXg59k-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPbfQ806B2bBb2X8g82gY3Y-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPbfx006B2yXy5G2pYKkgjG-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPbgJ006B2fpBBQ1YkB7VZy-1
Rintu Kutum CSIR-Institute of Genomics and Integrative Biology job-FPPbggQ06B2V5QGj12ZGpbj0-1
Anders Carlsson Bionamic job-FPPbjQQ06B2VgjX65bG6Zq7B-1
Anders Carlsson Bionamic job-FPPbkB006B2jZx3VBqkFYy49-1
Sergi Sayols Puig job-FPPfVyQ06B2z8xFJPvZy13kz-1
Omid Bazgir Texas Tech University job-FPPfy3806B2pK2KJ7yJkB9pZ-1
Kaspar Martens University of Oxford job-FPPg2z806B2VgjX65bG6ZqJY-1
Anders Carlsson Bionamic job-FPPg7PQ06B2bB5Kx39BxYXvj-1
Alden Leung job-FPPgJV806B2yFyJ00GXvJ50K-1
Kaspar Martens University of Oxford job-FPPgP2Q06B2f87yF1bPf79j0-1
Kaspar Martens University of Oxford job-FPPgpJ006B2zz4ZBKFKGBP14-1
Xiaowei Zhan job-FPPjPG806B2VBgzy9gb977v0-1
Omid Bazgir Texas Tech University job-FPPk81Q06B2vKjxX243XVBB4-1
Omid Bazgir Texas Tech University job-FPPkB7j06B2x8J1xPvBfpg16-1
Omid Bazgir Texas Tech University job-FPPkF0j06B2bB5Kx39BxYXy3-1
Omid Bazgir Texas Tech University job-FPPkFQQ06B2pQVXf7x9bVbj3-1
Yin Hung Lin Taiwan AI Labs job-FPPkYVj06B2f87yF1bPf79kY-1
Yin Hung Lin Taiwan AI Labs job-FPPkZBj06B2V4gqpPJbb1BFb-1
Yin Hung Lin Taiwan AI Labs job-FPPkZj006B2V4gqpPJbb1BFg-1
Yin Hung Lin Taiwan AI Labs job-FPPkb2006B2jfjqF8jZ4gFjy-1
Yin Hung Lin Taiwan AI Labs job-FPPkbBj06B2pgxvJ8gyyqx90-1
Yin Hung Lin Taiwan AI Labs job-FPPkbYj06B2Z5fPGPBPXBg46-1
Yin Hung Lin Taiwan AI Labs job-FPPkbqQ06B2yFJ6v1v52g53V-1
Yin Hung Lin Taiwan AI Labs job-FPPkf5006B2Z5fPGPBPXBg48-1
Ioannis Siavelis Karolinska Institute job-FPPkkB006B2vKjxX243XVBBj-1
Slim Fourati CWRU job-FPPkxg806B2VBgzy9gb977y9-1
Sunkyu Kim Korea University job-FPPkxxj06B2yFyJ00GXvJ52V-1
Marouen Ben Guebila job-FPPp45Q06B2V76578jKg0p32-1
Marouen Ben Guebila job-FPPp4X006B2V5QGj12ZGpfJg-1
Marouen Ben Guebila job-FPPp4pQ06B2xv0FB2032xGPy-1
Marouen Ben Guebila job-FPPp51006B2gF4ZP6G7KPb6x-1
Marouen Ben Guebila job-FPPp58006B2pq0qB9Gf335Yf-1
Marouen Ben Guebila job-FPPp5Q006B2Z5fPGPBPXBg51-1
Marouen Ben Guebila job-FPPp5f806B2jZx3VBqkFYyQ5-1
Marouen Ben Guebila job-FPPp5vQ06B2yFJ6v1v52g54Q-1
Marouen Ben Guebila job-FPPp6Y806B2y2vpj5ZX3927J-1
Marouen Ben Guebila job-FPPp71006B2vKjxX243XVBFf-1
Raziur Rahman Texas Tech University job-FPPp9pj06B2pq0qB9Gf335Zg-1
Enhao Fang job-FPPpJX006B2x8J1xPvBfpg31-1
Alden Leung job-FPPpvKj06B2XFQfPKZj7Xgqv-1
Alden Leung job-FPPpvk006B2XFQfPKZj7Xgqy-1
Alden Leung job-FPPpx0Q06B2qyzkF5PBP569g-1
Lisheng Zhou job-FPPpxQ806B2gj7Yy5bxfPbP5-1
Lisheng Zhou job-FPPpy7j06B2z8xFJPvZy13zY-1
Lisheng Zhou job-FPPpyjj06B2jkYZ193FGYVp0-1
Ben Greenwell job-FPPq09j06B2gF4ZP6G7KPb8F-1
Lisheng Zhou job-FPPq0PQ06B2x8J1xPvBfpg3g-1
Chia Lee job-FPPq4YQ06B2jJBFgFbXPyqQv-1
Renke Pan Sentieon Inc job-FPPq88j06B2gF4ZP6G7KPb8p-1
Renke Pan Sentieon Inc job-FPPq8g806B2yFyJ00GXvJ55j-1
Renke Pan Sentieon Inc job-FPPq91j06B2qyzkF5PBP56BJ-1
Renke Pan Sentieon Inc job-FPPq9Fj06B2b28BV8j796jv8-1
Lisheng Zhou job-FPPq9G806B2bB5Kx39BxYY31-1
Lisheng Zhou job-FPPq9g806B2bjG6XFbFXg5gp-1
Renke Pan Sentieon Inc job-FPPq9x806B2Z5fPGPBPXBg7Z-1
Lisheng Zhou job-FPPqB2806B2V76578jKg0p5Z-1
Renke Pan Sentieon Inc job-FPPqBG806B2y8257FZX9y38Q-1
Renke Pan Sentieon Inc job-FPPqF1006B2pK2KJ7yJkBB7g-1
Renke Pan Sentieon Inc job-FPPqFb806B2yFJ6v1v52g574-1
Renke Pan Sentieon Inc job-FPPqG2806B2v8VKj3YF6QVkk-1
Renke Pan Sentieon Inc job-FPPqGyQ06B2y8257FZX9y38Y-1
Lisheng Zhou job-FPPqYb806B2bB5Kx39BxYY3Y-1
Lisheng Zhou job-FPPqZ5806B2pgxvJ8gyyqxK4-1
Eric Li job-FPPqp1j06B2kj8JZF5GK8Gzz-1
Emily Flynn Stanford University job-FPPqp2006B2yFJ6v1v52g587-1
Sunkyu Kim Korea University job-FPPvJp006B2qyzkF5PBP56J6-1
Ali Afzal USC job-FPPvY8006B2V4gqpPJbb1BYY-1
Daniel Schlauch Genospace job-FQJGXZ006B2qVGfP13f8P9YF-1
Ian Green Booz Allen Hamilton job-FQJGZZj06B2Z30Y7Bgy0FpvY-1