Gaining New Insights by Detecting Adverse Event Anomalies Using FDA Open Data

During the life-cycle of FDA regulated products, FDA collects data from a diversity of sources including voluntary reports from healthcare providers and patients. While cause and effect are not always conclusive or relevant in these reports, valuable insights into the impact of regulated products on public health have been found in individual reports and in evaluation of reported data. This challenge engages data scientists to use evolving data science techniques to identify anomalies that may lead to valuable public health information.

  • Starts
    2020-01-17 19:20:00 UTC
  • Ends
    2020-05-19 03:59:00 UTC

The U.S. Food and Drug Administration (FDA) launched the Gaining New Insights by Detecting Adverse Event Anomalies Using FDA Open Data Challenge that asked participants to develop computational algorithms that automatically detect adverse event anomalies using publicly available data

Period: January 17, 2020 to May 18, 2020

Submissions: 6 submissions from 6 unique teams

Introductory Remarks

FDA regulators use a variety of data mining and hands-on methods and tools to analyze large volumes of adverse event reports and identify possible safety signals. While data mining and hands-on case reviews performed by medical officers have been the cornerstone of adverse event safety signal detection, other approaches, including machine learning (ML) and artificial intelligence (AI), may provide novel insights into FDA adverse event data. The goal of this challenge was to have participants explore the potential of these techniques for safety signal identification by developing computational algorithms for automatic detection of adverse event anomalies. The challenge was also launched as part of the “Modernizing FDA’s Data Strategy” public meeting (June 30th 2020) as a way to engage external scientists prior to the meeting.

Overview of Results

We received a total of 6 submissions from 6 unique teams (or individuals) to this challenge. Challenge participants had access to a wide and diverse array of public data sources, which, when analyzed, could reveal interesting patterns and even surprises. This challenge was unlike other challenges hosted on precisionFDA, in that, by design, it was more open ended, allowing for potential “outside-of-the-box” solutions or findings from participants. The unique challenge design resulted in a diversity of responses where submitters used differing techniques for data analysis and focused on a wide range of anomalies. Some examples of techniques employed by participants included; disproportionality analysis, clustering/outlier detection, logical inconsistency detection, novel ontology mapping, and natural language processing (NLP). Examples of anomalies detected included; unexpected age distributions in AEs, unexpected sex disproportions in AEs, unexpected specific AE ratios, contradictory AEs found together, clusters of outlier AE reports with similar characteristics, and comparison of reported AEs to what is already on the product label (via NLP).


Submission evaluation was carried out by a team of experts from diverse backgrounds including computer science, biostatistics, chemistry, epidemiology, medical science, pharmacovigilance, and biology. The team developed an evaluation rubric with scores based on participant algorithm methodologies, and anomalies detected. Specifically, methods were scored based on assessments of quality, usefulness, innovation, and reproducibility for a total of 6 points. Each of the submitters’ specific anomalies was scored (out of 24 points) based on impact, surprise, and innovation.

  • Impact - applicability to FDA review or data processing efforts
  • Surprise - an anomaly’s statistically disproportionate or unexpected nature
  • Innovation - novelty and complexity

Top submissions were further analyzed by a team of medical officers, pharmacists, and physicians from FDA’s Center for Biologics Evaluation and Research (CBER) and Center for Drug Evaluation and Research (CDER).


Submissions comprised a diverse array of techniques for data analysis and focused on a wide range of anomalies. Histograms below (Figure 1) show the distribution of scores according to each criterion for the submitted anomalies. Trends deduced from these distributions showed that submitted anomalies tended to have:

  • Very well-described methods [Panel A];
  • Somewhat low estimated impact [Panel B];
  • Somewhat higher “surprise” [Panel C]; and
  • A similarly positively skewed “innovation” score [Panel D].

It is important to note that some of the anomalies submitted were so surprising that they required many hours of research and discussion to confirm preliminary evaluations.

Figure 1. Histograms of evaluation scores for each criterion

Submissions were ranked based on preliminary evaluations and the top three were discussed with experts from CBER and CDER. Combined efforts helped us to identify two top performing submissions based on utility and innovation.

Top Performers

We selected two categories for awards of best performers: “Most Innovative” and “Most Synergistic”. The Most Innovative submission was selected based on novelty and demonstrated utility: showing the potential for aiding FDA in finding important new categories of anomalies. The Most Synergistic submission was selected based on direct applicability to existing processes used within the FDA. In short, the “Most Innovative” showed potential for new kinds of monitoring, while “Most Synergistic” showed potential for accelerating existing monitoring efforts. Dr. Leihong Wu earned top performer status for the Most Innovative submission. Dr. Wu took a novel approach to data analysis that is potentially valuable for FDA consideration. His approach used statistical measures, diverse data sources, and multiple varied logical frameworks to uncover and rank certain predefined categories of novel anomalies from large data sets. The “Most Synergistic” submission was awarded to a team consisting of Kelvin Chan and Nick Becker. This team produced a process which detected patterns of interest using natural language processing and disproportionality measures while leveraging the FDA product labels. This approach is well-aligned with some existing post-marketing techniques already in place at FDA. Both of these approaches are valuable to the FDA, and have the potential to enhance the agency’s monitoring efforts.

Take-Aways and Lessons Learned

  1. FDA Post marketing surveillance can be a valuable data source for challenges
  2. As an initial exploration of a vast and complex challenge dataset this was a good first step, especially considering the bevy of emergent COVID-19 challenges competing for participant attention during the same time period.
  3. Evaluation of Adverse Event reports is complex and nuanced. While automation has the potential to reduce reviewer burden, the background knowledge and experience of a reviewer proves an invaluable part of the process.
  4. Participants may have benefitted from more examples of a submission. While open-ended explorations are useful and necessary, having additional worked examples to facilitate a better understanding of submission format and kind of data expected may have decreased the barrier to entry
  5. More direct data access and detailed data descriptions may have alleviated participant confusion. We showed participants datasets from a variety of sources that were relevant to adverse events. Basic data merging and connecting operations needed to both obtain and connect these data sets was a heavy lift with questions arising such as “what are the linking fields?”, “what are the important fields?” These questions have complex answers, particularly in a regulatory challenge. For example, it would have been useful for a participant to know how to associate substance names to UNII codes, then to SPL products, and then to drug classes for those without domain expertise. In the future, when diverse public data sets are to be analyze it will be critical to provide more worked examples of an “analysis ready” merged dataset.
  6. Perhaps even exploratory challenges need some more defined boundaries. It was clear that participants derived different understandings of the challenge narrative. While some of this is inevitable when implementing an open-ended challenge, some confusion may have been avoidable by introducing constraints on possible tools and datasets that could be utilized to in a submission.
  7. A potential follow-up challenge would be to ask participants to provide tools FDA surveillance personnel and challenge evaluators could use in an interactive fashion to explore possible anomalies.
  8. As COVID-19 treatments/vaccines moving forward in FDA, VAERS and FAERS data is going to be extremely important if not potentially central to reassuring public of safety, so any future work in this area could be useful and timely.


This challenge was an important first step in the strategic implementation of AI techniques to augment and facilitate existing monitoring and surveillance efforts within the agency. The results of the challenge helped to better understand the possible approaches which may accelerate and enhance our existing efforts, as well as the kinds of approaches which may be helpful in establishing entirely new mechanisms for meeting the agency’s broader mandates on public health, transparency, and interoperability. Equipped with the results of the challenge, we will be better prepared to explore new opportunities for leveraging emerging technologies to meet new challenges.