An Evaluation of Clinical Natural Language Processing Systems to Extract Symptomatic Adverse Events from Patient-Authored Free-Text Narratives


Symptomatic adverse events (AEs) such as nausea are common among patients enrolled in cancer clinical trials. Historically, this information has been collected and reported into research databases by clinical staff using a set of AE grading criteria maintained by the National Cancer Institute (NCI) called the Common Terminology Criteria for Adverse Events (CTCAE). In NCI’s Patient-Reported Outcomes version of CTCAE (PRO-CTCAE) software system, patients can also provide supplemental free-text narratives about their AEs. 58% of patients submit supplemental AE information when given this opportunity1. More importantly, there was not considerable overlap between supplemental AEs submitted by patients and those elicited in trial-specific questionnaires, providing evidence for the value of collecting free-text, patient-authored AEs. In our prior work, we also found that the majority (88%) of the symptom concepts within patient narratives could be manually mapped to the Medical Dictionary for Regulatory Activities (MedDRA), which is the standard lexicon for reporting AEs to regulatory agencies such as the FDA. However, the manual process of mapping symptom concepts to lexicons is labor-intensive and limits the widespread collection of free-text AEs. Clinical natural language processing (NLP) has the potential to accelerate recognition and mapping of these symptom concepts and could enable real-time extraction, mapping, and reporting of patient-authored AEs. Off-the-shelf NLP systems, if high-performing, could allow for systematic text processing to be applied, but have not previously been examined for patient-authored AEs. Thus, the objective of this study was to evaluate performance of four widely used clinical NLP systems in extracting symptom concepts from patient-authored free-text AE narratives.

American Medical Informatics Association (AMIA) Informatics Summit Podium Abstract