Visual Cohort Queries for High-Dimensional Data: A Design Study


The large collections of electronic health data gathered by modern health institutions are increasingly being leveraged as a source of real-world evidence within population health studies. Cohort selection is a critical first step in these studies. However, querying for patient data within complex medical databases can be challenging due to two key concepts: (1) the high-dimensionality of medical data, and (2) the temporal nature of many queries (e.g., “patients with a specific medical procedure within X days after diagnosis”). Visual interfaces which enable non-technical experts to define queries of this type are available in systems such as the widely used i2b2 platform. However, using such tools to retrieve a satisfactory cohort for a given study remains difficult, typically requiring users to employ an iterative cohort refinement process using multiple queries. This paper reports results from a formative design study aimed at gaining a better understanding of the iterative query process, identifying challenges faced by users as they define cohorts, and gathering feedback on a preliminary design for a novel interactive visual query interface.

Visual Analytics in Healthcare (VAHC) Workshop