Unleashing the Power of Clinical AI: The Dandelion Health Marketplace
Last week, Dandelion Health launched our Clinical AI Marketplace—a groundbreaking platform that pairs representative, multimodal real-world data (RWD) with validated clinical AI at scale. We believe that this powerful combination is poised to accelerate life sciences innovation, research, and drug development. In this post, I’ll dive into why we built this platform, how it operates in practice, and our vision for the future of RWD + AI in the life sciences.
The Intersection of Supply and Demand in Clinical AI
Dandelion’s Clinical AI Marketplace sits at a new intersection of supply and demand in clinical AI. On the supply side, there’s an ever-growing list of AI developers producing algorithms that are primarily intended for clinical practice but which can also be applied for analytical purposes. On the demand side, life sciences companies are eager to harness AI to make the process of bringing new drugs to market more efficient, but for most players these efforts are in their infancy and have yet to show clear results.
The Supply Side: AI Developers
Over the past five years, we’ve witnessed an explosion in FDA-cleared algorithms – over 1,000 are expected to be cleared by the end of 2024.
Untold numbers of additional algorithms have been developed by academic researchers. But despite their potential to significantly impact patient care, most of these algorithms remain underutilized, gathering virtual dust in digital storage. The reasons are well-known by now: unclear ROI, concerns about performance and bias, limited reimbursement, and deployment challenges.
Meanwhile, the barriers to developing new AI tools continue to fall as technology evolves, making it easier than ever to create and improve these algorithms.
So while the research on clinical AI moves forward rapidly, the commercialization of the products remains halting at best.
The Demand Side: Life Sciences
Life sciences companies are under immense pressure to accelerate clinical development and maximize the value of their products. Since the 21st Century Cures Act, RWD and real-world evidence (RWE) have been heralded as transformative tools for achieving these goals. The vision is to use the billions of data points collected every day by the healthcare system as a way to understand the safety and efficacy of drugs in the real world, outside of the highly controlled environment of clinical trials. But despite the growth in RWD collection and utilization, it has yet to live up to its full potential. Why?
The answer lies in the data itself. Structured modalities like EMR and claims data are most of what is used for RWD today. While (comparatively) easy to work with, these types of data only scratch the surface of clinical reality. Generated for administrative purposes by overburdened clinicians, they fail to capture the true complexity and nuance of patient care. Diagnosis and procedure codes, despite their detail, cannot fully represent the granular data needed for scientific discovery.
Consider heart failure as an example. A patient with heart failure would often be represented in RWD analysis as having ICD-10 code I50. Within this code are various subcodes that might be applied throughout the chart depending on the coder. For example:
I50.1 (LV unspecified)
I50.2 (systolic)
I50.3 (diastolic)
I50.4 (combined systolic and diastolic)
I50.8 (other)
I50.9 (unspecified)
A richer characterization of the patient’s condition – one more useful and familiar from a physician’s standpoint – would include some or all of the following:
Patient symptoms: Particularly shortness of breath, fatigue, and swelling in the legs – all of which would be documented in clinical notes
Echocardiography: The actual ejection fraction measurement and other measurements from an echocardiogram, such as left ventricular mass and E/e' ratio
Blood tests: Particularly B-type natriuretic peptide (BNP) or N-terminal pro-BNP (NT-proBNP) – elevated levels of these biomarkers suggest heart failure
NYHA heart failure classification: Categorizes heart failure based on the severity of symptoms and physical limitations, ranging from Class I (no symptoms and no limitation in ordinary physical activity) to Class IV (severe limitations, with symptoms even at rest)
Electrocardiography: ECGs measure the electrical activity of the heart and can reveal patterns associated with heart failure, such as left ventricular hypertrophy or arrhythmias
Coronary angiography results: In some cases, especially if coronary artery disease is suspected as a cause of heart failure, coronary angiography may be performed to check for blockages in the coronary arteries which could be causing or worsening heart failure.
This is the level of granular clinical information that is collected in clinical trials. But it is several layers deeper than what is available in typical RWD.
Why not? There are three main challenges to taking this more comprehensive approach to RWD.
Data access: Nearly every bullet point above represents a different IT system within a hospital.
Data harmonization: At this level of granularity there is little consistency in the nomenclature used to describe, for example, specific blood tests (see, for example, this chart from the CDC outlining the 50+ ways to describe the units of common blood tests).
Data interpretation: Since much of the data generated is unstructured, it has traditionally required expensive clinicians to carefully abstract the data into structured formats that can be used for analytical purposes.
These are the challenges that Dandelion solves for life sciences companies.
For the challenges #1 and #2, the only solution is careful, expert human labor – this is what we've been working on for the four years since Dandelion was founded.
The launch of the Clinical AI Marketplace is designed to solve challenge #3.
AI's Potential to Unlock Unstructured Data
AI offers a way to unlock the vast reservoirs of unstructured data generated by the healthcare system. LLMs, for instance, can be used in the heart failure example above to abstract symptoms and NYHA classes from provider notes. Computer vision algorithms can automate various measurements that cardiologists typically perform using echocardiograms.
However, developing high-quality AI algorithms is no small feat. It requires access to high-quality, diverse data, rigorous model building, and thorough validation. While some life sciences companies have invested heavily in the data infrastructure and capabilities needed to do this work, there remains an intimidating amount of ground to cover – every data modality and clinical area require their own specialized expertise and data. Even for large life sciences companies, the scale of the investment required can grow daunting quickly, particularly when the ROI of these exploratory efforts remains uncertain.
How the Clinical AI Marketplace Works
Dandelion’s Clinical AI Marketplace bridges the gap between AI developers and life sciences companies. The process can be broken down into the following steps:
Source algorithm. Dandelion will help life sciences companies source high-quality, validated algorithms from developers who have already done the heavy lifting of training the model. In some cases, life sciences companies may have their own algorithms developed with proprietary data, but many will benefit from leveraging third-party solutions.
Validate algorithm. Dandelion uses our extensive digital biobank covering more than 10 million patients to validate these algorithms by comparing the algorithm predictions against the outcomes actually observed for those patients in the real world. This validation step ensures they are not only high-performing and unbiased but also fit-for-purpose for the specific population and use cases under study. For instance, while FDA approval sets the standard for clinical use, certain research and analyses may demand greater precision if they are intended to detect physiological changes caused by a particular drug with a high degree of statistical confidence.
Study cohort curation. Once we have validated the algorithm’s performance and suitability for the research task, we carefully curate patient cohorts with the relevant multimodal data points, providing a comprehensive, high-fidelity representation of the clinical trajectory of the patients in the cohort.
Deploy algorithm. We then run the validated algorithms on the unstructured data and generate tabular data fields that integrate seamlessly with other patient-level data and can be analyzed using the normal biostatistics toolkit.
The result? A novel, research-ready dataset delivered in weeks (not months) that allows life sciences companies to focus on answering their critical research questions rather than wrestling with the steps above.
Our Vision for the Clinical AI Marketplace
We believe that our Clinical AI Marketplace’s ability to combine multimodal RWD with validated AI will empower life sciences companies to move faster, reduce development costs, and foster greater creativity.
Faster: The Marketplace accelerates the iteration cycle, eliminating the friction associated with training algorithms from scratch and sourcing the right data.
Cheaper: By centralizing data infrastructure, we allow AI and data science teams to build on a shared foundation rather than constructing each study from the ground up.
More Innovative: By reducing barriers and enabling life sciences companies to leverage external AI innovations, the Clinical AI Marketplace will enable life sciences companies to make bolder bets and learn more quickly.
The end result of these improvements holds enormous potential to accelerate the development of therapies by supporting:
Trial enrichment: In the near term, AI algorithms can be used to enrich clinical trial cohorts in order to reduce the size and duration of clinical trials.
Companion diagnostics: In the medium term, AI algorithms running on raw clinical data will be able to predict which patients will respond to specific therapies, accelerating precision medicine across therapeutic areas
Surrogate biomarkers: In the longer term, we believe sufficient evidence can be generated to allow certain AI algorithms to be used as surrogate biomarkers for clinical endpoints in slow-progressing conditions like cardiovascular disease and Alzheimer’s Disease, making it easier to bring to market therapies that slow or prevent progression.
To provide a tangible example of the power of this approach, we recently published a research study that measures GLP-1s’ cardioprotective benefits using an AI algorithm from Dandelion’s Clinical AI marketplace that runs on raw ECG waveforms. Using the algorithm as a surrogate biomarker, we find that GLP-1 users had 15 -20% lower MACE risk scores three years after initiation than matched patients not on a GLP-1. This is roughly the same level of risk reduction observed in the SELECT trial, but for a tiny fraction of the cost and time. It also enabled us to look at primary prevention in a manner that would be prohibitively expensive for a randomized controlled trial.
Interested in learning more? To read more about the process and results, download our white paper below or read our scientific pre-print with more detail.
Are you an AI developer you want to put your algorithm on our Clinical AI Marketplace? Fill in this form to let us know who you are and what your algorithm does.
Do you work at a life sciences company and have a use case in mind that you’d like to discuss? Reach out to us at info@dandelionhealth.ai.