Building Dandelion's Multimodal Data Engine
In any given hospital system, there are maybe 16 or 17 different clinical information systems that store the data. So it's not like you go in and there's like a patient record and that includes all of their data ever in one place, and you just have to take it out. Everything is stored separately based on clinical workflow, based on department, and based on the type of data. Like imaging gets stored here, but then like the associated physician notes from reading and interpreting the image are stored over here.
And these are data that are not built for the purpose of analysis. They're built for the purpose of like continuity of care in the healthcare setting and also for billing purposes. So they're not trying to enable ease of analysis. So what we're really doing is what we're talking about is like starting from scratch in looking at all of these different sources and linking them together with fidelity. Being able to do so in a way that actually tells you, okay, patient A, whose image is over here. Actually, this is like all of the procedure orders.
They had done all the medications that they took within the same health care visit. That's just not readily available. You have to do all that linking and all that building yourself. You do all that cleaning because all kinds of stuff get captured automatically. That's not of particular use. And then on top of that, ethically de-identifying all of it to extract it. De-identifying it requires a different methodology depending on what you're talking about. So structured EMR data, you de-identify in one way, versus imaging that could have identifiable information.
You obscure different things to identify it, like de-identifying it in a different way. So de-identifying everything correctly and ethically and then linking it all together and then taking, getting rid of all the junk and then being left with something interpretable is one, a really large effort, and two, requires basically a combination of data science, clinical informatics, privacy, and engineering knowledge. So the team we've built out here that can do that has built this tremendous engine to be able to extract all of these elements in this depth.