Digital Pathology Podcast
Digital Pathology Podcast
203: Clarifying Validation Terminologies in Healthcare
This episode is only available to subscribers.
Digital Pathology Podcast +
AI-powered summaries of the newest digital pathology and AI in healthcare papersPaper Discussed in this Episode:
Clarifying validation terminologies in healthcare. Amanda Dy, Sandra M. Buetow, Andrew J. Bredemeyer, et al. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02471-2.
Episode Summary:
In this deep dive, we unpack the silent chaos surrounding a single, universally used word in healthcare innovation: "validation". Exploring a 2026 paper by the Pathology Innovation Collaborative Community (PIcc), we uncover how differing definitions of this word across AI developers, hospital directors, regulators, and venture capitalists can lead to massive miscommunications, millions of wasted dollars, and compromised patient safety. We ask the critical question: when a developer says an AI tool is "validated," what are they actually selling you?
In This Episode, We Cover:
• The "Chameleon Word" of Healthcare: Tracing the evolution of "validation" from its Latin roots, to its 1940s use in physical measurement accuracy, and its 1962 shift into hold-out testing. Today, the word functions simultaneously as an evidence claim, a lifecycle activity, and a quality label, creating a fractured meaning across disciplines.
• The AI/ML Trap (Three Shades of Validation): Why an AI developer claiming a model is "validated" might just mean they checked the raw data for corrupt files (dataset validation) or tuned the model's math in the lab (validation data). Calling a model "validated" after internal cross-validation severely misrepresents its readiness for actual clinical deployment.
• The Clinical Lab Reality Check (Analytical vs. Clinical): The crucial difference between analytical validation (proving a tool is technically perfect, like a thermometer) and clinical validation (proving the tool actually helps diagnose patients correctly). We also explore why the gold-standard US lab framework, CLIA, completely abandons the word "validation" in favor of establishing "performance characteristics" that require rigorous, site-specific verification.
• The Regulatory and Business Minefields: How geography alters the legal definition, with the FDA focusing on intended use while European frameworks (IVDR) encompass entire lifecycle risk management. Furthermore, we discuss why "business validation" (securing investor funding) does not equate to clinical safety or regulatory readiness.
• The "Lightweight" Solution: The authors don't propose a massive new dictionary; instead, they advocate for simple, lightweight qualifiers. Teams must stop using "validation" as a binary yes/no label and instead explicitly define the context—stating exactly what phase, reference standard, and operational conditions were tested.
Key Takeaway:
The word "validation" has morphed into a pseudoscientific label of trust that can mask a product's true readiness. To prevent dangerous misalignments in medical innovation, interdisciplinary teams must demand explicit context: never just accept that a tool is "validated" without asking "validated for what exactly?"
Welcome back trailblazers to another session of the digital pathology podcast.
Yeah, I'm uh really excited for this discussion today.
Same here because today we are taking a deep dive into something that's sounds simple but is actually you know causing massive silent chaos across healthcare innovation.
Oh, absolutely. It is a staggering problem and um it's entirely invisible until it causes a major crisis.
Right. Imagine a hospital spends I don't know $2 million on a state-of-the-art AI diagn agnostic tool.
Just a huge investment.
Exactly. And the software developer swears up and down that the algorithm is validated. The venture capitalist backing it, they promise it's validated, too.
But then they actually deploy it.
Yeah. The very first week it's deployed in the hospital's pathology lab, the AI just fails completely. It misreads slides. It flags the wrong cells. It has to be pulled from the floor.
And the wild part is the code wasn't broken.
No. The issue was that neither the developer nor the investor nor the the hospital director actually agreed on what the word validated meant.
Right? When different teams don't share a fundamental understanding of well what constitutes evidence, you end up selling the wrong assurances to the wrong audience.
Okay, let's unpack this because we are looking at a critical paper that exposes how this single universally used word is literally delaying medical innovation
and risking patient trust. Honestly,
totally. The paper is called clarifying validation terminology. in healthcare recently published in NPJ digital medicine in 2026
authored by a massive multid-disciplinary group
yeah led by Amanda Dy Jarkin K Leonard and the pathology innovation collaborative community or polai
and that authorship group is key you know Paulai brings together experts across lab medicine AI industry and regulatory science so they have a front row seat to this communication breakdown
right because the word validation is just thrown around everywhere right now from AI labs to hospital floors to investor pitch decks.
Oh, it's everywhere.
I was thinking about this and using the word validation in healthcare right now is um it's kind of like ordering a regular coffee in different parts of the world.
Ah, I like that. How so?
Well, if you order a regular coffee in a diner in New York, you are getting a large drip coffee with milk and sugar,
right? Classic.
But if you order a regular coffee in a cafe in Italy, you're getting a tiny shot of espresso.
Yeah, very different experiences.
It's the exact same phrase, but the local context completely dictates the output. But in healthcare, if an AI developer's definition of validated is an espresso and the surgeon expects a drip coffee,
you risk misdiagnosing a patient.
Exactly. It's not just a bad morning.
No, the stakes are incredibly high. And to understand why this one word is causing so much friction, we really have to look at how it migrated across different scientific disciplines over the decade.
So, it actually changed meaning over time.
Oh, absolutely. The paper maps out this conceptual evol ution perfectly in figure 2.
Mhm.
If we trace the word back to its Latin root validus, it actually means strength or effect.
Wait, really? It doesn't mean truth.
Nope. Just mean something is strong. But then uh fast forward to 1946, it moves into measurement accuracy.
Okay. Like physical measurements.
Yeah. Scientists needed a way to prove that an instrument actually captured the physical attribute it claimed to measure. So validation became about physical accuracy.
Interesting.
And then by 1962, we see another huge shift. That's when the concept of holdout testing is introduced to the definition.
Hold out testing. Break that down for me. How does that change the word?
Well, think about before 1962. Researchers were essentially, you know, grading their own homework. They'd built a statistical model with a set of data and then test it on that exact same data.
Oh, wow. So, of course, it worked. It already knew the answer.
Exactly. It's self-confirmation bias. So, hold out testing forced researchers to lock away a portion of their data completely untouched.
Right. So, you train on one batch and test on the lockedway batch.
Yes. To see if the model actually learned the rules or just you know memorized the test and from there it evolved into modern regulatory fit for purpose assurance.
That is quite the journey. For one word
it is what's fascinating here is that because of this layered history the word validation now functions simultaneously as an evidence claim a life cycle activity and a quality label.
So it means I have proof I am testing and this product is good all at the same time.
Yes. And when people don't specify which one they mean, communication completely breaks down.
Okay. So, because of this fractured meaning, nowhere is this misinterpretation more dangerous today than in an AI and machine learning for healthcare.
Oh, without a doubt, it is a minefield.
But I have to push back a little. If an AI developer tells me, a health care professional, that their diagnostic model is validated, shouldn't I be able to trust that it's ready for my patients?
See, that assumption is exactly the trap. The paper completely dismantles this using the AI and ML domain findings.
Really?
Yeah. In AI, validation could mean at least three totally different things. First, there's data set validation.
Okay. What is that?
That's just checking if your raw data represents the intended population. You know, checking for corrupt files or missing pixels.
Ah, so it's just making sure your ingredients are fresh before you bake the cake.
Perfect analogy. Second, you have validation data. And this is where it gets really tricky because validation data is actually used during the development phase for tuning.
Wait, tuning? Like adjusting the model while you build it.
Exactly. Like tweaking the dials on a mixing board. You feed the algorithm data, check it against the validation data, tweak the mathematical dials, and repeat.
But if they are still tweaking dials, they're still in the lab.
Yes. If a developer says they use validation data, it is not finished.
It is definitely not ready for the clinic.
That is terrifying. An administrator might buy it thinking it's ready to go.
It happens. And the third use is model validation which is the final performance evaluation.
Okay, so that one means it's ready, right?
Not necessarily. The paper points out that calling a model validated after internal cross validation severely misrepresents its clinical readiness.
Oh, because it was only tested on data from the exact same hospital where it was built.
Exactly. It risks overfitting to a local environment. Yeah. It looks great on paper but falls apart in the real world.
So the AI developers are basically using the word while they're still experimenting. Yeah. And the regulatory frameworks actually explicitly discourage using the word validation during those development phases.
Do you discourage it like the FDA tells them not to use it?
Yes. The paper suggests explicitly separating these phases into data representativeness, training, tuning, and testing. Avoid the word validation entirely during development.
Wow. Okay. So, let's say a developer navigates all that. They finish the algorithm and now they have to bring it into a realworld clinical pathology. lab.
That is where they hit a massive cultural and regulatory wall.
I want to clarify something the paper brought up here. The distinction between analytical and clinical validation.
Oh yes, this is crucial
because analytical validation is like proving a thermometer accurately reads 98.6° every single time. It's technically perfect,
right?
But clinical validation is proving that a temperature of 98.6 actually means the patient is healthy.
You can have a perfectly analytical tool that is clinically useless, right? Like using that perfect ther to diagnose a broken leg.
That captures the divide perfectly. And here is a surprising fact from the clinical practice domain. The clinical laboratory improvement amendments.
CLIA, right? The gold standard for US labs.
Yes, CLIA. They don't even use the term validation.
Wait, they don't use the word at all. What do they use?
Instead, CLIA requires labs to establish performance characteristics.
Establish performance characteristics. That is very specific.
Very. And it highlights the decentralized nature of pathology labs. of CLIA labs must conduct sight specific validation for their own tests
just for things they build in-house.
No, even sight specific verification for FDA authorized commercially purchased assays.
Hold on. So, if a tool has an FDA stamp of approval, the local lab still has to reprove it works on their specific computers and patients.
Yes, it's a huge contrast to other medical fields. In pharma, FDA approval is the finish line. In clinical labs, it's just the starting line
because of all the local variables. like how the tissue was handled or what chemical stains they used.
Exactly. All those physical variables alter the digital image. If an AI developer thinks they're validated label bypasses the local labs verification, they're in for a rude awakening.
Okay, so they navigate the AI terminology. They satisfy the clinical labs strict performance characteristics, but now they have to get the product legally approved and financially backed.
And here the definition shifts yet again.
I have to ask, does Geog Raphy actually changed the legal definition of validation. If I'm a trailblazer launching a tool in Europe versus the US, am I speaking a totally different regulatory language?
Oh, absolutely. You really are. The FDA typically uses validation to mean a software function reliably meets its intended use.
Okay. Focused on intended use,
right? But European frameworks like the IVDR use it much more broadly. It encompasses everything from analytical performance to entire life cycle risk management.
So a developer can't just say We are regulatory validated. They have to say under which law and where.
Precisely. And then you bring in the business domain which the paper warns us about.
Business validation. That usually just means product market fit, right? Or securing investor funding.
Exactly. A business model might be validated because investors really like the pitch deck or the user interface,
but it could still be completely unvalidated clinically.
Yes. If we connect this to the bigger picture, aligning business, clinical and regulatory valid ation early is the only way to prevent massive delays in getting these tools to patients,
right? Because if you only have business validation, you're just selling digital snake oil.
And if you only have clinical validation but no business validation, you go bankrupt and the tool never reaches the patient anyway.
Wow. So with all these conflicting domains, communication, AI, clinical labs, regulatory, and business, how do we actually fix this communication breakdown?
Well, we certainly don't want to get bogged down in endless red tape,
right? And Here's where it gets really interesting because the authors of this paper are not proposing a massive new regulatory framework or like a brand new dictionary.
No, they're proposing something highly practical,
lightweight qualifiers.
Yes, that is consensus proposal one from the paper. Explicitly define the context at the outset.
So, how does that work in practice?
The authors walk through table two with practical examples. So, instead of writing the model has been validated, which means nothing, Right.
The improved wording is the model's architecture was selected based on five-fold internal cross validation and performance was reported using an external data set.
Oh wow. Okay. So it takes an extra breath to say but the ambiguity is just completely gone.
Exactly. We shouldn't abandon the word validation. It's too ingrained.
We just have to treat it as a conditional statement.
Right. It is no longer a binary yes or no quality label. It must be tied to a specific context reference standard and operational condition.
That makes so much sense. So So to summarize for our trailblazers today, the word validation is a chameleon word.
It really is.
Assuming everyone is using the word the same way is just a recipe for duplicated work, blown budgets, and compromised patient safety.
You have to demand context. Ask what phase, what domain, and what standards are actually being validated.
And I want to leave everyone with a final sort of lingering question to ponder.
Oh, let's hear it.
Think about the digital tools and workflows you rely on in your practice today. How many of them are coasting on the halo effect of the word validated?
Oh, like functioning as a pseudoscientific label of trust.
Exactly. Simply because no one ever stopped to ask them. Validated for what exactly?
Wow, that is definitely something to think about next time someone hands you a pitch deck. Thank you so much for joining us for this session of the digital pathology podcast.
Thanks for having me.
Trailblazers, take these insights back to your interdisciplinary teams. Start using those lightweight qualifiers. And we will catch you next time.