
Digital Pathology Podcast
Digital Pathology Podcast
132: Ethical and Bias Considerations in Artificial Intelligence/Machine Learning
In this episode of the Digital Pathology Podcast, I explore the ethical and bias considerations in AI and machine learning through the lens of pathology. This is part six of our special seven-part series based on the landmark Modern Pathology review co-authored by the UPMC group, including Matthew Hanna, Liam Pantanowitz, and Hooman Rashidi.
From data bias and algorithmic bias to labeling, sampling, and representation issues, I break down where biases in AI can arise—and what we, as medical data stewards, must do to recognize, mitigate, and avoid them.
🔬 Key Topics Covered:
- [00:00:00] Introduction and post-USCAP 2025 reflections
- [00:03:00] Overview of AI and ethics paper from Modern Pathology
- [00:06:00] What it means to be a “data steward” in pathology
- [00:08:00] Core ethical principles: autonomy, beneficence, justice & more
- [00:13:00] Types of bias in AI systems: data, sampling, algorithmic, labeling
- [00:22:00] Temporal and feedback loop bias examples in pathology
- [00:29:00] FDA involvement and global guidelines for ethical AI
- [00:34:00] Bias mitigation: from diverse datasets to ongoing monitoring
- [00:43:00] The FAIR principles for responsible data use
- [00:49:00] AI development & reporting frameworks: QUADAS, CONSORT, STARD
🩺 Why This Episode Matters:
If we want to deploy AI ethically and reliably in pathology, we must check our bias—not just once, but at every stage of AI development. This episode gives you practical tools, frameworks, and principles for building responsible AI workflows from the ground up.
🎧 Listen now and become a more conscious and capable digital pathology data steward.
👉 Get the Paper here: Ethical and Bias Considerations in Artificial Intelligence/Machine Learning
📘 Explore more on this topic: https://digitalpathologyplace.com
Become a Digital Pathology Trailblazer get the "Digital Pathology 101" FREE E-book and join us!
Ethics and Bias Considerations in AI and Pathology
Aleks: [00:00:00] Good morning. My digital pathology trailblazers. Welcome back to the normal studio that we are in Pennsylvania at 6:00 AM from me. Make myself a little bigger and say hi to you in the chat, because if you are here, let me know that you're here. Let me know that you hear well. And today's topic is, let me share it with you.
Today's topic is, let's see if my aid here work. We are gonna be talking about ethical and biased considerations in AI and machine learning. That's gonna be the topic of today's livestream. Let me know that you're here because I see [00:01:00] that some of you are already here. How did you recover after USCAP? If you've been there, if you haven't been there.
There are podcasts, live streams, and of course a bunch of LinkedIn posts, not only from me but from everybody who was at the at the conference. So that's the United States and Canadian College of American Pathologists, no, sorry. CAP is College of American Pathologists and the other one is association of, and we actually, our paper is from USCAP as well.
So United States and Canadian Academy, correction, Academy of Pathology. So that was the conference last, no, the week before. Last end of 22nd to the 20. March, sorry guys. 20, 22nd to the 27th of March. So who was there? Let me know in the chat [00:02:00] that I was at USCAP and I saw you and we had a fantastic conference with many digital pathology trailblazers giving podcast interviews.
So this is gonna be coming out as soon as I have it edited. And also. One interesting thing that happened just after I came back from USCAP was a LinkedIn post I came across which I'm gonna do another podcast about because I wanna explain it a little bit. The, the suit that, the lawsuit that was was filed against FDA, wanting to rule the LDTs lab, developed tests and it was, won by CAP and by other organizations. It was deemed not okay for FDA to regulate those tests. So there's gonna be a separate podcast on that because I wanna dig a little bit deeper. [00:03:00] Why, on what basis was it? Was it was it one? And what does it mean? That whole last year since it started was like, oh, what's gonna happen with LDTs?
How is the diagnostic landscape gonna change with this end. Maybe it's not gonna change. Maybe it's gonna change into a different direction. And we're gonna be talking about this on another occasion, which is gonna be a podcast. And now let's go to ethical and biased considerations in AI. As we know by design, we are biased creatures.
By the way, I also met a few people who are our friends from this paper. I met, where is my, this thing? I met Matt. Matthew, I met Liam Pantanowitz. I did not meet Hooman at this conference. So let's take him out of our [00:04:00] highlights. But I met both of them at the MUSE Microscopy booth where I was hanging out with digital pathology trailblazers and with the MUSE team.
And I said, thank you for those papers because we are reviewing them with digital pathology trailblazers. Let me move the camera a little bit and when you are joining, let me know that you're joining. Let me know that you are back to the normal, normal because this is also still a special series, the seven part series, and we are on number six from the seven part series by Modern Pathology about AI in pathology co-authored by the UPMC group of pathology, including Matthew Hanna and Liam Pantanowitz and Hooman Rashidi with other co-authors. But let's dive into the topic of our conversation today. Yes, biased. We are biased [00:05:00] creatures. So what does that mean? It means that we have to be aware of that and mitigated. And so we are here. Number six. I can't believe we're already at number six, only number seven to go.
We are at Ethical and Biased Considerations of AI and Machine Learning. Yeah, medical practitioners and especially pathology, laboratory medicine professional in our case, but also imaging specialists, radiology. I would put them in the same category. And basically everybody who generates medical data in this paper, they call them data stewards.
And I like this term very much. Because we play a pivotal role in guiding the ethical development and deployment of current and future AI ML technologies [00:06:00] and machine learning technologies. What does that mean for us? How can we be good data stewards? We need to know enough about this.
We need to know enough about how it's being developed. What is being done. Basically, we need to know enough about this whole. AI revolution, I would call it, to be able to be good data stewards. So that's what we're doing right now. We're learning about it, and we're gonna learn about what can cause the bias.
And obviously they have fantastic tables and fantastic drawings, so we're gonna focus on that. Yes, we have our trailblazers. Welcome, Thomas. He's a regular Thank you so much. Like special thanks to the regulars because they show up every time. Every time. Thank you so much. And without further ado, let's. [00:07:00]
Dive into it. Ethical principles in healthcare and medical research with extensions to medical artificial intelligence. So this is important. I know I cannot shoot, I cannot draw on nicely. So I gave a presentation, I'm gonna link to it after I publish in the cards where I explained in general like different ethical principles and where they come from, how they are defined.
And some of these things are mentioned in this paper and obviously referenced. But if you wanna. Do a step back before diving into what they have here for AI. That's gonna be the presentation. But basically the we have in this table we have the ethical principle. For healthcare and medical research because this is very well defined.
And then now we have a category for medical AI. [00:08:00] The first principle would be respect for autonomy. The principle of self-governance, which in case of healthcare and medical research, is basically the individual's right? To make their own decisions regarding their healthcare and participation in research.
Do they want that? Do they not want that? And this is usually covered by informed consent, a little bit more about it later. And for AI that and here in AI, we have both developers and users. Not only the developers that are making those AI’s, also the users must ensure that in the individuals have sufficient control over their interactions with AI.
And there are different regulations more than I even knew there were. We're gonna be talking about it. Then the other principle is called, so this one is respect for autonomy, beneficence. Basically the principle of doing good [00:09:00] because we also have the non-maleficence in a second. So we must benefit the individual, both as a healthcare professional and a researcher, and also AI developers and users are supposed to do that.
They're supposed to maximize human benefits with those tools. We have non-maleficence. So not only are they supposed to do good, they have to do a lot of good because enough, and also do no harm. AI developers and users are responsible for preventing harm and mitigating risks. And then we also need to think about justice, the principle of fairness.
So for physicians and researchers, it's that. The treatment of individuals is fair and equitable regardless of factors such as race, gender, socioeconomic status, and medical condition, and AI, developers need to work on that as well. Promoting [00:10:00] equity regardless of factors such as race, gender, socioeconomic status, or medical condition.
And it's always these things. Okay, let's go to accountability, and then I will tell you a little story or a little thing that comes to my mind when. When I read the statement about factors such as race, gender, socioeconomic status, and all that. So accountability. We need to be responsible for our activities.
So the doctor is physician, researchers responsible for what they're doing and also AI developers and the users, but they're accountable for ensuring that AI is designed, implemented, and operated ethically, transparently, and reliably. And what may, so this justice, it's not funny, but I always.
Think about how it evolved. So I recently had my citizenship… citizenship interview for becoming a US citizen. [00:11:00] And one of the questions that they ask you there is what is the, about amendments to the constitution. And they ask you about one of the Amendments about voting, like what is the who can vote?
And the most recent one is everybody who has a US citizen over 18 can vote. And the evolution towards that step was white males could vote, then all males could vote, and then females could vote. And now everybody over 18 can vote. Yeah, that's about justice and how it evolved in the US Constitution.
Moving on, I'm gonna skip the text because they have fantastic tables and we're gonna focus on that as well today. Okay we now know the rules the principles. So we have the principles, these are our [00:12:00] ethical principles, respect for autonomy, benefits, non-maleficence, justice, and accountability, and.
What, where can this bias come from us who are creating AI? Because we are bias creatures and we create bias tools and we have to figure out how to make them less biased. So how to make them evolve like the US Constitution did. So let's look at this table. Okay. About where the bias can come from.
We have the type of bias here in the left column type of bias. What the descriptions and examples of in pathology, some are very specific, some or less specific but obviously data bias. This one is pretty like self-explanatory. And let me know if you are joining live and just say hi.
That's always super nice [00:13:00] when I hear from you in real time. So data bias is basically if we have bias data, what does that mean? Bias data. We have underrepresentation, overrepresentation or misrepresentation in this, in our case of pathology cases. And for, like example if in pathology would be overrepresentation of certain demographics in diagnostic data sets.
How can this happen? Like you work with one institution and they have data from one institution and that's already biased data. How do you overcome you try to have your data sets as diverse as from different places as possible. We have algorithmic bias. This is a bias introduced during algorithm design affecting diagnostic accuracy or treatment recommendations.
And for example, in pathology skewed [00:14:00] prioritization of symptoms or conditions in diagnostic algorithm. This can happen by design or not by design and by design. If you prioritize the correct one then it's okay, but, then there is always potential, like how do you know that these are correct ones?
But in several, in, in many diagnostic algorithms, you like give a weight to a certain symptom. So if there is something of a certain relevance then you like up. Upvote it give it a higher weight. So this can also happen unconsciously in a biased way that's going to give prioritization to stuff that is maybe should not get the priority it should get.
Then. Sampling bias? When we have non-random sampling methods then we have, we can have skewed conclusions. For example, in pathology, that would be insufficient gross dissection for mapping [00:15:00] of tumor bed to predict response to therapy. Interesting. This one. So for example if we have. If we don't have the tumor dissected, so maybe then we have less percentage of tumor in the sample.
I don't know. But basically, this sampling bias can already introduce a problem. Then we have measurement bias. Measurement bias is our inaccuracies. Diagnostics tests or imaging diagnostic tests or imaging technologies affecting treatment decisions. And here for pathology, it's gonna be variation in test sensitivity or reference ranges across demographic groups.
Then we have labeling bias and misassigned labels of classification, influencing disease prediction models. And that's pretty. Let's call it close to my heart [00:16:00] in the development of image analysis algorithms for pathology, because often you rely on annotations. Now we have models that, that don't do.
They have self-supervision and work in different ways, but historically, it was a lot of annotations by pathologists and they annotate in a different way. So you basically introduce subjectivity at the stage of labeling your data and, or you can mismatch reports to samples.
There is potential for error, potential for bias and but mostly subjective interpretation of biopsy tumor grading results. This is an easy thing to introduce bias in. Prejudice bias, bias from preconceived notions about certain disease or patient groups influencing pathology diagnosis. Then we have stereotypical assumptions about [00:17:00] patient demographic influencing diagnosis or management decisions.
There are like, I'm not gonna mention different diseases, but you can think of diseases that make you think of a certain prejudice towards the person that carries that disease. And. You have to realize that and then counter that. Then, environmental bias from environmental factors affecting disease prevalence or diagnostic outcomes in certain regions.
What can influence disease prevalence. This is so cool because I was thinking, oh, maybe some because environmental bias, right? Something that happened in the environment, like the churnable catastrophe, but it's a lot less dramatic than that because you can basically have different patterns in patient testing, in academic hospitals, [00:18:00] in urban regions compared to community practices in rural settings.
When you test in a different way or like more frequently, less frequently, then you will have different prevalences of disease based on the environment that this person lives in. It doesn't have to be an environmental catastrophe, like the charitable catastrophe. It can just be a big hospital versus a community hospital.
Then. Interaction bias, interactions between different diseases or comorbid conditions affecting diagnosis. I'm like interaction bias, like how you interact with a patient or how you interact with AI. No. How diseases interact with each other. And in pathology an example is gonna be serum antibody chemistry.
Interferences in patients with similar disease influencing test results. Yeah, basically comorbidities and elderly patients, [00:19:00] they have more multiple disease not only many people have multiple disease,s and they interact with each other and you have to be conscious of that bias.
Something interesting is the feedback loop bias. This is a bias exacerbated by diagnostic feedback loops where historical data or previous diagnoses influence future diagnostic decisions. So this is so easy to happen, oh, this pre this. Was there before. So it has to be continuation of the same thing, right?
And it can be. It is both in, in, in treatment diagnostic settings and also in research settings where you have groups of animals or groups of groups of research subjects. In my case it's gonna be animals 'cause I'm a veterinary pathologist. But then you see something in one group and I'm like, okay.
Then it must be like a different variation [00:20:00] of the same thing. Doesn't have to be. And that's already, your feedback loop bias in action. And an example is predict… predicting future ancillary studies based on initial diagnosis without supportive evidence. You don't have this, the evidence for this new conclusion.
You just have the historical thing. It's it's like acting according to the, oh, the best predictor of future behavior is past behavior and maybe in psychology this tracks, but in medicine and research, you should have some data backing it up. Representation Bias. Bias from inadequate representation of diverse populations in diagnostic data sets affecting accuracy.
That was a little bit connected to this data bias. And you have underrepresentation of minority populations in genetic screening databases. [00:21:00] It's. It's funny when I read that because underrepresentation of my minority populations, they're gonna be underrepresented because they're minority, but then.
It's like the a vicious circle because they're the minority, they're gonna be underrepresented anyway. There is less of them 'cause they're minority and then you don't like actively reach out or find this data to have the representation at least to the level that it's actually represented in the population.
And temporal bias. This is something like some of these I have no idea how you should overcome or you even can overcome. You just have to monitor and be super conscious about it. Oh, welcome Monica. Amazing to have you here. Thank you so much for joining. So the temporal bias is due to changes in disease prevalence.
Or diagnostic criteria not reflected in historical [00:22:00] diagnostic data. And something that can happen in pathology is evolution of disease classification. Leading to discrepancies is in historical and current diagnosis. And I think this is currently happening for cervical cancer screening and also maybe urinary bladder cytology.
It's happening as we speak, and the historical data is not gonna match the current data because the scoring criteria changed also, like probably IHC scoring criteria for different markers, including HER2 and others. Anyway, that's something that. I don't know how to overcome, maybe you need to like have a time cut point and say.
A different algorithm for before that cutoff when the criteria changed and a different one after. Okay. [00:23:00] Transfer Bias. Bias from differences between diagnostic practices in training hospitals and those in community settings. We had something similar here, the environmental bias. And yeah, we have the variations in diagnostic criteria between academic and community healthcare.
Center. So here's a variation in diagnostic criteria. There it was the… basically different patterns of testing, but connected and then. Confirmation bias, initial beliefs, or diagnostic decisions influence subsequent interpretations or actions. This is also another thing that we have hardwired in our brain shortcuts based on something that we already know.
We extrapolate other information and then, pathology example favoring evidence that confirms initial diagnosis or treatment plans without considering contradictory [00:24:00] information. So we always have to check ourselves, check your bias. There's a lot of biases to check. Very important in the context of medicine research, AI.
And let's look at this figure where we have a list of guidelines that actually can be used. So one, not argument but something that a statement that people say like, where are the guidelines, the regulatory agencies, they are lagging behind? How do we do it? There is a lot of guidelines that I didn't know existed, but now I know and you'll know as well.
So these guidelines are for respective stages of AI in the medical life cycle. And we're looking here at the medical lifecycle. Let's see if I can make it a little bigger to show you [00:25:00] each of those parts. So let's start. So what is this? Medical lifecycle? We have the research and development at the beginning of the lifecycle.
Then we have. Clinical implementation at the end and the parts where we are very active as researchers, as clinicians, physicians, and pathologists. For me it's gonna be preclinical development. Then we have clinical trials and clinical translation. And in the preclinical development, we have basically everything before we can introduce.
Something. In my case, it's gonna be a compound of a candidate drug in the clinical trial. It can be a diagnostic test, AI test, whatever, right? So we recognize the need to. We harvest [00:26:00] data, we ensure explainability, and then think about copyright and IP considerations. And on our at the bottom here is the AI.
We recognize the need, then we select the AI technique and we harvest the data. I love this little picture of harvesting the data. Let me show it to you. Ah.
Always a problem when I try to manipulate my PDF. I can do it. I can do it, guys. I can do it so that you can see it. Look at this data. A tree with data and a person is harvesting the data. So we harvest the data. I. And then develop and validate the algorithm. And when we develop and validate the algorithm, we have a bunch of guidelines that we can use.
We have Quadas AI, Stard AI. Robust AI, Tripod AI. And there is another table. I'm not gonna be going through these guidelines today, but I just wanted to let you know when you are at this stage of develop and validate algorithm, you need to check these guidelines. People already thought about it. You don't have to reinvent the wheel.
You don't have to start from scratch. Look at these guidelines and. Use them. I think it's always super, super useful tool and basically leverage other people's research and that's supposed to be the point of research, right? So that we can do new research. So we can do the same for guidelines for AI then...
What have we done? We have developed, we ensured the explainability, and then we published our research, [00:28:00] checked the copyright here, and we have now developed a product. And integrated with digital systems. Oh my goodness. If we're here, we're really advanced, but then we need to go to clinical trials, so we go and we have another set of guidelines for different phases of the clinical trials.
Let me make it bigger for you again.
So phase I-II early, we can use the Decide AI phase three clinical. We can do Spirit AI and Consort AI. So we already have different guidelines and then. When we are the luckiest of all and also the best researchers of all, we go into the [00:29:00] clinical implementation, clinical translation and then we deal with regulatory agencies, right?
So we have regulatory inspection and approval, and we analyze the health economics we release to market and then adopt scale. And check if there is a return on investment. And we have the system health surveillance and updates. These, these this last part is already with authorities, with regulators before.
We have guidelines that we can apply, and obviously the regulators and authorities have their own guidelines. And if you don't meet. The criteria that they have, then, they're not gonna be even talking to you. But the point of this is to show you that there is guidelines that you can, yes. And Monica, [00:30:00] thank you so much for this comment. FDA also have has recent regulations for developing and validating algorithms. Both for preclinical work, for clinical work. For, on the preclinical side, it's very much a risk assessment base. Very ca… not catering, but addressing the principle of beneficence and nonmaleficence.
So if the benefits have to very much outweigh the risks and also explainability and different things. So they are crafting those guidelines according to the principles that we are learning about in this in this paper series. Let's see how something can become biased. We see that it's not too difficult to bias something because we have so many different sources of bias.
But here is in the representative example of bias introduced in developing an AI algorithm for [00:31:00] prostate cancer. And like an and so here in this image, you have this hypothetical, hypothetical distribution of prostate cancer, Gleason score by hereditary population groups. And we have Eurasian and North American populations, and we have the prevalence of the different Gleason grades.
And we can see that 1, 2, 3, 4 is very low prevalence, right? And we use this data for training. And then we have biased, distribution of training data, right where our Gleason five is very low, and our AI robot gets this and applies this to diverse patients, whereas the training data was not from diverse… not geographically diverse.
And then it says positive for populations with low Gleason grade [00:32:00] five tumor prevalence and negative for populations with high prevalence of Gleason grade five tumors. So we have positive in the areas where the data was taken from and negative in the areas where there was no data. But if we use data from South American, African, and Oceania populations as well, in which we can see that their Gleason grade five prevalence is a lot higher than in the previous data, and we.
Combine the data sets. So we have both the North American and the rest of the world, south American, African, and Oceania. We have a different distribution of training data, unbiased representative, and then we can see that actually this prediction of prostate cancer outcome is positive for all populations.
Yeah, this is easy to produce this type [00:33:00] of bias because access to data is difficult. Like, how are you gonna get data from all over the world? It's difficult. But nevertheless, if you want an unbiased algorithm, then you need to do that unbiased, generalizable algorithm. If you, you cannot do that, then you cannot apply this algorithm to those… those populations that were not represented in your data.
And this is what the authors and I mean by check your bias, where can you be biased? Here I. Okay, we checked, there are a lot of sources of bias. How do you mitigate this bias? What is the bias mitigation? When I was going through this part, it basically is check for it and figure out how to not have it once you.
Realize it's there, then you know what to do to counter it. [00:34:00] And another comment regarding our FDA discussion that multidisciplinary is involvement is required in this. Yes, very much in general, in digital pathology it's already a multidisciplinary field where you have everybody involved in the care.
Actually. Working on this, IT, lab professionals, pathologists, obviously. But then clinicians or treating physicians, getting the data from… from digital pathology, also other specialties like dermatologists, showing slides to their patients that were generated by digital pathology.
And then next level is AI because now we have developers, now we have computer scientists, engineers in that group. So definitely and statisticians. Like everybody. Everybody [00:35:00] who is involved in research. Let's go back to our bias mitigation. What can we do better? When we collect data and prepare data we gather diverse data sets.
So it's funny because once you realize, okay, this is a bias, and I have to counter that bias, it's pretty self-explanatory. How to counter a bias where you have. You don't have diverse data sets. You get diverse data sets, right? Easy, but realizing that you don't have diverse data sets or that you're applying a solution built on non-diverse data sets to diverse populations.
This is where like you have to do the check. Here you actively seek out and include diverse data sets that represent different demographics, geographic locations, and socioeconomic backgrounds. Is this is crucial. And here I always say about different initiatives because [00:36:00] access to data is difficult.
When you have a collaboration with someone you like, wanna maximize that collaboration for in a… with a healthcare institution. And then. Okay, this is the population from the institution. So here I mentioned two organizations. One is the Digital Diagnostics…. Digital Diagnostics, not Organization.
What is the. Foundation… Digital Diagnostics Foundation started by Matt Levitt, and that is basically his goal. He gathers data from diverse populations, pathology, radiology, all the medical data, and whoever wants to get access to this data then works with the foundation. Both on the data providers, so healthcare institutions and patients and the data users, meaning the institutions that wanna develop tools or use this data.
And they are supposed to be or their role is to be an honest broker in the [00:37:00] handling of this data. And another initiative that is happening actually in January next year, they will have their annual meeting in Berlin Germany, where I studied. So maybe I did my PhD in my residency, so maybe I should go there and do a, like a live streaming from this event.
And that event is the annual meeting of the Big Picture Consortium. This is a private public consortium in Europe. And the role of this consortium is. Aggregate data, not only from diverse count countries, diverse places in Europe, but also from different stages of the life… sorry, of the medical life cycle.
So both preclinical and clinical. And they have a huge data base that they're gonna be building. But this is two institutions that they talk about. There is another option to use. Federated learning, [00:38:00] swarm learning, different decentralized learning techniques, topic for another presentation.
And also topic of different videos. And I'm pointing up because I wanted later… Link to it so that you can have a look at it later, or put it in the show notes when I'm pointing down it's show notes when I'm pointing up it's card on YouTube. Anyway so we do that. Then model development and preparation, sorry, modern development and training.
We select appropriate algorithms and features. This is important. Then conduct evaluation across different demographic groups to identify any disparate impacts. And techniques such as fairness, aware learning and bias detection algorithms can help bias detection algorithms. So these are automated techniques ways of introducing these [00:39:00] checks into the models.
Model evaluation, performance evaluation across spectrum of demographics and diagnostic groups. And it should be more than traditional accuracy metrics and include bias and fairness assessment. Evaluating the models performance for example, across patient populations, diagnostic categories, staining protocols, reference ranges et cetera.
So there was one publication that we actually talked about in one of the digital pathology digests, where there was an algorithm that you could apply to your algorithm or like to your data. No, sorry, to your data. To show like a performance checker, I think maybe bias checker where you run an [00:40:00] AI algorithm on data and this other algorithm is showing you on which data it didn't really perform because the data was out of distribution compared to the training data of the algorithm you were checking. I know it's a convoluted explanation, but basically it shows you the subset of the data that your algorithm that you're evaluating.
Is working well, you can use it there and then it tells you the rest. No, don't use this algorithm on the rest of the data. So what do you use, what do you do with the rest of the data? And the rest of the data is gonna be images from patients, right? You maybe send them back to the pathologist, or you may be use a different algorithm that's more optimized for, a specific demographic, a specific presentation of a tumor and things like that.
There is a lot of research going on this. Model deployment and ongoing monitoring regular bias assessments [00:41:00] throughout the AI model life cycle helps identify and mitigate biases as they arise, including pro pre-processing steps, algorithmic adjustments, and post-deployment monitoring. So basically a little bit of a never ending story.
You need to monitor. You need to check. And. There is also interpretability and accountability. And here concepts like explainability and self-reflection come into play. Self-reflection, for example, self-reflection, introspection specifically allows AI models to analyze their own decision making processes by identifying the biases and factors that influence their outcomes.
This concept was new to me. I didn't know this. There was like a way to flag biases with an AI, right? And explainability is that the, [00:42:00] I would call it branch of AI science that ensures that AI decisions are understandable to stakeholders and end users. By providing clear justifications of their predictions, outputs, and actions.
Yeah, and words of wisdom. Here. Effective bias mitigation requires a holistic approach. Like really? Yes. Yes. Now, after listening to me for 42 minutes, you know that it's not like one fix is all like in life, right? Holistic approach at multiple stages of development. And something that guides.
Everything here is… is… are the fair principles. Fair principles regarding data. They refer to findability, accessibility, interoperability, and reusability [00:43:00] principles. Is our data always fair? No, there's a fantastic graphic for this as well. So what does that mean? That. It's findable. The data is findable.
Like I always think, let's think about annotations. Aren't they reusable? No. They disappear in your algorithm and then you never see them again. Hundreds or thousands of pathologists, hours of work. But let's start with findability.
Let me do this. Highlighter, findability data is assigned a globally unique and per… persistent identifier. I don't know when I go through these fair principles, I'm thinking we're not that fair when it comes to data, but let's just go through it, right? Data is described with rich metadata. How often do we miss metadata?
Probably often data is registered [00:44:00] and indexed in searchable resource, like really? In present for a lot of data that I am thinking of searchable. Like only recently have we come have we come up with ideas and I saw some of these at USCAP to search images, right? Not only based by metadata, but also by image.
And image data is a super important part of pathology data. It's crucial part, right? Metadata specify the data identifier. Okay, but now let's stop complaining about how little of this is actually available. Let's use these principles and make more of it available. So that was findability.
This is the F from Fair. Accessibility thats the A. Data and metadata, I are retrievable by their identifier using a standard communications protocol. The [00:45:00] protocol is open, free, and universally implementable. Authentication and authorization procedures when necessary out are clearly outlined. And metadata remains accessible even when the data are no longer available.
That is interesting. See, even though when there is no more data, there is metadata. So the open padlock is the symbol. Interoperability. The I from Fair Data use a formal, accessible, shared, and broadly appli… applicable language for knowledge, representation, data use vocabularies that follow fair principles.
Data include qualified references to other data. And then we have reusability. That was my rant on annotations. Data and metadata are well-documented and richly described with accurate and relevant attributes. Data are released with a clear and accessible data usage license. [00:46:00] And here I wanna give a shout out to the authors and to modern pathology because we have a license that allows us to use these papers for activities like what we are doing right now.
Data sources are shared, data are shared alongside methodologies and research results. So these are the fair principles that we should adhere to when we generate data and when we build stuff from this data. Oh, and here we're going back to the guidelines that can guide you. When you develop some AI algorithms, let me show you the original image that we were discussing this one.
So these quad-strat and all these different guidelines. There is a special table at the end of this paper. And it gives you a list [00:47:00] of those guidelines. Start AI, Tripod AI. Probast AI. It tells you what is the purpose. So for example, for START AI reporting standards for studies, evaluating diagnostic AI algorithms and true transparency.
Then when are you supposed to do it? Evaluation before diagnosis and how it corresponds to the fair principles.
How I would approach it when I'm doing something, I would go to the, when I'm, let's assume I'm an AI developer, which I'm not, I'm a veterinary pathologist. But, or, somebody who manages the development of these, these algorithms. When we develop and validate, then we have. Quadas, Starred, Probas and Tripod, [00:48:00] and we go to the table and we check, okay, does, do these apply to us?
How do they relate to the fair principles? When do we do Tripod is for development of diagnostic tools. And then PROBAS is development of diagnostic and prognostic tools. So wherever, right? And then we have a lot more of them, a full table. Let me just tell you the timelines when we're gonna be using them.
A development and validation of AI algorithms, reporting of AI studies, development and validation of AI models, integration into clinical decision making, and clinical trial protocol. So this one is important specifically for people working with clinical trials, reporting of AI. So there is one for clinical trials and there is one for reporting clinical trials.
Spirit is for clinical trial protocol and. Then consort AI [00:49:00] is reporting of AI clinical trials, and then Quadas AI is reporting of AI studies, so we have a lot of them and we can use them. So let me know if you what, let me know what's new from this, because some stuff was not new for me, but these guidelines were new for sure.
The fairness principle, I knew about it, but it kind of, I learned about it and it was like in the background and not really at the forefront. It should be at the forefront.
Thank you so much for joining me today. If you are interested in learning more about. Digital pathology in general and AI.
There is a course that I have for AI. This is the Pathology AI [00:50:00] makeover. You can have a look at this and we are gonna be including all the live streams edited without the fluff there for you to follow. And if you're just starting, then grab the book, digital Pathology, one-on-one book.
You can grab a free PDF. Let me make myself big, free PDF from the website with the QR code that you're seeing on the screen right now and start your digital pathology journey. I. Once we are done with the series, the AI section is gonna be updated. So if you have the free PDF you already downloaded, so even if you did it like two years ago when I first published this book in 2023, you are on my list that will get the updated version.
So if you don't have the any version yet. Get whichever, even if it's not updated yet, and then when the new one is out, you are gonna get a free [00:51:00] PDF.
Thank you so much for joining me, and I talk to you in the next episode.